Nathaniel Fick: Encourage Innovation to Secure Military Systems From Cyberwarfare
Analysis: Three Observations About the Rise of the State in Shaping Cyberspace
by Andrea Little Limbago
Last month commemorated the 100th anniversary of the start of World War I. It was a time when states were so interdependent and borders so porous that some call it the first era of globalization. In fact, immediately prior to World War I, many forecast that interdependence would be the predominant driving force for the foreseeable future, diminishing states’ tendencies toward war and nationalism. World War I immediately halted this extensive global interdependence, in large part due to rising nationalism and the growth of inward-facing policies. On the surface, there seems to be little in common between that era and the current Digital Age. However, the misguided presumption prior to World War I that interdependence would render states’ domestic interests obsolete is at risk of resurfacing in the cyber domain. Given the narrow focus on connectivity during previous waves of interdependence, here are three observations about the role of the state in the Digital Age worth considering:
1) In “borderless” cyberspace, national borders still matter. Similar to perspectives on the growth of interdependence prior to World War I, there is currently an emphasis on the borderless, connected nature of cyberspace and its uniform and omnipresent growth across the globe. While borders – both virtual and physical – have become more porous, the state nevertheless is increasingly impacting the structure and transparency of the Internet. From Russia’s recent expansion of web control to Brazilian-European cooperation for underground cables, there is a growing patchwork approach to the Internet – all guided by national interests to maintain control within state borders.
2) “Data Nationalism” is the new nationalism of the Digital Age. While traditional nationalism still exists, thanks to the information revolution it now manifests in more nuanced ways. “Data nationalism”, where countries seek to maintain control of data within their physical borders, has strong parallels to traditional nationalism. In both cases, nationalism serves as a means to shape and impact a state’s culture and identity. As history has shown, states – and the governments running them – aim to maintain sovereign control of their territory and stay in power. Nationalistic tendencies, especially state preservation, tend to strongly influence the depth and structure of connectivity among people and states. This was true one hundred years ago, and it is true today. States are disparately invoking national legislation and barriers to exert their “data nationalism” within a virtual world, possibly halting the great expansion of access and content that has occurred thus far. Just as nationalism and states’ interests eventually altered the path of the first era of globalization, it is essential to acknowledge the growing role of the state in shaping the Internet during the Digital Age.
3) Although a technical creation, the cyber domain is not immune from the social construct of states’ interests. During each big wave of globalization and technological revolution, the idea that interdependence will triumph and trump individual states’ interests emerges. However, this idea discounts the role of the state in continuing to shape and maintain sovereign control while simultaneously influencing the structure of the newly connected system. This is true even in the cyber realm, which is not immune to the self-interest of states. From the great firewall of China to various regulations over content in Western European countries to Internet blackouts in Venezuela, states are increasingly leveraging their power to influence Internet access and control data and content within their borders. This has led to a growing discussion of the “Splinternet” or Balkanization of the Internet, which refers to the disparate patchwork of national policies and regulations emerging globally. Running counter to the ideals of openness and transparency on which the Internet was founded, it comes as no surprise to international relations scholars that states would seek to control (as best as possible) the cyber domain.
The role of self-interested states has largely been absent from discussions pertaining to the future of the Internet. Fortunately, there is a growing dialogue on the impact of national barriers and disparate national legislation on the Internet’s evolution. A recent article in The Atlantic reflects on the growing fractionalization of the Internet, and is reminiscent of earlier eras’ articles about the hub-and-spoke system of international trade. Similarly, a Pew Research Center poll highlights concern over the potential fractionalization of the Internet due to state intervention. As we continue to consider how the Internet will evolve and how policymakers will respond to an increasingly interconnected digital domain, we must not ignore the inherent tendency of states to demarcate both physical and virtual control within their sovereign borders.
Time Series Analysis for Network Security
by Phil Roth
Last week, I had the opportunity to attend a conference that had been on my radar for a long time. I’ve been using scientific Python tools for about 10 years, so it was with great excitement that I attended SciPy 2014 in Austin. I enjoyed meeting the developers of this excellent open-source software as well as other enthusiastic users like me. I learned a great deal from talks about some Python tools I haven’t yet tried but should really already be using, like conda, bokeh, and others. I also gave a talk describing how I have been using the SciPy stack of software in my work here at Endgame. In this post, I’ll summarize and expand on the first half of my presentation.
My work at Endgame has focused on collecting and tracking metrics associated with network and device behavior. By developing a model of normal behavior on these metrics, I can find and alert users when that behavior changes. There are several examples of security threats and events that would lead to anomalies in these metrics. Finding them and alerting our users to these threats as soon as possible is critical.
The first step in finding anomalies in network and device behavior is collecting the data and organizing it into a collection of time series. Our data pipeline here at Endgame changes rapidly as we develop tools and figure out what works and what doesn’t. For the purposes of this example, the network traffic data flows in the following way:
Apache Kafka is a distributed messaging system that views messages as a log. As data comes in, Kafka takes care of receiving it and distributing it to other systems that have subscribed to it. A separate system archives this data to HDFS for later processing over historical records. Reading the data from the Kafka servers allows my database to stay as current as possible. This allows me to send alerts to users very soon after a potential problem occurs. Reading historical data from HDFS allows me to backfill metrics once I create a new one or modify an existing one. After all of this data is read and processed, I fill a Redis database with the time series of each metric I’m tracking.
The three Python tools that I use throughout this process are kairos to manage the time series database, kafka-python to read from Kafka, and pyspark to read from HDFS. I chose each project for its ease of use and ability to get up to speed quickly. They all have simple interfaces that abstract away complicated behavior and allow you to focus on your own data flow. Also, by using a Python interface to old and new data, I can share the code that processes and compares data against the metrics I’ve developed.
I gave my presentation on the third and final day of SciPy. Up until that point, I hadn’t heard Apache Spark or pyspark mentioned once. Because of this, I spent an extra minute or two evangelizing for the project. Later, the Blaze developers gave a similar endorsement. It’s good to know that I’m not alone in the scientific Python community in loving Spark. In fact, before using Spark, I had been running Pig scripts in order to collect historical data. This required a bunch of extra work to run the data through the Python processing scripts I had already developed for the real-time side of things. Using Spark definitely simplified this process.
The end result of all this work is an easily accessible store of all the metrics. With just a couple lines of code, I can extract the metric I’m interested in and convert it to a pandas Dataframe. From there, I can simply analyze it using all of the scientific computing tools available in Python. Here’s an example:
#MakeaconnectiontoourkairosdatabasefromredisimportRedisfromkairosimportTimeseriesintervals={"days":{"step":60,"steps":2880},"months":{"step":1800,"steps":4032}}rclient=Redis(“localhost”,6379)ktseries=Timeseries(rclient,type="histogram”, intervals=intervals)# Read data from our kairos databasefrom pandas import DataFrame, to_datetimeseries = ktseries.series(metric_name, “months”)ts, fields = zip(*series.items())df = DataFrame({"data”:fields},index=to_datetime(ts,unit="s"))
And here’s an example time series showing the number of times an IP has responded to connection requests:
Thanks for reading. Next week I’ll talk about the different models I’ve built to make predictions and find anomalies in the time series that I’ve collected. If you’re interested in viewing the slides from my presentation, I’ve shared them here.
Building Models for Time Series Analysis
by Phil Roth
In my last post, I talked about the different Python projects I used to put together a pipeline for network security data. In this post, I’ll talk about how I used the scientific computing software stack in Python (numpy, scipy, and pandas) to build a model around that data and detect outliers. We left off last week with a pandas DataFrame containing example data:
This plot is an example taken from the database that shows the number of times an IP responds to connection requests over time. In order to find potential security threats, I’d like to find outliers in this and any other time series. In order to find outliers, I need to build a model around what I believe is normal behavior based on past data.
The most simplistic approach to building a model is to take the mean and standard deviation of the data I’ve seen so far. I can then treat the mean as a prediction of the next value and generate an alert when the actual value exceeds a configurable number of standard deviations from that prediction. The results of that simple algorithm are shown below:
In this plot and the ones that follow, the actual number of connections observed is in blue. The green window is centered on the prediction made for that time bin and extends one standard deviation in each direction. A red vertical line is drawn when the actual data is a configurable distance away from that prediction window.
As you can see in this first model, the prediction window is not highly correlated with the data and the spread is very large. A better model would be to fit the data to a sine curve using the tools that scipy provides. The prediction is the fit value and the standard deviation is derived from the residuals to the fit:
fromscipy.optimizeimportleastsqdeffitfunc(p,x):return(p[0]*(1-p[1]*np.sin(2*np.pi/(24*3600)*(x+p[2]))))defresiduals(p,y,x):returny-fitfunc(p,x)deffit(tsdf):tsgb=tsdf.groupby(tsdf.timeofday).mean()p0=np.array([tsgb[“conns”].mean(),1.0,0.0])plsq,suc=leastsq(residuals,p0,args=(tsgb[“conns”],np.array(tsgb.index)))returnplsq
At least on weekdays, the prediction mirrors the data better and the window is tighter. But we can improve these models even further. When looking through the data, it became apparent to me that different kinds of metrics required totally different models. I therefore developed a method for classifying the time series by asking two different questions:
- Does this metric show a weekly pattern (i.e. different behavior on weekdays and weekends?)
- Does this metric show a daily pattern?
In order to answer the first question, I fit the sine curve displayed above to the data on weekdays and weekends separately and compared the overall level of the fit (the p0 parameter in the equation above). If the levels differed, then I would build a model for the weekday data separately from the weekend data. If the overall levels of those fits were similar, then I kept that time series intact.
defweekend_ratio(tsdf):tsdf['weekday']=pd.Series(tsdf.index.weekday<5,index=tsdf.index)tsdf['timeofday']=(tsdf.index.second+tsdf.index.minute*60+tsdf.index.hour*3600)wdayplsq=fit(tsdf[tsdf.weekday==1])wendplsq=fit(tsdf[tsdf.weekdy==0])returnwendplsq[0]/wdayplsq[0]
In the plot above, I show the weekday and weekend fits in red. For this data, the behavior of the time series on weekdays and weekends was different enough that I decided to treat them separately.
The next step is to determine if the time series displays daily patterns. In order to do this, I use numpy to take the Fourier transform of the time series and inspect the bins associated with a frequency of a day. I sum the three bins closest to that frequency and compare them to the first bin or the DC component. If the sum is large enough compared to that first bin, then the time series is classified as having a daily pattern.
defdaily_ratio(tsdf):nbins=len(tsdf)deltat=(tsdf.index[1]-tsdf.index[0]).secondsdeltaf=1.0/(len(tsdf)*deltat)daybin=int((1.0/(24*3600))/deltaf)rfft=np.abs(np.fft.rfft(tsdf[“conns”]))daily_ratio=np.sum(rfft[daybin-1:daybin+2])/rfft[0]returndaily_ratio
Plots are sometimes the best way to explain these results, so I show two examples of the procedure below. In the first example, I show all the weekday data together in blue and the Fourier transform of that data in green. Red lines highlight the values corresponding to the frequency of a day in the Fourier transform data. The spike there is obvious and indicates a strong daily pattern.
The next figure shows the second example of the daily classification procedure. Here, the weekend data is combined in blue and the Fourier transform of that is in green. The Fourier transform data is flat and tells me that there is no daily pattern in this data.
The next step in the analysis is to apply a predictive model to the weekdays and weekends separately. In both cases, I apply an exponentially weighted moving average (EWMA). This calculation weights more recently occurring data more heavily in the calculation of an average. Trends and events in the past have less and less of an effect on future predictions. It’s a very simple calculation to do in pandas:
defewma_outlier(tsdf,stdlimit=5,span=15):tsdf[’conns_binpred’]=pd.ewma(tsdf[‘conns’],span=span).shift(1)tsdf[’conns_binstd’]=pd.ewmstd(tsdf[‘conns’],span=span).shift(1)tsdf[‘conns_stds’]=((tsdf[‘conns’]–tsdf[’conns_binpred’])/tsdf[‘conns_binstd’])tsdf[‘conns_outlier’]=(tsdf[‘conns_stds’].abs()>stdlimit)returntsdf
For time series that show no daily pattern, such as the weekend days of the example data we’ve been working with, I calculate the moving average and standard deviation and flag outliers when the actual data is a certain number of standard deviations away from the average. This procedure works best for data that does not vary significantly over time. It does not work as well when predictable daily patterns are present. In this case, the moving average lags the actual data in a predictable way that I should be able to account for. I’ve been calling this method a “stacked EWMA” because I group the data by time of day and stack each day on top of another. The next scatter plot shows the data stacked in this way.
Each vertical line corresponds to the number of connection responses occurring during a certain time of day over the span of about three weeks. Now I track the EWMA of the data in each of those vertical lines separately. This is illustrated in the next plot.
Here, the number of connection responses between 8AM and 8:30AM are expanded over the range of days on which they were collected. The green solid line shows the EWMA calculated from just those points and the dashed green line shows the edges of the prediction window. The same analysis is carried out for each time of day bin. After it’s completed, I have a prediction window for each bin that’s based on what’s happened at this time of day over the previous days and weeks. Here is the code that completes this stacked analysis:
defstacked_outlier(tsdf,stdlimit=4,span=10):gbdf=tsdf.groupby(‘timeofday’)[colname]gbdf=pd.DataFrame({‘conns_binpred’:gbdf.apply(pd.ewma,span=span),‘conns_binstd’:gbdf.apply(pd.ewmstd,span=span)})interval=tsdf.timeofday[1]-tsdf.timeofday[0]nshift=int(86400.0/interval)gbdf=gbdf.shift(nshift)tsdf=gbdf.combine_first(tsdf)tsdf[‘conns_stds’]=((tsdf[‘conns’]–tsdf[‘conns_binpred’])/tsdf[‘conns_binstd’])tsdf[‘conns_outlier’]=(tsdf[‘conns_stds’].abs()>stdlimit)returntsdf
This last plot shows the final result when the weekday and weekend models are executed and combined in the same figure. Daily patterns are predicted and accounted for. Flat periods during the weekends are well tracked. In further testing, this prediction model is very robust to different types of time series.
In the future, I’d like to create some metric for judging different prediction models that adequately penalizes for false positives and false negatives. I’d also like to further experiment with ARIMA (autoregressive integrated moving average) models and automatically finding repeated patterns instead of counting on them occurring in daily and weekly time spans. Also, a different technique will probably be necessary for time series with low statistics.
Thanks so much for reading. I hope you’ve learned a bit about the simplicity and power of working with the scientific computing stack in Python and its applications to network security data. I’ve posted the slides from which this material was taken here.
Report Analysis: A Data-Driven Approach to a Cyber Security Framework
by Andrea Little Limbago
On Monday, I attended the rollout event for former Secretary of the Navy Richard Danzig’s most recent report: “Surviving on a Diet of Poisoned Fruit: Reducing the National Security Risks of America’s Cyber Dependencies.” The report provides eight recommendations to help the government better position itself in light of the abundance of cyberspace challenges. Danzig’s recommendations tackle a range of topics, from federal workforce challenges to the trade-offs between functionality and security. While the main recommendations were thought provoking, Danzig arguably placed the most important portion of his paper in the appendix, meaning it was overlooked during the discussion and likely by readers as well. Danzig notes in the appendix that “there is no reliable data upon which to make decisions.” This is an extraordinarily important point that conceptually applies to the majority of his eight recommendations, but is generally overshadowed by the emphasis on more practical considerations.
The global community is dealing with one of the greatest technological disruptions in history, but, as Danzig argues, policymakers and analysts lack data and metrics upon which to make informed decisions. Both the public and private sectors are operating completely blind when it comes to cyberspace. This enables self-interested organizations and individuals to make claims that cannot be falsified. Based on Popper’s criterion of falsifiability, cyberspace currently resides in the realm of conjecture as opposed to scientific research. While moving cyber into the realm of scientific research may seem like merely an academic exercise, the absence of falsifiability implies that the public and private sectors are spending an exorbitant amount of money in the cyber domain based on assumptions that may or may not be true. In fact, as Danzig notes, assessments “are unconstrained in reflecting their political, ideological and commercial agendas rather than logical inferences.”
While problematic, this phenomenon is not distinct from other periods of technological shock that similarly lacked the data standardization and construct validity required to assess the impact of the changes. For instance, during and in the aftermath of World War II, the first quantitative studies emerged that attempted to understand the great shock that had just occurred to the international system. Lewis Frye Richardson (Statistics of Deadly Quarrels) and Quincy Wright (A Study of War) pioneered quantitative research focused on understanding the causes and consequences of war. Their work paved the way for additional formal modeling and quantitative analysis that helped shape Cold War theories and policy approaches, blurring the line between complex, quantitative analytics and policymaking and grand strategy.
It took a shock to the international system to spark innovation in the realm of conflict and security studies. The creation and expansion of cyberspace is similarly a shock to the international system today, but we have yet to see this same level of innovation in the realm of cyberspace and the data prerequisites that make a cybersecurity framework possible. Where could this theoretical and technical innovation come from? Danzig’s sixth recommendation highlights the fact that cybersecurity is not just a technical problem, but a social and behavioral problem as well. In short, it requires insights from various disciplines to help make sound diagnoses and prescriptions for cybersecurity. Interestingly, the breakthrough in conflict studies did not come solely from within political science, but rather benefited from the multi-disciplinary research of its early pioneers. As the report discussion highlighted, it is very likely that the breakthrough in our comprehension of cybersecurity will not come solely from technologists, but from interdisciplinary practitioners who can help construct and evaluate the relevant data and its impact on the operating environment.
Until that happens, as Danzig notes, cybersecurity will remain fragmented, with decisions made in the dark. Absent an interdisciplinary, data-driven approach to crafting a coherent cybersecurity framework, the pendulum will continue to dramatically swing between fear mongering over a “cyber Pearl Harbor” at one extreme and a blissful ignorance of the reality of potential cyber threats at the other. Decision-makers rely on information that is, according to Danzig, “indeterminate, inconsistent, over-interpreted or all three.” He’s absolutely right, but this must change. Cybersecurity is long overdue for a data-driven framework – crafted by technologists and non-technologists alike – that can assist decision-makers as they grapple with the challenges of the dynamic cybersecurity environment.
Securing the eCampus: Ten Observations About Cyber Security in Academia
by Nate Fick
I recently gave the keynote address at “Securing the eCampus,” a gathering of university CIOs and CISOs hosted by Dartmouth College. Drawing on my fifteen years of experience in the kinetic security world, running a security software company, and serving on the Board of Trustees at Dartmouth, I offered ten observations on the cyber landscape, especially as it pertains to academic environments:
1) Most of us are creatures of analogy when it comes to cyber – we’re all, to some extent, steering by looking in the rear-view mirror. So how did we get here? I think we need to look back to the 1991 Gulf War, after which the Chief of Staff of the Indian Army said that the great lesson was “don’t pick a fight with the U.S. unless you have nuclear weapons.” By dominating the middle of the spectrum of conflict – WW2-style tank-on-tank warfare – we drove our adversaries to the poles: Afghanistan and Iraq on the “low” (insurgency) end, Iran and North Korea on the “high” (nuclear) end. Cyber is where the two meet: there are low barriers to entry to achieving capabilities that can have global impact – very cheap, somewhat easy, and extremely powerful.
2) The cyber world is characterized by four vectors moving in the same direction but at different speeds:
- Technology is progressing very quickly;
- Social norms are evolving just behind that technical frontier;
- Then there’s a huge gap to where the law is trying to keep up;
- Slowest of all, policy formulation within the context of the law remains the most stagnant domain.
Most of the really interesting things are happening in that gap between technical feasibility/new social norms and the law/policy playing catch-up.
3) Cyber is blurring the lines between states and commercial actors, rendering conflict possible between any combination of states, state proxies, non-state actors, and commercial companies. The old model of states solely attacking states and companies spying only on companies is obsolete (the Iranian attacks on American banks a few years ago were a wake-up call in the popular consciousness). Most enterprises, both federal and commercial, now face similar challenges in cyber security.
4) The line between offense and defense is similarly nebulous, driven by technical evolution and also by the dissolution of the perimeter. The old paradigm was American football: 11 guys on the field playing ‘O’ and then 11 guys on the field playing ‘D’. The new paradigm is a lot more like European football: whether you’re on offense or defense depends upon which direction the ball is traveling on the field. It’s a faster and more dynamic world. (Just to be clear: I’m very much against private companies conducting offensive cyber operations…in addition to the legal issues, my time in the kinetic security world as a Marine left me with a strong appreciation for escalation dominance: don’t pick a fight you can’t win, and I don’t know of any company that can possibly win against a state or state-sponsored adversary.)
5) A relentless increase in connected devices, from roughly 9B connected things today to 40B connected things by the end of this decade, will greatly strain an eCampus security environment. Connecting 1B devices per quarter for the foreseeable future means massively increasing device proliferation, data proliferation, and network complexity. Just in the campus environment, for example, the average student today has 3 devices and the average student in four years will have 5 devices.
6) There are no more fortresses: the perimeter isn’t just gone, it’s been burned to the ground. Mobility/device proliferation, cloud migration, and the vastly increasing attack surface of the Internet of Things (remember that the attack vector in the recent Target data breach was the HVAC system…) mean that all this PII is flowing out to IaaS and cloud applications. Security teams need to gain and maintain visibility across infrastructures they don’t own and cannot control.
7) The security industry has a persistent and growing talent gap. Universities can help with the supply side of that equation through STEM education, but that takes a long time, so let’s also focus on the demand side by building tools that are easy to use. Can the industry bring consumer-style ease of use to security? Can we bring all the interesting things that are happening in analytics and visualization to bear on this problem? And incidentally, can we make the adjustment to realize – and I’m making a prediction here – that the Target breach and its aftermath will be looked back on as a watershed moment? A public company CEO was fired by the board because of a data breach. These issues will be less and less delegable to the CISO by a President or CEO, less and less delegable to a communications officer by a commanding general, and so we as an industry need to find a way to present what’s happening in a format that is digestible in the C-suite.
8) We must move from threat-based analysis to risk-based security intelligence. Universities are not immune from the cyber threat, but the degree of risk varies significantly, depending on a given source of attack and the kind of target. Let’s just postulate that in academic environments Russian intrusions typically target personal identifying information, while Chinese attacks generally target biochemistry and engineering research. Some universities are implementing frameworks to establish data location and sensitivity – mapping exactly where all its research and data is stored, and then color-coding it according to sensitivity. Because university networks are so porous and global, it’s often difficult to even recognize a breach attempt. For global universities that experience daily connections from countries around the world, nothing is an anomaly. We need to move towards:
- Reducing noise by focusing on relevance and risk instead of on arbitrary alerts.
- Extending visibility to every aspect of a digital domain instead of only behind the false security of a perimeter fortress.
- Empowering users to do data-driven exploration instead of relying only on PhD data scientists and mathematicians. Lead with insights, not with data.
This all starts to define a new aspect of security, something like “Security Intelligence” – the real-time collection, normalization, and analysis of the data generated by users, applications and infrastructure that impacts the IT security and risk posture of an enterprise. The goal of Security Intelligence is to provide actionable and comprehensive insight that reduces risk and operational effort for any size organization. None of that is especially controversial in the world where I typically operate – corporations and the government. But I put on my Dartmouth Trustee hat and think, “wow, real-time collection and analysis of user-generated data” is going to raise eyebrows in academic environments, which gets to the ninth observation…
9) Privacy and security cannot be in opposition to one another. First, we need to de-couple privacy and civil liberties in this conversation (historically, in America, I would argue that we have seen vastly expanding civil liberties even as privacy has diminished considerably – which isn’t to say that privacy doesn’t have great value; it’s just something a bit different). Second, we’ve been trading privacy for convenience for a long time – think not just of social media and email but about the first person who dictated a communication to a telegraph operator instead of sending a more private letter. But nonetheless, there’s a cultural aspect to academic communities that will inevitably be concerned with security adversely impacting intellectual freedom. The very purpose of these institutions is to promote academic and intellectual freedom, learning, and sharing. Unlike the tight controls that most enterprises employ, university technology systems are very porous. Because of their design (or intentional lack thereof…), university systems are often less secure: security and ops teams don’t manage the entirety of the systems that make up the network; students frequently oversee the assets they use; systems may be deployed and changed without much formal oversight and sometimes without IT’s knowledge. So I’ll conclude with a very simple 10th observation, which is only that…
10) Any workable management of these challenges is mostly human, not technical. Universities have gone through cultural transitions over the years as physical security has become a greater concern. Campus shootings have led to the widespread adoption of measures such as making students wear and show IDs, closing access to certain facilities at certain times of day, posting security guards at campus entrances, and more complex monitoring and alerting systems, while high-profile cases of sexual abuse or assault have led to increasing calls for background checks for employees and faculty. There may now need to be a cultural shift in how digital security is viewed as well—not as an intrusion, but as a measure of protection. Until this cultural shift happens, there will be continuing barriers to adopting better digital security measures on college campuses.
New Internet Hot Spots? Neighborhood Effects and Internet Censorship
by Andrea Little Limbago
During the 2011 London riots, the local government called for a ban on BlackBerry Messenger Service, a key form of communication during these events. Following the riots, Prime Minister David Cameron considered a ban on social media outlets under certain circumstances. Last year, Irmak Kara tweeted as events unraveled during the Gezi Park Protests in Turkey - now, she is on trial and faces up to three years in prison for those tweets. Last month, Iran sentenced eight citizens to a combined total of 127 years in jail for posting on Facebook. At the same time, Iran’s leaders continue to use social media outlets such as Facebook, Twitter, and Instagram. This apparent contradiction highlights the often Janus-faced nature of cyber statecraft. World leaders employ cyber statecraft domestically to exert control over their citizens as well as to propagate their messages and communicate. But which states are more likely to censor and restrict access to the Internet? On the surface, this seems like a fairly straightforward question - clearly, democracies must censor less than authoritarian regimes. However, as these brief examples illustrate, global politics is rarely so straightforward. Spatial patterns may in fact impact the likelihood of Internet censorship more consistently than a state’s domestic attributes. While factors such as regime type, level of economic development, and Internet infrastructure undoubtedly play a role, a look at spatial patterns data highlights that a neighborhood “hot spot” effect may be a predominant force in a state’s propensity toward Internet censorship.
Hot spots traditionally refer to the geographic clustering of a given event, such as conflict, democracy, or terrorism. Analysts who study hot spots argue that geography – and its diffusion effect – has a stronger impact on the occurrence of these events than domestic factors. Internet censorship may be a likely addition to the ‘hot spots’ literature. An initial investigation of geospatial data shows visible geographic clustering of Internet censorship and freedoms. However, the same linear relationship is not necessarily true between several predominant domestic indicators and Internet censorship. To evaluate these relationships, I reviewed the following indicators for 2013:
- Regime type: Polity V’s twenty-one point ordinal measure ranking states from authoritarian to anocratic to democratic regimes.
- Economic development: World Bank GDP per capita (PPP).
- Internet penetration: Percentage of Individuals using the Internet from the International Telecommunications Union.
- Freedom on the Net: Freedom House’s ranking of countries as Free, Partly Free or Not Free with regard to Internet freedoms, as well as the Web Index’s freedom and openness indicator.
The obvious hypotheses assume that democratic regimes, greater economic development, and greater Internet penetration would be inversely related to Internet censorship. However, that’s not always the case. Let’s take democracy. While all but one country ranked as ‘Free’ (minimal or no Internet censorship) is also a democracy (Armenia is the outlier), not all democracies are ranked as ‘Free’. For example, Turkey, Brazil, South Korea, Mexico, Indonesia, and India are all ranked as ‘Partly Free’ for Internet freedoms, even though Polity categorizes them as democracies. In the realm of Internet freedoms, they join authoritarian countries like Azerbaijan and Kazakhstan as only ‘Partly Free’. The nebulous world of the anocracies is even more haphazard with various illiberal democracies exhibiting a range of censorship characteristics. In short, the countries with the greatest Internet freedoms are more likely to be democracies, but democracy does not guarantee the presence of Internet freedoms.
Similarly, economic development does not appear to be correlated with Internet censorship. Countries that display the greatest Internet censorship (i.e. ‘Not Free’) range from Ethiopia (with a GDP per capita of roughly $1300) to Saudi Arabia (with one of the highest GDP per capitas in the world). On the other end of the spectrum, countries with greater Internet freedom (i.e. ‘Free’) range from Kenya and Georgia (~$2200 and $7100 GDP per capita, respectively) along with the economic leaders United States, Australia, and Germany. The data shows that there are plenty of instances of Internet censorship on both ends of the economic development scale, and the same is true for Internet freedoms.
Finally, it seems intuitive that Internet penetration would be inversely related to Internet censorship. States that censor the Internet seem likely to also impede the development of Internet infrastructure and hinder access. Again, this may not be the case. In ‘Free’ countries Philippines, Ukraine, and South Africa, only 35-50% of the population has access to the Internet. This is the same percentage of Internet penetration found in ‘Not Free’ countries China, Uzbekistan, and Vietnam. Even at the higher levels of Internet access (~85-95%), one finds countries like the United Arab Emirates and Bahrain (Not Free) as well as Iceland and Japan (Free).
In short, many of the usual suspects such as regime type, level of economic development, and Internet penetration may not have as linear an impact on Internet censorship as is commonly assumed. Conversely, the spatial patterns (shown in these interactive maps from Freedom House and Web Index) seem quite apparent with regard to Internet censorship. For example, Africa exhibits discrete clusters of both openness and censorship, as does Asia, while Western Europe and the Middle East exhibit larger, regional clustering patterns at extreme ends of the censorship spectrum. There appears to be a neighborhood effect that may in fact more consistently influence a state’s likelihood of Internet censorship than these domestic factors.
This initial look at the data on Internet censorship highlights the need to more rigorously test many common assumptions about Internet censorship. Comprehensive quantitative analysis using spatial statistics modeling techniques could be applied to further test these hypotheses and evaluate the cross-sectional and temporal trends. These models should include additional control variables such as education levels and urbanization, temporal lags, as well as explore the potential for interactive effects between geography (i.e. contiguity) and some of the domestic factors discussed here. Until then, there’s a chance that global Internet ‘hot spots’ may soon become just as synonymous with Internet censorship as it is with Internet access.
We’ll be posting a follow-up to this analysis that contains our own visualizations to enable interactive exploration of the various indicators, so keep an eye out for a more in-depth look at some of this data.
Black Hat Decomposed: Perspectives from a Social Scientist
by Andrea Little Limbago
This week I attended my first-ever Black Hat conference. As a social scientist, I was very intrigued to actually experience the culture of the conference, but anticipated being overwhelmed by the technical nature of the presentations. Little did I know that I was in for a very pleasant surprise during yesterday’s keynote address by Dan Geer, CISO at In-Q-Tel, entitled “Cybersecurity as Realpolitik”. While it appropriately remained very focused on the cybersecurity landscape, I couldn’t help but notice the underlying references and parallels that he drew from my world – the world of international relations and conflict studies. Below are a few of my favorite highlights of the presentation. Please keep in mind that these are based on my own note taking and perceptions – captured with a pen and paper, which I now know is clearly anomalous behavior.
1. Expected utility & rational choice– Before diving into his main argument, Geer referenced the use of funds to buy off adversaries during the Iraq War. This is improbable in the cyber landscape of non-attribution, but he notes the role of incentivizing to impact the cost-benefit calculus, as occurs in expected utility models. This notion of assuming rational actors and how to incentivize them reemerged during his discussion on vulnerabilities. This time he hypothesized ways to incentivize the security community to reveal known vulnerabilities, including the US government providing enormous financial rewards to disclose vulnerabilities. Geer references an Atlantic article, which notes that financial incentives will only work if the vulnerability landscape is sparse (as opposed to dense, or plentiful). A sparse landscape means that an individual vulnerability would represent a larger proportion of the entire population of vulnerabilities, vice a very dense population where revealing a single vulnerability would have little impact. In each of these instances, the key focus is on how to impact a person’s behavior through understanding and evaluating their risk preferences and motivations.
2. Cybersecurity’s unholy trinity?– In international economics, the unholy trinity (aka Mundell-Fleming model) represents the trade-offs between open capital flows, a fixed exchange rate, and an independent monetary policy. A state can only pursue two of the three – with each scenario inherently inducing distinct constraints on policy decisions. This came to mind as Geer noted that in cyberspace, a similar choice must be made between freedom, security, and convenience. Only two out of the three can exist, but not all three simultaneously. Unfortunately, users demand all three, and from my perspective, security frequently seems to be the lowest priority. This could very well be due to an inability to quantify the return on investment for security….which gets back to point number one.
3. Second-strike– Although Geer used the term ‘strike back’, I translate that quickly to second-strike, which seems identical. Second-strike gained prominence during the Cold War, and is a key component of mutually assured destruction, which rests on the ability of a state to maintain a second-strike posture – that is the ability to strike back after being hit by a nuclear weapon. Geer adopts this concept, and discussed the notion of cyber smart bombs, which he argues are extraordinarily difficult due to non-attribution within the cyber domain. Instead of focusing on second-strike, Geer argues actors (organizations, states, individuals, etc.) should focus on recovery due to the costs and adversarial identification required for cyber smart bombs.
4. A little bit of Churchill– Winston Churchill famously noted (to paraphrase) that democracy is the worst form of government except for all of the others. Geer provides us the cyber version of this quote, when he states (again, to paraphrase) that open-sourcing abandoned codebases is the worst option except for all of the others. This was part of Geer’s discussion of abandonment, when organizations no longer provide security updates for older versions of their software. This remains a problematic and lingering aspect of cybersecurity, without many effective solutions, as the quote insinuates.
5. Convergence or divergence?– Scholars have long debated whether the international system is moving toward a unified system with an overarching international government or whether it will devolve into an anarchic fragmented system. Geer also draws this analogy to cyberspace and asks whether it is headed toward a single system or a heterogeneous one broken into manageable chunks. While convergence is the natural flow, there could be significant power battles over on whose terms this unification occurs.
6. And a little bit of Weber– In a discussion on the growing inability of organizations to protect themselves, Geer referenced a Bloomberg article (possibly this one) that discussed a call by some of the large banks for assistance from the government for cybersecurity. Geer highlights the dependent nature of this relationship, wherein the only actors powerful enough to provide protection for these multi-national corporations are those with a monopoly on the use of force. According to Max Weber, this monopoly on the legitimate use of force is one of the defining characteristics of a state. This is an interesting juxtaposition after so much discussion of the demise of the nation state due to the borderless nature of cyberspace (as I’ve discussed in a previous post).
7. Complexity– In some of his concluding comments, Geer addressed the complex nature of cyberspace. The dynamism, scale, speed and scope of cyberspace – not to mention the intersection of the virtual and physical worlds – all compound to shape its complexity. While clearly there are differences, many of the same tenets exist in systems theory and structural functionalism. Pioneered by social scientists such as Talcott Parsons and Karl Deutsch, they view social systems as open and complex, and identify the various social functions that together comprise the whole system. In both cyberspace and social systems, complexity remains an underlying attribute, and practitioners and theorists alike will continue to pursue simplicity to advance our understanding of how these systems function.
Final thoughts: Geer began his presentation highlighting the growing role of cybersecurity within the policy domain. While understandably few and far between, I have found a couple of panels that focus on the intersection of policy and cybersecurity, addressing issues such as legal constraints and state-sponsorship of malware. For clearly selfish reasons, I hope this is just the beginning of a larger trend. As cybersecurity moves increasingly to the forefront of policy – as Geer elucidates – it only makes sense to innovate these kinds of discussions at conferences like Black Hat.
2014 Hottest DC Companies
Hack Week The Endgame Way
by Andrea Little Limbago, Principal Social Scientist
Several Endgamers attended Black Hat in Las Vegas a couple of weeks ago. Some stayed and many more arrived for DEF CON. Keeping the theme alive, we just finished up this summer’s Vegas hack week, where engineers, data scientists, product managers, and even a social scientist all gathered to pursue new endeavors outside the normal sprint cycles and Jira tickets. With the team split between San Francisco and DC, it was a great time not only to see if we could quickly hack an idea into reality, but it also gave the team a chance to spend some quality time together.
The purpose of hack week is to set aside time outside of the office for self-forming teams to pursue new ideas. The week culminates with demos by each team, followed of course by a celebration of all the hard work and great projects that emerged. The projects include product enhancements, internal tools creation, data exploration and validation, and even execution of brand new product ideas. However, there are many intangibles that accompany the week that have a lasting impact on the company’s culture and continued emphasis on innovation:
Knocking out the Tech Bucket List: Everyone has a bucket list, and while we are busy in our day-to-day work demands, we don’t always get a chance to fully explore the promising tangents we encounter every day. During this week, we get the chance to explore new technologies, libraries, data sources, and methodologies. Together, these feed into the continuous whiteboarding sessions where ideas are knocked around within and across teams, with the assistance—of course—of plenty of caffeine.
Failure is an option: This may seem simplistic, and possibly even counterintuitive, but the hack week fosters an environment of risk-taking and exploration. The week provides the opportunity to explore ideas without being burdened with whether they succeed or not. Of course, the goal is not to fail, but more often than not failure may close one door but open another, serendipitously providing insights into new solutions or approaches. In Steven Johnson’s book, Where Good Ideas Come From, he explains, “Innovative environments thrive on useful mistakes…”, and these mistakes are an intrinsic component of exploration and innovation.
Cross-pollination of ideas: The group that gathered in Vegas represents a broad range of expertise and backgrounds. Given this diversity, it’s important to foster an environment that encourages the cross-pollination of ideas within and across teams. More often than not, people use this time to brainstorm ideas with groups other than their day-to-day teams about the projects they’re tackling. In fact, on my hack week team alone we counted contributions from at least three different projects. Since we had an astrophysicist participating, it’s almost essential to throw in a quote from Neil deGrasse Tyson (writing for Foreign Affairs). He notes, “cross-pollination involving a diversity of sciences much more readily encourages revolutionary discoveries.” While we didn’t expect revolutionary, I certainly saw some real excitement about some of the breakthroughs.
Mandatory Fun: In the spirit of Weird Al Yankovic’s latest album, hack week similarly disregards many corporate-style team-building events, and favors a more natural (and fun!) environment for getting to know colleagues professionally and personally. This is especially important given all of the new folks attending their first Endgame hack week. It gives each of us some time to demonstrate our own skills, while learning more about the capabilities and backgrounds of our colleagues. We also identified some interesting nocturnal eating habits, and possibly even invented a few new dance moves along the way.
It’s safe to say we accomplished many of these intangible goals for the week. And who knows, it just may be the case that what happens in Vegas won’t stay in Vegas, and will ignite future efforts on our teams back home.
Want to hear more about Endgame Hack Week? Read John Herren’s perspective here.
How We Win Hack Week
by John Herren, Director of Product Design
With outstretched arms and a voice just a tad too loud, I shout, “Welcome to Hack Week!” As a fitting coda to Black Hat and DEF CON wrapping up in the previous days, an enthusiastic group of Endgame software engineers, security researchers, designers, data scientists, devops engineers, architects, and yes, even a few managers and company executives have gathered together in a place where we can all concentrate on our ideas and the Hack, devoid of any outside distractions and temptation to break focus. Literally, we are in the middle of the desert; specifically, The Palms Hotel and Casino in Las Vegas.
In our doublewide conference room, still heavy with the lingering aromas of our breakfast buffet combined with what could only be Red Bull, we’ve just heard our CEO and CTO speak on the importance of innovative culture and the value of looking forward. We’re reminded of the difference between invention and innovation and how there always exists the opportunity to make our products better. Now it’s my turn to speak. I’m playing the role of hype man and evangelist. In contrast to the executives’ strategy points, my intentions are a bit more tactical: how to win hack week.
To call it a hack week is a bit generous. This purposefully abridged gathering is more like hack-two-days. After the morning discussion, we have just about forty-eight hours before the klaxon call of hands off keyboard, minus any time reserved for sleep, or blackjack, or rooftop drinks, or yet another steak. As software engineers, we’re taught to embrace constraints, timetables included. We’re also notorious for over-estimating our abilities when it comes to timelines. I’m interested to see how the short timeframe affects our projects with the addition of all of the variables that this odd city has to offer. More poetically: Beer and Coding in Las Vegas.
The material goal of Hack Week is to produce something useful. The official judging criteria echoes this and is broken into four aspects. The first of these is potential and longevity: will the project find its way into a new product, feature, conference paper, or open source release? Second is innovation. How new and exciting is the project? How does it solve a problem better? Third, how well does a project solve a real world security problem? And finally, how functional is your hack? How well is it implemented and does it actually operate? These are great criteria and set us up for some healthy competition. Aside from this, only a couple more rules are in place: the deadline for coding is high noon on Wednesday, and team sizes should be two to four people. By the time we arrived in Vegas, we’d hashed out our ideas on the company wiki and recruited our team members.
I begin my rant with a few pointers. Scope your project carefully. The teams are small, and the timeline is tight. The Internet connection is fast, but the action at the craps tables is faster. Concentrate on the important part of your project and borrow from Github liberally for the rest. Don’t let yourself get stuck for too long. Ask your teammates for help when you do get stuck. This is a great rule of thumb for everyday development work, but on an accelerated hack project, you benefit greatly when failing fast and relying on your team’s collective experience. To prove this point, I throw some rapid-fire questions at the group to show the diversity of knowledge among our team. I ask for open source components we can use to solve common problems:
“I need to cluster some data points!”
“I need a CSS framework for grid layouts!”
“I want to update a web page in real time!”
“I want to display a time series!”
“I want to do sentiment analysis on a data feed!”
And just so everyone can get involved,
“I need to store and access a thousand rows of data!”
I can’t stump these folks. They’re shouting answers back as soon as I can get the questions out.
“D3!”
“Socket.io!”
“Scikit!”
“Stanford has… some library!”
A program manager even chimes in “SQLite!” for that last question.
At this point I’m hopping around the room awarding airplane bottles of Jaegermeister and Fireball cinnamon whiskey for correct answers, and when those are all gone, some anthropomorphized scratch and sniff stickers of tacos. They have eyes, mouths, mustaches, sombreros, and guitars. They smell like cumin and stale tortillas.
You can feel a great energy in the room building up. We believe in ourselves and we can solve any problem.
After this silliness, I reiterate the criteria for the competitive win, but my main point is to talk about the real win, the team win. The cultural win. The kind of win that makes the time and resources that go into this production worth it for every stakeholder, even if all of our projects are complete duds.
I stress the importance of the presentation. Spend time on preparing your talk! Tell a story, and provide context and background. Dumb it down for the least technical person in the room (and yes, someone volunteered to serve that role). Then, dumb it down some more. Only then are we ready for your demo.
The goal of Hack Week is a collective teaching and learning. We learn about ourselves, and how we work together, how we solve problems, and how we react and support one another when we fail to solve problems. To win Hack Week, when we give our twelve-minute presentations, we must reveal that journey as much as we show off the bytes and pixels we manipulate:
How did you come up with your idea?
What was your methodology?
What tools did you use?
What did you try that was different?
What worked, and what didn’t work?
What did you learn?
We win Hack Week by teaching that experience. This is a cultural goal of any company, but just as we can accelerate writing code during Hack Week, so can we with fusing our culture.
The next few days were frenzied. The four large, portable whiteboards were quickly commandeered and marked up with checklists, diagrams, and even formal user stories. Some of us pulled all-nighters. Some went out for Japanese curry dishes. Some hacked away persistently with headphones. Another found cookies. The energy never subsided. I caught a frontend engineer whisper to his team, with wide eyes and raised brows, “guys, I want to just absolutely crush this thing.”
Ultimately, team Endgame won Hack Week. The projects presented were varied, and all of them interesting enough for lively, inquisitive, and deep Q&A sessions. They included exploration tools for DNS data, large-scale honeypot deployments and analysis, mobile OS process visualization, and a complete lightweight overhaul of an existing Endgame product. A recently-hired UX designer implemented a solo project, an onboarding application for new hires, which rivaled any of the minimum viable products you’d see from a funded startup on Hacker News. Over the two days, one of our data scientists learned the Angular JavaScript framework, and another frontend engineer learned about process threads on Android. Some of the hacks will find their way into product features. Others will become internal tools. Some will never see the light of day but will prompt discussions for new ideas. Hack Week was an amazing opportunity to have fun, teach, and learn, and I’m already looking forward to the next one. For us, what happens in Vegas stays on Github!
Want to hear more about Endgame Hack Week? Read Andrea Limbago’s perspective here.
Endgame is always looking for great talent to join the team. If you enjoyed hearing about Endgame Hack Week, please take a look at our job openings to learn more about careers at Endgame.
The More Things Change...Espionage in the Digital Age
by Andrea Little Limbago
Last week, Der Spiegelreported that the BND – Germany’s foreign intelligence agency – had accidentally intercepted calls of U.S. government officials while collecting intelligence on Turkey. For many, this was an example of hypocrisy in international relations, as German Chancellor Angela Merkel was one of the most vocal critics following the Snowden Affair, which strained relations between the U.S. and Germany. But one can’t help but be struck by the media’s surprise that a country that so vocally spoke out against cyber-espionage also conducts it. The main story should not be an overreaction to the collection behavior (accidental or not) between allies, but rather the evolving nature of state behavior in light of technological change. Each historical technological revolution has altered and shaped not only every aspect of warfare, but also of peacetime behavior between states. One of the current manifestations of this adaptation to technological change is the creation of state-sponsored cyber units, potentially for cyber offense and defense alike.
First, and it almost seems ridiculous to note this, but recent events warrant it: espionage is not a new phenomenon. Espionage and intelligence gathering have likely existed since the beginning of time, and were certainly factors in many ancient civilizations including Egypt, Greece and Rome. Just like today, spying was not purely a characteristic of Western behavior - in fact, Sun Tzu devoted an entire chapter of The Art of War to spying and intelligence collection. As technology changed, the modes of espionage evolved from eavesdropping, to binoculars, to pigeons rigged with cameras, to aircraft satellites, to today’s hot term: cyber-espionage. While this over-simplifies the evolution of espionage, it’s important to note that throughout history, each technological change has similarly impacted collection procedures.
Moreover, technological innovations in both war and peace simply cannot remain indefinitely under the purview of a single actor. Eventually, other actors imitate and even leapfrog ahead after the first use of the technology. In fact, a striking feature of the Digital Age is the decreasing amount of time it takes for the replication of technological innovation. While it used to take years to copy the technological capabilities of other actors, this time lag has dramatically decreased due to the fast pace of technological change characterizing the modern era. While in the past, some states may have held onto anachronistic technologies, even governments of closed societies are increasingly tech savvy, leveraging the cyber domain to achieve domestic and international objectives.
Knowing that espionage is not a new phenomenon, and that technological copycats have occurred throughout history, the obvious question becomes: Is Germany indicative of other states that have organizations devoted solely to cyber security? Below are just a few examples. This list is by no means comprehensive, but it is illustrative of a growing trend as states adjust to the realities of the Digital Age. The role of the United States and Germany has been covered in significant detail elsewhere, as have the cyber units of major global and regional powers such as Russia, China, Israel and Great Britain. Like most behavior in the international system, the scale and scope of these cyber units vary enormously based on the opportunity (i.e. resources) and willingness of each individual state:
- Australia: The Australian Cyber Security Centre, formerly known as the Cyber Security Operations Centre, is scheduled to open later this year with a large focus on domestic cyber security. The Australian Signals Directorate has been noted as having closer ties with foreign signals intelligence organizations.
- Brazil: The Center of Cyber Defense (CDCiber) brings together the Brazilian Army, Air Force and Navy, but is predominantly led by the Army.
- France: Some note that France lags behind Western counterparts, but it has established the Centre d’Analyse en Lutte Informatique Defensive (CALID). According to Reuters, while this year’s increased spending will go toward infrastructure, a large part of it will also be allocated toward, “building up web monitoring and personal data collection.”
- Nigeria: In light of cyber attacks from Boko Haram, Nigeria is stepping up its cyber security capabilities. Recent proposed legislation focuses mainly of combating cyber crime, and includes intercepting various forms of electronic communication.
- North Korea: Has had a cyber unit since 1998, the most prominent of which is Unit 121. Just this summer, it was reported that North Korea has doubled its cyber military force to 5900 cyber warriors. The General Bureau of Reconnaissance is likely home to this growing group of hackers.
- Philippines: Despite some delays in its legal system, the Philippine military has created a cybersecurity operations center, called C4ISTAR. This move followed a series of attacks against Philippine government websites and heightened tensions in the South China Seas.
- Rwanda: Perhaps the most unlikely case, Rwanda has had a cyber unit for quite some time. This summer the Rwandan government announced plans to strengthen the cyber unit’s capabilities.
- South Africa: Maintains a National Cyber Security Advisory Council, and as of last year intends to create a cyber security hub based on its National Cyber Security Policy Framework.
- South Korea: Has had a cyber command since 2010, a likely response to increased cyber attacks from North Korea and elsewhere.
- Even IGOs are getting in on the action – NATO has strategically placed a cyber unit in Estonia, called the cyber polygon base. NATO has already carried out several cyber exercises at this site.
In short, similar to what we’ve seen in previous eras, states are altering their behavior and organizational features in light of technological disruption. This quick overview by no means makes a normative claim about whether the rise of state-sponsored cyber organizations is bad or good for society, but instead highlights a growing trend in international relations. The latest disclosure on German collection efforts is likely indicative of things to come. But how states respond to this trend will vary greatly. Like all technology, there will be those who embrace it and those who reject it. Germany’s suggestion of adopting typewriters (and not the electronic kind) to protect sensitive information and counter cyber-espionage is just one example of how reactionary measures by states may risk sending them back to the technological ice age. What a great way to protect information - because after all, everyone knows that espionage didn’t exist before the Digital Age!
Working Across the Aisle: The Need for More Tech-Policy Convergence
by Andrea Little Limbago
Last week, the White House confirmed that Todd Park is stepping down from his position as the country’s second Chief Technology Officer to move back to Silicon Valley, though he’ll remain connected to the administration. The latest news indicates that Google executive Megan Smith is a top candidate for his replacement. This Silicon Valley/Washington, DC crisscrossing, although rare, is a welcome development, and it comes at a time when the Washington establishment – Republicans and Democrats alike – is becoming increasingly known for its lack of technical acumen. The divide between those who are tech savvy and politically savvy is not only geographic, but is also perpetuated by industry and academic stovepipes. This is especially true in the cyber realm, an area that has vast technical and political implications but where the two communities remain separated by geography, disciplinary jargon, and inward-focused communities.
It’s a safe bet that I was the only person to attend both Black Hat in Las Vegas and this past weekend’s American Political Science Association Annual Conference in Washington, DC (perhaps best known now for the disruptive fire at the main conference hotel). If anyone else attended both, I’d love to talk to them. For the most part, I was struck by how little acknowledgement was given to cyber at APSA and how little Black Hat addressed the impact of the foreign and domestic policy issues that greatly impact the future of the cyber domain. Each conference should continue to focus on its core expertise and audience, but the increasing interconnection of cyber and policy can’t continue to be brushed aside. For its part, Black Hat had exactly three policy-related presentations out of roughly 120, and that is based on their own coding schema, which seemed accurate. APSA didn’t do any better – three panels had ‘cyber’ in their title, three had ‘Internet’, and maybe two dozen had ‘digital’, although these really only used it as a synonym for the modern era and had nothing to do with technology. To put this in context, APSA generally has over 1000 panels, and the theme this year was “Politics after the Digital Revolution.”
Why does this even matter? During one of the APSA panels (one of the three that addressed cyber), an audience member asked what political science has to do with cyber studies. I viewed this question as similar to someone in the 1950s asking what political science has to do with nuclear engineering. Clearly, they are distinct domains with distinct experts, but policymakers (and those informing them) cannot simply ignore major technological breakthroughs. Especially in the cyber domain, policymakers currently employ cyber statecraft as both sticks and carrots, but lack a body of literature that explicates the impact of these relatively new foreign policy tools. Similarly, engineers and computer scientists focusing on cyber security may find themselves increasingly affected by legal and political decisions. In fact, based on the surprisingly large attendance I saw at the panels on policy at Black Hat, the tech community seems quite aware of the large impact that policy decisions can have on them.
There is room for cautious optimism. At Black Hat, I attended a panel on “Governments as Malware Authors”. It was an interesting, tactical overview of various malware attacks by a range of governments, many of which were not the usual suspects. Similarly, the APSA panel on “Internet Politics in Authoritarian Contexts” provided a great overview of the myriad ways in which authoritarian regimes employ a diverse range of cyber tools to achieve their objectives, including censorship and DDoS attacks. These two panels covered many similar topics, but with strikingly different methodological approaches and data. It would be phenomenal to see these two groups on one panel. I’d argue that panel would produce exactly the kind of information policymakers could actually use.
Similarly, at the beginning of an APSA panel on “Avoiding Cyber War”, I met one of the panel members. When he learned I worked at a tech company, he quietly admitted, “I’m not a political scientist, I’m really a hacker.” To that, I responded, “I’m not really a hacker, I’m a political scientist.” It would be wonderful to see these two perspectives increasingly collaborate and explore each other’s main venues for intellectual innovation. This small but impactful step could finally provide policymakers the insights and technological information that could help improve the glut of tech acumen within the policy domain. The tech community also must be increasingly willing to contribute to the national debate on all of the technology issues that will continue to impact their lives and businesses.
Next year, APSA will be in San Francisco, which presents an exciting opportunity for this kind of collaboration. It would be great to see more panels featuring technology, and more specifically, new analyses of cyberspace and statecraft. Of course, short of a miracle, APSA will have to work on some marketing for that to happen. I, for one, welcome the day when APSA abandons the nylon Cambridge University Press bags it gives away in favor of an APSA sticker or decal that political scientists (and maybe even an engineer or two) can proudly exhibit on their Macs.
Cyber Defense: Four Lessons from the Field
by Casey Gately
In cyberspace, as in more traditional domains, it’s essential to both understand your enemy as well as understand yourself. A comprehensive defensive strategy requires a better understanding of the motivations and intent of an adversary, but this is exceedingly difficult due to non-attribution and the complex nature of cyberspace. It’s safe to say that most organizations don’t actually have the required tools, knowledge, mission set, scalability or authority to incorporate analysis of the adversary into their cybersecurity frameworks. But as I’ve experienced, thinking like an adversary and internal analysis of your network and assets are both essential components of cyber defense.
Recently, my colleague Jason and I attended and presented at the 2014 Malware Technical Exchange Meeting (MTEM). MTEM is the annual malware technical exchange event that brings together practitioners and researchers from industry, the FFRDCs (federally funded research and development centers), academia, and government to present and discuss all things malware. MTEM presentations typically focus on malware analysis at scale, incident response, trend analysis, and research, but this year’s theme was more specific: “Evolving Adversaries: Stories from the Field”. The goal was to exchange information on technical and policy issues related to evolving threats with a focus on presenting new methods for analyzing malware more quickly and effectively and share success stories from the field. Below are four key insights that I’ve gained from my experience at conferences like MTEM and from cyber exercises:
1. Know your network: Today’s cyber defenders must know their network. They need visibility into all assets, including operating systems, users, endpoints, mobile devices as well as knowledge of normal network behavior. Unfortunately, this isn’t always the case. There are some organizations where the defenders and incident responders have extremely limited access/visibility into their own network. They are mostly blind, relying solely on anti-virus software, firewalls, and host-based detection systems. A situation like this could have detrimental consequences. For example, if the defenders only saw sensor-detected “known bads”, an attacker could leverage that by deploying low-level, easily detectable malware that would keep the defenders occupied while the attackers carried out their most nefarious acts. In order to proactively defend against the adversary in real-time, defenders must seek and obtain ubiquitous presence within their own protected cyber space.
2. Think like the adversary: Defenders must also think like an adversary, which goes above and beyond just monitoring anti-virus tools, IDS alerts, and firewall logs. To truly protect themselves, defenders must understand the aggressor tactics that adversaries will use. For example, once an attacker gains access to a victim network, they’ll most likely conduct reconnaissance to learn the lay of the land. This could reveal some of the defensive tools deployed, enabling the attacker to circumvent them. Additionally, the attackers’ recon mission could reveal additional credentials, allowing an attacker to burrow further into the network. Defenders also have to remember that an attacker is not static; the most aggressive attackers will evolve and try new methods to find the most valuable assets. To effectively defend the most critical data networks and level the playing field, defenders must truly think like the adversary. Our MTEM presentation focused on this theme of an evolving adversary and drew on experiences from a recent cyber exercise. The presentation included various network and defender techniques, demonstrating the utility of thinking like the adversary to proactively deter intrusions.
3. Prioritize: A good defense requires organizations to prioritize their most valuable assets, incorporating both what is most valuable to the organization but also what may be deemed most valuable to an adversary. Realistic defensive teams will categorize all of their assets, from the “crown jewels” all the way down to the “postcards at the checkout stand”. To set this in motion, simply put yourself in the mindset of the attacker and ask, “What do I really want or need from this organization?” The answer is most likely where the attacker will try to land. Armed with this information, efforts can be implemented to protect that data and/or alert a defender when someone (or something) tries to access it.
4. Automation & Contextualization: Automation is an essential component of defense, but alone it is not enough. At the same time, since today’s attackers use automated techniques to expedite their attacks, manual defensive measures alone will also probably prove to be an inadequate defense in most cases. Automated technologies that incorporate contextual awareness are key to maintaining situational awareness and strong cyber defense.
And before I sign off, I’d like to leave you with one more thought. It was something a LtGen told a group of us analysts 10 years ago. Regarding counterterrorism, he said, “We have to throw a strike with every pitch while terrorists only need a single hit.” I believe this same sentiment holds true in the world of cyber defense. An attacker only needs a single success to produce catastrophic results for a victimized network or organization. In cyberspace, a good defense requires the ability to anticipate the adversary and continually evolve your defense accordingly.
Article 5.0: A Cyber Attack on One is an Attack on All? (Part 1)
by Andrea Little Limbago & John Herren
NATO leaders gathered in Wales in early September to address a variety of security challenges, culminating on September 5th with the Wales Summit Declaration. It is no wonder that the summit of an alliance formed 65 years ago did not garner much media attention. With all of the current crises monopolizing the news cycle – the expanding powerbase of ISIS in Iraq and Syria, the Ebola outbreak in West Africa, and the tenuous ceasefire between Ukraine and Russia – little attention has been devoted to a potentially major policy shift within NATO that could have long-term global implications. For the first time, NATO has determined that cyber attacks can trigger collective defense. This shift is particularly important now since offensive cyber behavior is on the rise in Eastern Europe, and Georgia and Ukraine are still being considered for NATO expansion.
NATO’s influence and even existence have been questioned since the dissolution of the Soviet Union in 1991. With over a decade in Afghanistan, NATO largely shifted its focus to counterinsurgency capabilities, virtually rendering the collective defense aspect of NATO obsolete. NATO members have not prioritized the alliance, which currently boasts an old and decrepit infrastructure, as resources were devoted to Afghanistan and not Europe. Article 5 provides the bedrock of the alliance, explicating the notion of collective defense – an attack on one is an attack on all. As the below map demonstrates, over the last 50 years NATO collective defense has slowly crept toward the Russian borders, and now includes former Soviet states Estonia, Latvia, and Lithuania. This creeping expansion is often cited as inciting Russia to engage in a series of conflicts in Estonia, Georgia, and now Ukraine. Others also believe that Russian President Vladimir Putin and his megalomaniac infatuation with rebuilding the Russian empire fuel his expansionist appetite, including his wide use of the cyber domain to achieve political objectives. With the rising tensions and realpolitik emerging between Russia and several former Soviet states and satellites, NATO leaders have come to the realization that the modern international system now includes an entirely new domain that can’t be ignored – cyberspace.
Russia’s current adventures into Ukraine likely influence this timing, but the increased use of offensive cyber statecraft in Eastern Europe over the past several years has clearly crossed the tipping point such that policy is slowly catching up to the realities of international relations. The inclusion of cyber as a catalyst for collective defense brings to the forefront a series of technical and policy issues that must be parsed out in order to truly give this newest addition to Article 5 some teeth. On the policy front, the Wales Summit Declaration notes, “A decision as to when a cyber attack would lead to the invocation of Article 5 would be taken by the North Atlantic Council on a case-by-case basis.” This extraordinarily vague criteria must be made more specific not only to assuage concerns of NATO’s Eastern European members, but also to signal externally what kind of cyber behavior may actually incur a kinetic response.
Signaling is just as important today as it was during the Cold War, and for policies to be taken seriously, there must be some sign of credible commitment on behalf of member states. The cyber domain is fraught with attribution issues, making the practical aspects of this even more challenging. The Russian group CyberVor has been linked to the theft of passwords and usernames, while a group dubbed DragonFly is possibly responsible for producing the malware Energetic Bear. Energetic Bear was created as a cyber-weapon, crafted to monitor energy usage and disrupt or destroy wind turbines, gas pipelines and powerplants. Energetic Bear, similar to other offensive cyber behavior in the region, exhibits characteristics that lead many to infer it is state sponsored, but proving that is extraordinarily difficult in cyberspace. It is important to note that Energetic Bear, unlike many more publicized examples of Russian state-sponsored cyber attacks, mainly targeted Western European countries. The notion of NATO collective defense against cyber is not solely an Eastern European problem.
All of this begs the question: Is it technically possible for NATO to create a cyber umbrella of collective defense around its members, just as the nuclear umbrella protected them during the Cold War? We’ll tackle this question in two additional blogs that address the technical difficulties associated with the cyber aspect of the Wales Summit Declaration. NATO’s inclusion of cyber attacks has long-term implications for the international system, signaling a return to major power politics and realpolitik. Instead of billiard balls crashing in an anarchic world system, we may now be moving to a world where power politics means binaries crashing in cyberspace.
Article 5.0: A Cyber Attack on One is an Attack on All? (Part 2: Technical Challenges of a Mobile Cyber Umbrella)
by Adam Harder
Mobile phone networks are prime targets for a cyber attack, and governments large and small are in a particularly powerful position to execute such an attack on another country. Given the September 5th NATO Wales summit resolution declaring that cyber attacks can now trigger collective defense, how will the “cyber umbrella” extend to the mobile and telecommunications domain? This is not a hypothetical scenario, but already has significant precedent. State actors have conducted cyber-espionage (most recently in Hong Kong) and cyber-sabotage of critical infrastructure. We’ve never seen a significant sabotage operation against mobile phones themselves, but the phone network has long been an infrastructure target in traditional wars going back to the age of the telegraph. Just this spring Russia conducted denial of service attacks targeting the Ukraine government’s phone system. So what are the technical challenges associated with the NATO Wales summit resolution?
Some Background
Mobile phone networks are culturally distinct and radically different from the Internet. First, there is an asymmetrical relationship between phone and Internet communication because there are no web servers on the phone network. While phones can reach the Internet, computers on the Internet cannot directly touch your phone. This provides one of the many additional layers of complexity when dealing with malicious cyber activity in the mobile domain.
Second, there also is asymmetry of the markets. The Internet marketplace is often referred to as the “Wild West,” while the mobile network marketplace is extraordinarily oligopolistic. Thousands of Internet service providers exist and anyone can build their own network with some Ethernet cable and a router. The Internet is a wild and chaotic place: e-commerce sites are as easy to connect to as social networks, travel-booking sites, and even your bank. Nobody trusts anybody because any stranger can connect to any web site from anywhere in the world. This is why online accounts have passwords and corporations employ VPNs for additional security. Conversely, only a few companies build Internet telecommunications equipment, which communicates over a collection of esoteric protocols. There are also only a few hundred national and regional phone networks, which are owned and managed by about fifty multinational companies. Phone networks only connect to other phone networks, so only other providers have access. The networks connect to each other through tightly controlled connections managed internally or through trusted third parties.
This oligopoly can not only result in higher prices, but has also produced a complex web of protocols and technologies that further differentiate mobile networks from the Internet. To oversimplify, the Internet uses HTTP over TCP/IP, while Telecommunications networks communicate via SS7 over SCTP/IP. The reality is even far more complicated than this – SIP should eventually replace SS7, but that’s going to be a very long process and new protocols are drafted every year. The situation has been in flux since the 1990s, and that won’t change in the near future.
Finally, because of the proprietary nature of mobile networks, phone network security is an under-researched area because few researchers can get their hands on telecommunications equipment. The stuff is expensive, rare, horribly complicated to use, and its sale and distribution is heavily regulated. So how is this related to cyber-warfare? National governments are closely tied to the phone network infrastructure and providers. They regulate the providers, some of which are state-owned enterprises, and they commonly operate their own massive internal phone networks. Governments themselves are de-facto telecommunications providers, and yield an unwieldy advantage over non-state actors in instigating malicious mobile network behaviour.
Article 5: In Theory and in Reality
In the Wales summit resolution, NATO did not adopt any language that spells out what a cyber attack is, opting instead to say that a decision on when to invoke Article 5 would be made on a “case-by-case basis.”
So let’s see how this would play out in a hypothetical situation. Country X wants to knock country Y offline. X is a de-facto provider, and it has a connection into the private global phone network. From its trusted position in the network, X can exploit a vulnerability in Y’s provider and knock it off-line. X can attribute this to another country by tunnelling through a third party provider.
If the target is hit in the right place, the impact can be enormous. In summer 2012, a failure in the Home Location Register (or HLR - one of hundreds of crucial components in a core network) caused the collapse of the provider Orange in France. For twelve hours, 26 million people had no phone or data service. That same year O2 in the UK experienced a similar outage due to an HLR failure.
X could target the HLR in Y’s network or one of several other choke points. Components that affect large numbers of subscribers include the HLR, MSC, HSS, SMSC, MSS, and GGSN. After the initial attack, X would simply wait until the network service was restored, and knock it back down. Lather, rinse, and repeat.
If NATO is serious about Article 5, it needs to be aware that an attack on a telecommunications network could be the catalyst for invoking it. This isn’t simply some hypothetical, futuristic scenario, but has serious precedent. The core mobile network infrastructure is particularly vulnerable, and the perpetrator of such an attack will likely have the access, skills, and resources of a nation state. NATO member states need to prepare now for how they might respond to a mobile black-out – a case by case strategy simply won’t suffice when an entire population is disconnected from their smart phones.
Article 5.0: A Cyber Attack on One is an Attack on All? (Part 3: Why Private Companies Should Care About Geopolitics)
by Andrea Little Limbago
In the month since our first post on NATO, the Sandworm virus’ extent and reach has become increasingly publicized. Sandworm is believed to be a Russian cyber-espionage campaign focused on extracting content and emails that reference Ukraine. NATO was among its many targets. To some, this may just appear to be power politics playing out in cyberspace, with only the government sector truly affected. That would be an extraordinarily myopic perspective. Private companies are increasingly entangled in the world of cyber geopolitics and must be wary of how geopolitical developments can impact their own cyber security. When it comes to the cyber realm, the line is increasingly blurred between state and non-state actors. For the private sector, the geopolitical situation may become just as relevant to assessing cyber risk as international markets are to assessing economic risk.
As we’ve noted, there are significant hurdles to implementing NATO’s collective cyber defense, and the challenges in enforcing it will only grow. But the expansion of Article 5 to include cyber is just one tool the West can use to push back against Russian influence. NATO’s adoption of cyber complements the sanctions employed by the US and EU against select (mainly state-owned) Russian companies. US sanctions against Russia largely target the financial, energy, defense, and transportation sectors. Similarly, the Sandworm virus targeted, in addition to NATO and other Western government entities, several energy, telecommunications and defense companies. It also targeted an academic institution due to Ukrainian research by one of the professors. The JP Morgan data breach (and that of a dozen other banks) similarly is largely hypothesized to trace back to Russia, with some viewing it as retaliation for US sanctions.
The permeation of geopolitics into the private cyber domain is not limited to the Russian example. Last year, the Syrian Electronic Army (SEA) attacked several Western media outlets, the most prominent of which was the New York Times website. The timing of the attacks coincided with the Obama administration’s claims that Bashar al-Assad used chemical weapons against his population. The SEA is believed to have targeted anti-government/pro-rebel media outlets. The success of the SEA has led some to wonder whether the Islamic State of Iraq and the Levant (ISIL) is similarly capable of mounting a similar attack.
Geopolitics in cyberspace is certainly not limited to attacks against the West. The recent wave of cyber attacks on mobile phones in Hong Kong is likely an attempt by the Chinese government to quell the pro-democracy demonstrations. In response, Anonymous has vowed to retaliate against the Chinese government. Anonymous is not the only non-state actor fighting back against state-sponsored cyber attacks.
The line between state and non-state actors in cyberspace is becoming increasingly blurred. Members of the private sector, whether companies or individuals, are increasingly likely to be targets of cyber attacks–not because of their own behavior, but because of the growing impact of geopolitics on the private sector. Unlike the seemingly non-politically motivated breaches of companies such as Target or Neiman Marcus, private sector companies may become the targets of retaliatory behavior of foreign governments (or their non-state extensions). Rather than being the result of actions by specific companies, these targeted attacks will more likely be spillover effects of the greater geopolitical tensions between states. Saudi Aramco knows full well just how quickly the business (or state-owned enterprise in their case) sector can become a victim of grander power politics. This is likely to become the norm, not the exception, as states continue to play out disputes in the anonymizing domain of cyberspace. Private sector companies, especially those in energy, finance or defense, are especially likely to be prone to targeting by foreign government affiliated entities.
Fixing America’s Strategic Analysis Gap Without Creating Another Institution
by Andrea Little Limbago
In his recent Washington Post article “America Needs a Council of International Strategy”, David Laitin accurately makes the case for “better analysis of data, trends, and context…” to help policy makers within the government make more informed international policy decisions. He recommends the creation of a “team of strategic analysts capable of using the statistical models of the forecasters…” so that policy makers can explore global events and related policy options. As someone who helped build exactly that kind of analytic team within the government–and then saw it eliminated–I learned plenty of lessons about obstacles to implementation. Instead of creating yet another analytic organization, we should focus on refining current analytic institutions within the Intelligence Community and Department of Defense (which I’ll refer to as the Community) to fill the gap Dr. Laitin identifies.
As Dr. Laitin rightly notes, there does not exist (to my knowledge) a place within the Community that contains a group of government civilian scholars focused on quantitative modeling to inform strategic level decisions. While there are pockets of these capabilities, they are small and disparate. One reason for this gap is a bias against quantitative social science analyses. This can be partially traced to the false promise of many quantitative models that proved to either be incongruent with the operational pace of the Community or were simply based on faulty or obsolete theory and data. These models often contained proprietary black-box modeling techniques, and thus were impossible to fully comprehend. Because of this, quantitative analyses that truly accommodate academia’s technical rigor as well as the Community’s expedited analytic pace continue to be met with skepticism. I still recall a task in which our team responded to a question from the highest levels of military leadership. Our quantitatively-derived findings – which at the time were counterintuitive but have since been proven out – were never included in the final presentation to leadership. Quite simply, domain expertise trumped quantitative analyses then, and it still does today.
Second, there is a bias in the Community for tactical, real-time analysis over longer-term strategic thinking. This is partly due to an incentive structure that focuses on inputs into daily briefs and quick-turn responses in lieu of longer-term, strategic analyses. This is not a surprise given real-world demand for real-time responses. However, in my experience talking to various levels of leadership within the Community, there is also demand for strategic insights. In fact, as you move up the leadership chain, these kinds of analyses become ever more important to help inform global resource allocation, effects assessments, and planning.
Academia is equally culpable for the gap Dr. Laitin identifies. First, there remains a faulty assumption that scholarship can only be rigorous or policy relevant, and not both. This was evident at this year’s American Political Science Association (APSA) Annual Meeting. To provide just one thematic example, cyber analyses across the soft/hard power spectrum were practically non-existent among the over one thousand panels. Academic leaders, just like their government counterparts, similarly need to adapt the discipline for the challenges of the modern era.
Finally, there needs to be greater academic support outside of the beltway for rigorous, policy relevant research and career tracks. Given the academic job market over the last decade, academic leadership should also encourage, not deter, graduate students from pursuing non-academic positions, including in the government analytic community.
I adamantly agree with Dr. Laitin’s acknowledgement that policy makers require greater access to independent, quantitatively driven and probabilistic policy alternatives. However, the best way to fill this gap is from the inside, not from the creation of yet another new organization. Let’s complement and refine the extant analytic institutions such that they too can conduct and make relevant the quantitatively-derived strategic analyses Dr. Laitin describes.
Malware with a Personal Touch
by Casey Gately
Over the summer, a friend sent me some malware samples that immediately grabbed my attention. The malware was intriguing because the literal file name of each binary was named after a person or a user ID (for example, bob.exe, bjones.exe, bob_jones.exe, etc). Cool, I thought at first – but after some more detailed analysis, I realized that the malware actually contained hard coded user information, implying that each binary was crafted to target that particular user. Unlike more prominent instances of malware, these samples contained binaries specifically aimed at a pre-generated list of email addresses. No longer is malware targeting only randomized email addresses - this sample indicates a different variety of malware that has a more “personal touch”.
After digging around a bit, it became apparent that this type of malware has been around for a while. Malware of this type was actually circulating during early 2013, but recent open source research revealed there was also a malicious Facebook campaign earlier this year, in May 2014, that delivered similar malware. In the 2014 reports I read, the malware contained embedded clear text bible scripture, but while the samples I received from my friend didn’t contain any bible scripture, there were enough similarities (such as obfuscation techniques and reach back communications) that suggest my variants may have been from the same campaign. In typical phishing fashion, the May campaign began with an email like this:
So it’s been around for a little while and there are some other excellent analytical reports on this piece of malware - some of which delve more into the math behind the malware, which is quite interesting. However, in this post, I’ll be focusing on the personalized nature of the malware, which sets it apart from many I have previously analyzed.
Regardless of the malware genesis, what really amazes me is the number of people who will receive an email, download the zip file, then open it using the password provided. They then inadvertently run the malware and receive a fake MessageBox notification created by the malware. This means that while the user probably thought everything was okay, behind the scenes the malware was off and running. Similar to other types of malware, the binaries are triggered by user behavior and continue to run unbeknownst to the infected user. However, unlike many other types, this sample truly contained a personal touch – leveraging social engineering to fool the user that the malware was truly a personalized, benign message. The following section provides a technical walk-through of the various aspects of the malware.
Malware Execution: A Step by Step Overview
Upon execution, the affected user will see an error MessageBox with the following:
This would probably lead the unsuspecting user to think the program didn’t work, and there’s a good chance the user would just go about their business. If so, they would never realize their computer had just been compromised, and the seemingly innocent MessageBox would have been the only visual sign of something gone awry.
Now let’s take a closer look at the malware’s footprint on a victim host. It self-replicates to two binaries, both located in a malware-spawned folder within the affected users %AppData% folder. The name of the malware folder and the names of the self-replicated binaries are decoded during run time. What’s interesting about this is that while their naming conventions appear random, they were actually static and quite unique to that particular binary. The names of the folder and each binary are hardcoded, but obfuscated, within the binary. In other words, the malware file structure will be the same each time it is run. Pasted below are three different examples that illustrate this. Note: The literal file names have been intentionally changed to protect the identity of the affected users.
At first glance, this file structure reminded me of several ZeuS 2.1 and ZeuS-Licat variants I analyzed several years back, but the ZeuS file structure was not static in any way.
The communication flow looks like this. Within the span of 16 seconds, the infected host will connect out to 85 different domains, each with the same GET request.
Immediately, a pattern of anomalies can be seen regarding the reach back domains. Like children’s mix and match clothing, the reach back domains are mixed and matched combinations consisting of two English words. Subsequent reversing revealed the malware binary contained an obfuscated list of 384 words, ranging from 6 to 12 letters as follows:
6 letter words = 97
7 letter words = 152
8 letter words = 82
9 letter words = 38
10 letter words = 10
11 letter words = 4
12 letter words = 1
The reach back domains are dynamically generated using a Domain Generation Algorithm (DGA) to merge two of the 384 words together into a single domain name with the “.net” top level domain (TLD). The infected host will connect out to 85 domains, and the list of 85 domains will remain constant for 8:32 minutes, meaning that if the malware is restarted during the same 8 and a half minute period, the same domains would be requested.
For demonstration purposes, if the malware was run at 3:59am on 16 Sep, the domain generation will begin with classshoulder.net, followed by thickfinger.net, etc. (as shown above for connections 1-4). At 4:00am, however, the first domain requested will be thickfinger.net, followed by classfinger.net. Eight minutes later, at 4:09am, the first domain will be classfinger.net, followed by againstbeyond.net. This means that domains 2-85 at 4:00am are identical to domains 1-84 from 3:59am. Below is a representation of the first five domains used between 03:59 and 04:09 AM on 16 September, where the first domain can be seen dropping off after 8 minutes.
Referring back to the four connections above, you can see the affected user’s email address is contained within the outbound “GET” via port 80. This too was hardcoded, but obfuscated within the binary and decoded during run time. In essence, this means each binary was crafted for that particular user. Another interesting aspect is that of the variants examined, the domain name list and the affected user information were encoded with different keys.
Now let’s take a quick peek at the malware in action. Once executed, both self-replicated binaries are running processes as shown below:
What really intrigued me about this was that these are bit-for-bit copies of each other, and despite this, they are hooked to one another shown above. This seemed quite odd, but diving in a little deeper revealed a more interesting side of the story. Basically the first spawned process (ovwcntv.exe) is the primary running process and the second spawned process (ctrdyelvupa.exe) is a slave process, but more on that later. For now, let’s check out a brief chronological snippet of the malware and its spawned processes during run time.
Notice how the original binary (person1.exe) wrote the first copy (ovwcntv.exe) to disk, then spawned its process. Yet ovwcntv.exe was the binary that wrote the second copy (ctrdyelvupa.exe) to disk, subsequently spawning the ctrdyelvupa.exe process. This daisy chain actually acts as a persistency mechanism.
The original binary (person1.exe) is launched via a command line argument. Once running, the binary decodes a string equating to “WATCHDOGPROC”, which the running process looks for. If the “WATCHDOGPROC” string is part of the same command line string as the path for either binary, that particular process is launched. If the “WATCHDOGPROC” string isn’t contained within the same command line string as the binary path, the running process will not launch the additional process. Below are stack excerpts to help demonstrate this.
Will launch:
ASCII "WATCHDOGPROC "C:\Documents and Settings\user\Application Data\nripohbdhnewia\ovwcntv.exe""
Won’t launch:
ASCII "WATCHDOGPROC"
ASCII "C:\Documents and Settings\user\Application Data\nripohbdhnewia\ovwcntv.exe"
As stated above, the ovwcntv.exe binary is the active running process while ctrdyelvupa.exe acts as a safeguarding (or slave) process. Using the WATCHDOGPROC procedure, if ovwcntv.exe is terminated, ctrdyelvupa.exe immediately restarts it. If ctrdyelvupa.exe is terminated, ovwcntv.exe will restart it as well.
WATCHDOGPROC, in its encoded form, is a 13-byte block embedded in the original binary at offset 00022990. During run time, those bytes are used to populate the ECX register. Each byte is then XOR decoded with the first 13 bytes contained in the EAX register, resulting in the ASCII string “WATCHDOGPROC”. This is demonstrated below.
My initial interest in this particular piece of malware, however, was the hardcoded user’s email information that was obfuscated within the binary. I was equally interested in the word list used by the DGA. I wanted to find their embedded location inside the original binary. It was bit of a trek to get there, but persistence paid off in the end. So let’s begin with the encoded user information.
Within the person1.exe binary, the encoded email address was located at offset 00023611, preceded by the encoded uri string (at offset 00023600).
During runtime, this data block was stored in the ESI register as shown below:
Additionally, a similarly sized block of data was dynamically generated and stored in EAX as shown below. In essence, this was the decoding key.
Each byte of ESI and EAX were then run through the following incremental XOR loop…
…producing the decoded URI which included the victim user’s email address as shown below.
XOR EXAMPLE
40 XOR 6F = 2F (‘/‘)
6F XOR 09 = 66 (‘f’)
6F XOR 00 = 6F (‘o’)
56 XOR 24 = 72 (‘r’)
Note: once I isolated the user data (or email address) within the original binary, along with its key, I patched the binary so it would reflect ‘nottherealuser’ vice the name of the actual victim user. This patched binary was then used to obtain the previous examples.
Next, let’s look at the domain name generation. It followed the same scheme as above. A 2800-byte block of hardcoded data from the original binary was stored in ESI. Then an equally sized block of data was dynamically generated and stored in EAX. These two data blocks were run through the same XOR loop producing a list of 384 words. To demonstrate this, the first 48 bytes of the applicable registers are shown below.
XOR EXAMPLE
FB XOR 91 = 6A (‘j‘)
4A XOR 25 = 6F (‘o’)
0A XOR 7F = 75 (‘u’)
30 XOR 42 = 72 (‘r’)
4C XOR 22 = 6E (‘n’)
3F XOR 5A = 65 (‘e’)
46 XOR 3F = 79 (‘y’)
The interesting part about this binary was that while it appeared packed, it wasn’t. Just about everything within the binary (API calls, strings, etc.) was obfuscated and decoded on the fly during run time as needed. Also, it didn’t debug on its own freewill. This became apparent while stepping through the binary and hitting a point where EIP was set to 00000000. To overcome this, the binary was patched at that particular offset by changing the opcode to a jump (EB FE) so that it would loop back to itself during run time. The patched binary was then saved and executed again, causing it to run in an infinite loop. While running, a debugger was attached to the binary. The jump opcode (EB FE) was then changed back to its original opcode (FF 15 in this case) at which time the intended location of that call (address 00409C50) appeared as can be seen in the following debugger excerpt:
At this point, the binary was patched with a call to the newly identified offset by replacing “CALL DWORD PTR DS:[42774C]” (shown above) with “CALL 409C50”. After this, the binary was saved to a new binary (e.g. binary1.exe).
Next, binary1.exe was loaded into a debugger and a break point was set for CreateProcessA. The binary was then run which generated the first copy of the original binary (in this case ovwcntv.exe), but for simplicity’s sake, we’ll call this binary2.exe.
Binary2.exe was then loaded into a debugger and as we did previously, the opcode at the initial point of user code (E8 48 in this case) was changed to EB FE, changing the initial command from CALL to JMP, as follows:
to
Binary2.exe was then saved as itself, overwriting the original binary2.exe. It also created a backup copy (.bak), which was deleted. Then the debugger was closed.
After this, a debugger was reopened, but it wasn’t attached to anything just yet. Returning to the still-opened debugger for binary1.exe, Alt F9 was pressed in order to execute the binary til user code. This caused binary2.exe to run in a loop (due to the aforementioned patch). The newly opened debugger was then attached to binary2.exe, opening the binary in the ntdll module. From the debugger for binary2.exe, Alt F9 was pressed in order to run it til user code. At this point, the opcode EB FE was changed back to its original opcode (in this case E8 48). A breakpoint was then placed for LoadLibraryA and the binary was run again. Stepping through the binary back into the user code led to all the deobufuscation discussed earlier.
Lastly, below is the complete list of words used for the creation of reach back domains, in order of the lists creation (reading left to right):
INSA Whitepaper: Operational Cyber Intelligence
by Andrea Little Limbago
Endgame Principal Social Scientist Andrea Little Limbago is a coauthor of the Intelligence and National Security Alliance’s (INSA) latest whitepaper, Operational Cyber Intelligence. The paper is part of an INSA series that addresses the multiple levels of cyber intelligence. While much focus has been devoted to the tactical level, the strategic and operational levels of analysis have not garnered equal attention. This latest whitepaper discusses the role of operational cyber intelligence, the key bridge between the tactical and strategical levels, and the non-technical and technical domains. It examines the role of operational cyber intelligence in assessing the operating environment, forecasting and assessing adversarial behavior, and concludes with business and mission requirements to develop operational cyber intelligence capabilities.
Visit INSA to read the full whitepaper.