Quantcast
Channel: Endgame's Blog
Viewing all 698 articles
Browse latest View live

Three Questions: Smart Sanctions and The Economics of Cyber Deterrence

$
0
0

The concept of deterrence consistently fails to travel well to the cyber realm. One (among the many) reasons is that, although nuclear deterrence is achieved through nuclear means, cyber deterrence is not achieved solely through cyber means. In fact, any cyber activity meant for deterrence is likely going to be covert, while the more public deterrence activities fall into diplomatic, economic, financial, and legal domains. Less than six months after President Obama  signed an executive order to further expand the range of responses available to penalize individuals or companies conducting “malicious cyber-enabled activities”, there are now reports that it may be put to use in a big and unprecedented way. Numerous news outlets have announced the possibility of sanctions against Chinese individuals and organizations associated with economic espionage within the cyber domain. If the sanctions do come to fruition, it may not be for a few more weeks. Until then, below are some of the immediate questions that may help provide greater insight into what may be one of the most significant policy evolutions in the cyber domain.

1. Why now?  

Many question the timing of the potential Chinese sanctions, especially given President Xi Jinping’s upcoming state visit to Washington. It is likely that a combination of events over the summer in both the US and China have instigated this policy shift:

Chinese domestic factors: China’s stock market has been consistently fallingsince June, with the most visible plunge occurring at the end of August, which has had global ramifications. A major slowdown in economic growth has also hit China, which by some estimates could be as low as 4% (counter to the ~10% growth of the last few decades, and lower than even the recent record low of 7.4% in 2014). The latest numbers from today reinforce a slowing economy, with the manufacturing sector recording a three-year low. Simultaneously, President Xi continues to consolidate power, leading a purge of Communist Party officials targeted for corruption and asserting greater control of the military. In short, President Xi is looking increasingly vulnerable, handling economic woes as well as continuing a political power grab, which has led to two influential generals toresign and discontent among some of the highest ranks of leadership.

US domestic factors: The most obvious reason for the timing of potential US sanctions seems to be in response to this summer’s OPM breach, which has been largely attributed to China. This is just the latest in an ongoing list of public and private sector hacks attributed to China, including United Airlines and Anthem. The OPM breach certainly helped elevate the discussions overretaliation, but it’s unlikely that it was the sole factor. Instead, the persistent theft of IP and trade secrets, undermining US competitiveness and creating an uneven playing field, is the dominant rationale provided. Ranging from the defense sector to solar energy to pharmaceuticals to tech, virtually no sectorremains unscathed by Chinese economic espionage. The continuing onslaught of attacks may have finally reached a tipping point.

The White House also has experienced increased pressure to respond in light of this string of high-profile breaches. Along with pressure from foreign policy groups and the public sector, given the administration’s pursuit of greater public-private partnerships, there is likely similar pressure from powerful parts of the private sector – including the financial sector and Silicon Valley – impacting the risk calculus of economic and financial espionage. For instance, last week, Secretary of Defense Ashton Carter visited Silicon Valley, encouraging greater cooperation and announcing a $171 million joint venturewith government, academia and over 160 tech companies. These partnerships have been a high priority for the administration, meaning that the government likely feels pressure to respond when attacks attributed to the Chinese, such as the GitHub attacks this spring, hit America’s tech giants.

2. Why is this different from other sanctions?

Sanctions against Russia and Iran were in response to the aggressive policies of those countries, while those against North Korea were in response to the Sony breach. However, each of these countries lacks the economic interdependence with the US that exists for China.  Mutually assured economic destruction is often used to describe the economic grip the US and China have on each other’s economies. The United States is mainland China’s top trading partner, based on exports plus imports, while China is the United States’ third largest trading partner, following the European Union and Canada. Compare this to the situation in Russia, North Korea, and Iran, the most prominent countries facing US sanctions, none of which have significant trade interdependencies with the US.

Similarly, foreign direct investment (FDI) between China and the US is increasingly significant, with proposals for a bilateral investment treaty (BIT)exchanged this past June, and discussions ongoing in preparation for President Xi’s visit this month. China is also the largest foreign holder of  US Treasury securities, despite its recent unloading of Treasury bonds to help stabilize its currency. Compare this to Russia, North Korea, or Iran, none of which the US economy relied on prior to their respective sanctions. Even in Iran and Russia’s strongest industry – oil and gas– the US has become less reliant and more economically independent, especially given that the US was the world’s largest producer of oil in 2014.

3. Who or what might be targeted?

If sanctions are administered, the US will most likely continue its use of “smart” or targeted sanctions that focus on key individuals and organizations, rather than the entire country. The US sanctions against Russia provide some insight into the approach the administration might take. Russian sanctions are targeted at Putin’s inner circle, including its affiliated companies. These range from defense contractors to the financial sector to the energy sector, and include close allies such as Gennady Timchenko.  Similarly, North Korean sanctionsfollowing the Sony hack focused on three organizations and ten individuals. In the case of China, the state-owned enterprises (SOEs)deemed to reap the most benefits from economic espionage will likely be targeted. In fact, the top twelve Chinese companies are SOEs, meaning they have close ties to the government. More specifically, sanctions could include energy giants CNOOC, Sinopec and PetroChina, some of the large banks, or the global tech giant Huawei because of their large role in the economy and their potential to benefit from IP theft. Interestingly, the largest Chinese companies do not include several of their more famous tech companies, such as Alibaba, Tencent, Baidu and Xiaomi. Most of these enterprises have yet to achieve a significant global footprint, which means they are less likely to top any sanctions list. In considering who among Xi’s network might be targeted, some point to the Shaanxi Gang, Xi’s longtime friends, while others look at those most influential within the economy, such as Premier Li Keqiang.

Given President Xi’s upcoming visit, is the talk of sanctions diplomatic maneuvering, or will it be backed by concrete action? If enacted, the administration’s intent will be revealed through the actual targets of the sanctions.  If the objective is to deter future cyber aggression, then sanctions must be targeted at these influential state-owned companies and inner circle of the regime.  Otherwise, it will be perceived as a purely symbolic act both in the United States and in China and lack the teeth to truly enact change. 


Meet Endgame at AWS re:Invent 2015

$
0
0

See how we automate the hunt for cyber adversaries.

Stop by Booth #1329 to:

SEE A DEMO OF ENDGAME PRODUCTS

Sign up here for a private demo to learn how we detect attacks that:

  • Use native tools to locate, stage, and exfiltrate customer data
  • Exploit application vulnerabilities to install unknown malware
  • Install backdoors to gain control of critical servers
     

JOIN US AT 1923 BOURBON BAR!

Join Endgame for an evening of bourbon, cigar rolling, and jazz at 1923 Bourbon Bar on Wednesday, October 7. Registration is required to attend. Learn more and register here.

MinHash vs. Bitwise Set Hashing: Jaccard Similarity Showdown

$
0
0

As demonstrated in an earlier post, establishing relationships (between files, executable behaviors, or network packets, for example) is a key objective of researchers when automating the hunt.  But, the scale of information security data can present a challenge if naïvely measuring pairwise similarity.  Let’s take a look at two prominent methods used in information security to estimate Jaccard similarity at scale, and compare their strengths and weaknesses.  Everyone loves a good head-to-head matchup, right?

Jaccard distance is a metric1 that measures the similarity of two sets, A and B, by

where Js denotes the Jaccard similarity, bounded on the interval [0,1].  Jaccard similarity has proven useful in applications such as malware nearest-neighbor search, clustering, and code reuse detection.  In such cases, the sets A and B might contain imported functions, byte or mnemonic n-grams, or behavioral properties observed in dynamic analysis of each file.

Since each datapoint (e.g., malware sample) often consists of many feature sets (e.g., imports, exports, strings, etc.) and each set can itself contain many elements, naïve computation of Jaccard similarity can be computationally expensive.  Instead, it’s customary to leverage efficient descriptions of the setsA and B together with a fast comparison mechanism to compute Jd(A,BorJs(A,B). Minwise Hashing (MinHash) and bitwise set hashing are two methods to estimate Jaccard similarity.  Bitwise set hashing will be referred to in this blog post as BitShred since it is used as the core similarity estimator in the BitShred system proposed for large-scale malware triage and similarity detection.

First, let’s review some preliminaries. First, key ideas behind MinHash and BitShred will be reviewed, with a few observations about each estimator.  Then, these two methods will be compared experimentally on supervised and unsupervised machine learning tasks in information security.

 

MinHash

MinHash approximates a set with a random sampling (with replacement) of its elements.  A hash function h(ais used to map any element a from set A to a distinct integer, which mimics (but, with consistency) a draw from a uniform distribution.  For any two sets A and B, Jaccard similarity can be expressed in terms of the probability of hash collisions:

where the min operator acts as the random sampling mechanism.  Approximating the probability by a single MinHash comparison of A and B is actually an unbiased estimator, but has quite large variance—the value is either identically 1 or 0.  To reduce the variance, MinHash averages over m trials to produce an unbiased estimator with variance O(1/m).

Estimating Jaccard similarity via MinHash is particularly efficient if one approximates h(a) using only its least significant bit (LSB).  This of course, introduces collisions between distinct elements since the LSB of h(a) is 1 with 0.5 probability—but the approximation has been shown to be effective if one uses many bits in the code.  Overloading notation a bit, let a (respectively, b) be the bit string of 1-bit MinHashes for set A (respectively, B). Then Jaccard similarity can be approximated via a CPU-efficient Hamming distance computation (xor and popcount instructions):

It has been shown that the variance of 1-bit MinHash is 2(1-Js)/m when using mtotal bits, and indeed the variance of any summary-based Jaccard estimator has variance at least 1/m.  Interestingly, the variance of b-bit MinHash does not decrease if one uses more than b=1 bits to describe each hash output h(a) while retaining the same number of bits in the overall description.  With a little arithmetic, one can see that to achieve an estimation error of at most ε Js with probability exceeding 1/2, one requires m > (1-Js)/ (ε Js)2 bits of 1-bit Minhash, by Chebyshev’s inequality.

Code (golang) to generate a 1-bit MinHash code and approximate Jaccard similarity from two codes is shown below.

func Hash64(s string, seed uint64) uint64
func PopCountUint64(x uint64) int

func OneBitMinHash(set []string, N_BITS int) []uint64 {
  code := make([]uint64, N_BITS/64)
  var minhash_value uint64
  for bitnum := 0; bitnum < N_BITS; bitnum++ {
    minhash_value = math.MaxUint64
    for _, s := range set {
      minhash_tmp := Hash64(s, uint64(bitnum)) // bitnum as seed
      if minhash_tmp < minhash_value {
        minhash_value = minhash_tmp
      }
    }
    whichword := bitnum / 64   // which uint64 in the slice?
    whichbit := bitnum % 64    // which bit in the uint64?
    if minhash_value&0x1 > 0 { // is the bit set?
      code[whichword] = code[whichword] | (1 << uint8(whichbit))
    }
  }
  return code
}

func JaccardSim_OneBitMinHash(codeA []uint64, codeB []uint64) float64 {
  var hamming int
  N_BITS := len(codeA) * 64
  for i, a := range codeA {
    hamming += PopCountUint64(a ^ codeB[i])
  }
  return 1.0 - 2.0*float64(hamming)/float64(N_BITS)
}

 
BitShred: Bitwise Set Hashing

Feature hashing is a space-efficient method to encode feature-value pairs as a sparse vector.  This is useful when the number of features is a priori unknown or when otherwise constructing a feature vector on the fly.  To create an m­-dimensional vector from an arbitrary number of feature/value pairs, one simply applies a hash function and modulo operator for each feature name to retrieve a column index, then updates that column in the vector with the provided value.   Column collisions are a natural consequence in the typical use case where the size of the features space n is much larger than m.

BitShred uses an adaptation of feature hashing in which elements of a set are encoded as a single bit in a bit string.  Since m<<n, a many-to-one mapping between set elements and bit locations introduces collisions.  A concise bit description of set A is created by setting the bit at [h(amod m] for all elements a in A.  Overloading notation again, let a (respectively, b) be the BitShred description of set A (respectively, B).  Then Jaccard similarity is estimated efficiently by replacing set operators with bitwise operators:

To make sense of this estimator, let random variable Ci denote the event that one or more elements from each set A and B both map to the ith bit.  Similarly, let random variable i  denote that one or more elements from either set A orB (or both) map to the ith bit.  Then, the BitShred similarity estimator Js can be analyzed by considering the ratio

which is simply the (noisy, with collisions) sum of the intersections divided by the sum of the union.  Estimating the bias of the ratio of random variables will not be detailed here.   But, note that due to the many-to-one mapping, the numerator generally overestimates the true cardinality of the set intersection, while the numerator underestimates the true cardinality of the set union.  So, without cranking laboriously through any math, it’s easy to see from the ratio of “too big” to “too small” that this estimator is biased2, and generally overestimates the true Jaccard similarity.

Code (golang) to generate a BitShred code and estimate Jaccard similarity from two BitShred codes is shown below.

 

func Hash64(s string, seed uint64) uint64
func PopCountUint64(x uint64) int

func BitShred(set []string, N_BITS uint16) []uint64 {
  code := make([]uint64, N_BITS/64)
  for _, s := range set {
    bitnum := Hash64(s, 0) % uint64(N_BITS)
    whichword := bitnum / 64  // which uint64 in the slice?
    whichbit := bitnum % 64   // which bit in the uint64?
    code[whichword] = code[whichword] | (1 << uint8(whichbit))
  }
  return code
}

func JaccardSim_BitShred(codeA []uint64, codeB []uint64) float64 {
  var numerator, denominator int
  for i, a := range codeA {
    numerator += PopCountUint64(a & codeB[i])
    denominator += PopCountUint64(a | codeB[i])
  }
  return float64(numerator) / float64(denominator)
}

 

 
Estimator Quality

The math is over; let’s look at some plots.

This plot shows the estimated vs. true Jaccard similarity for MinHash and BitShred, for the contrived case where sets and B consist of randomly generated alphanumeric strings, |A|=|B|=64, and the number of bits m=128.  The mean and 1 standard deviation error bars are plotted from 250 trials for each point on the similarity graph.  The y=x identity line (dotted) is also plotted for reference.

A few things are evident. As expected, MinHash shows its unbiasedness with modest variance.  BitShred is grossly biased, but has low variance.   Note however, that the variance of both estimators vanishes as similarity approaches unity.  In many applications such as approximate nearest-neighbor search, it’s the consistent rank-order of similarities that matter, rather than the actual similarity values.  In this regard, one is concerned about the variance and strict monotonicity of this kind of curve only on the right-hand side, whereJs  approaches 1.  The extent to which the bias and variance near Js=1 play a role in applications will be explored next.

 

Nearest Neighbor Search

So, what about nearest-neighbor search?  Let’s compare k-NN recall.

As a function of neighborhood size k, we measure the recall of true nearest neighbors, that is, what fraction of the true k neighbors did we capture in our ­k­-NN query?  The plot above shows recall vs. k averaged over 250 trials with one standard deviation error bars for MinHash vs. BitShred.  The same contrived case is used as before, in which sets and B consist of randomly generated alphanumeric strings, |A|=|B|=64, and the number of bits m=128.  While it’s mostly a wash for small k, one observes that the lower-variance BitShred estimator general provides better recall.

Note that in this toy dataset, the neighborhood size increases linearly with similarity; but in real datasets the monotonic relationship is far from linear.  For example, the first 3 nearest neighbors may enjoy Jaccard similarity greater than 0.9, while the 4th neighbor may be very dissimilar (e.g., Jaccard similarity < 0.5).

Applications: Malware Visualization and Classification

Let’s take a look at an application.   In what follows, we form a symmetric nearest neighbor graph of 250 samples from each of five commodity malware families plus a benign set, with k=5 nearest neighbors retrieved via Jaccard similarity (MinHash or BitShred).  For each sample, codes are generated by concatenating five 128-bit codes (640 bits per sample) consisting of a single 128-bit code for each of the following feature sets extracted from publicly available VirusTotal reports:

  • PE file section names;
  • language resources (English, simplified Chinese, etc.);
  • statically-declared imports;
  • runtime modification to the hosts file (Cuckoo sandbox); and
  • IP addresses used at runtime (Cuckoo sandbox).

A t-SNE plot of the data—which aims to respect local similarity—for MinHash and BitShred are shown below. (I use the same random initialization for both plots.)

Figure 1: MinHash similarity from k=5 symmetric similarity matrix

Figure 2: BitShred similarity from k=5 symmetric similarity matrix

The effects of BitShred’s positive bias can be observed when comparing to the MinHash plot.  It’s evident that BitShred is merging clusters that are distinct in the MinHash plot.  This turns out to be good for Allaple, but very bad for Ramnit, Sality and benign, which exhibit cleaner separation in the MinHash plot.  Very small, tight clusters of Soltern and Vflooder appear to be purer in the BitShred visualization. Embeddings produced from graphs with higher connectivity (e.g., k=50) show qualitatively similar findings.

For a quantitative comparison, we show results for simple k-NN classification with k=5 neighbors, and measure classification performance.  For MinHash the confusion matrix and a classification summary are:


And for BitShred:

In this contrived experiment, the numbers agree with our intuition derived from the visualization: BitShred confuses Ramnit, Sality and benign, but shows marginal improvements for Soltern and Vflooder.

 

Summary

MinHash and BitShred are two useful methods to approximate Jaccard similarity between sets with low memory and computational footprints.  MinHash is unbiased, while BitShred has lower variance with nonnegative bias.  In non-extensive experiments, we verified intuition that BitShred overestimates Jaccard similarity, which can introduce errors for visual clustering and nearest-neighbor classification.  In our contrived experiments (which also plays out in practice), this caused confusion/merging of distinct malware families.

The bias issue of BitShred could be partially ameliorated by using neighbors that fall within a ball of small radius r, where the BitShred bias is small.  (This is in contrast to k-NN approaches in which similarities in the “local” neighborhood can range from 0 to 1, with associated bias/variance.) 

Finally, the Jaccard metric represents a useful measure of similarity.  There are many others based on common or custom similarity measures, which may also be approximated by Hamming distance on compact binary codes.   These, together with efficient search strategies (also not detailed in this blog post) can be employed for powerful large-scale classification, clustering and visualization.

1How can one show that Jaccard  distance is really a metric?  Nonnegativity, coincidence axiom, and symmetric properties? Check, check and check.  But, triangle inequality?  Tricky!  Alternatively, one can start with a known metric—the symmetric set difference between A and B—then rely on the Stenhaus Transform, to crank through the necessary arithmetic and arrive at Jaccard distance.

2One may reduce the bias of BitShred by employing similar tricks to those used in feature hashing. For example, a second hash function may be employed to determine whether to xor the current bit with a 1 or a 0. This reduces bias at the expense of variance.  For brevity, I do not include this approach in comparisons.

Read more blog posts about Data Science.

Follow Hyrum on Twitter @drhyrum.

Webinar: Automating the Hunt for Network Intruders

$
0
0

As adversaries - whether criminal or otherwise - make use of increasingly sophisticated attack methods, network defenses have not kept pace; they remain focused on signature-based, reactive measures that close the barn door after the horses have escaped. Automated threat detection offers the opportunity for truly proactive network defense, by reducing the amount of time an intruder remains undetected and introducing remedies earlier than otherwise possible. Automation can also enable better use of scarce resources and reduced exposure to network-based threats. This webcast discusses how to automate the hunt for network threats and move an organization's security posture to the next level.

Sign up for this SANS webcast and be among the first to receive an advance copy of a SANS whitepaper discussing the automation of threat detection.Register here.

Empty Promises, Broken Memes: Why Skepticism Should Prevail When It Comes to Sino-American Cooperation

$
0
0

Last week’s understanding reached between Chinese President Xi Jinping and US President Barack Obama highlighted the attempt to mitigate the growing tension between the countries over espionage. In response, a series of commentaries applauded the agreement for its deterrent effect, and view it as a sign of détente or simply a good first step. This agreement, coupled with Xi’s meeting with top US CEOs, has been interpreted as growing collaboration in both the public and private domains. In contrast, as yesterday’s Senate hearing exemplified, many in and out of the national security view it as a hollow agreement that will not alter Chinese behavior in the cyber domain. Below are three key areas that, when analyzed, illustrate the need to maintain a healthy dose of skepticism when it comes to Sino-American relations in the cyber domain.

 

An Inflated Threat

Many contend that the Chinese threat to US interests in the cyber domain is inflated because there has yet to be physical destruction as a result of malicious digital activity, or because China has yet to convert the stolen information to their advantage. These arguments often rely on the nebulous term cyber war, which is the wrong gauge of the threat to US national interests. The absence of war does not imply peace. In contrast, conflict in the cyber domain is very similar (although dramatically different by the three Vs: velocity, volume, and variety) to economic conflict of the mercantilist era, where economic warfare was an extension of politics and part of the escalatory path to military conflict. For instance, it’s quite unlikely that Lockheed Martin or Dupont (among numerous private and public organizations) would agree that the Chinese threat is inflated. Similarly, while there has not been physical destruction, intrusions into critical infrastructure already exist and could lead to sabotage during times of heightened tensions. Similarly, the aggregation of health recordsbackground checks, and travel records, to name a few, together provide a vast network view of US citizens that can be used for recruitment, blackmail, and exploitation of vulnerabilities. Just because the full extent of the possible has not occurred, it does not imply that the preparation of the operating environment is not well underway.  

 

The Tech Community Embraces China

From Cloudflare’s venture with Baidu to Microsoft’s partnerships with politically connected Chinese companies to Google’s latest partnership with Huawei to make the Nexus 6P, one might believe the tech community is openly embracing the world’s largest market. However, the growing concerns of US companies over IP theft and increased restrictions on doing business in China, have led to relations that are increasingly deteriorating. Last week’s forum in Seattle organized by Xi to bring together Chinese tech CEOs with their US counterparts illustrates these growing tensions as well as challenges with doing business in China. For instance, there were notable absentees on the invite list, which is apparent in the forum’s class picture, which lacks Google, Twitter and Uber CEOs. Moreover, this forum normally does not require CEO level attendance, but China threatened regulatory scrutiny that would negatively impact the organizations if the invited companies did not send CEO level representatives, rendering this a mandatory forum if the companies did not want to potentially encounter retaliation. Furthermore, this summer’s announcement that China will be inserting cybersecurity police into tech companies is indicative of their ongoing push for greater control of the internet, which runs counter to the internet freedoms and global norms promoted by the US government and tech companies alike. The tech community increasingly is coming to grips with the tradeoff of access to the largest market with the acknowledgement that the Chinese government could exploit their technologies as part of its ongoing censorship campaign. In addition, China’s crackdown on VPN access, and use of US partnerships to build domestic competitors is evidence of the Chinese strategy to replace all foreign technologies with domestic counterparts by 2020. This is hardly the warm embrace corporations seek.

 

Deterrence & Credible Commitments

The notion that last week’s agreement could be a deterrent fails to comprehend that deterrence depends on credible commitments, which are strongly lacking in the Sino-American relationship. Xi’s stance that China does not steal IP or PII from the US, despite the ever-growing list of intrusions, sparks little confidence when it comes to his ability to commit to the agreement. Those in the national security community as well as tech community have a hard time taking him at his word.  This skepticism is expounded when noting that the agreement was negotiated while under the threat of sanctions. Leaders are self-interested actors, and Xi was able to shape the agreement to stall (temporarily?) sanctions while enabling him to maintain his stance that China does not conduct cyber espionage. Finally, the agreement not only lacks any compliance mechanism, but it also fails to address the theft of PII and is nebulous in so many areas that the Chinese government can easily continue to lean on proxy actors in and outside of government to feign ignorance regarding any upcoming identification of an intrusion. Clearly, this is not what is meant when discussing deterrence, as there has been little to no impact on the decision calculus of the Chinese, which is at the core of successful deterrence.

Discussion of détente is as ridiculous as comparing Chinese open economic policies to Glasnost, or their anti-corruption campaign to Perestroika. Obviously, it’s important diplomatically to seek to prevent the growing intrusions, but it’s naïve to believe this might be the first step at achieving a deterrent effect. As yesterday’s Senate Armed Services Committee hearing demonstrated, there is little faith in the agreement, and therefore it likely will soon be forgotten as soon as the next major breach is revealed. In that regard, the aspect of the past week that may have the longest media cycle is not so much the idea of a plausible détente, but rather the attire of Silicon Valley’s CEOs, who stunned the Twitter-sphere by proving they do in fact own suits.

To Patch or Not to Patch? The Story of a Malicious Update

$
0
0

While it’s unlikely that Shakespeare had patching in mind when he penned “to be or not to be”, I started thinking about this seemingly simple question the other day when I heard about a recent Microsoft out-of-cycle patch (which means that Microsoft pushed out a critical patch outside of its regular “patch Tuesday”). Patching is always a good idea, but not all patches are created equal - especially if they are received via email.  Those are most likely always malicious.  Since some readers of this blog may not have experienced this first-hand, I’d like to share an example of a malicious campaign with you and explain how a link to a malicious binary, spread via email under the guise of a Microsoft update, can have a catastrophic effect.

The malware link shown below in Figure 1 was spread via an email phishing campaign that purportedly had a cyber spin to it.  In the body, the email notified the recipient that an urgent update was necessary and included a hyperlink to a nefarious link, and the link still hosted the malware so it was easy to obtain for analytical purposes.  It’s important to note that while I’m not at liberty to disclose the actual email, the malicious link was reported by urlquery.net, an online service for testing and analyzing URLs for malicious content, making an excellent resource for researchers (portions of the link have been redacted for security reasons).

 


Figure 1: urlquery.net displaying a malicious link to malware

After obtaining the malicious ZIP file, closer inspection revealed the compressed archive file contained an executable binary named ‘Mse-Security.Update.exe’ (binary icon included below)—and this is where our story begins.

                                                       

Upon execution of the binary, the unknowing victim wouldn’t notice anything unusual.  However, underneath the operating system’s hood, the story is quite different.  The user probably wouldn’t have been the wiser that several artifacts were dropped on their system to include a persistent executable binary (.exe) along with two dynamic link library (.dll) files on the victim host.  Nor would the user have known it would capture keystrokes, web activity and document all running applications.  Not only that, the user wouldn’t have been aware that the malware was also grabbing screen captures of their desktop, then disseminating all this collected data back to its Command and Control (CnC) server via FTP.  And this is only the beginning.  The real fun comes when we start dissecting this in greater detail.

‘Mse-Security.Update.exe’ is a dropper that drops four files contained within a newly created directory named ‘LMCAEC’.  This directory is created with System Hidden attributes and it resides in the application data directory for all users.  Here is how it appears on Windows XP:

 

            Figure 2: LMCAEC Directory Tree

The binary PLG.exe is the persistent implant which has a registry RUN key to insure it runs at startup.  It also has a unique icon. Some will recognize this as the Ardamax Keylogger (see below).

 

Upon execution, it captures, encodes, then writes four different types of stolen data.  Each of these data types gets stored in another directory named IGW.  This too resides within the same all users path.  The contents of these files are continually written to.  Figure 3, below, shows an example of the directory as well as the type of data stored within each file:

C:\Documents and Settings\All Users\Application Data\IGW

PLG.001                                                          (keys)

PLG.002                                                          (web)

PLG.004                                                          (apps)

screen_[datetimestamp].005                       (screen capture)

Figure 3: IGW Directory Tree

 

There’s also some network activity, but we’ll look at that in a moment.  First, let’s take a look at the stolen data files.  At first glance PLG.001, PLG.002 and PLG.004 look similar, but obfuscated.  After closer inspection, however, a few things jump out.

The first thing I noticed was the appearance of an every other byte pattern consisting of the same two-bytes (see red highlights in Figure 4).  These turned out to be extra bytes thrown in probably for obfuscation.   These extra bytes begin appearing regularly at offset 0x12, but they also appear in the first dword (or 4-bytes) of the file (also highlighted in red). 

Second was a 2-dword (or 8 byte) separator, or delimiter (see green highlights in Figure 4).  The first dword of the separator consist of null bytes, while the first byte of the second dword contains the length (in hex) of the data segment to follow (i.e. 0x8a = 138; 0x 6A = 106; 0x 4A = 74).  These data segments are the encoded stolen data.  Additionally, with the exception of the first segment, each concluded with a dword bitmask (see black highlights).

More importantly, however, was the encoding key.  The stolen data is encoded with a 2-byte xor key found interspersed with the extra bytes within the first dword of the file (see blue highlights in Figure 4).  Once the extra bytes are removed, this 2-byte key can be applied to decode the data.  I’ll expand upon this in more detail shortly.   

Figure 4: PLG.001 Encoding Schemes (with decoded data)

Scripting a decoder gives you a quick peek at the stolen data from a command line, as shown in Figure 5.

Figure 5: PLG.001 output from a python decoder

Interestingly, analysis of PLG.004 revealed it followed the exact obfuscation scheme of PLG.001, whereas the extra bytes and encoding key were flip-flopped within the first dword of PLG.002.  Figure 6 illustrates this by highlighting their respective ‘xor keys’ in blue and their ‘extra bytes’ in red.  Notice too that these values are different for each file – they are variables created on the fly during run time.  Another interesting piece to these files can be found at offsets 0x11 through 0x21.  Every other byte decodes to ‘Wonderful’ (see highlighted green below—did you spot that previously in Figure 5?).

Figure 6: File Comparison (key in blue, extra bytes in red)

After examining the three files above, I took a quick peek at the config file ‘PLG.00’ in the LMCAEC directory.  It too began with the same encoding scheme described above, but it followed the path of PLG.002 in that the first dword was:  extra byte : key : extra byte : key (see Figure 7).  Moreover, once the file is decoded, two interesting strings appear between offsets 0x121 and 0x14B: ‘a5XXXX64’ and ‘metXXXXXXt85’ (portions intentionally redacted with Xs).  These strings can be seen below in the pop up box in Figure 7, but we’ll come back to them momentarily.

Figure 7: PLG.00 Decoded

First, let’s get back to the files in the IGW directory for a moment.  These eventually have the ‘PLG’ in their name replaced with a date time stamp.  For example, PLG.001 becomes something like 2015-04-08_13-01-23.001.  These files are still encoded as described above; however, within milliseconds of their respective name changes, they are decoded and the contents are added to an html page for exfiltration.  The html page has the same basic naming convention, but the filename is prepended with a ‘flag’ indicating the type of data contained within the html page.  These flags are: App, Keys, Web and Screen.  This means our file ‘2015-04-08_13-01-23.001’ becomes ‘Keys_2015-04-08_13-01-23.html’.  Figure 8 demonstrates their respective naming conventions before and after.

Figure 8: Stolen Data Files (naming convention: before and after)

The html files are then exfiltrated via FTP to the CnC server ‘aymail[.]site11[.]com’ logging in with the credentials: username ‘a5XXXX64’ and password ‘metXXXXXXt85’ (remember those from our PLG.00 file?).  This login can be seen in Figure 9.

Figure 9: FTP Login Session

Once these are pushed to the CnC server, both versions of these files are deleted from the system.  Interestingly, the html files are in the clear as can be seen in Figures 10 through 13, detailing examples of the exfiltrated data.  This data loss could be quite damaging depending on the unsuspecting user’s activity.

Figure 10: Keystrokes by the Victim User

Figure 11: Websites Visited by the Victim User

Figure 12: Applications Used by the Victim User

Figure 13: Exfiltrated Screen Capture of the Victim User’s Desktop

And there you have it – a day in the life of a malicious update; one that updated nothing except the attacker’s stolen data repository.  Before signing off though, I’d like to leave you with a chronological snippet of the malware during runtime.  Table 1 details the dropper and implant along with their respective operations and results.  The chronology is followed by the file identifying hashes of the malware discussed within this post.  Until next time--patch smartly!

Table 1: Chronological Gist of Malware During Runtime

File Identifiers:

 

If this sort of analysis interests you, check out our Senior Malware Research Scientist position. We are always looking for great malware research talent to join the Endgame team!

The State of the State: Tech & Data Science

$
0
0

A few years ago Jeff Hammerbacher famously claimed  that, “The best minds of my generation are thinking about how to make people click ads.” This seems to have only marginally changed with teams of data scientists in Silicon Valley often devoted to discovering solutions that yield indiscernible improvements within a broader range of recommender engines. In large part, data science within the tech community remains focused on e-commerce and the sharing economy – which largely are at the point of diminishing returns from a customer’s perspective –instead of disrupting industries such as healthcare, education or security. This general lack of integration of data science innovations into products in other realms is anecdotally reinforced at the various data science focused conferences, which overwhelmingly present the incremental changes to driving times, deliveries, or more targeted shopping experiences. Areas awash with data at scale – such as security – rarely even garner a blip on the radar at data science-focused tech conferences.

The failure of data science to extend significantly into products in new industries may be a major contributing factor when looking at data science within the 2015 Gartner Hype Cycle for Emerging Technologies. The 2015 Hype Cycle divides the various approaches within data science, placing each of them just before or after the peak of inflated expectations, including machine learning and NLP. Interestingly, digital security remains in the innovation trigger phase, highlighting the great opportunities that exist in the security space.

 

 

Below is a quick synopsis of some observations from a range of data science and technology focused conferences I’ve attended on both coasts this year. In short, Hammerbacher’s admonitions are as relevant today as they were a few years ago. However, this does not need to be the case, with great opportunities for data science to disrupt the security industry.

 

Current State of Data Science

  • Much of the Same: Targeted marketing continues to prevail, with emphasis on fine-tuning the already refined and complex algorithms for better shopping experiences and search results within sites.
  • Diminishing Returns: Large teams are focused on incremental improvements to the user experience, creating an ever bigger void between what users understand is being done with their personal data and the reality. Much of this also focuses on social media mining for marketing and e-commerce purposes.
  • Black Box Approach: Hailed by the Harvard Business Reviewas the sexiest job of the twenty-first century, there are signs that many believe current work by data scientists will soon be automated or simply is not the silver bullet as it has been portrayed in marketing and media materials. The prevalent mentality belittles domain expertise of the data and/or data science techniques in favor of a black box approach. This impacts the frequency and kind of data collected, what questions can be addressed with the data, or even the theoretical validity of the multitude of correlations that are bound to occur given a large-scale data environment.
  • Chasing Fads: The majority of data science research and development focuses on edge-cases to solve niche problems, instead of the majority of the problems that would have the biggest disruption across an industry.  While the technology may be novel and groundbreaking, it actually may provide little utility for a product. Theoretically interesting breakthroughs that fail to be relevant for a product remain stove piped in the Ivory (or Silicon) Tower.
  • General misperception of data science: The less technical conferences with sections on data science or big data generally exhibit lengthy Q&A sessions, which exhibit the ongoing struggles of those outside of the field to comprehend how data science might be applied within their company or industry. In many cases, companies have hired data scientists but aren’t really sure what to do with them. The media portrayal does not help in this regard, arguing that BI tools can serve as nextgen data science.

 

Data Science’s Next Disruption:  Security

The Gartner Hype Cycle for Emerging Technologies’ bleak outlook for data science highlights the necessity for data science to expand into products in industries beyond the e-commerce, sharing economy, and marketing realms. These markets have greatly benefited from machine learning and other data science techniques, but could very well be at the point of diminishing returns. In contrast, the security community – which is ultimately a key player in both the protection of individual privacy as well as economic and national security – greatly underachieves in integration of vetted and advanced data science techniques into commercial software. The vast majority of security products are based on rules and signatures, which are tenuous and fail to scale or generalize to current environments. While there is arguably a growing emphasis on quantitative approaches to security research, these remain one-off services, with very few actually making their way into products that could truly disrupt an industry that remains focused on Cold War, perimeter based mindsets.

There are great opportunities for data science to play a critical role in the next generation of security research and product instantiation. There is untapped potential for the application of anything from machine learning to natural language processing to dynamic, Bayesian approaches that can be automatically updated with prior and additional knowledge. Similarly, the socio-technical interplay is another under-explored area. For instance, the time series econometric models could help inform repeatable and scalable risk assessment frameworks. Finally, there is the unfortunate perception that security related work is orthogonal to individual privacy. In fact, data science algorithms should help inform the next wave of privacy features – ranging from encryption to fraud detection to preventing the extraction of personally identifiable information by malicious actors.

 

Join Us at the Data Mining in Cybersecurity Meetup

Data science within security is admittedly difficult, with low tolerance for errors and few open datasets for training and testing. These challenges, however, make the work that much more rewarding and impactful. Endgame’s data science and research and development teams are increasingly pursuing many of the established and bleeding edge techniques in data science across a wide range of data feeds. If you’d like to meet some of the team and hear more about our research, we’ll be hosting the Datamining in Cybersecurity Meetup in San Francisco on November 12th.

Adobe Flash Vulnerability CVE-2015-7663 and Mitigating Exploits

$
0
0

Today Adobe released a patch for CVE-2015-76631 that addresses a vulnerability we discovered in Flash Player.

The vulnerability exists due to the improper tracking of freed allocations associated with a “Renderer” object when handling multiple progress bar additions. This can be forced to overflow a Bitmap object corrupting adjacent memory. As we will discuss later, we originally exploited this bug in the lab using the common Vector length corruption target.

In this post I wanted to focus on mitigating the exploitation of Flash Player, and the challenges associated with it, instead of the traditional look at this particular vulnerability in detail.

But first, a little insight into why we see Flash in APT campaigns and exploit markets. From the attacker's perspective, Flash is an amazing access capability.

1.     It’s cross platform

2.     It’s cross browser

3.     It can be embedded in other documents and formats

4.     It has a very rich programming language available

5.     It’s easy to fuzz

6.     There is so much code, vulnerabilities are sure to shake out

Because of this, attackers know that a good Flash exploit can give them reliable access to Windows, Linux, OS X, and Android systems through Chrome, Firefox, IE, Reader, Office, and more! For these reasons Flash exploitation is valuable, and will continue to be so. One capability could easily cover a large majority of all Desktop targets. 

 

Adobe Vector length corruption technique

The Vector corruption technique we used to exploit CVE-2015-7663 has been publicly known since at least 20132. It is a classic exploitation concept that provides a few “nice to haves” as an exploit writer. 

1.     Corruption gives the attacker read/write of virtual memory

2.     You can allocate arbitrary sizes

3.     It is resistant to corruption and application crash

4.     There is no validation or protection of its contents

Due to the popularity of this technique over the past year, we have seen a rise3 in Flash exploits4 using it.

 

Vector Isolation

In response to the widespread use of this technique, Adobe has strengthened the security posture of Flash by adding two defenses to help reduce the effectiveness of zero day exploits5.

Previously, Flash used a single heap for allocating all ActionScript objects. Doing this allowed an attacker to target a Vector objects length property when overflowing an adjacent buffer, by coercing the allocation algorithm to position different objects consecutively. Doing this after corrupting memory gave attackers read and write access to virtual memory, making it simple to bypass ASLR and execute code. Here is a graphic showing this memory layout.

 

 

However, starting with Flash Player version 18.0.0.2096, Adobe has made this more difficult.

Now, Flash Player allocates Vector objects in the default runtime heap, instead of the heap associated with ActionScript interpretation. This effectively removes the ability to coerce the allocator into creating adjacent blocks of memory for an attacker to use when corrupting the “length” property. The memory layout before now looks more like this.

 

 

 

This concept of moving specific allocations into separate heaps is called “Heap Isolation”.

The idea of isolating heap allocations for security purposes is not new. In fact, Microsoft7, Mozilla8, and others, have been pursuing the idea for years. This approach disrupts some of the steps in the exploitation process that are typical of almost all exploits leveraging memory corruption vulnerabilities.

1.     Allocate memory linearly in a predictable order of specific sizes

2.     Free a subset of those allocations creating predictable “holes”

3.     Trigger a vulnerability that falls into the predictable locations and corrupts adjacent memory allocated in step 1.

Heap isolation effectively breaks Step 1 and 3 by placing certain objects in isolated sections of memory. For example, only Vector objects can be allocated in Heap A, and only ByteArray objects can be allocated in Heap B. Never together. Thus ensuring an attacker cannot allocate memory of another type adjacent to them making it impossible to control virtual memory enough to corrupt adjacent allocations of interest.

 

Vector Property Guarding 

Additionally, because moving an allocation is not sufficient in some cases, the Vector object now contains a precomputed value9, often called a “cookie”, that is checked for consistency before the length property is used. If an attacker corrupts this cookie the Flash application aborts and alerts the user. This breaks #4 in the “nice to haves” we discussed earlier.

Unfortunately, we know from experience that preventing a specific technique, such as Vector length corruption, will not stop attackers10. Instead, we see a new technique pop up with the same characteristics I have outlined above. There are likely many more “Vector like” objects available in ActionScript/Flash Player beyond ByteArrays and attackers have already found them.

 

Endgame Heap Isolation

Beyond vulnerability discovery, Endgame's Vulnerability Research and Prevention team is also focused on mitigating and detecting exploits.  We provide protections to customers against whole classes of attacks, without the need for source. We feel particularly aligned to do this because of our extensive experience discovering and exploiting software vulnerabilities.

One of our first research efforts focused on generic enforcement of heap isolation. Instead of enforcing isolation on specific objects like Vectors, we apply it to every object that fits our criteria. This is particularly well suited for prevention of the previously described techniques, as well as vulnerability classes like Use-After-Frees (UAF).

An attacker exploiting a UAF must reallocate a different object into the freed memory location when an object has been released. This reallocation is what eventually gains the attacker code execution by controlling the function pointers in an object.

Forcing heap isolation ensures the attacker can only reallocate the original object, effectively preventing exploitation. The illustration below helps to visualize this effect.

 

Before

 
After

 

This can be a powerful mitigation against specific bug classes and in our testing it has been proven to prevent a large portion of reported vulnerabilities. But we can do more. 

 

Endgame Control Flow Integrity

In addition to heap isolation we can also enforce control flow integrity (CFI) policies on an application. Whereas heap isolation can be very effective at preventing successful exploitation, a CFI based approach additionally allows us to detect active exploitation attempts since we are inspecting and validating when control flow –- the path that an application executes -- has changed. In the majority of exploits we have studied, there is a point when the attacker must “hijack” control of the process to begin executing a ROP chain – used to bypass DEP -- or arbitrary code.

To accomplish this, Endgame has adapted and expanded on the idea of utilizing processor functionality to determine the exact moment when this happens. Inspired by a novel approach published by researchers at Fudan University 11,12, we leverage CPU branch misprediction13, allowing us to introduce control flow integrity policies without expensive binary modifications to a target application such as hooking or dynamic instrumentation. 

We have extended this technique to work on both Linux and Windows 64-bit operating systems and have used it to detect our exploitation of CVE-2015-7663 as well as others, including CVE-2014-055614 and the exploit used in the APT campaign Russian Doll CVE-2015-304315.

The following output shows our system catching the exploitation of CVE-2014-0556 on a 64-bit Linux host.

  libpepflashplayer.so ----> libpepflashplayer.so                    libpepflashplayer.so

 

The FROM_IP in the anomolous branch detection is the point when the exploit has control over execution.

 

 

The TO_IP is the beginning of the payload. In this case no ROP is used which would be missed by ROP only detection methods.

 

 

The following screenshots shows the full system preventing this exploit in real-time.

 

This work is exciting, as it has already shown its effectiveness at comprehensively detecting unknown exploits regardless of the specific technique used by observing abnormal program execution indicative of exploitation.

 

Conclusion

We know from experience that vulnerabilities and exploits will continue to make headlines. With the ubiquity of Flash and its high value, attackers will invent creative ways to exploit bugs. We have already seen how Adobe’s recent mitigations are a great step forward, but are not keeping pace with the attackers' ability to exploit vulnerabilities. We understand it’s an iterative process that eventually poses a significant limitation to attackers, but there is still a long way to go.

Endgame is working hard to defend against advanced attacks on all software by developing cutting edge mitigations that work in tandem with strong vendor protections, affording the end user better defense in depth. Our unique experience allows us to test real exploits against real software, something we find necessary to providing adequate protections. 

Look for future posts where we cover additional mitigations and share more vulnerabilities!

 

References

[1] https://helpx.adobe.com/security/products/flash-player/apsb15-28.html

[2] https://sites.google.com/site/zerodayresearch/smashing_the_heap_with_vector_Li.pdf

[3] http://krebsonsecurity.com/2015/07/third-hacking-team-flash-zero-day-found/

[4] https://www.fireeye.com/blog/threat-research/2015/06/operation-clandestine-wolf-adobe-flash-zero-day.html

[5] http://googleprojectzero.blogspot.com/2015/07/significant-flash-exploit-mitigations_16.html

[6] https://helpx.adobe.com/security/products/flash-player/apsb15-19.html

[7] https://labs.mwrinfosecurity.com/blog/2014/06/20/isolated-heap-friends---object-allocation-hardening-in-web-browsers/

[8] http://robert.ocallahan.org/2010/10/mitigating-dangling-pointer-bugs-using_15.html

[9] http://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html

[10] http://blog.trendmicro.com/trendlabs-security-intelligence/latest-flash-exploit-used-in-pawn-storm-circumvents-mitigation-techniques/

[11] http://ipads.se.sjtu.edu.cn/_media/publications:perf-apsys.pdf

[12] http://ipads.se.sjtu.edu.cn/_media/publications:cfimon.pdf

[13] https://en.wikipedia.org/wiki/Branch_predictor

[14] http://googleprojectzero.blogspot.com/2014/09/exploiting-cve-2014-0556-in-flash.html

[15] https://www.fireeye.com/blog/threat-research/2015/04/probable_apt28_useo.html


Beyond Privacy: Trans-Pacific Partnership & Its Potential Impact on the Cyber Domain

$
0
0

For months, there has been sharp criticism of the secret negotiations surrounding the Trans-Pacific Partnership (TPP), which is on track to becoming the world’s largest trade agreement covering 40% of the global economy. If implemented by all twelve countries involved, this trade agreement would have profound geo-political consequences, largely driven by the exclusion of China from the agreement. As the White House website states, the TPP is a means to rewrite the rules of trade, otherwise “competitors who don’t share our values, like China, will step in to fill that void.” Clearly, in addition to the pursuit of trade openness, this agreement is a major geo-political tool to shape global norms to the US advantage. The geo-political consequences – which may very well play out in the cyber domain – have been all but ignored by the tech community, which has focused almost entirely on the agreement’s privacy implications. This is unfortunate and leads to myopic conversations that ignore the agreement’s larger implications for the tech community, and specifically cybersecurity. While Internet privacy absolutely is a high priority, these arguments are completely misplaced. Instead, the tech community – and especially security – should be very wary of how China may respond to this open pursuit of economic containment. In the least, in the short term it does not bode well for the cooperative agreement made last month between China and the US. To help briefly fill this void, below is a cheat sheet of sorts for those unfamiliar with trade agreements, and the TPPs more probable implications in the cyber domain.

  

  1. Misplaced criticism – Critiques focused solely on the TPP and its potential infringement on digital privacy are simply misplaced. In fact, based on the few aspects of the agreement that address the digital domain, the intent is to protect privacy, not erode it. Moreover, trade agreements have a statistically significant relationship with decreasing government repression, and can support democratic consolidation. If the TPP follows suit of other trade agreements (especially those including some democratic members), it likely will also help support Internet openness, not repression. This includes a component that protects organizations from having to submit source code, thus circumventing a major concern of the tech community when working abroad. Even with those arguments aside, the more appropriate place to target international online privacy concerns would be at the rise of bilateral cyber agreements, not trade agreements.
  2. Turn the map around: Any global map of the TPP with the Eastern hemisphere on the left quickly highlights the economic containment of China.  This is reinforced when considering China’s pursuit of shaping global economic norms through the Asian Infrastructure Investment Bank, and more recently pursuit of membership in the European development bank.  Although there has been limited talk about including China in the TPP, China would have to adhere to the rules of privacy protection that are completely orthogonal to its domestic interests and policies around the Great Firewall. Viewing the TPP through the lens of China quickly highlights the likelihood that they are feeling encircled, and may respond accordingly.
  3. Stumbling blocs – Depending on how the TPP plays out, it could mirror the discriminatory trade blocs prevalent during the Interwar Era trade that helped lay the foundation for future conflict. While trade agreements increase trade between countries, it can also lead to a contraction of global trade, and certainly augment tensions between member states and those excluded from the agreement. Clearly, the current environment is extremely different from that of the Interwar era, but trade agreements that perpetuate geo-political fault lines at best will not improve relations that are already tenuous.

 

Given the larger geo-political context surrounding the TPP, those in the tech and security communities would be well advised to spend some time looking at the broader implications of the TPP. In reality, the TPP could be a means to expand the US vision of a free and open internet within signatory states, but that gets lost on those who don’t see the whole picture. This would be especially true if China actually becomes a signatory in the future, much as it did with WTO membership. Until then, China likely perceives the agreement as one to spread US influence in the region, and may be considering a range of retaliatory responses, including their persistent reliance on digital statecraft to achieve political objectives. Instead of a myopic focus on Internet privacy, the security community’s broader concerns should focus on China’s potential retaliation in the digital domain, which impacts national security, economic stability and – given the enormous data breaches of the last year – privacy. 

 

May the Source Be With You: 4 Implications of China’s Latest Stance on the OPM Hack

$
0
0

According to the Chinese state-run Xinhua news, the OPM breach, “turned out to be a criminal case rather than a state-sponsored cyber attack as the U.S. previously suspected.” Yesterday, the Washington Post similarly reported the arrest of Chinese criminal hackers, which has since circulated and been sourced across numerous outlets.

Similar to remarks following the US-Sino cyber agreement from September, many pundits are claiming a sea change in Chinese cyber activity. These perceptions unfortunately ignore centuries of theories and data on how states manage the tight balancing act to appease both international as well as domestic audiences. The need to assuage both international and domestic groups leads states to pursue policies that support their own incentive structure and overarching goal of staying in power. By focusing on this latest news from the Chinese government’s perspective, it’s easier to extract insights on their actions and the plausible gap between what is said in the diplomatic arena and what occurs in the nebulous realm of cyberspace.

Below are four assumptions that – when viewed through a strategic, Chinese perspective – should be met with a solid amount of skepticism as the OPM plot thickens:

  1. OPM was not state-sponsored. China has devoted significant capital claiming they are not perpetrators of malicious activity in the cyber domain. By allegedly finding the criminal group behind the OPM hack, China is able to save face internationally and maintain the façade of the pursuit of benign activity in cyberspace. Moreover, by identifying the perpetrators as Chinese criminals, the Chinese government rationalizes away any evidence that may point to China, while distancing any government involvement
  2. China is holding domestic criminals accountable. The Chinese government has a long history of leveraging scapegoats, as is evident in the ongoing crackdown on corruption. Accountability and scapegoating are very different, and confounding the two only leads to erroneous interpretations of activities.
  3. Norms are working. Xinhua’s announcement supports the ongoing perception that US-shaped global norms may be impacting Chinese digital activity. Unfortunately, this ignores the difficulty in establishing norms, which generally follow a steep S-curve and take significant time and resources to establish in the international system. Moreover, Chinese overt announcements toward cooperation occurred just as the US was about to impose economic sanctions due to the string of breaches attributed to China, including OPM as well as GitHub, United Airlines, Anthem, and the National Oceanic and Atmospheric Administration, to name a few.  This behavior does not change overnight, nor do norms become embedded quickly enough to alter behavior that significantly. Conversely, self-interest (not so-called cyber norms) dominates states behavior, and will continue to rationalize the gap between diplomatic behavior and covert activity.
  4. The source is credible. Finally, a dominant source of information on the arrest of Chinese criminals for OPM is Xinhuanet, run by Xinhua News Agency, the official media outlet for the state government. Like virtually all state-run media platforms in non-democracies, Chinese state-run platforms have a reputation as serving as a propaganda tool of the state. In the 2014 World Press Freedom Index, China ranks 175 out of 180, barely edging out Somalia and Syria in press liberties.

As additional details are disclosed over the next few weeks and months regarding the OPM hack, greater scrutiny of the sources and incentive structures should be explored before making grand assertions of strategic behavioral shifts. Diplomatic maneuvering between states to shape both domestic and international perceptions is an omnipresent characteristic of the international system. It would be wise to remain cognizant of motives and activities before believing the next state-sponsored media report.

Why Banning Tor Won’t Solve France’s National Security Problem

$
0
0

Throughout the second half of this year, there has been much heated debate about proposed changes to the Wassenaar Arrangement, which seeks to expand export controls on dual-use technologies, including those that pertain to intrusion software. While the intention was good, the first iteration of these changes released this past spring was more likely to hurt those who adhere to the arrangement, while empowering non-participants (e.g. China, Iran, North Korea) with an uneven playing field in their favor.

This week’s proposal by the French government to impose a ban on Tor – the most popular anonymous dark net – is just the latest in a series of myopic policy solutions (e.g. encryption debate) that similarly seems to entail undesirable externalities. In the wake of the Paris terrorist attacks, the government likely feels obliged to expedite policy changes to demonstrate a tough stance against terrorist activity. Unfortunately, banning Tor not only will fail to meet those objectives, but will also disrupt the democratic ideals of a free and open Internet. Below are four of the key problematic issues that arise from this initial policy proposal.

  • ​Failure to deter terrorist activity. Malicious actors – including terrorists, criminal networks, and lone wolves –will simply adapt and find another venue for their activity. Tor is just one aspect of a multi-pronged OPSEC strategy pursued by ISIS and other groups. In fact, roadblocks are more likely to force adversaries toward more innovative strategies and activity outside of the law enforcement radar.
  • Difficult to enforce. At a technical level, it is not difficult to identify whether a specific computer is communicating directly with the Tor network. However, the difficulty arrives in specifically attributing the actual person behind the keyboard, especially given the ability to bounce connections, obfuscate connections via proxy or VPN, and the presence of multiple users at a computer. In fact, banning Tor eliminates a known source of activity and data, thereby making it arguably much harder to monitor and attribute criminal and terrorist behavior.  Identifying whether a particular individual is using Tor inherently involves monitoring Internet usage, which may require additional legal provisioning. Finally, the simple technical logistics of implementing the ban is much more difficult, which even if enacted, returns to the difficulty of who to charge if a given computer is discovered to be using Tor.
  • Decreases civil liberties. While the French proposal is in response to terrorist activities, it is more likely to harm those human rights and civil liberties groups who use Tor to express their perspective, collaborate, and coordinate with journalists. The only people that will stop using Tor in France as a result of the ban are people who were using it for legal purposes. Anyone using Tor for an illicit criminal or terrorist agenda will continue to use Tor. The ban therefore decreases an important outlet for these civil liberties groups while enabling illicit activity to persist.
  • A global Splinternet.  Despite the widespread perception that state boundaries are obsolete, they do in fact still matter. The ban on Tor in France would accelerate the trend toward a Balkanized Internet, again undermining the spread of a free and open Internet. Moreover, just because Tor is blocked in France, does not mean that malicious actors can’t access it elsewhere. This is especially pertinent and returns to our first point. Because the attacks in Paris were largely planned in Belgium, if this legislation had been in place in France prior to the terrorist attacks, it is extremely unlikely that they would have been prevented based on what we know now.

The policy debates around Wassenaar, encryption, and now the ban on Tor all reflect the naïve belief that a policy can simply make these capabilities disappear. The genie is out of the bottle and instead of placing bans on these technologies, which will only hinder licit while enabling illicit activity, the policy world needs to dig deep and provide innovative solutions that better align with the realities of the modern world system.

Jamie Butler Cigital Podcast: On Enterprise Security and Thinking Like a Hacker

$
0
0

Today, Gary McGraw of Cigital spoke with our CTO Jamie Butler about enterprise security, thinking like an attacker, and his specialization in rootkit development. Head over to Cigital and listen in as Gary and Jamie discuss the attribution problem and how enterprises can think like a hacker.

A New Year, A New Normal: Our Top Cybersecurity Predictions for 2016

$
0
0

Each of the last several years has been dubbed the “year of the breach,” or more creatively the “year of the mega-breach.” But instead of continuing this trend and calling 2016 the “year of the uber-mega-breach,” Endgame’s team of engineers, researchers and scientists have pulled together their top predictions for the security industry.  We anticipate a threatscape that will continue to grow in complexity and sophistication.  And while policymakers are yet to acknowledge that cyber innovations like encryption, Tor, and intrusion software will not simply go away through legislation, global enterprises should recognize that the “year of the breach” is the new normal.

 

Increased Focus on the Cloud
Mark Dufresne, Director Malware Research and Threat Intelligence

Cyber attackers will increasingly interact with cloud services to acquire sensitive data from targets. Through compromising credentials and socially engineering their way into access, attackers will successfully gain control of sensitive data and services hosted by commercial cloud providers.  In addition to data exposure, we may see companies that rely heavily on the cloud significantly impacted by ransom demands for restoration of cloud-hosted assets, potentially with new cases of outright destruction of data or services that are often perceived by users as backed-up and secured by their cloud provider.  As part of their continuing effort to evade detection, adversaries will increasingly use these same cloud-based services for command and control as well as exfiltration in order to blend in with the noise in modern Internet-connected environments. Encryption and the heterogeneity of most environments makes drawing a distinction between legitimate and malicious activity very difficult. Attackers will increasingly take advantage of this heterogeneity, leading some organizations to increase investments in securing and controlling their perimeter.

 

Targeted Malvertising Campaigns
Casey Gately, Cyber Intel/Malware Analyst

State sponsored actors will continue exploiting the social dimension of breaches, focusing on susceptible human vulnerabilities in diverse ways, such as through targeted spear phishing or more widespread malvertising campaigns. Many of these widespread campaigns will become increasingly targeted given the growing sophistication of attacks. Spear-phishing is a very reliable method for a state-sponsored actor to gain a foothold into a given network. In contrast, malvertising is more of a 'spray and pray' approach - where attackers generally hope that some of the millions of attempts will succeed.

Attackers could also take a more targeted malvertising approach by dispersing a series of weaponized ads for a particular item – such as weight training equipment. When someone conducts a search for “barbell sets” those ads would be auto-delivered to the potential victim. If the ads were targeted to fit the output, mission statement or goal of a specific corporation, the chance of victimizing someone from that company would be greater.

 

Increase in Mobile App Breaches
Adam Harder, Technical Director of Mobile Strategy

The volume of payments and digital transactions via mobile apps will continue to grow as end-users continue to shift from desktops and the web to mobile platforms.  Walmart is in the process of standing up a complete end-to-end mobile payment system, and 15% of all Starbucks revenue is processed through its mobile app.  Unfortunately, more of these apps will likely fall victim to breaches this year. Consider all the apps installed on your mobile device. How many of these are used to make purchases or view credit/loyalty account balances? Several popular consumer apps - Home Depot, Ebay, Marriott, and Starbucks - have been associated with data breaches in the last 24 months.

 

Public Perception Shift from Security to Safety
Rich Seymour, Data Scientist

People are slowly coming to realize the lack of implicit security in the systems they trust with their data. Many users operate under the false assumption that security is inherently baked into private services.  This isn't a paradigm shift for folks used to untrusted networks (like the manually switched telephone systems of the pre-rotary era), but people who simply assumed their information was stored, retrieved, and archived securely need to recognize that not only must they trust the physical security of a data center, they must also trust the entire connected graph of systems around it.  

Based on some leading literature from last year, including the work of Nancy Leveson, expect to see safety become the buzzword of 2016. There also could be big things from the Rust community (including intermezzOS and nom) and QubesOS. As such, “safety” will likely be the new information security buzzword.

 

Malicious Activity Focused on Exploiting PII & Critical Infrastructure
Doug Weyrauch, Senior CNO Software Engineer

With the rise in frequency and severity of data breaches, including those at OPM and several health care companies, cyber criminals and hacktivists are increasingly using PII and other sensitive data for extortion, public shaming, and to abuse access to victims’ health records and insurance.  Unlike credit card information, personal health and background information cannot be canceled or voided.  If health records are co-opted and modified by a malicious actor, it is very difficult for a victim correct misinformation.  And with the US Presidential election heating up this year, it’s likely one or more candidates will suffer a breach that will negatively impact their campaign.

As more stories surface regarding the cyber risks unique to critical infrastructure, such as in the energy sector, terror groups will increasingly target these systems. In 2016, there will likely be at least one major cyber attack impacting critical infrastructure or public utilities. Hopefully this propels critical infrastructure organizations and governments to actually put money and resources behind overhauling the digital security of the entire industry.

How Banks' Spending on Cybersecurity Ranks If They Were Small Countries

$
0
0

Last week, our team predicted the biggest cybersecurity trends in the new year – specifically, that as attacks grow in complexity and sophistication, breaches will be the new normal.

Indicative of the growing importance of cybersecurity to critical infrastructure industries, the financial sector is responding to this new normal, and is investing its resources accordingly. In light of high profile breaches like JP Morgan Chase and the Carbanak campaign, current and anticipated spending on cybersecurity in the financial sector exposes the resources required to counter this new normal. To highlight this, we’ve compared cybersecurity spending of four of the largest banks to the GDP of four small countries to demonstrate the vast resources required to manage current and emerging threats.

 

 

With the new year kicking off with a high profile attack on the Ukrainian power grid, it is increasingly evident that the new normal is here to stay. Tackling this dynamic and complex threatscape requires organizations – especially those in the highly targeted critical infrastructure sectors –  to think like the adversary. That’s why we’ve built a solution that intimately understands adversarial techniques and tactics – enabling our customers to go from being the hunted to the hunter and identifying threats at the earliest possible moment before damage and loss can occur.

Endgame Crushes the Industry Average for Gender Diversity

$
0
0

In the State of the Union address on Tuesday, President Obama highlighted the important contributions of women in science and technology fields. Unfortunately, the tech industry on average has less than 30% women in the workforce, which supersedes the paltry 10% of women in cybersecurity, at any position. Endgame understands that today’s complex threatscape requires new thinking and diverse perspectives. As we continue to grow, we keep this in mind, with women comprising almost 42% of our recent hires. We value the contributions of all of our team members and continue to bring a diversity of perspectives to ensure our products and research and development are best prepared to tackle adversaries of today and tomorrow. If you want to see first hand presentations of some of our team members, please visit our RSA booth in San Francisco next month, or attend the Women in Cybersecurity conference in Dallas at the end of March.


Moving Beyond the Encryption Debate

$
0
0

With the Cybersecurity Information Sharing Act snuck into the omnibus budget bill in December, and the horrific terrorist attacks in Paris and San Bernardino, encryption has returned front and center as the next cybersecurity policy battleground.  Unfortunately, like so many reactive policy issues, the encryption debate remains muddled in myopic discussions that ignore the complex realities of both technology as well as the modern international system. Since the technological challenges have been widely covered, below are just three of the key structural social challenges that further indicate that it’s time to move onto more productive discussion regarding the national security implications of the cyber domain.

 

  • Collective Action Problem  – Similar to the Wassenaar Arrangement, any policy that depends on global adherence will fail unless it is in everyone’s interest to abide by it. Digital safe havens will continue to exist with and without legislation requiring backdoor access to data. Nefarious actors will take advantage of and circumvent any legal mandates if deemed in their best interest to do so. This is why norms are so challenging in this domain. Because – whether illegal or not – encryption without backdoor access will be used by criminals, spies and terrorists if it helps them achieve their objectives. Moreover, adhering to the law would then become a self-imposed competitive disadvantage for corporations as it could weaken the security and protection of their PII and IP. Weakening encryption assists those trying to exploit the system or limit civil liberties, while hindering those trying to protect them. Given the very widespread data breaches of the last few years, if anything, we need stronger security practices around our personal and intellectual data, not weaker.

 

  • Dictatorships – While the notion we’re entering an era of authoritarian resurgence remains highly debated, it is clear that major powers such as China and Russia as well as smaller states like Uzbekistan continue to leverage the Internet as a key source of international statecraft and domestic control. Many state and non-state actors better achieve their objectives if the Internet is not free and open. In this case, encryption becomes part of their strategy of domestic control, either by implementing encryption to protect their own communications, or by cracking into it as part of a larger surveillance strategy. Dictatorships further achieve these objectives by working with companies whose main purpose is to crack the encryption systems of companies such as Facebook and Google. As long as there are leaders who pursue domestic policies of censorship and Internet control, they will find ways to impose or crack encryption systems to their benefit. Encryption becomes part of their larger strategy, implementing impenetrable systems to safeguard their own data, thus giving them an advantage, as they are not required to provide backdoor access to their data. Dictatorships also constantly pursue vulnerabilities and weaknesses to exploit – especially among pro-democracy groups and social media companies – and therefore will devote significant resources toward gaining access to data via any backdoor channels.

 

  • Head in the sand – Finally, as policy slowly muddles along to grasp technological realities, encryption systems are increasingly ubiquitous. The recent presidential debates demonstrated the void in comprehension of the problem and certainly did not provide viable solutions. On the one hand, the most recent Democratic debate avoided providing any coherent platform other than the need to balance security and privacy. The Republican debate similarly failed to offer viable solutions, with bewildering comments ranging from cutting off parts of the Internet to confusing statements about smartphone encryption. Unfortunately, it’s possible thatreactive policy responses may win out over more thoughtful recommendations that clearly address the core problems. The recent terrorist acts put renewed pressure on Congress to respond quickly to a dominant national security concern, elevating the risk that misguided policy will be passed.

 

For instance, there has been talk of a bipartisan commission that would bring both DC and Silicon Valley leadership together to explore the problem, similar to the 9/11 commission. Worried that it will take too long, the Senate may instead push forth with encryption legislation that may not be an adequate solution to the actual national security challenges. A bipartisan commission – a rare display of unity in Congress – could help Congressional leaders better grasp the technical implications of their policies, while also helping the tech community better comprehend the complexity of modern national security challenges. Until then, based on the recent level of discourse, the more likely reality unfortunately is ill-conceived, reactionary legislation.

The encryption debate – centered at its core on whether there is a security and privacy trade-off – only continues to further the wedge between DC and Silicon Valley. It would be more productive for both the tech and policy communities to look beyond encryption. Although cybersecurity was not addressed in last month’s State of Union address, hopefully meetings such as that between national security leaders and Silicon Valley CEOs last month is a sign that these two sides can work toward more innovative solutions that meet both the technological and geopolitical realities of the current era. Of course, this will require both sides to compromise. Silicon Valley needs to accept that safeguards are necessary given the national security landscape, while Congress needs to lean on Silicon Valley to optimize the way advanced technologies can simultaneously protect both privacy and national security. Until then, we’re likely to see misguided policy proposals that are ill-fitted to achieve the desired national security objectives.

Distilling the Key Aspects of Yesterday’s Threat Assessment, Budget Proposal, and Action Plan

$
0
0

In light of the latest breach– including 200GB of PII of Department of Justice and FBI personnel – yesterday’s news from DC is all the more compelling. As is often the case, the most intriguing aspects are hidden deep within the texts or spread across the various documents and hearings. To help make sense of this extremely active week in cyber policy, we have analyzed some of the crosscutting themes on the threat and policy responses from the following:

 

Disparate & Unprecedented Paths

  • State & Non-State Actors: The CNAP and the threat assessment both highlight the range of adversaries, including criminals, lone wolves, terrorists, and state-sponsored espionage (i.e. spies). The sophistication of their techniques clearly varies, but each type of threat actor is increasingly leaning of the availability and low risk of offensive cyber operations to achieve their objectives.
  • Adversaries’ offensive tradecraft: Threat actors are keeping all options on the table, pursuing the range of cyber statecraft from propaganda to deception to espionage. Both Russia and China rely heavily on misinformation and espionage, while data integrity and accountability is increasingly problematic, which has strategic level implications for attribution and U.S. policy responses.
  • Targets: The targets vary depending on the threat actor, which means that most industries remain potential targets. Those entities with significant PII, IP, or critical infrastructure are at the greatest risk. These include power grids and financial systems, as well as defense contractors.
  • Tech & Data Science: Cyber and technology dominate all discussions of leading national security challenges, consistent with previous assessments. In contrast, data science and security are rarely referenced when talking about adversaries’ capabilities, but this year’s threat assessment breaks new ground in identifying the foreign data science capabilities of threat actors. While Director Clapper focuses more on foreign data collection capabilities, the sophistication of the data science will determine any insights that can be gleaned from the collection.
  • Between the lines: There is increasingly the potential for unintended consequences given the complex mix of actors, capabilities, and targets. Sophisticated digital tools in the hands of unsophisticated actors are likely to produce negative externalities. Moreover, adversaries’ risk calculus is extraordinarily slanted in favor of offensive attacks. As long as the benefits of a cyber attack outweigh the costs, prepare for more high profile breaches.

  

Multi-faceted Responses

  • Greater spending: The new budget proposal includes a 35% increase in cybersecurity spending to $19 billion. This will cover a broad range of initiatives, including new defensive teams, IT modernization, and broader training initiatives across society.
  • Additional bureaucracy: Just as the NCCIC was formed to create a central source for information sharing, the CNAP recommends the creation of a federal CISO. While the attempt is to parallel the organizational feature of the private sector, it may cause confusion considering there is an extant cyber czar.
  • Proactive hunting: Given the seemingly endless string of breaches, the CNAP calls for “proactively hunting for intruders”. This will be an interesting area to observe, as it’s among the first federal signs of an offensive-based strategy to defend the government networks.
  • Tech Outreach: The budget and the CNAP both stress the need for better government relationships with Silicon Valley. This includes the formation of a new commission comprised of national security experts and Silicon Valley technologists, which would be responsible for longer-term cyber initiatives. President Obama’s reference to the federal system as an “Atari game in an Xbox world” likely resonates with the tech crowd. However, given the absence of anything close to security at this week’s Crunchies, it is unclear whether Silicon Valley is ready to invest in the tough security challenges.
  • Elevated Role of R&D: - The CNAP calls for a testing lab for government and industry to pursue cutting-edge technologies. Director Clapper similarly noted the need to stay ahead of the sophisticated research of many adversarial states in the realms of AI, data science and the Internet of Things. This may be another signal that we are working toward crafting this era’s Sputnik moment, just as President Obama described over five years ago.
  • Between the lines: Protecting digital infrastructure remains a top national security priority, with an emphasis on strengthening and diversifying our cyber defenses to counter the growing range of adversaries. Interestingly, the pursuit of norms to counter adversarial behavior was markedly absent, potentially because it has yet to have any clear deterrent effect. Instead, the budget and CNAP advocate for changes across the workforce, modernization of archaic federal IT infrastructure, creative strategic thinking, proactive cyber techniques, and strengthened partnerships between Silicon Valley and DC. This is a challenge that requires the best strategic thinkers working alongside the most innovative technologists to help secure the country’s critical assets. The budget battle has already begun, so it is uncertain whether many of these necessary changes will in fact become a reality.

Welcome to the Jungle: RSA 2016

$
0
0

 RSA is just a few weeks away, and everyone is finalizing his or her dance cards. There are multiple opportunities to meet the Endgame team, and talk about everything from the Endgame Hunt Cycle to data science to our global network of honeypot sensors to gender diversity in the cybersecurity workplace.

1. Booth 2127– Stop by our RSA booth to learn how Endgame detects known and unknown adversarial techniques and eradicates them from enterprise networks. 

2. Lightning Tech Talks– We’re excited to share with you some of the great work of our R&D team. Building upon the theme of multi-layer detection, we’ll show you three distinct approaches to detection. The first focuses on strategic level trends, providing insights garnered from our global honeypot network. The second dives into dynamic malware analysis and the tit-for-tat interactions of defenders and attackers. The final talk describes our automated malware classification capabilities, which build upon the broad expertise of our data science team.

3. Personalized demo– Overwhelmed by the crowds and prefer a more quiet and calm environment to take a look at the Endgame platform? Schedule a private demo here.

 

Employing Latent Semantic Analysis to Detect Malicious Command Line Behavior

$
0
0

Detecting anomalous behavior remains one of security’s most impactful data science challenges. Most approaches rely on signature-based techniques, which are reactionary in nature and fail to predict new patterns of malicious behavior and modern adversarial techniques. Instead, as a key component of research in Intrusion Detection, I’ll focus on command line anomaly detection using a machine-learning based approach.  A model based on command line history can potentially detect a range of anomalous behavior, including intruders using stolen credentials and insider threats. Command lines contain a wealth of information and serve as a valid proxy for user intent. Users have their own discrete preferences for commands, which can be modeled using a combination of unsupervised machine learning and natural language processing. I demonstrate the ability to model discrete commands, highlighting normal behavior, while also detecting outliers that may be indicative of an intrusion. This approach can help inform at scale anomaly detection without requiring extensive resources or domain expertise.

 

A Little Intro Material

Before diving into the model, it’s helpful to quickly address previous research, the model’s assumptions, and its key components. Some previous work focuses solely on the commands, while some use a command's arguments as well to create a richer dataset.   I focus only on commands and leave the arguments for future work.  In addition, this work focuses on server resources, as opposed to personal computers, where command line is often not the primary means of interacting with the machine.  Since we are focusing on enterprise-scale security, I leave applications of this model for personal computers to future work.  I also focus on UNIX/Linux/BSD machines due to current data availability.

Authors in previous work often rely on the uniqueness of their set of commands.  For (an overly simple) example, developer A uses emacs while developer B uses vi, hence it is an anomaly if user A uses vi.  These works come in many forms including sequence alignment (similar to bioinformatics), command frequency comparisons, and transition models (such as Hidden Markov Models).  One common issue across many of these works is the explosion in the number of dimensions.  To illustrate this, how many commands can you type from your command line? My OS X machine has about 2000 commands.  Now add Linux, Windows and all the uncommon or custom commands.  This can easily grow to the order of tens of thousands of commands (or dimensions)! 

In addition to dimensionality challenges, data representation further contributes to the complexity of the data environment.  There are many ways to represent a bunch of command sequences.  The most simple is to keep them as strings.  Strings can work for some algorithms, but can lack efficiency and generalization.  For example, assumptions of Gaussian distributions don’t really work for strings. In addition, plugging strings into complex models requiring mathematical operators like matrix multiplies (i.e., Neural Nets) are not going to work.  Often, people use one-hot encoding in order to use more complicated models with nominal data, but this still suffers from the curse of dimensionality as the number of unique names increases.  In addition, one-hot encoding treats each unique categorical value as completely independent from other values.  This, of course, is not an accurate assumption when classifying command lines.

Fortunately, dimensionality reduction algorithms can counteract the growing number of dimensions caused by one-hot encoding.  Principle Component Analysis (PCA) is one of the most common data reduction techniques, but one-hot encoding doesn’t follow Gaussian distributions (for which PCA would optimally reduce the data).  Another technique is binary encoding.  This technique is generic, making it easy to use, but can suffer in performance as it doesn’t take domain specific knowledge into account.  Of course, binary encoding is typically used for compression, but it actually works fairly well in encoding categorical variables when each bit is treated as a feature.

So how can we reduce the number of dimensions while utilizing domain knowledge to squeeze the best performance out of our classifiers?  One answer, that I present here, is Latent Semantic Analysis or LSA (also known as Latent Semantic Indexing or LSI). LSA is a technique that takes in a bunch of documents (many thousands or more) and assigns "topics" to each document through singular value decomposition (SVD). LSA is a mature technique that is heavily used (meaning lots of open source!) in many other domains.  To generate the topics, I use man pages and other documentation for each command.  

The assumption (or hypothesis) is that we can represent commands as a distribution of some limited and overlapping set of topics that proxy user intent, and can be used for detecting anomalous behavior. For an overlapping example, mv (or move) can be mimicked using a cp (copy) and an rm (delete). Or, from our previous example, emacs and vi do basically the same thing and probably overlap quite a bit.

 

LSA on Command Lines

To test the hypothesis, I need to evaluate how well LSA organizes commands into topics using the text from man pages.  I use around 3100 commands (and their respective man pages) to train my LSA model.  Next, I take the top 50 most used commands and show how well they cluster with other commands using cosine similarity.  I could visualize even more commands, but the intent is to show a coherent and understandable clustering of commands (so you don't have to run man a hundred times to understand the graphic). Similarly, only edges with weights greater than .8 are kept for visualization purposes (where cosine similarity is bounded in [0,1] with 1 as the most similar).  

 

[[{"fid":"610","view_mode":"default","type":"media","attributes":{"height":"1010","width":"974","class":"media-element file-default"}}]]

 

If you look closely you can see clusters of like commands. This was done totally unsupervised.  No domain experts. That's pretty cool!

That’s a great first step, but how can we use this to classify command lines?  The idea is to average intent over small windows of commands (such as three, ten or fifty commands) and to use this as a feature vector.  For example, if the user types cd, ls, cat, we find the LSA representation of each command from their corresponding man pages.  Assuming we model commands with 200 topics, we take each of the three 200 point feature vectors and do a simple mean to get one 200-point feature vector for those three commands.  I tried a few other simple ways of combining features vectors, such as concatenating, but found mean works the best.  Of course, there are probably better more advance techniques, but this is left to future work.  We can generate a large training and testing set by applying a sliding window over a user’s command sequence. For fun, I use the one-class SVM from sklearn and employ data from the command line histories of eleven colleagues.  I create a total of eleven models trained on each respective user.  These are one-class models, so no positive (i.e., anomalous) examples are in any of the training.  I run ten folds using this setup and average the results. For each fold, I train on 50% of the data and keep 50% of all commands from each user for testing.  I admit this setup is not completely representative of a real world deployment as the numbers of anomalous command sequences far outweigh the numbers of normal. I also do the most basic preprocessing such as stemming and removal of stop words using NLTK and stop_words (both can be installed through pip) on the man pages before running LSA to create topics. 

For a baseline, I run the same experiment using one-hot, binary, and PCA encoded feature vectors for each command. I take the mean of these feature vectors over windows as I did before.

I run the experiment on windows of three, ten, and fifty and display the corresponding receiver operating characteristic (ROC). The ROC curve describes how well the eleven user models identified the held out commands. One caveat is that not all commands are represented in the man pages.  For simplicity and reproducibility, I currently ignore those commands and leave that to future work. 

[[{"fid":"611","view_mode":"default","type":"media","attributes":{"height":"750","width":"771","class":"media-element file-default"}}]]

The first image is not so great.  Here we are displaying the ROC for a window size of 3.  Except for PCA, everything is about the same. LSA is marginally better than one-hot and binary encoding. However, with such a small window size, you’re best off using PCA.

[[{"fid":"612","view_mode":"default","type":"media","attributes":{"height":"750","width":"771","class":"media-element file-default"}}]]

As we increase the window size, the results get a little more interesting.  One-hot and LSA encoding get the largest boost in performance, while PCA degrades.  As I stated earlier, PCA is a bad choice for reducing categorical variables, so this drop-off is not overly surprising. The other interesting point is that larger windows make a better classifier.  This is also not very surprising as the users in this study are very similar in their usage patterns.  Larger windows incorporate more context allowing for a more informative feature vector.

[[{"fid":"613","view_mode":"default","type":"media","attributes":{"height":"750","width":"771","class":"media-element file-default"}}]]

The results get even better for LSA with a window size of fifty.  Of course, we could enrich our features with command line arguments and probably get even better results, but we are already doing really well with just the commands.

 

Final Thoughts

LSA works very well in clustering the command line arguments, serving as a useful proxy for user intent and more importantly detecting anomalous behavior. This was completely unsupervised making the model an easy fit for a real world deployment where labels often don’t exist. One assumption of this post is the training data is not polluted (i.e., does not contain command line sequences from other users). Also, this data comes from the command lines of software developers or researchers who are very similar in their usage patterns.  This means a command line pattern may be common across several users leading to false negatives in this experimental setup.  Hence, we may see much better results when we feed a malicious user’s command lines to a normal user’s model.  In addition, we could possibly create a more robust model by using the command histories of multiple normal users (instead of building a model from a single user). I will leave the answers to these questions to another post!

 

 

Glimmers of Hope: Why All is Not Lost for Silicon Valley and DC

$
0
0

By most accounts, the dispute between Apple and the FBI over the San Bernardino attacker’s mobile phone has escalated tension between the tech community and the federal government. In one of the best recent examples of misinformation and misunderstanding of the nuances, this case is now viewed as indicative of the insurmountable divide between Silicon Valley and the tech community on one side, and Washington, DC and the federal government on the other. There are certainly enormous challenges to overcome – culturally, technically, and organizationally. However, despite these doomsday scenarios of an intractable bicoastal feud, it’s important to keep in mind that there also have been, and continue to be, a growing number of olive branches between the two groups. These clearly are not as sensational and generally fail to make it through a 24-hour news cycle, but these collaborative efforts are fundamental to maintaining both our innovative spirit and national security. In fact, the growing national security and privacy challenges cannot be resolved without the input and collaboration of the leading minds from both coasts.

 

This fact was made increasingly apparent during last week’s Cyber Analytic Exercise at the UC-Berkeley Center for Long-Term Cybersecurity, sponsored by the RAND Corporation and the Hewlett-Foundation. With an intentionally diverse group of attendees from deep inside Silicon Valley as well as from the government, think tanks and academia, the attendees were divided into groups to discuss two different hypothetical scenarios in the not-so-distant future pertaining to the Internet of Things (IoT) and a series of breaches that lead to a loss of faith in online banking and data protection. The objective was for each group to identify solutions that take into account core values such as economic vitality, innovation, privacy, and security. It immediately became apparent that almost every solution had a government component to it, and required collaboration with the tech industry. Even when thinking about market forces driving innovation, it was deemed that there would be greater success at achieving objectives when the tech community and federal government work together. It simply is an imperative going forward for both national security and the preservation of privacy and innovation.

 

Despite the dominant headlines, and very real challenges, there are real-world indications that both sides realize the mutual value of working together. Here are just a few recent examples:

  • Valley visits: East meets West. President Obama broke ground last year when holding the White House Summit on Cybersecurity at Stanford. While many in the Valley responded with skepticism, it was the first real outreach by a sitting president. Similarly, Secretary of Defense Ashton Carter was the first defense secretary in two decades to visit Silicon Valley, eliciting greater collaboration and support from the community to handle the range and sophistication of national security threats.
  • Inside the Beltway: West also meets East.  With former Googler Megan Smith as the U.S. CTO, it’s a clear signal on both sides that an embedded position – not just sporadic visits – is required to truly enact change within the government. Smith replaced another Silicon Valley figure, Todd Park, with a successful transition indicating this is not merely a passing fad. Similarly, DJ Patil is the first US Chief Data Scientist. With roots in Silicon Valley, including LinkedIn and ebay, Patil continues the trend of bringing Valley expertise to the federal government. The recent announcement of the creation of a Federal CISO provides yet another opportunity to bring a Silicon Valley mindset into the federal government.
  • Outside the Beltway: There has been a quiet emergence of government entities starting to pop up in the greater Bay Area. From the Defense Innovation Unit Experimental to DHS’ Silicon Valley office, there is growing acknowledgement of the need for a federal presence within the Valley whose mission is to reach out to the community, while also bringing in new approaches to the respective organizations.
  • Cyber commission: President Obama’s recent Cyber National Action Plan formalized the creation of a bicoastal commission tasked with identifying new cybersecurity recommendations and solutions. In addition to many other information-sharing groups, this is the latest attempt to provide opportunities for the two communities to collaborate, find common ground, and identify mutually beneficial recommendations.
  • Wassenaar Arrangement: Last year’s recommended imposition of export controls on intrusion software ignited a strong debate about the research and security implications of such a ban. Yet again, many saw this as indicative of the bicoastal divide. However, the verdict is still out. The government requested and expeditiously received broad input from the security community. Given the breadth of input from the tech community, the government is delaying any final decisions until it receives another round of public comments. This is hopefully a good sign and indicator that the government is taking into account the hurdles inherent in the agreement.

 

Returning to last week’s Berkeley event, thought leaders from the various communities intentionally came to collaborate and network with one another. While likely appearing naively optimistic in light of the high profile FBI and Apple case, there simply seems to be a growing number of people and opportunities available to bring the national security and tech communities together. As the media continues to portray a divide as deep as the Mariana Trench, there is simultaneously a grassroots movement underway and real desire for collaboration between the communities. It is increasingly apparent that the complexities of both the national security and technological landscapes require input, collaboration, and thought leadership from both Silicon Valley and DC.    It is essential to continue fostering and promoting these grassroots efforts in order to maintain our innovative, competitive global edge, as well as the security and privacy that make it possible.

 

Viewing all 698 articles
Browse latest View live