Endgame's Blog

In the first blog post of this series, we discussed considerations for measuring and understanding the performance of machine learning models in information security. In the second post, we compared machine learning models at a fairly coarse level on a malware detection task, noting several considerations of performance for each model, including accuracy (as measured by AUC), training time, test time, model size, and more. In this post, we'll depart slightly from the more academic discussion of our previous blog posts and discuss real-world implementation considerations. Specifically, we’ll build upon a related white paper and address operationalizing a malware classifier on an endpoint in the context of a hunt paradigm.

A Word about the Hunt

First, let's establish some context around malware detection in a hunt framework. We define hunting as the proactive, methodical, and stealthy pursuit and elimination of never-before-seen adversaries in one's own environment. Threat hunters establish a process to locate and understand a sentient adversary prior to eviction, without prematurely alerting the adversary to the hunter's presence. A thorough understanding of the extent of an adversary's access is necessary for complete removal and future prevention of the adversary. Prematurely alerting an adversary that he's being pursued can prompt him to destroy evidence of his exploitation, accelerate his timeframe for causing damage and stealing data, or cause him to burrow deeper and more carefully into systems than he otherwise might. Therefore, the hunter uses stealthy tools to survey the enterprise, secures assets to prevent the adversary from moving laterally within networks, detects the adversary's TTPs, and surgically responds to adversary tactics without disruption to day-to-day operations, for example, by terminating a single injected thread used by the adversary.

In this context, malware detection is part of a multi-stage detection framework that focuses particularly on discovering tools used by the adversary (e.g., backdoors used for persistence and C2). Unlike the passive detection framework, malware detection in the hunt framework should have particularly rigid standards for stealth, including a low memory and CPU footprint, while still providing high detection rates with low false positive rates. A low memory and CPU footprint allow an agent to hide amongst normal threads and processes, making it difficult for an adversary to detect or attempt to disable monitoring and protective measures. For this purpose, we focus attention specifically to developing a robust, lightweight model that resides in-memory on an endpoint to support hunting as one part of a larger detection framework. The task of this lightweight classifier is to determine the maliciousness of files that are:

Newly created or recently modified;
Started automatically at boot or other system or user event;
Executed (pre-execution);
Backing running processes;
Deemed suspicious by other automated hunting mechanisms; or
Specifically queried by the hunt team.

Lightweight Model Selection

Similar to results presented in the coarse model comparison experiment in our second post, and following additional, more detailed experiments, gradient boosted decision trees (GBDTs) offer a compelling set of metrics for lightweight malware detection for this hunt paradigm, including:

Small model size;
Extremely fast query time; and
Competitive performance as measured by AUC, allowing model thresholds that result in low false positive and false negative rates.

However, to improve performance of GBDTs and tune it for real-world deployment, one must do much more than train a model using off-the-shelf code on an ideally-labeled dataset. We discuss several of these elements below.

From Data Collection to Model Deployment

The development of a lightweight malware classifier requires a large amount of diverse training data. As discussed in the first blog post of this series, these data come from a variety of both public and private sources. Complicating our initial description of data collection is that: (1) data must often be relabeled based on preliminary labels, and (2) most of these data are unlabeled—there's no definitive malicious/benign or family indicator to the collected samples.

Data Labels: It Just Seemed Too Easy, Didn't It?

Even when data come with labels, they may not be the labels one needs for a malware classifier. For example, each label might consist of a family name (e.g., Virut), a malware category (e.g., trojan) or in a more unstructured setting, a free-form description of functionality from an incident response report, for example. From these initial labels, one must produce a benign or malicious tag. When curating labels, consider the following questions:

While a backdoor is malicious, what about legitimate network clients, servers and remote administration tools that may share similar capabilities?
Should advanced system reporting and manipulation tools, like Windows Sysinternals, that might be used for malicious purposes be considered malicious or benign?
Are nuisance families like adware included in the training dataset?

These questions speak to the larger issue of how to properly define and label "grayware" categories of binaries. Analyzing and understanding grayware categories to build a consensus are paramount for constructing an effective training dataset.

Unlabeled Data: The Iceberg Beneath the Surface

Unlabeled data may be under-appreciated in the security data science field, but are often the most interesting data. For example, among the unlabeled data may be a small fraction of bleeding-edge malware strains that have yet to be detected in the industry. Although not always straightforward to do so, these unlabeled samples can still be leveraged using so-called semi-supervised machine learning methods. Semi-supervised learning can be thought of as a generalization of supervised learning, in which both labeled and unlabeled samples are available for training models. Most models, like many considered in our second post, do not natively support the use of unlabeled samples, but with care, they can be modified to take them into account. We explain two such methods here.

First, semi-supervised methods exist that work in cooperation with a human analyst to very judiciously select "important" samples for hand-labeling, after which traditional supervised learning models can be used with an augmented labeled dataset. This so-called active learning framework is designed to reduce the burden on a human analyst, while enhancing the model’s performance. Instead of inspecting and hand labeling all of the unlabeled samples, a machine learning model guides the human to select a small fraction of samples that would be the greatest benefit to the classifier. For example, samples may be selected that provide the greatest gain in the volume of data label discovery. By asking the human to label a single sample, the labels for an entire tight cluster of samples can be inferred. There are many similar, sometimes competing and sometimes complementary objectives outlined below:

Which unlabeled samples is the model most uncertain about?
If labeled correctly, which sample will maximize information gain in my model?
Which unlabeled samples could represent new attack trends?

Malware samples selected by active learning can address one or more of these objectives while respecting the labeling bandwidth of human analysts.

A second category of semi-supervised methods leverages unlabeled data without human intervention. One approach in this category involves hardening decision trees (and by extension, GBDTs), which are known to suffer from being overly sensitive in regions of the feature space where there are few labeled samples. The objective is to produce a GBDT model that is regularized towards uncertainty (produce a score closer to 0.5 than 0.0 or 1.0) in regions of the feature space where there are many unlabeled samples, but few or no labeled samples. Especially in the hunt paradigm, a model should have a very low false positive rate. This locally-feature-dependent regularization of the model can save a hunter from alert fatigue, which inherently eliminates the utility of the alerting system.

Other semi-supervised methods that do not require human intervention include label spreading and propagation which infer labels inductively—neighbors to a labeled sample should carry the same label—and self-training—where the model predicts labels for unlabeled samples, and the most confident decisions are added to the labeled training set for re-training.

Automation and Deployment

For an enterprise-relevant data science solution, a wholly automated process is required for acquiring samples (malicious, benign and unlabeled), generating labels for these samples, automatically extracting features, partitioning the features into training and validation sets (for feature and model selection), then updating or re-training a model with new data. This may seem like a mundane point, but data lifecycle management and model versioning and management don’t enjoy the standard processes and maturity that are now common within software version management. For example, consider four independent elements of a data science solution that could change the functionality and performance of an endpoint model: 1) the dataset used to train the model; 2) the feature definitions and code to describe the data; 3) the model trained on those features; and 4) the scaffolding that integrates the data science model with the rest of the system. How does one track versioning when new samples are added to or labels are changed in the dataset? When new descriptive features are added? When a model is retrained? When encapsulating middleware is updated? Introducing the engineering processes into a machine learning solution narrows the chasm between an interesting one-off prototype and a bona fide production machine learning malware detection system. Once a model is trained and its performance on the holdout validation set is well characterized, the model is then automatically pushed to a customer.

But the job doesn’t stop there. In what constitutes quality assurance for data science, performance metrics of the deployed model are continuously gathered and checked against pre-deployment metrics. Following deployment, the following questions must be answered in order to monitor the status of a deployed model:

Is there an unusual spike in the number of detections or have the detections gone quiet?
Are there categories or families the model is no longer correctly classifying?
For a sampling of files submitted to the model, can we discover the true label and compare them against the model’s prediction?

The answer to these questions is particularly important in information security, since malware samples are generated by a dynamic adversary. In effect, the thing we’re trying to detect is a moving target: the malware (and benign!) samples we want to predict continue to evolve from the samples we trained on. Whether one acknowledges and addresses this issue head on is another issue that separates naive from sophisticated offerings. Clever use of unlabeled data, and strategies that proactively probe machine learning models against possible adversarial drift can be the difference between rapidly discovering a new campaign against your enterprise, or being “pwned”.

Endgame MalwareScore™

Optimizing a malware classification solution for the hunt use case produces a lightweight endpoint model trained on millions of benign, malicious, and unlabeled samples. The Endgame model allows for stealthy presence on the endpoint by maintaining a minuscule memory footprint without requiring external connectivity. Paired with a sub-100 millisecond query time, the model represents the ideal blend of speed and sophistication necessary for successful hunt operations. The endpoint model produces the Endgame MalwareScore™ where scores approaching 100 inform the analyst that the file in question should be considered malicious. Endgame MalwareScore™ also encourages the analyst to easily tune the malicious threshold to better suit the needs of their current hunt operation, such as reducing the threshold during an active incident to surface more suspicious files to the hunter. The Endgame MalwareScore™ is an integrated element of detecting an adversary during hunt operations, enriching all executable file-backed items with the score to highlight potentially bad artifacts like persistence mechanisms and processes to guide the hunter as effectively as possible.

That's a Wrap: Machine Learning's Place in Security

After reading this series of three blogs, we hope that you are able to see through the surface-level buzz and hype that is too prevalent in machine learning applied to cyber security. You should be better equipped to know and do the following:

Understand that, like any security solution, machine learning models are susceptible to both false positives and false negatives. Hence, they are best used in concert with a broader defensive or proactive hunting framework.
Ask the right questions to understand a model's performance and its implications on your enterprise.
Compare machine learning models by the many facets and considerations of importance (FPR, TPR, model size, query time, training time, etc.), and choose one that best fits your application.
Identify key considerations for hunting on the endpoint, including stealthiness (low memory and CPU footprint), model accuracy, and a model's interoperability with other facets of a detection pipeline for use by a hunt team.
Understand that real-world datasets and deployment conditions are more "crunchy" than sterile. Dataset curation, model management, and model deployment considerations have major implications in continuous protection against evolving threats.

Endgame’s use of machine learning for malware detection is a critical component of automating the hunt. Sophisticated adversaries lurking in enterprise networks constantly evolve their TTPs to remain undetected and subvert the hunter. Endgame’s hunt solution automates the discovery of malicious binaries in a covert fashion, in line with the stealth capabilities developed for elite US Department of Defense cyber protection teams and high end commercial hunt teams. We’ve detailed only one layer of Endgame’s tiered threat detection strategy: the endpoint. Complementary models exist on the on-premises hunt platform and in the cloud that can provide additional information about threats, including malware, to the hunter as part of the Endgame hunt platform.

Endgame is productizing the latest research in machine learning and practices in data science to revolutionize information security. Although beyond the scope of this blog series, machine learning models are applicable to other stages of the hunt cycle—survey, secure, and respond. We’ve described the machine learning aspect of malware hunting, specifically the ability to identify persistence and never-before-seen malware during the “detect” phase of the hunt. Given the breadth of challenges in the threat and data environment, automated malware classification can greatly enhance an organization’s ability to detect malicious behavior within enterprise networks.

As we discussed in an earlier post, most defenses focus on the post-exploitation stage of the attack, by which point it is too late and the attacker will always maintain the advantage. Instead of focusing on the post-exploitation stage, we leverage the enforcement of coarse-grained Control Flow Integrity (CFI) to enhance detection at the exploitation stage. Existing implementations of CFI require recompilation, extensive software updates, or incur a significant performance penalty, making them difficult to adopt and use in the enterprise. At Black Hat USA 2016, we presented our hardware-assisted technique that has proven successful at blocking exploits, while minimizing the impact on performance to ensure operational utility at scale. To enable earlier detection while limiting the impact on performance, we have developed a new concept we’re calling Hardware Assisted Control Flow Integrity, or HA-CFI. This technology utilizes hardware features available in Intel processors to monitor and prevent exploitation in real time, with manageable overhead. By leveraging hardware features we can detect exploits before they reach the “Post-Exploitation” stage and provide stronger protections while defense still has the upper hand.

Prior Art and Operational Constraints

Our work builds on previous research that identified the Performance Monitoring Unit (PMU) of microprocessors as a good candidate for enforcing control-flow integrity. The PMU is a specialized unit in most microprocessor architectures that provides useful performance measuring facilities for developers. Most features of the unit are intended to count hardware level events during program execution to aid in program optimization and debugging.

In their paper, Yuan et al. [YUAN11] introduced the novel application of these events to exploit detection for software security. Their research focused on using PMU events along with the Branch Trace Store (BTS) messages to correlate and detect code-injection and code-reuse attacks without source code. Xia et al. explored the idea further in their paper CFIMon [XIA12], combining precise event context gathering with the BTS and PEBS to enforce real-time control-flow integrity. In addition to these foundational papers, others have pursued variations on the idea to specifically target exploit techniques such as Return-Oriented- Programming.

Alternatively, just-in-time CFI solutions have been proposed using dynamically instrumented frameworks such as PIN [PIN12] or DynamoRIO [DYN16]. These frameworks dynamically interpret code as it is executed while providing instrumentation functionality to developers. Applying control flow policies with a framework like PIN allows for the flexible and reliable checking of code. However, it often incurs a significant CPU over-head, in the area of 10 to 100x, making it unusable in the enterprise.

Our research into dynamic run-time CFI included parameters we feel would make this approach relevant to enterprise security, while also providing significant detection and prevention assurances. To ensure our approach is resilient for enterprise security while also providing significant detection and prevention assurances, we established several functional requirements, such as ensured functionality on 32 and 64bit Operating Systems, application without software recompilation, or access to source code.

Approach

HA-CFI uses PMU-based traps to apply coarse-grained CFI on indirect calls on the x86 architecture. The system uses the PMU to count and trap mispredicted indirect branches in order to validate branch destinations in real-time. In addition to gaining assistance from a carefully tuned PMU, a practical implementation of this approach requires support from Intel’s Last Branch Record (LBR) feature, and a method for tracking thread context switching in a given OS. It also requires an algorithm for validating branch destination addresses, all while keeping performance over-head to a minimum. After more than a year of fine-tuning these hardware features, we have proven our model is capable of generically detecting control-flow hijacks in real-time with acceptable performance over-head on both Windows and Linux. Because control-flow hijack attacks often stem from a corrupted or modified VTable, many CFI designs focus on validating all indirect branches. Because these call sites have never before jumped to the attacker controlled address, this indirect call is almost always mispredicted by the branch prediction unit. Therefore, by only focusing on mispredicted indirect call sites we greatly limit the number of places that a CFI check is necessary.

HA-CFI configures the Intel PMU on each core to count and generate an interrupt on every mispredicted indirect branch. The PMU is capable of delivering an interrupt any time an event counter overflows, and thus HA-CFI sets the initial counter value to -1 and resets the counter to -1 from the interrupt service routine to generate a trap for every occurrence of the event. In this way, the HA-CFI interrupt service routine becomes our CFI component capable of validating each mispredicted call and determining whether it is the result of malicious behavior. To validate target indirect branch addresses, HA-CFI builds a comprehensive whitelist of valid code pointer addresses as each.dll/.so is loaded into protected processes. When a counter overflows, the Interrupt Service Routine (ISR) called is then able to compare the mispredicted branch to a whitelist, and determine if the branch is anomalous.

Figure 1: High level design of HA-CFI using the PMU to validate mispredicted branches

Figure 1: High level design of HA-CFI using the PMU to validate mispredicted branches

To ensure we minimized the overhead of HA-CFI while maintaining an extremely low false-positive rate, several key design decisions had to be made, and are described below.

The Indirect Branch: On the Intel x86 architecture, an indirect branch can occur at both a CALL or JMP instruction. We focus exclusively on the CALL instruction for several reasons, including the frequent use of indirect JMP branch locations for switch statements. In our experimentation on Linux, we found roughly 12% of hijacked indirect branches occurred as part of an indirect JMP, but occurred even less frequently on Windows. Secondly, ignoring mispredicted JMP instructions further reduces the overhead of HA-CFI. Therefore, we opted to omit mispredicted JMP branches during this research, which can be achieved with settings on the PMU and LBR.

Figure 2: A breakdown of hijackable indirect JMP vs CALL instructions found in Windows and Linux x64 binaries

Added Precision with the LBR: Given our requirement for real-time detection and prevention of control-flow hijacks, unlike the majority of previous research, we couldn’t use the Intel Branch Trace Store (BTS), which does not permit analysis of the trace data in real-time. Instead, to precisely resolve the exact branch that caused the PMU to generate an interrupt, we make use of Intel’s Last Branch Record (LBR) stack. A powerful feature of the LBR is the ability to filter the types of branches that are recorded. For example, returns, indirect calls, indirect jumps, and conditional branches can all be included or excluded. With this in mind, we can configure the LBR to only record indirect call branches occurring in user mode. Additionally, the most significant bit of the LBR branch FROM address indicates whether the branch was actually mispredicted. As a result, this provides a quick filter for the ISR to ignore the branch if it was predicted correctly. It’s important to note that we are not iterating over the entire LBR stack, only the most recently inserted branch.

On-Demand PMU-Assisted CFI: HA-CFI is focused on protecting commonly exploited applications such as browsers, mail clients, and Flash. As such, the PMU and LBR are both configured to only operate on mispredicted indirect calls occurring in user mode, ignoring those that occur in ring-0. Moreover, by monitoring thread context switches in both Windows and Linux, we can turn the entire PMU on and off depending upon which applications are being protected. This design decision is perhaps the most critical element in keeping our performance overhead at an acceptable level.

Runtime Whitelist Generation: The final component to the HA-CFI system is the actual integrity check that involves querying a whitelist data structure containing valid destination addresses for indirect calls. Whitelist generation is performed at run-time for each image loaded into a protected process. We generated a whitelist such that all branches from our dataset could be verified in a hashtable leaving zero unknown captured branches.

Implementation Challenges  

Throughout the course of our research, we encountered numerous hurdles to meet our original goal of low-overhead, high detection stats. First, registering for PMU interrupts on Windows was a major challenge. Our initial prototype was developed under Linux. Transferring the same techniques to Windows proved problematic, especially with regards to Kernel Patch Protection. After significant research, we discovered an undocumented option in the Windows Hardware Abstraction Layer (HAL) that registers a driver supplied interrupt handler for PMU interrupts. Second, our implementation of CFI on Windows restricted PMU monitoring to a single process or thread.

The technique we ultimately arrived at makes use of a threads Asynchronous Procedure Call (APC) mechanism. Windows allows developers to register APC routines for a given thread, which are then added to a queue to be executed at certain points. By maintaining an APC registered on all threads that we seek to monitor at all times, we are notified that a thread has resumed execution when our routine executes. The routine re-enables the PMU counter if necessary and updates various tracking metrics. We detect when a processor swaps out a thread and begins executing another when we receive an interrupt in a different thread context. We can then disable the PMU counters, if needed.

Results

To evaluate our system, we measured success both in terms of performance overhead added by HA-CFI as well as detection statistics when tested against various exploits in common client applications, including the most common web browsers, as well as Microsoft Office and Adobe Flash. We sourced exploits from Metasploit modules for testing, as well as numerous live samples from popular Exploit Kits found in the wild.

Performance: After completing our prototype, we were concerned about the overhead of monitoring with HA-CFI and its impact on system performance and usability. Since each mispredicted branch in a monitored process would cause an interrupt, there was the potential for a very high number of interrupts to be generated. We subjected our prototype implementations to several tests to measure overhead. Monitoring Internet Explorer 11 while running a JavaScript performance test suite, the driver detected approximately 83,000 interrupts per second on average. In contrast, monitoring an “idle” IE resulted in roughly 1,000 interrupts per second. Our performance analysis revealed that overhead is highly dependent upon the process being protected. For example, with Firefox we saw around 10% overhead while running Dromaeo [DRO16] JavaScript benchmarks, and with PassMark benchmarking tool we saw 8-10% over- head. With Internet Explorer under heavy usage this number was above 10%. Alternatively, under normal user behavior the overhead is significantly less. We have deployed HA-CFI on systems in daily use monitoring web browsing, and observed little impact on performance or usability.

Exploit Detection: We extensively tested HA-CFI against a variety of exploits to determine its efficacy against as many bug classes and exploitation techniques as possible, with an emphasis on recent samples using approaches intended to bypass other mitigation measures. We ran one set of tests against more than 15 Metasploit exploits targeting Adobe Flash Player, Internet Explorer, and Microsoft Word. HA-CFI detected and prevented exploitation for each of the tested modules, with an overall detection rate greater than 98%.

We found the Metasploit results to be encouraging, but came to the conclusion that they did not provide sufficient diversity in exploitation techniques needed to comprehensively test HA-CFI. We used the VirusTotal service to download a set of samples used in real-world exploit kit campaigns from several widespread kits [KAF16]. In total, we tested forty-eight samples comprising twenty unique CVE vulnerabilities. We analyzed the samples to verify that they employed a varied set of both Return-Oriented Programming (ROP) and “ROPless” techniques. HA-CFI succeeded in detecting all 48 samples, with an overall detection rate of 96% in a multiple trial consistency test.

Results of VirusTotal Sample Testing, by Exploitation Technique

Results of VirusTotal Sample Testing, by Bug Class

Conclusion

Modern exploitation techniques are rapidly changing, requiring a new approach to exploit detection. We demonstrated such an approach to exploit detection by using the Performance Monitoring Unit to enforce control flow integrity on branch mispredictions. A run-time generated whitelist can determine the validity of indirect calls to locations classified as malicious. This approach greatly reduces the overhead of the instrumentation by moving the policy enforcement to a “coarse-grained” verifier on mispredicted indirect branch targets. The data provided also shows the efficacy of such a system on samples captured in-the-wild. These samples, from popular exploit kits, allow us to measure against unknown threats further validating its application. As exploits advance, we also need advanced exploit detection. Our hardware-assisted CFI (HA-CFI) system has a low performance impact and measurable prevention success against 0day exploits and previously unknown exploitation techniques. Using HA-CFI we have advanced the state-of-the-art – moving the industry from post-exploitation to exploitation – to give enterprise-scale security software an upper hand in earlier detection of exploitation. To learn more about pre-exploit detection and mitigation, we'll be discussing our approach during a webinar on August 25th, at 1 pm ET.

References

[YUAN11] L. Yuan, W. Xing, H. Chen, B. Zang, “Security Breaches as PMU Deviation: Detecting and Identifying Security Attacks Using Performance Counters”, APSys’11, July 11-12, 2011. 

[PIN12] PIN: A Dynamic Binary Instrumentation Tool. https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool

[DYN16] DynamoRIO: Dynamic Instrumentation Tool Platform. http://www.dynamorio.org/ 

[XIA12] Y. Xia, Y. Liu, H. Chen, and B. Zang, “CFIMon: Detecting violation of control flow integrity using performance counters,” in Proceedings of the 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 1–12, IEEE Computer Society, 2012. 

[DRO16] Dromaeo JavaScript Benchmark. http://www.dromaeo.com 

[EME16] The Enhanced Mitigation Experience Toolkit. https://support.microsoft.com/en-us/kb/2458544

[KAF16] Kafeine. Exploit Kit Samples. http://malware.dontneedcoffee.com/

Social media sites are frequently used for stealthy malware command and control (C2). Because many hosts on most networks communicate with popular social media sites regularly, it is very easy for a C2 channel hiding in this traffic to appear normal. Further, there are often rich APIs for communicating via social media sites allowing a malware author to easily and flexibly use the services for malicious purposes. Blocking the HTTP and HTTPS connections to these sites generally is infeasible since it would likely cause a revolt amongst the workforce. Security researchers have discovered multiple malware campaigns that have used social media for C2 capabilities. For example, Twitter has been used to direct downloaders to websites for installing other malware or for controlling botnets. Further, the information posted to a social media site may be obfuscated or Base64 encoded. Even if the malicious social media content is discovered, it would require an astute defender to recognize the intent.

Separately, a few malware strains (e.g., ZeusVM banking trojan) have used image steganography to very effectively disguise information required to operate malware. The stego image represents an effective decoy, with information subtly encoded within something that is seemingly innocuous. This can make malicious intent difficult to uncover. In the case of ZeusVM, an encoded image contained a list of financial institutions that the malware was targeting. However, even a close look at the image would not reveal the presence of a payload. The image payloads were discovered only because the stego images were retrieved on a server containing other malicious files.

We combine these two methods for hiding in plain sight and demonstrate "Instegogram", wherein we hide C2 messages in digital images posted to the social media site Instagram. We presented this research earlier this month at Defcon’s Crypto Village, as part of our larger research efforts that leverage our knowledge of offensive techniques to build more robust defenses. Since our research aims to help inform and strengthen defenses, we conclude with a discussion of some simple approaches for preventing steganographic C2 channels on social media sites, such as bit jamming methods and analysis of user account behaviors.

A Brief History of Steganography

Steganography is the art of hiding information in plain sight and has been used by spies for centuries to conceal information within other text or data. Recently, digital image steganography has been used to obfuscate configuration information for malware. The following timeline contains a brief history of steganography used in actual malware case studies:

JPEG-Robust Image Steganography Techniques & Instagram

Digital image steganography involves altering bits within an image to conceal a message payload. Let’s say Alice encodes a message in some of the bits of a cover image and sends the stego image (with message payload) to Bob, who decodes the message using a private key that he shares with Alice. If Eve intercepts the image in transit, she is oblivious to the fact that the stego image contains any message at all since the image appears to be totally legitimate both digitally and to the human eye.

A simple steganography approach demonstrates how this is possible. Alice wants to encode the bits 0100 into an image whose first 4 pixels are 0x81, 0x80, 0x7f, 0x7e. Alice and Bob agree (private key) that the message will be encoded in row-major order in the image using the least significant bit (LSB) to determine the message bit. Since the LSBs of the first 4 pixels are 1010, Alice must flip the LSBs of the first three pixels so that the LSBs are equal to the desired message bits. Since modifying the LSBs of a few pixels changes the pixel intensity by 1/255, the stego image appears identical to the original cover image.

Additionally, since Instagram re-encodes uploaded images in JPEG/JFIF form, the steganography encoding for Instegogram must be robust against JPEG compression artifacts.

JPEG images are compressed by quantizing coefficients of a 2D block discrete cosine transform (DCT). The DCT transformation is applied after a color transformation from RGB to YCbCr color space which consists of a luminance (Y) channel, and two chrominance channels, blue (Cb) and red (Cr). The DCT transformation on each channel has the effect of condensing most of the image block’s information into a few upper-left (low frequency) coefficients in the DCT domain. The lossy data compression step in JPEG comes in by applying an element-wise quantization to each coefficient in the DCT coefficient image, with the amount of quantization per coefficient determined by a quantization table specified in the JPEG file. The resulting quantized coefficients (integers) are then compactly encoded to disk.

One method of encoding a message in a JPEG image is to encode the message bits in the quantized DCT coefficients rather than the raw image pixels, since there are no subsequent lossy steps. But for use with Instagram, an additional step is required. Upon upload, Instagram standardizes images by resizing and re-encoding them using the JPEG file format. This presents two pitfalls in which the message can be clobbered in the quantized DCT coefficients: (1) if Alice's stego image is resized, the raw DCT coefficients can change, since the 8x8 block in the original image may map to a different rectangle in the resized image; (2) if Alice's stego image is recompressed using a different quantization table, this so-called double-compression can change the LSBs that may contain the secret message.

To prevent the effects of resizing, all cover images can be resized to a size and aspect ratio that Instagram will accept without resizing. To prevent the double-compression problem, the quantization table can be extracted from an existing Instagram image. During the course of this research and monitoring the tables, it appears that this quantization table is used across all Instagram images. Messages are then encoded in images that use those same quantization tables.

A number of other standard practices can be used to provide additional robustness against image manipulations that might occur upon upload to Instagram. First, pre-encoding the message payload using error correcting coding allows one to retrieve the message even when some bits become corrupted. For small messages, the encoded message can be repeated in the image bits until all image bits have been used. A message storing format that includes a header to specify the message length allows the receiver to determine when the message ends and a duplicate copy begins. Finally, for a measure of secrecy, simple methods for generating a permutation of pixel locations (for example, a simple linear congruential generator with shared seeds between Alice and Bob) can communicate to Bob the order in which the message bits are arranged in the image.

Instegogram Overview

Digital image steganography can conceal any message. Stego is appealing and compelling to malware authors for malware C2 because of its inherent stealthiness. Hosting C2 on social media sites is also appealing for the same reason - it is stealthy because it is hard to filter out or identify maliciousness. Instegogram combines these two capabilities - digital image steganography and the use of social media for C2 - to mirror the utilization of social networks for C2 that has increased for years, while exploring the feasibility of using stego on a particular site - Instagram.

The delivery mechanism for our proof of concept (POC) malware was chosen based on today’s most commonly used infiltration methods - a successful spearphish that causes the user to open a document and run a malicious macro. The remote access trojan (RAT) is configured to communicate with specific Instagram accounts that we control, and on which we’ll POST request images containing messages encoded with our steganographic scheme. The malware includes a steganographic decoder that extracts a payload from each downloaded image, allowing arbitrary command execution on the remote system.

The malicious app continuously checks the Instagram account feed for the next command image, decodes the shell command, and executes it to trigger whatever nefarious behavior is requested in the command. Results are embedded steganographically in another image which is posted to the same account.

As with any steganographic scheme, there is a limited number of characters which can be sent through the channel. In this simple POC, 40 characters could be reliably transmitted in the JPEG stego images. The capacity can be increased using more robust coding techniques as discussed previously.

In short, once the remote system is compromised, encoded images can be posted from the command machine using Instagram’s API. The remote system will download the image, decode it, execute the encoded commands, encode the results in another image, and post back to Instagram. This process can be repeated at will. This attack flow is depicted in the graphic below.

Our Instegogram POC was built on Mac OSX, specifically as an uncertified MacOs app developed in obj-c. To execute our POC, we needed to bypass Apple’s built in Gatekeeper protection, which enforces code signing requirements on downloaded applications and thereby makes it more difficult for adversaries to launch a malicious application on an endpoint. We discovered a Gatekeeper bypass, disclosed it to Apple, and are currently working with Apple on a fix.

Instagram API and Challenges

Instagram's API is only partially public. They encourage third-party apps to use likes, subscriptions, requests for images, and similar actions. But, the specifics of calls for uploads are only provided via iPhone hooks and Android Intents. The webapp is a pared down version of the mobile app and doesn't allow uploads. This is likely to deter spam, bots, and maybe malware C2 architectures.

To work as an effective C2 system, we need to be able to integrate our system with Instagram's to automate uploads, downloads, and comments. Creating an image with an embedded message, transferring it to a cell phone, and then uploading it via the official Instagram app is too cumbersome to be useful. We needed to reverse the upload API.

With the help of some open source efforts and a little web-proxy work, we identified the required fields and formats for all the necessary API calls. Charles Proxy was used to sniff out the payloads of API requests coming from the phone app itself. After the general structure was determined, we used a fake Android user-agent, crafted the body of the request, and were in business. The code demonstrating this is in the "api_access" section of Endgame’s github repo.

It is worth noting that the steganographic encode/decode capabilities and capability to programmatically interact with the Instagram API are not specific to a malware use case. They are tools that can be used for other purposes such as secure hidden messaging.

Detecting & Preventing Instegogram

As previously mentioned, this research is motivated by the requirement to strengthen defenses. Therefore, after successfully implementing the malware, we identified the most robust means to detect and prevent Instegogram. There is a range of measures for detecting and preventing C2 via image steganography as well as additional non-steganography measures that can be implemented.

First, a complementary field of research to steganography is steganalysis, in which a defender aims to: (a) predict whether an image may contain a steganographic payload, and if detected, (b) attempt to recover the message from the stego image. However, using statistical techniques to detect whether an image is corrupted may be infeasible for very small message payloads relative to the image size. Payload recovery is a difficult cryptanalysis problem in its own right, but is of interest only for forensic analysis of a detected C2 channel. Given the challenges in successfully implementing these steganalysis techniques through per-image or per-user defensive strategies, we don’t recommend a steganalysis approach.

In contrast, a much simpler set of measures can be implemented by the social media site owner via site-wide policies that effectively jam potential stego traffic. For example, one can create an effectively noisy stego channel for steganographic C2 with minimal visual distortion through one or more of the following methods:

Regularly and pseudorandomly change the site-wide JPEG quantization table used for re-encoding images. This can induce double-compression problems for simple stego routines like our POC that rely on a specific quantization table. Introducing a quantization table mismatch can reduce the channel capacity for communication, and force sophisticated attackers to resort to more robust and advanced information hiding methods.
Implement other minor and visually idempotent but digitally altering transformations on the image, such as cropping the image by a few boundary pixels, randomly shifting the image by a few pixels, and/or randomly flipping the LSBs of quantized bits. While this does not qualitatively alter the image, it creates a challenging environment to transmit information via stego C2.
Institute a policy that requires mandatory application of an Instagram filter, which represents a nonlinear transformation in the image domain. While this qualitatively changes the image, it represents a “visually appealing” change, whilst also providing an effective attack on possible stego C2.

In addition to the stego-focused mitigations, the social media provider can attempt to detect accounts which may be hosting C2. For example, the provider could look for account access anomalies such as a single account being used in rapid succession from geographically distant locations. Anomalies may also be visible in account creation or other factors.

Non-steganography based security policies can also be implemented by a local admin in an attempt to defend against this attack:

Limit access to third-party websites: Like any other remote access trojan or backdoor, you want a policy that limits connections to the command and control server. The simplest technical solution would to limit third-party websites entirely if it's not related to the mission. As mentioned earlier, although it is a simple solution, it may be infeasible due to workforce demands.
Outliers in network behavior: Network monitoring can be configured to detect anomalous network behavior, such as responses from multiple infected machines that utilize a single Instagram account for C2.
Android string detection: The instagram API we provided utilizes an Android network configuration. If your network is primarily Windows endpoints, the user agent string containing Android strings would be an obvious anomaly.
Disable VBA Macros: If not needed, Microsoft Windows Office VBA Macros should be disabled to avoid spearphishing campaigns utilizing the infiltration technique adopted by the POC.

Conclusion

We accomplished our goal of creating a proof of concept malware that utilizes a C2 architecture for nearly untraceable communications via social media and encoded images. We demonstrated a small message channel with the simplest of steganography algorithms. By combining digital image steganography with a prominent trend in malware research - social media for C2 - our research reflects the ongoing vulnerabilities of social media, as well as the novel and creative means of exploiting these vulnerabilities against which modern defenses must be hardened.

There are numerous defensive measures that can be taken to detect and prevent malware similar to Instegogram. As noted, applying Instagram filters or identifying preprocessed images based on quantization tables could be a powerful prevention strategy. Account anomalies can also point at potential issues. However, this is really only possible on the provider side, not the local defender side.

This type of attack is difficult to defend against within a given enterprise. Blocking social media services will be infeasible in many organizations. Local defenders can look at policy-based hardening and more general anomaly detection to defend against attacks like our proof of concept. These steps will help defend against an even broader set of malware.

Malware Timeline Sources

November 21, 2013 - ZeusVM retrospectively found By Xylibox - http://www.xylibox.com/2014/04/zeusvm-and-steganography.html
December 11, 2013 (Actual VirusTotal Date) - April 2014 - First Reported in April 2014 Lurk Downloader by Dell SecureWorks Blog: https://www.secureworks.com/research/malware-analysis-of-the-lurk-downlo...
January 31, 2014 - First reported ZeusVM by Jerome Segura, https://twitter.com/jeromesegura/status/423180548236771328 (Blog: https://blog.malwarebytes.org/threat-analysis/2014/02/hiding-in-plain-si...) and The discovery of stego was discovered by French researcher Xylitol: https://twitter.com/xylit0l
Late 2014 - Gozi Neverquest/Vawtrak - Reported by Dell SecureWorks Blog: https://www.secureworks.com/research/stegoloader-a-stealthy-information-...
June 15 2015 - First reported Stegoloader by Dell SecureWorks Blog: https://www.secureworks.com/research/stegoloader-a-stealthy-information-...
June 26 2015 - Evidence of KINS copying ZeusVm reported by Xylit0l and unixfreaxjp Blog: http://blog.malwaremustdie.org/2015/07/mmd-0036-2015-kins-or-zeusvm-v200...
November 2015 - AdGholas discovered by Proofpoint: https://www.proofpoint.com/us/threat-insight/post/massive-adgholas-malve...

Throughout history, foreign entities have meddled in the internal affairs of other countries, including leadership duration, reputation, and elections of other countries. Whether it’s a coup receiving external support, such as last year’s attempted coup in Burundi, or major power politics battling it out, such as between the East and West during the Cold War, external actors often attempt to influence domestic elections to achieve a variety of objectives. Yesterday, The Washington Post reported that the US is investigating the possibility of covert Russian operations to influence this fall’s presidential elections. Following the DNC hack, the DCCC hack, and last week’s news about breaches of Illinois and Arizona’s databases, this is just the latest, potential indication of Russian tactics aimed at undermining the US elections and democracy.

For years, Russian President Vladimir Putin has exerted domestic control of the Internet, employing his team of Internet trolls to control the domestic narrative. Building upon the domestic success of this tactic, he has also employed it in Ukraine, and other former Eastern bloc countries. As targeted sanctions continue to strangle the Russian economy, coupled with low oil prices, Putin’s international behavior predictably reflects a rational-actor response to his domestic situation. He is going on the offensive, while attempting to undermine the very foundation of US democracy. The potential for Russian covert operations reinforces the point that offensive cyber activity does not occur in a stovepipe from other cross-domain geo-political strategies. Instead, it is one part of a concerted, multi-pronged effort to achieve strategic objectives, which may include undermining the foundation of US democracy domestically and internationally.

US Election Digital Hacking: A brief history

At least as far back as 2004, security professionals have warned that elections can be digitally hacked. By some accounts, the US experienced the first election hack in 2012 when Miami-Dade County received several thousand fraudulent phantom absentee voter ballot requests. They weren’t the only political entities subject to hacking attempts. Both the Romney and Obama campaigns also experienced attempted breaches by foreign entities seeking to access databases and social media sites. Even earlier, in 2008, the Obama and McCain campaigns also suffered the loss of “a serious amount of files” from cyber attacks. China or Russia were the key suspects based on the sophistication of the attacks. Clearly, this has been an ongoing problem, with little fix in sight. Many states continue to push forth with electronic voting systems vulnerable to common yet entirely preventable attacks, and many states have yet to also implement a paper trail.

Hacking an Election around the World

The US is not the only country that has or continues to experience election hacking. External actors can pursue multiple strategies to influence an election. In the digital domain, these can roughly be bucketed into the following:

Data theft: This is not merely a concern in the United States, but is a global phenomenon and is likely going to grow as elections increasingly become digitized. The hack on the Philippines Commission on Elections exposed the personally identifiable information of 55 million citizens. The Ukraine election system was attacked prior to the 2014 election with a virus intended to delete the election results. More recently, Taiwan is experiencing digital attacks (likely from China), aimed at gaining opposition information. Hong Kong is in a similar situation, with digital attacks targeting government agencies leading up to legislative elections. The data theft often occurs for espionage purposes, either to conceal for information gain, or to disclose as a large data dump, such as the DNC data leak.
Censorship: Government entities – largely in authoritarian countries – attempt to control social media prior to elections. As a recent article noted, “Election time in Iran means increased censorship.” This is not just true of Iran, but of many countries across the globe. In Africa, countries as diverse as Ghana, Ethiopia, and Republic of Congo have censored social media during the election season. Ugandan president Yoweri Museveni, who has ruled for 30 years, has deployed Internet censorship, targeting specific social media sites, as a means to influence elections. Last year, Turkey similarly cracked down on social media sites prior to the June elections, similar to Erdogan’s tactics in 2014 during presidential elections. Of course, a government’s censorship often coincides with the electorate’s circumvention of the censorship, which can be analyzed through the use of anti-censorship tools like Tor or Psiphon.
Disinformation: In one of the most intriguing alleged ‘election hacks’ to date, earlier this year Bloomberg reported the case of a man who may have literally rigged elections in nine Latin American countries. He did not do this solely via data theft, but rather through disinformation. Andrés Sepúlveda would conduct digital espionage to gather information on the opposition, while also running smear campaigns and influencing social media. This is similar to Putin’s tactics via the Trolls, as he created fake YouTube videos, Wikipedia entries, and countless fake social media accounts. While the Russian trolls have largely focused domestically or in Eastern Europe, there is indication that they are increasingly posing as supporters within the US election.

Adjusting the Risk Calculus

Elections are just one of the many socio-political events that experience heightened malicious activity. However, within democracies, elections are the bedrock of the democratic process, and manipulation of them can undermine both domestic and international legitimacy. At this weekend’s G-20 summit, President Obama and Putin met to discuss, among other things, acceptable digital behavior. President Obama noted the need to avoid the escalation of an arms race as has occurred in the past, stressing the need to institutionalize global norms.

This is an important point, as digital espionage and propaganda – especially when aimed against a core foundation of democracy – do not necessitate digital responses. In yesterday’s The Washington Post article, Senator Ben Sasse urged Obama to publicly name Russia as the source behind the latest string of attacks on the US presidential elections. He noted, “Free and legitimate elections are non-negotiable. It’s clear that Russia thinks the reward outweighs any consequences….That calculation must be changed. . . . This is going to take a cross-domain response — diplomatic, political and economic — that turns the screws on Putin and his cronies.”

A key feature of any deterrence strategy is to adjust the risk calculus. Russia’s risk calculus remains steadfast, with the benefits of disinformation and data theft clearly outweighing the costs. The latest investigation of Russian covert operations further elucidates that the cyber domain must not be viewed solely through a cyber lens, as is too often the case. Instead, nefarious digital activity is part of the broader strategic context, requiring creative, cross-domain solutions that impact an adversary’s risk calculus, while minimizing escalation.

No one, not even Google CEO Sundar Pichai, is immune to being hacked. And this problem isn’t going away. Cybercrime figures are increasing each year, with a reported 22% rise in breaches already in 2016. While organizations spent $75 billion on security products and services in 2015, there were still 2,000 breaches (75-90% of which were in large enterprises), with the median dwell time of 146 days before detection.

The security industry has not kept pace with cybercriminals' innovation or creativity; it’s been left playing checkers while the enemy plays chess. Cybercriminals are not only becoming more sophisticated and numerous, but today’s IT infrastructure is more vulnerable than ever thanks to the growing number of entry points into an organization.

After 20 years in the security industry with organizations like the NSA, Mandiant and FireEye, and now as the CTO of cybersecurity firm Endgame, I've seen a lot of changes within the field. However, I know that all hope is not lost when it comes to strengthening enterprise security. For instance, in the past few years, the scientific community has empowered companies to leverage artificial intelligence (AI) for everything from fraud detection to self-driving cars. I've seen through my own company's work that utilizing AI in the cybersecurity industry can help platforms block and expel threats from a network. But AI cannot do it alone; it must be critically informed by domain expertise, or it could become just another buzzword.

How AI Could Modernize Security

AI should be seen as a multiplier, not a silver bullet. The hype around AI and data science is warranted, but AI isn’t a homogenous black box. There are many machine learning approaches and different models that fall under the AI umbrella. Like all robust data science, the model and approach must be suited to the each case’s unique data and operational constraints. Nevertheless, machine learning can expedite data wrangling -- a process that becomes tougher as the velocity, volume, variety and veracity (the four Vs) of data continues to expand -- and it can also help organizations make intelligent decisions about the data.

My firm recently conducted research to demonstrate how AI can improve security processes in a project called Machine Learning Red Teams. "Red Teaming" usually entails a team TISI +% of people manually simulating an attack on a system to test its defense, identifying vulnerabilities, and then patching any weaknesses they’ve found. Think of it as two computer robots facing off in a series of rounds. The Red Team model generates malicious samples to bypass the Blue Team detector, which learns how to identify these attacks. Through this process, the Red Team’s technical capabilities against the defense improves. Meanwhile, the Blue Team becomes hardened against blind spot attacks simulated by the generator. Using machine learning, we can simulate the testing portion of process -- pitting offense against defense -- and attempt to deliver faster results in greater volume to recursively improve defenses.

While Machine Learning Red Teaming involves proprietary research, it demonstrates the power of AI to take common, known activities and supercharge them. Notably, DARPA is getting in on the action with their Cyber Grand Challenge, investing in automated systems that can work alongside human security practitioners. We are just scratching the surface of how AI can transform the security world, but it’s incumbent upon all of us to push ourselves in that direction.

Looking Towards Data Science Plus Domain Expertise

Given the complexity of data in the security industry, damage to or loss of critical assets can occur if models are applied thoughtlessly. This is where domain expertise comes into play. Domain experts go beyond security practitioners and should include vulnerability researchers, hackers, and those familiar with implementing advanced techniques that stop the most sophisticated cyberattacks. Together, domain experts and data scientists can shape the model’s assumptions, spotting anomalies and patterns in the data. While an automated system can analyze and forecast events on its own, humans are needed to interpret the results, provide the best courses of action, and further shape the AI model.

Finding domain experts can be tricky. However, we’re starting to see companies post job descriptions for them. At my firm, we used our roots within the public and private sectors to assemble a group of over 20 domain experts, but it wasn't easy to build this kind of team. For others looking to do so, I recommend recruiting those leaving government service looking for commercial opportunities. Federal labs, the intelligence community, and federal law enforcement are great places to find trained domain experts who have faced the full range of cybercriminals, including sophisticated nation state adversaries.

In short, AI still requires domain experts to work alongside the data scientists who are building the models to ensure they are relevant and operationally useful in order to truly innovate the industry.

A Change In Mindset Creates A Change in Culture

For AI to become the revolutionary tool it has the potential to be, enterprises should adopt a new way of thinking that highlights a scientific approach among not only practitioners but the entire enterprise all the way up to the C-suite. By understanding that machine learning is not a silver bullet by itself, enterprises that invest intelligently in these other areas will see improvements across their entire business.

In the security industry, thanks to the automation of many time-consuming and repetitive tasks, companies can take a more proactive approach to security. Instead of simply responding to alerts, security personnel could proactively hunt malicious activity within their networks.

Technology and cybercrime are evolving quickly. I believe it's time to integrate a diversity of expertise and bring the security industry up to speed. AI bolstered by domain expertise could put the security industry at the forefront of innovation. These machine learning-backed defenses will continue to evolve, expanding the shelf-life of security investments while enabling organizations to have greater defensive capabilities and a more proactive approach to cybersecurity.

This piece was originally featured on Forbes.com

After adversaries breach a system, they usually consider how they will maintain uninterrupted access through events such as system restarts. This uninterrupted access can be achieved through persistence methods. Adversaries are constantly rotating and innovating persistence techniques, enabling them to evade detection and maintain access for extended periods of time. A prime example is the recent DNC hack, where it was reported that the attackers leveraged very obscure persistence techniques for some time while they evaded detection and exfiltrated sensitive data.

The number of ways to persist code on a Windows system can be counted in the hundreds and the list is growing. The discovery of novel approaches to persist is not uncommon. Further, the mere presence of code in a persistence location is by no means an indicator of malicious behavior as there are an abundance of items, usually over a thousand, set to autostart under various conditions on a standard Windows system. This can make it particularly challenging for defenders to distinguish between legitimate and malicious activity.

With so many opportunities for adversaries to blend in, how should organizations approach detection of adversary persistence techniques? To address this question, Endgame is working with The MITRE Corporation, a not-for-profit R&D organization, to demonstrate how the hunting paradigm fits within the MITRE ATT&CK™ framework. The ATT&CK™ framework—which stands for Adversarial Tactics, Techniques & Common Knowledge—is a model for describing the actions an adversary can take while operating within an enterprise network, categorizing actions into tactics, such as persistence, and techniques to achieve those tactics. Endgame has collaborated with MITRE to help extend the ATT&CK™ framework by adding a new technique – COM Object Hijacking– to the persistence tactic, sparking some great conversations and insights that we’ve pulled together into this post. Thanks to MITRE for working with Endgame and others in the community to help update the model, and a special thanks to Blake Strom for co-authoring this piece. Now let the hunt for persistence begin!

Hunting for Attacker Techniques

Hunting is not just the latest buzzword in security. It is a very effective process for detection as well as a state of mind. Defenders must assume breach and hunt within the environment continually as though an active intrusion is underway. Indicators of compromise (IOC) are not enough when adversaries can change tool indicators often. Defenders must hunt for never-before-seen artifacts by looking for commonly used adversary techniques and patterns. Given constantly changing infrastructure and the increasingly customized nature of attacks, hunting for attacker techniques greatly increases the likelihood of catching today’s sophisticated adversaries.

Persistence is one such tactic for which we can effectively hunt. Defenders understand that adversaries will try to persist and generally know the most common ways this can be done. Hunting in persistence locations for anomalies and outliers is a great way to find the adversary, but it isn’t always easy. Many techniques adversaries use resemble ways software legitimately behaves on a system. Adversary persistence behavior in a Windows environment could show up as installing seemingly benign software to run upon system boot, when a user logs into a system, or even more clever techniques such as utilizing Windows Management Instrumentation (WMI). Smart adversaries know what is most common and will try to find poorly understood and obscure ways to persist during an intrusion in order to evade detection.

MITRE has provided the community with a cheat sheet of persistence mechanisms through ATT&CK™, which describes the universe of adversary techniques to help inform comprehensive coverage during hunt operations. It includes a wide variety of techniques ranging from simply using legitimate credentials to more advanced techniques like component firmware modification approaches. The goal of an advanced adversary is not just to persist - it is to persist without detection by evading common defensive mechanisms as well. These common evasion techniques are also covered by ATT&CK™. MITRE documented these techniques using in-depth knowledge about how adversaries can and do operate, like with COM hijacking for persistence.

To demonstrate the value of hunting for specific techniques, we focus on Component Object Model (COM) Hijacking, which can be used for persistence as well as defense evasion.

So What’s Up with the COM?

Microsoft’s Component Object Model (COM) has been around forever – well not exactly – but at least since 1993 with MS Windows 3.1. COM basically allows for the linking of software components. This is a great way for engineers to make components of their software accessible to other applications. The classic use case for COM is how Microsoft Office products link together. To learn more, Microsoft’s official documentation provides a great, comprehensive overview of COM.

Like many other capabilities attackers use, COM is not inherently malicious. However, there are ways it can be used by the adversary which are malicious. As we discussed earlier, most adversaries want to persist. Therefore, hunters should regularly look for signs of persistence, such as anomalous files which are set to execute automatically. Adversaries can cleverly manipulate the COM to execute their code, specifically by manipulating software classes in the current user registry hive, and enabling persistence.

But before we dive into the registry, let’s have a quick history lesson. Messing with the COM is not an unknown technique by any means. Even as early as 2005, adversaries were utilizing Internet Explorer to access the machine’s COM to cause crashes and other issues. Check out CVE-2005-1990 or some of CERT’s vulnerability notes discussing exactly this problem.

COM object hijacking first became mainstream in 2011 at the Virus Bulletin conference, when Jon Larimer presented “The Dangers of Per-User COM Objects.” Hijacking is a fairly common term in the infosec community, and describes the action of maliciously taking over an otherwise legitimate function at the target host: session hijacking, browser hijacking, and search-order hijacking to name a few. It didn’t take long for adversaries to start leveraging the research presented at Virus Bulletin 2011. For example, in 2012, the ZeroAcess rootkit started hijacking the COM, while in 2014 GDATA reported a new Remote Administration Tool (RAT) dubbed COMpfun which persists via a COM hijack. The following year, GDATA again presented the use of COM hijacking, with COMRAT seen persisting via a COM hijack. The Roaming Tiger Advanced Persistent Threat (APT) group reportedly also used COM hijacking with the BBSRAT malware. These are just a few examples to demonstrate that COM hijacking is a real concern which hunters need to consider and handle while looking for active intrusions in the network.

The Challenges and Opportunities to Detect COM Hijacking

Today, COM hijacking remains relevant, but is often forgotten. We see it employed by persistent threats as well as included in crimeware. Fortunately, we have one advantage - the hijack is fairly straightforward to detect. To perform the hijack, the adversary relies on the operating system to load current user objects prior to the local machine objects in the COM. This is the fundamental principle to the hijack and also the method to detect.

Easy, right? Well, there are some gotchas to watch out for. Most existing tools detect COM hijacking through signatures. A COM object is identified in the system by a globally unique identifier called a CLSID. A signature-based approach will only look at and alert on specific CLSIDs which reference an object that has been previously reported as hijacked. This is nowhere near enough because, in theory, any COM object and any CLSID could be hijacked.

Second, for us hunters, the presence of a user COM object in general can be considered anomalous, but some third-party applications will generate such objects causing false positives in your hunt. To accurately find COM hijacks, a more in-depth inspection within the entire current user and local machine registry hive is necessary. In our single default Windows 7 VM, we had 4697 CLSIDs within the local machine hive. To perform the inspection, you will need to dust off your scripting skills and perform a comparative analysis within the registry. This could become difficult and may not scale if you are querying thousands of enterprise systems, which is why we baked this inspection into the Endgame platform.

Pic of file folders

So many objects to hijack...

At Endgame, we inspect the registry to hunt exactly for these artifacts across all objects within the registry and this investigation scales across an entire environment. This is critical because hunters need to perform their operations in a timely and efficient manner. Please reference the following video to see a simple COM hijack and automatic detection with the Endgame platform. Endgame enumerates all known persistence locations across a network, enriches the data, and performs a variety of analytics to highlight potentially malicious artifacts in seconds. COM hijacking detection is one capability of many in the Endgame platform.

Hunting for COM Hijacking using Endgame

Conclusion

Persistence is a tactic used by a wide range of adversaries. It is part of almost every compromise. The choice of persistence technique used by an adversary can be the most interesting and sophisticated aspect of an attack. This makes persistence, coupled with the usual defense evasion techniques, prime focus areas for hunting and subsequent discovery and remediation. Furthermore, we can’t always rely on indicators of compromise alone. Instead, defenders must seek out anomalies within the environment, either at the host or in the network, which can reveal the breadcrumbs to follow and find the breach.

Without a framework and intelligent automation, the hunt can be time-consuming, resource-intensive, and unfocused. MITRE’s ATT&CK™ framework provides an abundance of techniques that can guide the hunt in a structured way. With this as a starting point, we have explored one persistence technique in depth: COM hijacking. COM hijacks can be detected without signatures through intelligent automation and false positive mitigation, getting beyond many challenges present if an analyst would need to find COM hijacks manually. This is just one way in which a technique-focused hunt mindset can allow defenders to detect, prevent, and remediate those adversaries that continue to evade even the most advanced defenses.

E-mail spam and browser exploitation are two very popular avenues used by criminals to compromise computers. Most compromises result from human error, such as clicking a malicious link or downloading and opening an attachment within an email and enabling macros. Email filtering can offer effective protections against the delivery of widespread malicious spam, and user training can be reasonably effective in reducing the number of employees willing to open an unknown document, enable macros, and self-infect.

Protection against browser exploitation is more difficult, which occurs simply by visiting a website. Users are prone to clicking links to malicious sites, and much worse, criminals actively exploit well trafficked sites (directly through web server exploitation or indirectly via domain squatting, ad injections, or other techniques) and cause the user’s browser to covertly visit an exploit server. The user’s browser gets hit, a malicious payload such as ransomware is installed, and the user has a bad day.

The hardest part of this for attackers is the exploitation code itself. Fortunately for criminals, a thriving underground market for exploit kits is available. These occasionally contain zero days, but more often, exploit kit authors rapidly weaponize new vulnerabilities which will allow for exploitation of users who fail to rapidly patch their systems.

Exploit kits are often key components of crimeware campaigns, which is estimated to be a $400 billion global market. Capturing these often evasive exploit kits is essential to advance research into protecting against them, but samples are hard to obtain for researchers. To solve this problem, we created Maxwell, an automated exploit kit collection and detection tool that crawls the web hunting for exploits. For researchers, Maxwell significantly decreases the time it takes to find exploit kits samples, and instead enables us to focus on the detection and prevention capabilities necessary to counter the growing criminal threat of exploit kits.

Exploit Kits in the Wild

The Angler exploit kit - responsible for a variety of malvertising and ransomware compromises - is indicative of just how lucrative these exploit kits can be. By some estimates, Angler was the most lucrative compromise platform for crimeware, reeling in $60 million annually in ransomware alone. Earlier this year, a Russian criminal group was arrested in connection with the Lurk Trojan. This coincided with an end of Angler exploit kit activity. A battle for market share has ensued since, with RIG and Neutrino EK jockeying for the market leading position.

A typical business model for exploit kit authors is malware as a service. The authors rent access to their exploit kits to other criminals for several thousand dollars a month, on average.

Other criminal groups instead opt to focus more on traffic distribution services or gates, and set out to compromise as many web servers as possible. Once compromised, they can insert iframe re-directions to websites of their choosing. The users of exploit kits can pay for this as a service to increase the amount of traffic their exploit kits receive and drive up infections.

The Exploitation Process

The graphic below depicts a high-level overview of the six core steps of the exploitation process. There are numerous existing case studies on exploit kits, such as one on Nuclear, that provide additional, very low level details on this process.

The graphic depicts a high-level overview of the six core steps of the exploitation process.

A user visits a legitimate web page.
Their browser is silently redirected to an exploit kit landing page. This stage varies, but focuses on traffic redirection. Either the attacker compromises the website and redirects the user by adding JavaScript or an iframe, or the attacker pays for an advertisement to perform the redirection.
The exploit kit sends a landing page which contains JavaScript to determine which browser, plugins and versions the user is running. Neutrino EK is a little different in that this logic is embedded in a Flash file that is initially sent.
If the user’s configuration matches a vulnerability to a particular exploit (e.g., an outdated Flash version), the user’s browser will be directed to load the exploit.
The exploit’s routines run, and gain code execution on the user’s machine.
The exploit downloads and executes the chosen malware payload. Today, this is usually ransomware, but it can also be banking trojans, click fraud, or other malware.

How to Catch an Exploit Kit: Introducing Maxwell

There are numerous motivations for collecting and analyzing exploit kits. As a blue teamer, you may want to test your defenses against the latest and greatest threat in the wild. As a red teamer, you may want to do adversary emulation with one of the big named exploit kits (e.g., Neutrino, RIG, Magnitude). Or maybe you have some other cool research initiative. How would you go about collecting new samples and tracking activity? If you work for a large enterprise or AV company, it is relatively easy as your fellow employees or customers will provide all the samples you need. You can simply set up packet collection and some exploit kit detections at your boundary and sit back and watch. But what if you are a researcher, without access to that treasure trove of data? That’s where Maxwell comes in. Maxwell is an automated system for finding exploit kit activity on the internet. It crawls websites with an army of virtual machines to identify essential information, such as the responsible exploit kit, as well as related IPs and domains.

The Maxwell Architecture

Maxwell consists of a central server, which is basically the conductor or brains of the operation, connecting to a vSphere or other cloud architecture to spin up and down virtual machines. RabbitMQ and ElasticSearch provide the means for message queuing and indexing the malicious artifacts. The virtual machine consists of a variety of Python agent scripts to enable iterative development, as well as a pipe server that receives messages from the instrumentation library, filters those that match a whitelist, and forwards the remaining messages to a RabbitMQ server.

Flux is our instrumentation library, which is a DLL loaded to new processes with a simple AppInit DLL key. Flux hooks the usual functions for dropping files, creating registry keys, process creation, etc. The hooking is done only in user-mode at the Nt function level. The hooks must be at the lowest possible level in order to capture the most data. Flux also has some exploit detection capabilities built in, and shellcode capturing, which will be discussed shortly.

Moving to outside the virtual machine, the controller is a Python script that listens on a RabbitMQ channel for new jobs, including basic information like the website to visit, a uuid, and basic config information. Once a new job is received, the controller is responsible for spinning up a virtual machine and sending the job information and the plugin to execute (which is currently only Flux). The controller uses ssh to copy files into the virtual machine. The results server is also a Python script that listens on a different RMQ channel. This script receives data from the virtual machines during execution. The data is forwarded to an Elasticsearch index for permanent storage and querying. Once a job has completed, the results server determines if any malicious activity has occurred. If so, it executes post processing routines. Finally, all extracted data and signature information is sent in a notification to the researcher. An important design decision worth noting, is to stream events out of the virtual machine during execution, as opposed to all at once after a timeout. The latter is susceptible to losing information after ransomware wreaks havoc in the virtual machine.

When configuring your virtual machine, it’s important to make it an attractive target for attackers, who work by market share and target the most popular software, such as Windows 7, Internet Explorer, and Flash. Be sure to also remove vmtools and any drivers that get dropped by VMware. You can browse to the drivers folder and sort by publisher to find VMware drivers. Finally, you should consider patch levels, and pick plugin versions that are exploitable by all major exploit kits, while also disabling any additional protections, such as IE protected mode.

Exploit Detection in Maxwell

As mentioned earlier, Maxwell automates exploit detection. While previously ROP detection was reliable enough, it is no longer effective at detecting modern exploit kits. The same is true for stack pivot, which basically checks ESP to see if it points to the heap instead of the stack, and is easily evaded by Angler and other exploit kits.

In Flux, we throw guard pages not only on the export address table, but also the IAT and MZ header of critical modules. We also use a small whitelist instead of a blacklisting technique, enabling us to catch shellcode that is designed to evade EMET. Even better, we can detect memory disclosure routines that execute before the shellcode. When a guard page violation is hit, we also save the shellcode associated with it for later inspection.

Similar to all sandboxes, Flux can log when files are dropped, new processes are created, registry keys are written to, or even when certain files are accessed. File access can be used to detect JavaScript vm-detection routines before an exploit is even triggered. However, these are all too noisy on their own so we must rely heavily on the whitelist capability. Essentially, every typical file/registry/process action that is observed from browsing benign pages is filtered, leaving only malicious activity caused by the exploit kit. Even if they evade our exploit detection techniques, we will still detect malicious activity of the payload as it is dropped and executed.

If malicious activity is detected, the post-processing step is activated by executing tcpflow on the PCAP to extract all sessions and files. Next, regular expressions are searched across the GET/POST requests to identify traffic redirectors (such as EITEST), EK landing pages, and payload beacons. Finally, any dropped files, shellcode, and files extracted from the PCAP are scanned with Yara. The shellcode scanning allows for exploit kit tagging based on the specific shellcode routines used in each kit, which are very long lasting signatures.

If you are protecting a network with snort rules for a component of the EK process, you need to know when Maxwell stops flagging a signature. Building robust signatures limits the necessity to frequently update them. There are a few tricks for writing robust signatures, such as comparing samples over time to extract commonalities or creating signatures from Flash exploits themselves. Both of these may result in longer lasting signatures. You can also take advantage of social media, and compare samples in Maxwell against those posted on Twitter by researchers, such as @kafeine, @malware_traffic and @BroadAnalysis.

Hunting for Exploit Kits

With the architecture and detection capabilities in place, it’s time to start hunting. But which websites should be explored to find evil stuff? A surprisingly effective technique is to continually cycle through the Alexa top 25,000 or top 100,000 websites, which can be streamlined by browsing five websites at a time instead of one, and get a 5x boost on your processing capability. In less than 24 hours, you can crawl through 25,000 websites with just a handful of virtual machines. The only down side is losing the ability to know exactly which of the five websites was compromised without manually looking through the PCAP. If you have a good traffic anonymizing service, you can just reprocess each of the five websites.

At DerbyCon 6.0 Recharge, the Maxwell research was presented for the first time and we released the code under an MIT license. You can find it on GitHub. We look forward to to comments, contributions, and suggestions for advancements. Maxwell has proven extremely useful in fully automating the detection and analysis of exploit kits and watering holes. Ideally, Maxwell can help both red and blue teamers test an organization’s defenses without requiring extensive resources or significant time. It also greatly simplifies a key pain point for researchers - actually collecting the samples. By hunting for exploit kits with Maxwell, researchers can spend more time analyzing and building defenses against exploit kits, instead of searching for them.

Picture Source: artistsinspireartists

In 2008, the number of internet-connected devices surpassed the number of people on the planet and Facebook overtook MySpace as the most popular social network. At the time, few people grasped the impact that these rapidly expanding digital networks would have on both national and cyber security. This was also the year I first used Hadoop, a distributed storage and processing framework.

Since then, Hadoop has become the core of nearly every major large-scale analytic solution, but it has yet to reach its full potential in security. To address this, last week Cloudera, Endgame and other contributors announced the acceptance of Apache Spot, a cyber analytics framework, into the Apache Software Foundation incubator. At the intersection of Hadoop and security, this new project aims to revolutionize security analytics.

“When you invent the ship, you also invent the shipwreck”- Paul Virilio

Back in 2008, the security industry was recovering from the Zeus trojan and unknowingly gearing up for a date with Conficker, a worm that would go on to infect upwards of 9 million devices across 190 countries. Simultaneously, government think tanks warned that web 2.0 social networks would facilitate the radicalization and recruitment of terrorists.

As a computer engineer in the US Intelligence Community, I was lucky enough to work on these large-scale problems. Problems that, at their core, require solutions capable of ingesting, storing, and analyzing massive amounts of data in order to discern good from bad.

Around this time, engineers at search engine companies were the only other teams working on internet-scale data problems. Inspired by Google’s MapReduce and File System papers, Doug Cutting and a team at Yahoo open sourced Apache Hadoop, a framework that made it possible to work with large data sets across inexpensive hardware. Upon its release in 2006, this project began democratizing large-scale data analysis and gained adoption across a variety of industries.

Seeing the promise of Hadoop, the Intelligence Community became an early adopter, as it needed to cost-effectively perform analysis at unprecedented scale. In fact, they ultimately invested in Cloudera, the first company founded to make Hadoop enterprise ready.

Fast Forward to Today: Hadoop for Security

In 2016, forward-leaning security teams across industry and government are increasingly adopting Hadoop to complement their Security Incident and Event Management (SIEM) systems. There are a number of fundamental characteristics that make Hadoop attractive for this application:

1. Scalability: Network devices, users, and security products emit a seemingly infinite flow of data. Based on its distributed architecture, Hadoop provides a framework capable of dealing with the volume and velocity of this cross-enterprise data.

2. Low Cost-per-Byte: Detection, incident response, and compliance use cases increasingly demand longer data retention windows. Due to its use of commodity hardware and open source software, Hadoop achieves a scaling cost that is orders of magnitude lower than commercial alternatives.

3. Flexibility: Starting with a single Apache project, the Hadoop family has grown into an ecosystem of thirty plus interrelated projects. Providing a “zoo” of data storage, retrieval, ingest, processing, and analytic capabilities, the Hadoop family is designed to address various technical requirements from stream processing to low-latency in-memory analytics.

Unfortunately, many Hadoop-based security projects exceed budget and miss deadlines. To kick off a project, engineers have to write thousands of lines of code to ingest, integrate, store, and process disparate security data feeds. Additionally, the numerous ways of storing data (e.g., Accumulo, HBase, Cassandra, Kudu...) and processing it tees up a myriad of design decisions. All of this distracts from the development and refinement of the innovative analytics our industry needs.

Apache Spot

Apache Spot is a new open source project designed to address this problem, and accelerate innovation and sharing within the security community. It provides an extensible turnkey solution for ingesting, processing, and analyzing data from security products and infrastructure. Hadoop has come a long way since its inception. Apache Spot opens the door for exciting security applications. Purpose-built for security, Spot does the heavy lifting, providing out-of-the-box connectors that automate the ingest and processing of relevant feeds. Through its open data models, customers and partners are able to share data and analytics across teams - strengthening the broader community.

Endgame is proud to partner with Cloudera and Intel to accelerate the adoption of Apache Spot across customers and partners. Our expertise in using machine learning to hunt for adversaries and deep knowledge of endpoint behavior will help Apache Spot become a prominent part of the Hadoop ecosystem. We’re excited to contribute to this open source project, and continue pushing the industry forward to solve the toughest security challenges.

To find out more about Apache Spot, check out the announcement from Cloudera and get involved.

As the security research community develops newer and more sophisticated means for detecting and mitigating malware, malicious actors continue to look for ways to increase the size of their attack surface and utilize whatever means are necessary to bypass protections. The use of scripting languages by malicious actors, despite their varying range of limited access to native operating system functionality, has spiked in recent years due to their flexibility and straightforward use in many attack scenarios.

Scripts are frequently leveraged to detect a user’s operating system and browser environment configuration, or to extract or download a payload to disk. Malicious actors also may obfuscate their scripts to mask the intent of their code and circumvent detection, while also deterring future reverse engineering. As I presented at DerbyCon 6 Recharge, many common obfuscation techniques can be subverted and defeated. Although they seem confusing at first glance, there are a variety of techniques that help quickly deobfuscate scripts. In this post, I’ll cover the latest advances in script obfuscation and how they can be defeated. I’ll also provide some practical tips for quickly cleaning up convoluted code and transforming it into something human-readable and comprehensible.

Obfuscation

When discussing malicious scripts, obfuscation is a technique attackers use to purposefully obscure their source code. They do this primarily for two purposes: subverting antivirus and intrusion detection / prevention systems and deterring future reverse engineering efforts.

Obfuscation is typically employed via an automated obfuscator. There are many to choose from, including the following freely available tools:

Since obfuscation does not alter the core functionality of a script (superfluous code may be added to further obscure a script’s purpose), would it be possible to simply utilize dynamic malware analysis methods to determine the script’s intended functionality and extract indicators of compromise (IOCs)? Unfortunately for analysts and researchers, it’s not quite that simple. While dynamic malware analysis methods may certainly be used as part of the process for analyzing more sophisticated scripts, deobfuscation and static analysis are needed to truly know the full extent of a script’s capabilities and may provide insight into determining its origin.

Tips for Getting Started

When beginning script deobfuscation, you should keep four goals in mind:

Human-readability: Simplified, human-readable code should be the most obvious realized goal achieved through the deobfuscation process.
Simplified code: The simpler and more readable the code is, the easier it will be to understand the script’s control flow and data flow.
Understand control flow / data flow: In order to be able to statically trace through a script and its potential paths of execution, a high level understanding of its control flow and data flow is needed.
Obtain context: Context pertaining to the purpose of the script and why it was utilized will likely be a byproduct of the first three goals.

Prior to starting the deobfuscation process, you should be sure you have the following:

Virtual machine
Fully-featured source code editor with syntax and function / variable highlighting
Language-specific debugger

It should also go without saying that familiarity with scripting languages is a prerequisite since you’re trying to (in most cases) understand how the code was intended to work without executing it. The following scripting language documentation will be particularly useful:

Online script testing frameworks provide a straightforward means for executing script excerpts. These frameworks can serve as a stepping-stone between statically evaluating code sections and setting up a full-fledged debugging session. The following frameworks are highly recommended:

Before you begin, it is important to know that there is no specific sequence of steps required to properly deobfuscate a script. Deobfuscation is a non-linear process that relies on your intuition and ability to detect patterns and evaluate code. So, you don’t have to force yourself to go from top to bottom or the perceived beginning of the control flow to the end of the control flow. You’ll simply want to deobfuscate code sections that your eyes gravitate towards and that are not overly convoluted. The more sections you’re initially able to break down, the easier the overall deobfuscation process will be.

Code uniformity is crucial to the deobfuscation process. As you’re deobfuscating and writing out your simplified version of the code, you’ll want to employ consistent coding conventions and indentation wherever possible. You’ll also want to standardize and simplify how function calls are written where possible and how variables are declared and defined. If you take ownership of the code and re-write it in a way that you easily understand, you'll quickly become more familiar with the code and may pick up on subtle nuances in the control or data flow that may otherwise be overlooked.

Also, as previously mentioned, simplify where possible. It can't be reiterated enough!

Obfuscation Techniques

Garbage Code

Obfuscators will sometimes throw in superfluous code sections. Certain variables and functions may be defined, but never referenced or called. Some code sections may be executed, but ultimately have no effect on the overall operation of the script. Once discovered, these sections may be commented out or removed.

In the following example, several variables within the subroutine are defined and set to the value of an integer plus the string representation of an integer. These variables are not referenced elsewhere within the code, so they can be safely removed from the subroutine and not affect the result.

Obfuscation_Code_1

Complicated Names

Arguably the most common technique associated with obfuscation is the use of overly complicated variable and function names. Strings containing a combination of uppercase and lowercase letters, numbers, and symbols are difficult to look at and differentiate at first glance. These should be replaced with more descriptive and easier to digest names, reinforcing the human-readable goal. While you can use the find / replace function provided by your text editor, in this case you’ll need to be careful to avoid any issues when it comes to global versus local scope. Once you have a better understanding of the purpose of a function or a variable later on in the deobfuscation process, you can go back and update these names to something more informative like “post_request” or “decoding_loop.”

Obfuscation_Code_2

In the above example, each variable and function that is solely local in scope to the subroutine is renamed to a more straightforward label describing its creation or limitation in scope. Variables or function calls that are referenced without being declared / defined within the subroutine are left alone for the moment. These variables and functions will be handled individually at the global scope.

Indirect Calls and Obscured Control Flow

Obscured control flow is usually not evident until much later in the deobfuscation process. You will generally look for ways to simplify function calls so they’re more direct. For instance, if you have one function that is called by three different functions, but each of the those functions transform the input in the exact same way and call the underlying function identically, then those three functions could be merged into one simple function. Function order can also come into play. If you think a more logical ordering of the functions that matches up with the control flow you are observing will help you better understand the code, then by all means rearrange the function order.

Obfuscation_Code_3

In this case, we have five subroutines. After these subroutines are defined, there is a single call to sub5. If you trace through the subroutines, the ultimate subroutine that is executed is sub2. Thus, any calls outside of this code section to any of these five subs will result in a call to sub2, which actually carries out an operation other than calling another subroutine. So, removing sub1, sub3, sub4, and sub5 and replacing any calls to those subs with a direct call to sub2 would be logically equivalent to the original code sequence.

Arithmetic Sequences

When it comes to hard-coded numeric values, obfuscators may employ simple arithmetic to thwart reverse engineers. Other than doing the actual math, it is important to research the exact behavior of the scripting language implementation of the mathematical functions.

Obfuscation_Code_4

In the line above, the result of eight double values which are added and subtracted to / from each other will pass into the ASCII character function. Upon further inspection, the obfuscator likely threw in the "86" values, as they ultimately cancel each other out and add up to zero. The remaining values add up to 38 which, when passed into the character function, results in an ampersand.

Obfuscation_Code_5

While the code section above may look quite intimidating, it is easily reduced down to one simple variable definition by the end of the deobfuscation process. The first line initializes an array of double values which is only referenced on the second line. The second line declares and sets a variable to the second value in the array. Since the array is not subsequently referenced, the array can be removed from the code. The variable from the second line is only used once in the third line, so its value can be directly placed inline with the rest of the code, thus allowing us to remove the second line from the code. The code further simplifies down as the Sgn function calls cancel each other and the absolute value function yields a positive integer, which will be subtracted from the integer value previously defined in the variable from the second line.

Obfuscated String Values

As for obfuscated string values, you’ll want to simplify the use of any ASCII character functions, eliminate any obvious null strings, and then standardize how strings are concatenated and merge together any substrings where possible.

This line of code primarily relies on the StrReverse VBA function which, you guessed it, reverses a string. The null strings are removed right off the bat since they serve no purpose. Once the string is reversed and appended to the initial “c” string, we’re left with code which invokes a command shell to run the command represented by the variable and terminate itself.

Obfuscation_Code_6

A common technique employed in malicious VBA macros is dropping and invoking scripts in other scripting languages. The macro in this case builds a Windows batch file, which will later be executed. While it is quite evident that a batch file is being constructed, the exact purpose of the file is initially unclear.

Obfuscation_Code_7

If we carry out string concatenations, eliminate null strings, and resolve ASCII characters, we can see that the batch file is used to invoke a separate VBScript file located in a subdirectory of the current user’s temp directory.

Advanced Cases

Okay, so you tried everything and your script is still obfuscated and you have no idea what else to do…

Well, in this case you’ll want to utilize a debugger and start doing some more dynamic analysis. Our goal in this case is to circumvent the obfuscation and seek out any silver bullets in the form of eval functions or string decoding routines. Going back to the ”resolve what you can first” approach, you also might want to start out by commenting out code sections to restrict program execution. Sidestepping C2 and download functions with the aid of a debugger may also be necessary.

Obfuscation_Final_Code_8

If you follow the function that is highlighted in green in the above example, you can see that it is referred to several times. It takes as input one hexadecimal string and one alphanumeric string with varying letter cases, and returns a value. Based off of the context in which the function is called (as part of a native scripting language function call in most cases), the returned value is presumed to be a string. Thus, we can hypothesize that this function is a string decoding routine. Since we are using a debugger, we don’t need to manually perform the decoding or reverse engineer all of its inner workings. We can simply set breakpoints before the function is called and prior to the function being returned in order to resolve what the decoded strings are. Once we resolve the decoded strings, we can replace them inline with the code or place them as inline comments as I did in the sample code.

Conclusion

Script deobfuscation doesn't require any overly sophisticated tools. Your end result should be simple, human-readable code that is logically equivalent to the original obfuscated script. As part of the process, rely on your intuition to guide you, and resolve smaller sections of code in order to derive context to how they’re used. When provided the opportunity, removing unnecessary code and simplifying code sections can help to make the overall script much more readable and easier to comprehend. Finally, be sure to consult the official scripting language documentation when needed. These simple yet effective tips should provide a range of techniques next time you encounter obfuscated code. Good luck!

As any good hunter knows, one of the first quick-win indicators to look for is malware within designated download or temp folders. When users are targeted via spear phishing or browser based attacks, malware will often be initially staged and executed from those folders because they are usually accessible to non-privileged processes. In comparison, legitimate software rarely runs from, and even more rarely persists from, these locations. Consequently, collecting and analyzing file paths for outliers provides a signature-less way to detect suspicious artifacts in the environment.

In this post, we focus on the power of file paths in hunting. Just as our earlier post on COM Hijacking demonstrated the value of the ATTACK^TM framework for building defensive postures, this piece addresses another branch of the persistence framework and illustrates the efficacy of hunting for uncommon file paths. This method has proven effective time and time again in catching cyber criminals and nation state actors alike early in their exploitation process.

Hunting for Uncommon Paths: The Cheap Approach

In past posts, we highlighted some free approaches for hunting to include passive DNS, DNS anomaly detection, and persistence techniques. Surveying file locations on disk is another cost effective approach for your team. The method is straightforward: simply inspect running process paths or persistent file locations and then look through the data for locations where attackers tend to install their first-stage malware. For an initial list of such locations, Microsoft’s Malware Protection Center provides a good list of common folders where malware authors commonly write files.

Your hunt should focus on files either running or persisting within temp folders, such as %TEMP%, %APPDATA%, %LOCALAPPDATA%, and even extending to %USERPROFILE%, but first you need to collect the appropriate data. To gather persistence data with freely available tools, you can use Microsoft’s SysInternals Autoruns to discover those persistent files and extract their paths. For detailed running process lists, there are many tools available, but we recommend utilizing PowerShell via the Get-Process cmdlet.

Some popular commands include:

“Get-Process | Select -Property name, path”: To list current processes and their path.
“Get-Process | Select -Property path | Select-String C:\\Users”: In this case we filter for running processes from a user’s path. Like grep, utilize Select-String to filter results.
“Get-Process | Group-Object -Property path | Sort count | Select count, name”: For the number of occurrences, group objects.

Expanding your hunt beyond just temp folders can also be beneficial, but legitimate processes will be running from %SystemRoot% or %ProgramFiles%. Because of this, outlier analysis will be effective and assist your hunt. Simply aggregate the results from your running process or persistent surveys across your environment, then perform frequency analysis to find the least occurring samples.

The Noise Dilemma

As you may have guessed, anomalous items uncovered with this approach to hunting does not necessarily mean they were suspicious. Users can install applications locally in uncommon locations, adding benign results to your hunt. Experienced hunters know that noise is always going to be part of many hunt analytics, and the art is in discerning the artifacts that are most likely to be malicious and should thus be focused on first.

There are many ways to triage and filter results, but here are a few examples.

You can utilize authenticode by filtering out executables that are legitimately signed by well-known signers or by signers you might expect in your environment. Unsigned code out of an odd location might be especially unusual and worth inspecting.
If you have a known-good, baseline image of your environment, you can use hash lookups to weed out any applications which were not approved for installation.
You may want to ignore folders protected by User Account Controls (UAC). In a well-configured environment, a standard user is not allowed to write to a secure directory, such as %SystemRoot% and some directories under %ProgramFiles%. Filtering out items executing from there can reduce the data set. It’s worth mentioning that you are assuming administrative credentials weren’t harvested and UAC wasn’t bypassed.
Once you’ve filtered data out, especially if you are left with many suspicious files, you may want to submit files for malware scans (e.g. VirusTotal). At Endgame, we prioritblize based on the MalwareScore™ of the file, our proprietary signature-less malware detection engine. The higher the score, the more likely it is that the file is malicious.

You may come up with many other ways to filter and reduce the data based on your specific knowledge of your environment. As always, your analysis of the hunt data and subsequent analytics should be driven by environmental norms, meaning that if you observe something uncommon to your system configurations, it’s worth investigating.

How Endgame Detects

At Endgame, we facilitate one-click collection of processes and persistence locations to allow for rapid anomaly and suspicious behavior detection. Our platform examines each endpoint within your network and can prevent, alert, or hunt for malicious behavior at scale. The following video shows a simple PowerShell-based approach to collection and stacking, and then shows the Endgame platform enumerating and enriching data in seconds to bubble suspicious artifacts to the top with the uncommon path technique.

Conclusion

There are many very effective ways to hunt for suspicious artifacts across your systems. Hunting for files executing or persisting out of strange paths can be effective in many environments. We’ve discussed some free ways to make this happen, and shown you how this can be done with the Endgame platform, highlighting malicious activity with ease. Layering this hunt approach with other analytics in your hunt operations will allow you to find intrusions quickly and do what every hunter wants to do: close the gap between breach and discovery to limit damage and loss.

InRise of the Machines, Thomas Rid details the first major digital data breach against the US government. The spy campaign began on October 7, 1996, and was later dubbed Moonlight Maze. This operation exfiltrated data that, if stacked, would exceed the height of the Washington Monument. Once news of the operation was made public, Newsweek cited Pentagon officials as clearly stating it was, "a state-sponsored Russian intelligence effort to get U.S. technology". That is, the US government publicly attributed a data breach aimed at stealing vast amounts of military and other technology trade secrets.

Fast-forward twenty years, and on October 7, 2016, ODNI and DHS issued a joint statement noting, “The U.S. Intelligence Community (USIC) is confident that the Russian Government directed the recent compromises of e-mails from US persons and institutions, including from US political organizations.” It’s been twenty years, and our policies have not yet evolved, leaving adversaries virtual carte blanche to steal intellectual property, classified military information, and personally identifiable information. They’re able to conduct extensive reconnaissance into our critical infrastructure, and hack for real-world political impact without recourse. This recent attribution to Russia, which cuts at the heart of democratic institutions, must be a game-changer that finally instigates the modernization of policy as it pertains to the digital domain.

Despite the growing scale and scope of digital attacks, with each new record-breaking breach dwarfing previous intrusions, this is only the fourth time in recent years that the US government has publicly attributed a major breach. Previous public attribution resulted in the indictment of five People’s Liberation Army officials, economic sanctions against North Korea following the Sony breach and earlier this year the indictment of seven Iranians linked to attacks on banks and a New York dam. As breach after breach occurs, those in both the public and private sectors are demanding greater capabilities in defending against these attacks.

Unfortunately, much of the cyber policy discussion continues to rely upon frameworks from decades, if not centuries, ago and is ill equipped for the digital era. For instance, Cold War frameworks may provide a useful starting point, but nuclear deterrence and cyber deterrence differ enormously in the core features of Cold War deterrence – numbers of actors, signaling, attribution, credible commitments, and so forth. Unfortunately, even among those highest ranking government officials there continues to be comparisons between nuclear and cyber deterrence, and so we continue to rely upon an outdated framework that has little relevance for the digital domain.

Some prefer to look back not decades, but centuries and point to Letters of Marque and Reprisal as the proper framework for the digital era. Created at a time to legally empower private companies to take back property that was stolen from them, they are beginning to gain greater attention as ‘hacking back’ also grows in popularity in the discourse. Nevertheless, there’s a reason Letters of Marque and Reprisal no longer exist. They fell out of favor, largely because of their escalatory effect on international conflict, during an era that didn’t even come close to the scope and scale of today’s digital attacks, or the interconnectivity of people, money and technologies. Similarly, technical challenges further complicate retaking stolen property. Adversaries can easily make multiple copies of the stolen data and use misdirection, obfuscation, and short-lived command and control infrastructure. This confounds the situation and heightens the risk of misguided retaliation.

So where does this leave us? The Computer Fraud and Abuse Act (CFAA) from 1986 remains the core law for prosecuting illegal intrusions. Unfortunately, just like the Wassenaar Arrangement and definitions of cyber war, the CFAA is so vaguely worded that it risks both judicial over-reach as well circumvention. This year’s Presidential Policy Directive 41 is essential and helps incident response but has no deterrent effect. In contrast, Executive Order 13694 in 2015, which basically sanctions people engaging in malicious cyber activity, is a start. It clearly signals the exact repercussions of an attack, but has yet to be implemented, and thus lacks the deterrent effect.

Similar steps must be taken to further specify the options available and that will be enacted in response to the range of digital intrusions. Too often it is assumed that a cyber tit for tat is the only viable option. That is extremely myopic, as the US has the entire range of statecraft available, including (but not limited to) cutting diplomatic ties, sanctions, indictments and, at the extreme, the military use of force. The use of each of these must be predicated on the target attacked, the consequences of that attack, as well as the larger geopolitical context.

Clearly, it is time for our policies to catch up with modern realities, and move beyond decades of little to no recourse for adversaries. This must be a high priority, as it affects the US as well as our allies. Last year’s attack on the French TV network, TV5Monde, came within hours of destroying the entire network. The attacks on a German steel mill, which caused massive physical damage, as well as the Warsaw Stock Exchange, not to mention the attacks on the White House, State Department and Joint Staff unclassified emails systems, have also been linked to Russia.

The world has experienced change at a more rapid pace arguably than any other time in history, largely driven by the dramatic pace of technological change. At the same time, our cyber policies have stagnated, leaving us unprepared to effectively counter the digital attacks that have been ongoing for decades. Given both the domestic and global implications, the US must step forward and offer explicit policy that clearly states the implications of a given attack, including consideration of targets, impacts of the attack, and the range of retaliatory responses at our disposal.

To be fair, balancing between having a retaliatory deterrent effect and minimizing escalation is extremely difficult, but we haven’t even really begun those discussions. Absent this discourse and greater legal and policy clarity, the intrusions will continue unabated. At the same time, many in the private sector will continue to debate the merits of a hacking back framework that has serious escalatory risks, and likely is ineffective. The next few weeks are extremely important, as the Obama administration weighs the current range of options that cut across diplomatic, information, military and economic statecraft. Hopefully we’ll see a rise in discourse and concrete steps that begin to address viable deterrent options, and signal the implications of digital attacks that have hit our economy, our government, and now a core foundation of our democracy.

*Note: This post was updated on 10/12/2016 to also include the indictment against seven Iranians.

Information security needs a more accurate metaphor to represent the systems we secure. Invoking castles, fortresses and safes implies a single, at best layered, attack surface for security experts to strengthen. This fortified barrier mindset has led to the crunchy on the outside, and soft chewy center decried by those same experts. Instead of this candyshell, a method from safety engineering - System Theoretic Process Analysis - provides a way to deal with the complexity of the real world systems we build and protect.

A Brief Background on STPA

Occasionally referred to as Stuff That Prevents Accidents, System Theoretic Process Analysis (STPA) was originally conceived to help design safer spacecraft and factories. STPA is a toolbox for securing systems which allows the analyst to efficiently find vulnerabilities and the optimal means to fix them. STPA builds upon systems theory, providing the flexibility to choose the levels of abstraction appropriate to the problem. Nancy Leveson’s book, Engineering a Safer World, details how the analysis of such systems of systems can be done in an orderly manner, leaving no possible failure unexamined. Because it focuses on the interaction within and across systems, it can be applied far outside the scope of software, hardware and network topologies to also include the humans operating the systems and their organizational structure. With STPA, improper user action, like clicking through a phishing email, can be included in the analysis of the system as much as vulnerable code.

Benefits of STPA

At its core, STPA provides a safety first approach to security engineering. It encourages analysts to diagram and depict a specific process or tool, manifesting potential hazards and vulnerabilities that otherwise may not be noticed in the daily push toward production and deadlines. There are several key benefits to STPA, described below.

Simplicity

Diagrammatically, the two core pieces of system theory are the box as a system and the arrow as a directional connection for actions. There is no dogmatic view of what things must be called. Just draw boxes and arrows to start, and if you need to break it down, draw more or zoom into a box. The exhaustive analysis works on the actions through labels and the systems’ responses. The networked system of systems can be approached one connection at a time. Unmanageable cascading failure becomes steps of simultaneous states. Research has shown that this part of STPA can be done programmatically with truth tables and logic solvers. The diagram below illustrates a simple framework and good starting point for building out the key components of a system and their interdependencies.

Completeness

The diagram of systems and their interconnections allows you to exhaustively check the possible hazards that could be triggered by actions. Human interactions are modeled the same way as other system interactions, allowing for rogue operators to be modeled as well as attackers. This is an especially useful distinction for infosec, which often fails to integrate the insider threat element or human vulnerabilities into the security posture. As you see below, the user is an interconnected component within a more exhaustive depiction of the system, which can be useful to extensively evaluate vulnerabilities and hazards.

Clear Prioritization

Engineering a Safer World - and STPA more broadly - urges practitioners and organizations to step back and assess one thing: What can I not accept losing? In infosec, for example, loss could mean exfiltrated, maliciously encrypted or deleted data or a system failure leading to downtime. During system design, if the contents of a box can be lost acceptably without harming the boxes connected to it, you don’t have to analyze it. An alternative method is to estimate the likelihood of possible accidents and assign probabilities to risks. Analyzing these probabilities of accidents instead makes it more likely that low likelihood problems will be deprioritized in order to handle the higher likelihood and seemingly more impactful events. But, since the probabilities of failure for new, untested designs can’t be trusted, the resulting triage is meaningless. Instead, treating all losses as either unacceptable or acceptable forces analysts to treat all negative events seriously regardless of likelihood. Black Swan events that seemed unlikely have taken down many critical systems from Deepwater Horizon to Fukushima Daiichi. Treating unacceptable loss as the only factor, not probability of loss, may seem unscientific, but it produces a safer system. As a corollary, the more acceptable loss you can build into your systems, the more resilient they will be. Building out a system varies depending on each use case. In some cases, a simple diagram is sufficient, while in others, a more exhaustive framework is required. Depending on your specific situation, you could arrive at a system diagram that falls in between those extremes, and clearly prioritizes components based on acceptable loss, as the diagram depicts below.

Resilience

Still accidents happen and we must then recover. Working on accident investigation teams, Dr. Leveson found that the rush to place blame hindered efforts to repair the conditions that made the accident possible. Instead, STPA focuses the investigation on the connected systems, making the chain of cause and effect into more of a structural web of causation. To blame the user for clicking on a malicious link and say you’ve found the root cause of their infection ignores the fact that users click on links in email as part of their job. The solutions to such problems require more than a blame, educate, blame cycle. We must look at the whole system of defenses from their OS, to their browser, to their firewall. No longer artificially constrained to simply checking off the root cause, responders can address the systemic issues, making the whole structure more resilient.

Challenges with STPA

Although designed for safety, STPA has been recently expanded to security and privacy. Colonel William E. Young, Jr created STPA-Sec in order to directly apply STPA to the military needs to survive attack. Stuart Shapiro, Julie Snyder and others at MITRE have worked on STPA-Priv for privacy related issues. Designing safe systems from the ground up, or analyzing existing systems, using STPA requires first defining unacceptable loss and working outwards.While there are clear operational benefits, STPA does come with some challenges.

Time Constraints

STPA is the fastest way to perform a full systems analysis, but who has the luxury of a full system analysis when half the system isn’t built yet and the other half is midway through an agile redesign? It may be difficult to work as cartographer, archeologist and safety analyst when you have other work to get done. Also, who has the time to read Engineering a Safer World? To address the time constraint, I recommend the STPA Primer. When time can be found, the scope of a project design and the analysis to be done may look like a never-ending task. If a project has 20 services, 8 external API hits and 3 user types the vital systems can be whittled down to perhaps 4 services and 1 user type, simply by defining unacceptable loss properly. Then, within those systems, subdivide out the hazardous from the harmless. Now the system under analysis only contains the components and connections relevant to failure and unacceptable loss. While there may be a somewhat steep learning curve, once you get a hang of it, STPA can save time and resources, while baking in safe engineering practices.

Too Academic

STPA may be cursed by an acronym and a wordiness that hides the relative simplicity beneath. The methodology may seem too academic at first, but it has been used in the real world from Nissan to NASA. I urge folks to play around with the concepts which stretch beyond this cursory introduction. Getting buy-in doesn’t require shouting the fun-killing SAFETY word and handing out hard hats. It can be as simple as jumping to a whiteboard while folks are designing a system and encouraging a discussion of the inputs and outputs to that single service systematically. I bet a lot of folks inherently do that already, but STPA provides a framework to do this exhaustively for full systems if you want to go all the way.

From Theory to Implementation: Cybersecurity and STPA

STPA grew from the change in classically mechanical or electromechanical systems like plants, cars and rockets as they became computer controlled. The layout in analog systems was often laid bare to the naked eye in gears or wires, but these easy to trace systems became computerized. The hidden complexity of digital sensors and actuators was missed by the standard chain of events models. What was once a physical problem, now had an additional web of code, wires, and people that could interact in unforeseen ways.

Cybersecurity epitomizes the complexity and systems of systems approach ideal for STPA. If we aren’t willing to methodically explore our systems piece by piece to find vulnerabilities, there is an attacker who will. However, such rigor rarely goes into software development planning or quality assurance. This contributes to the assumed insecurity of hosts, servers, and networks and the “Assume Compromise” starting point which we operate from at Endgame. Secure software systems outside of the lab continue to be a fantasy. Instead defenders must continually face the challenge of detecting and eradicating determined adversaries who break into brittle networks. STPA will help people design the systems of the future, but for now we must secure the systems we have.

The financial sector continues to be a prime target for highly sophisticated, customized attacks for an obvious reason - that’s where the money is. Earlier this year, the SWIFT money transfer system came under attack, resulting in an $81 million heist of the Bangladesh Bank. This number pales in comparison to estimates close to $1 billion stolen by the Carbanak group from over 100 banks worldwide.

Earlier this month, Symantec detailed a new threat to the financial sector, which they said resembles the highly sophisticated Carbanak group. In their excellent post, they describe Odinaff, a precision toolkit used by criminal actors with a narrow focus on the financial industry with tradecraft resembling that of nation-state hackers. It appears that the malware is being used in part to remove SWIFT transaction records, which could indicate an attempt to cover up other financial fraud.

Given the sophistication and stakes involved in the Odinaff campaign, we wanted to see how well Endgame’s early-stage detection capabilities would do against this emergent and damaging campaign. The verdict: extremely well. Let’s walk through this campaign and show how it can be detected early and at multiple stages, before any damage.

Background

According to Symantec, the Odinaff trojan is deployed at the initial compromise. Additional tools are deployed by the group to complete their operations on specific machines of interest. The group is conscious of its operational footprint and uses stealth techniques designed to circumvent defenses, like in-memory only execution. The toolkit includes a variety of techniques offering the group flexibility to do just about anything including credential theft, keylogging, lateral movement, and much more.

The integration of multiple, advanced attack techniques with careful operational security is a trademark of most modern, sophisticated attacks. We’ve put a lot of effort into developing detection and prevention techniques which allow our customers to detect and prevent initial compromise and entrenchment by adversaries (for example, see our How to Hunt posts on file paths and COM hijacking for more information on how we’re automating and enabling hunt operations across systems). Our exploit prevention, signature-less malware detection, in-memory only detections, malicious persistence detection, and kernel-level early-stage adversary technique detections combine to make it extraordinarily difficult for adversaries to operate. This prevents the adversary from establishing a beachhead in the network and protects the critical assets they’re after. Let’s take a look and see how this layering and integration of early stage detection and prevention fare against the Odinaff trojan. We tested the following dropper referenced in the Symantec post: F7e4135a3d22c2c25e41f83bb9e4ccd12e9f8a0f11b7db21400152cd81e89bf5.

Initial Malware Infection

According to Symantec, the initial trojan is delivered via a variety of methods including malicious macros and uploaded through an existing botnet. The instant this malware hits disk, Endgame catches it. Our proprietary signature-less detection capability, MalwareScore™, immediately alerts that the new file is extraordinarily malicious, scoring it a 99.86 out of 100. This would lead to immediate detection through Endgame.

pic 3_odinaff.png

Persistence Detection

One of the first things the malware does is persist itself as a run key in the registry. Endgame’s comprehensive persistence enumeration and analytics cause the malicious Odinaff persistence item to clearly stand out, warning the network defender and enabling quick remediation. The persistence inspection is crucial because even if the Odinaff actors had cleverly written their malware to evade our MalwareScore™, other characteristics of the dropped persistence item are caught by Endgame’s automated analytics. These include an anomaly between the filename on disk and the original compilation filename from Microsoft’s Version Info, the fact that it’s unsigned, outlier analysis highlighting the anomalous artifact in the environment, and more. All these analytics are presented to the user in a rich and intuitive user interface and point to the persistence item as very suspicious.

In-Memory Detection

Endgame’s patent-pending detections of memory anomalies allow users to find all known techniques adversaries use to hide in memory. On install, Odinaff sleeps for about one minute and then modifies its own memory footprint in a way which Endgame detects as malicious. The evasion technique used is uncommon and will be very difficult to detect with other endpoint security products, requiring at a minimum a tremendous amount of manual analysis. On the other hand, Endgame highlights Odinaff’s in-memory footprint as malicious with high confidence in seconds. Endgame discovers other known in-memory stealth techniques just as easily.

Layered and Early Detection at Scale

This malware was not widely discussed in the security community before the Symantec report, yet these sophisticated attackers have been deploying Odinaff in the wild since at least January 2016, according to Symantec. Signature-based techniques do not provide adequate protection as new threats emerge because it takes time for threats to become known and for signatures to be created and propagate.

As we’ve described above, Endgame’s layered detection technology detects Odinaff with ease with no prior knowledge of signatures. By focusing on detection of techniques including in-memory stealth, which are seen time and time again as initial access is gained, detection and prevention can reliably take place early. Early detection stops advanced adversaries from achieving their objectives and in turn prevents damage. Take a look at the video below to walk through these layered defenses and see how Endgame detected Odinaff early and at various stages of the attack.

Masquerading was once conducted by the wealthiest elite at elaborate dances, allowing them to take on the guise of someone else and hide amidst the crowd. Today, we see digital masquerading used by the most sophisticated as well as less skilled adversaries to hide in the noise while conducting operations. We continue our series on hunting for specific adversary techniques and get into the Halloween spirit by demonstrating how to hunt for masquerading. So let’s start the masquerade ball and hunt for a simple but more devious defense evasion technique.

Defense Evasion

In nature, camouflage is a time-proven, effective defensive technique which enables the hunted to evade the hunters. It shouldn’t come as any surprise that attackers have adopted this strategy for defense evasion during cyber exploitation, hiding in plain sight by resembling common filenames and paths you would expect within a typical environment. By adopting common filenames and paths, attackers blend into and persist within environments, evading many defensive techniques.

Part of the attacker’s tradecraft is to avoid detection. We can look to frameworks, like Mitre’s ATT&CK^TM, to guide us through the adversary lifecycle. We’ve shown how it’s useful for hunting for persistence (as our COM hijacking post demonstrated) and it also covers the broad range of attacker techniques, including defense evasion. DLL search order hijacking, UAC bypassing, and time stomping are all effective for defense evasion, as is the one we will discuss today - masquerading.

Attackers use these defense evasion techniques to blend in, making them easy to miss when hunting, especially when dealing with huge amounts of data from thousands of hosts. Let’s start with some DIY methods to hunt for masquerading, which require an inspection of persistent or running process file names or paths.

The Masquerading Approach

We previously explored hunting for uncommon filepaths, which is a simple approach for detecting suspicious files. We can expand on this method by understanding masquerading. Let’s focus on two different masquerading techniques:

Filename masquerading where legitimate Windows filenames appear in a non-conventional location.
Filename mismatching where filenames on disk differ from those in the resource section of the compiled binary.

Filename Masquerading

For filename masquerading, you need to first build the list of files which have masquerade potential. We’ll call that the anchor list. A good approach is installing a clean base image representative of your environment (a fresh install of Windows will do). Next, you need to choose which files you care about. Like most things, there is a lazy approach and an approach that takes a little more effort, but will probably give you more meaningful results with less noise. To build your anchor list the lazy way, simply enumerate all files in C:\Windows including the filename and path and use that as your anchor list.

However, there are a huge number of filenames in this list, and you should ask yourself questions about the likelihood of an adversarial masquerade before putting it in the anchor list. After all, it isn’t much of a masquerade if the legitimate filename seen in a process list or anywhere else might cause someone to question its legitimacy, even if it’s a system file, such as NetCfgNotifyObjectHost.exe. So, put in a bit more work and make a custom list of native Windows files, such as svchost, lsass, winnit, smss, and logonui, which show up constantly and are likely to be passed over if an experienced but rushed investigator is inspecting the name. It is also a good idea for the anchor list to include names for other common applications you expect to find in your environment, such as reader_sl.exe, winword.exe, and more.

Once the anchor list is compiled, you can start using it during your hunt operations. List the running processes, persistent files, or some other file-backed artifact you’re interested in. Compare those names to the anchor list. Do the filenames match? There will be many matches. What about the filepaths? If not, you know where to target your hunt. There are legitimate reasons for this happening (users do unexpected things), but locating this simple defensive evasion technique is a good way to find intrusions.

We’d also recommend some additional triage of results before calling this a legitimate detection and embarking on an incident response. Easy things to do include checking hashes against the masquerade target in the anchor list. If it’s a match, it’s probably a false alarm, and check the signer information for the file as we discussed in the previous post. Be sure to avoid being too trustworthy of the name on the cert, as actors sometimes can get code signing certs that look similar to something legitimate...but that’s a topic for another day.

If you find this approach worthwhile, you will have to keep your anchor list updated. Software changes and if you don’t change with it, you’ll have gaps in your analysis.

Filename Mismatch

Why stop at simply comparing files to your anchor list when more can be done? In this bonus masquerading approach, let’s look at filenames on disk and from the resource section of the binary. There’s a wealth of additional information here, including the MS Version info. As they note, it includes the original name of the file, but does not include a path. This can inform you whether the file has been renamed by a user.

Obviously, if the filename on disk doesn’t match the original file name, there are generally two possibilities: either the user renamed it, or maybe someone brought a tool with them, but doesn’t want you to know. Let’s take DLL implants for example. Many APT groups have brought rundll32 with them, as opposed to using the native Windows version. APT groups aren’t the only ones masquerading. Everyone does this!

Endgame @ the Masquerade Ball

Crafting your own anchor list, regularly updating it, and manually comparing the list to your hunt data or adding this analytic to your bag of post-processing scripts may work for some, but it calls for routine grooming. Let’s take a look at how easy it is to hunt for masquerading using Endgame, where we provide this as one of the many one-click automations in the platform.

Conclusion

Whom amongst us doesn’t love to use Halloween as an excuse to masquerade as someone, or something, else? Unfortunately, adversaries embrace this mentality year round, hiding in plain sight, actively evading detection, and trying to blend in. Clever use of masquerading within filenames can make their activities difficult to detect. While there are manual means to detect mismatches and masquerading, this can be time intensive and may not scale well to larger environments. Thanks to Endgame’s advanced detection capabilities, in a few clicks we are able to quickly catch those masqueraders, remediate the intrusion early, and get back to the ball.

Machine learning is often touted as a silver bullet, enabling big data to defeat cyber adversaries, or some other empty trope. Beneath the headlines, there is rigorous academic discourse and advances that often are lost in the hype. Last month, the Association for Computing Machinery (ACM) held their 9th annual Artificial Intelligence in Security (AISec) Workshop in conjunction with the 23rd ACM Conference on Computer and Communications Security Conference in Vienna, Austria. This is largely an academic-focused workshop, highlighting some of the most novel advances in the field. I was lucky enough to present research co-authored by my colleagues Hyrum Anderson and Jonathan Woodbridge on Adversarial Machine Learning, titled DeepDGA: Adversarially-Tuned Domain Generation and Detection. As one of only three presenters outside of academia, it was quickly evident that more conferences which focus on the intersection of machine learning and infosec are desperately needed. While the numerous information security conferences are beginning to introduce data science tracks, it is simply not enough. Given how nascent machine learning is in infosec, the industry would benefit greatly from more venues for cross-pollinating insights across industry, academia, and government.

AISec brings together researchers from academic institutions and a few corporations to share research in the fields of security and privacy, highlighting what most would consider novel applications of learning algorithms that address hard security problems. Researchers applied machine learning (ML) and network analysis techniques to solve malware detection, application security, and address privacy concerns. The workshop kicked off with a keynote from Elie Bursztein, the director of Anti-Abuse at Google. Although the bulk of his talk discussed various applications of ML at Google, he ended his talk stressing the need for openness and reproducibility in our research. It was a great talk in which I felt Google was backing their open source efforts with libraries like Tensorflow and the steady pace of releasing high-performing models such as Inception-ResNet-v2, which allows even an entry-level deep learning (DL) enthusiast to get their hands dirty with a “real” model. In a way, this call for reproducible research could go a long way in eliminating a common misconception of the use of ML in infosec: that ML is only a black box obfuscated by the latest marketing speak. Opportunities abound to provide infosec data scientists an avenue to demonstrate results without “giving away the farm” and potentially losing out on any (and much deserved) intellectual property rights.

AISec compiled a fantastic day of talks, ranging from the release of a new dataset, cleverly named "SherLock vs Moriarty: A Smartphone Dataset for Cybersecurity Research" (Mirsky et al) to "Identifying Encrypted Malware Traffic with Contextual Flow Data" (Anderson and McGrew – Cisco Systems) and "Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection" (Deo et al – Royal Holloway, U of London, UK). The latter tackled a problem common amongst those who do ML on malware: stale models or models that were trained on old or outdated samples (known as adversarial drift). The basic question is "how do you decide when it is time to retrain a malware classification model?". This can be difficult, especially since training can be expensive, both in terms of time and computational resources. The model can go stale for a variety of reasons, such as shifts in malware techniques or changes to original labels of training samples. Drift is a fascinating topic and the researchers did an excellent job describing their methodology (use of Venn-Albers predictors) for handling such problems.

Our presentation focused on the use of Generative Adversarial Networks (GANs) to create a “red team vs. blue team” game for the creation of Domain Generated Algorithms (DGAs). We leveraged GANs to construct a deep-learning DGA designed to bypass an independent classifier (Red Team), and posit that adversarially generated domains can improve training data enough to harden an independent classifier (blue team). In the end, we showed that adversarially-crafted domain names targeting a DL model are also adversarial for an independent external classifier and, at least experimentally, those same adversarial samples could be used to augment a training set and harden an independent classifier.

Bobby presenting at AISec in Vienna, Austria on October 28, 2016

While more mainstream security conferences have an occasional talk on ML, this was my first experience where it was the sole focus. It will be shocking if AISec remains the only ML-focused security conference in the coming years, with the rise of ML in the information security domain. Frankly, it’s well past time to have a larger, multi-day conference, where both academic and information security company researchers come together to learn, recruit, and network on topics we spend our days (and often nights) trying to solve.

That is not to take anything away from AISec, as this conference packed quite the proverbial punch with the eight hours it had at its disposal. In fact, my biggest takeaway from AISec is that more conferences like it are needed. To that end, we’re working across partners and within our networks to formulate such a gathering in 2017, so stay tuned! Providing a venue for researchers to put aside various rivalries, including academic, corporate or public sectors, even for a day, would greatly benefit the entire infosec community, allowing us the opportunity to listen, learn, and apply.

Written by Bobby Filar.

The FireEye Labs Advanced Reverse Engineering (FLARE) team just hosted the third annual FLARE-On Challenge, its reverse-engineering CTF. The CTF is made up of linear challenges where one must solve the first to proceed to the next. Out of the 2,063 participants, only 124 people were able to finish all of the challenges. Of those 124 successful participants, only 17 were from the U.S., including us! Josh completed the FLARE-On Challenge last year (2015), learned a lot and improved his reversing skills significantly, which was motivation enough to challenge himself again and attempt to finish this year’s contest. Last year’s challenge had interesting problems and gave him first-hand experience reverse engineering Windows kernel drivers, native Android ARM libraries, and .NET executables, among other types of binaries.This year’s challenge was also a good way to validate that his reverse-engineering skills hadn’t dipped, and he was able to get through some of the challenges this year faster thanks to the lessons learned last year. Blaine's passion is reverse engineering, which he has applied to analyzing malicious binaries for years, including those of APTs. Due to the competition's reputation, Blaine decided to attempt the competition as a way to validate and hone his reverse engineering and malware analysis skills. While we both approached each problem set in unique ways, the competition this year did not disappoint.

So What is the FLARE-On Challenge?

This year's contest consisted of ten levels, each requiring a different strategy and set of skills. The levels progressed in difficulty starting with more basic reversing skills and escalating to the more difficult and lesser known skills. Many levels employed different anti-analysis techniques including:

- Custom base64 encoding
- Various symmetric encryption routines
- Obfuscation
- Anti-VM and anti-debugger checks
- Custom virtual machine

As per the tradition in past FLARE-On challenges, each level consisted of a binary that participants needed to reverse-engineer to uncover a hidden flag -- an email address ending in “@flare-on.com”. Each challenge is unique and doesn’t build upon the previous challenges. Some examples of binaries seen in this year’s contest were:

- .NET executable
- DLL
- Compiled Go executable
- Ransomware sample
- PCAP
- SWF

This year’s FLARE-On Challenge used a variant of the CTFd framework to host the competition. Upon completing a level and finding the flag, you’d enter it into the system and it would score your flag as correct or incorrect, and record your time of completion. The framework is nice as it also provides you with a statistics view of how you are proceeding through the challenge. Below are examples of our stats:

Screen Shot 2016-11-08 at 2.49.30 PM.png

Screen Shot 2016-10-30 at 4.26.37 PM.png

Note: The failed submission attempts are a result of not being able to tell the difference between “0”s and “O”s on the challenges where the flag was in the form of an image.

Getting Started

The FLARE-On Challenge is open to all who wish to participate, and welcomes all skill levels from beginners to experts, or just the plain curious. Simply register on the site with your email and you’re off to the races. The challenges are generally open to contestants for 5-6 weeks at a time and have so far been held between July and November. This year’s contest was held for 6 weeks, starting on Sept 23 and ending on Nov 4.

If you’re new to the CTF world, there are some fundamental building blocks you’ll need to get started. First, you’ll need a virtual machine to enable you to run applications and various programs in either Windows or Linux. While not 100% necessary, we always recommend a VM as precautions should be taken when running unknown binaries on your machine. Your debugger of choice such as OllyDbg, Immunity Debugger, WinDbg, or x64dbg is also a necessity. Similarly, you’ll need your favorite disassembler. We highly recommend IDA (pro or free version), radare2, or Hopper (pro or free version). Additionally, a foundational knowledge of x86/x64 fundamentals will help with being able to read the disassembly.

With that infrastructure in place, there are a few more decisions to make. The challenge is language agnostic, although Python and C/C++ are always solid options. As we mentioned previously, each challenge is unique, so you’ll be acquiring and relying on different tools as you progress through the challenges. At various points, we relied upon tools like dnSpy and ffdec, keeping in mind that each binary is a precious snowflake that brings unique challenges. To that end, an interest in solving puzzles is perhaps the most essential requirement for succeeding at the FLARE-On challenge.

FLARE-On Strategery

A good strategy, especially for CTF-style reversing problems, is to start at the “win” basic block, and work your way backwards to see what conditions need to be satisfied to reach it. Pay attention to how your user input affects the flow of execution, and learn to block out the stuff that doesn’t matter (i.e. the white noise) which generally comes with experience -- so until then, enjoy determining what all the nitty-gritty parts do!

Writing out the pseudocode can aid in understanding what various functions and basic blocks do. For practice we highly recommend the open source IDA Pro Binary Auditing Training Material, which has many binaries representing high-level language (HLL) constructs (such as If-Then statements, pointers, C++ virtual tables, etc...). By understanding how these HLL constructs map to their disassembly counterparts you’ll quickly be able to understand what’s happening at the disassembly level and be able to reproduce near-source pseudocode. Or you can use a decompiler to do the bulk of the work for you, such as the ones found in IDA Pro and Hopper Pro.

Anti-analysis checks (such as anti-VM and anti-debugging) are sometimes thrown in the challenges to slow analysis of the binaries. However, this mirrors real world malware which usually has multiple anti-analysis checks built in. These checks serve multiple purposes in real world malware -- to hinder analysis and prevent infection of non-targeted systems (such as a malware analyst’s machine or honeypot). These checks can usually be overcome via binary patching (NOPing out the instructions) or modifying the VM if necessary (renaming or deleting specific drivers or programs).

If all else fails, keep it simple. Break apart the code one piece at a time, and if you hit a wall, Google it out! Remember, any and all tools (hopefully obtained legally) are at your disposal, so use them. Be creative! This is a great opportunity to gain a deeper understanding of real-world techniques used by malware authors to increase the difficulty of reversing.

So You Think You Know Reversing?

Are you looking for a personal test of skills and mental fortitude? Yearning for that mad street cred? Want to be the envy of everyone in your office with this most elusive of swag? The FLARE-On Challenge is for you! This year’s prize is the police-style badge below. Pretty cool, right?

More importantly, the FLARE-On Challenge is a tremendous way to continue to test, hone, and expand your reverse engineering skills. Now that you know how to get started, we strongly encourage you to consider participating in next year’s challenge. Over the remainder of the year, and to further assist you in your FLARE-On aspirations, we’ll provide a few more posts pertaining to the FLARE-On Challenge. We’ll get into the weeds of some of the more creative and daunting challenges we overcame on route to joining the esteemed ranks of those who completed previous challenges.

The presence of domain names created by a Domain Generation Algorithm (DGA) is a telling indicator of compromise. For example, the domain xeogrhxquuubt.com is a DGA generated domain created by the Cryptolocker ransomware. If a process attempts to connect to this domain, then your network is probably infected with ransomware. Domain blacklists are commonly used in security to prevent connections to such domains, but this blacklisting approach doesn’t generalize to new strains of malware or modified DGA algorithms. We have been researching DGAs extensively at Endgame, and recently posted a paper on arxiv that describes our ability to predict domain generation algorithms using deep learning.

In this blogpost, we’ll briefly summarize our paper, presenting a basic yet powerful technique to detect DGA generated domains that performs far better than “state-of-the-art” techniques presented in academic conferences and journals. “How?” you may ask. Our approach is to use neural networks (more popularly called deep learning) and more specifically, Long Short-Term Memory networks (LSTMs). We’ll briefly discuss the benefits of this specific form of deep learning and then cover our simple yet powerful approach for detecting DGA algorithms.

If you are unfamiliar with machine learning, you may want to jump over to our three part intro series on machine learning before continuing on.

The Benefits of Long Short-Term Memory Networks

Deep learning is a recent buzzword in the machine learning community. Deep learning refers to deeply-layered neural networks (one type of machine learning model), in which feature representations are learned by the model rather than hand-crafted by a user. With roots that go back decades, deep learning has become wildly popular in the last four-five years in large part due to improvements in hardware (such as improved parallel processing with GPUs) and optimization tricks that make training complex networks feasible. LSTMs are one such trick for implementing recurrent neural networks, meaning a neural network that contains cycles. LSTMs are extremely good at learning patterns in long sequences, such as text and speech. In our case, we use them to learn patterns in sequences of characters (domain names) so that we can classify them as DGA-generated or not DGA generated.

The benefit of using deep learning over more conventional machine learning algorithms is that we do not require any feature engineering. Conventional approaches generate a long list of features (such as length, vowel to consonant ratio, and n-gram counts) and use these features to distinguish between DGA-generated and not DGA-generated domains (our colleague wrote about this in a previous post). If an adversary knows your features, they can easily update their DGA to avoid detection. This requires us, as security professionals, to go through a long and arduous process of creating new features to stay ahead. In deep learning, on the other hand, the model learns the feature representations automatically allowing our models to adapt more quickly to an ever-changing adversary.

Another advantage to our technique is that we classify solely on the domain name without the use of any contextual features such as NXDomains and domain reputation. Contextual features can be expensive to generate and often require additional infrastructure (such as network sensors and third party reputations systems). Surprisingly, LSTMs without contextual information perform significantly better than current state of the art techniques that do use contextual information! For those who want to know more about LSTMs, tutorials are plentiful. Two that we recommend: colah’s blog and deeplearning.net, and of course our paper on arxiv.

What is a DGA?

First, a quick note on what a DGA is and why detecting DGAs is important. Adversaries often use domain names to connect malware to command and control servers. These domain names are coded into malware and give the adversary some flexibility in where the C2 is actually hosted in that it’s easy for them to update the domain name to point to a new IP. However, hardcoded domain names are easy to blacklist or sinkhole.

Adversaries use DGAs to evade blacklisting and sinkholing by creating an algorithm (DGA) that creates pseudorandom strings that can be used as domain names. Pseudorandom means that the sequences of strings appear to be random but are actually repeatable given some initial state (or seed). This algorithm is used by both malware that runs on a victim’s machine as well as on some remote software used by the adversary.

On the adversary side, the attacker runs the algorithm and randomly selects a small number of domains (sometimes just one) which he knows will be predicted by the DGA, registers them, and points them at C2 servers. On the victim side, malware runs the DGA and checks an outputted domain to see if it’s live. If a domain is registered, the malware uses this domain as its Command and Control (C2) server. If not, it checks another. This can happen hundreds or thousands of times.

If security researchers gather samples and can reverse engineer the DGA, they can generate lists of all possible domains or pre-register domains which will be predicted in the future and use them for blacklisting or sinkholing, but this doesn’t scale particularly well since thousands or more domains can be generated by a DGA for a given day, with an entirely new list each following day. As you can see, a more generalized approach for predicting whether a given domain name is likely generated by a DGA would be ideal. That generalized approach is what we’re describing here.

LSTMblack background.jpeg

Figure 1 demonstrates the discovery process used by many types of malware. In this case, the malware attempts three domains: asdfg.com, wedcf.com, and bjgkre.com. The first two domains are not registered and receive an NXDomain response from the DNS server. The third domain is registered and the malware uses this domain to call home.

Building the LSTM

Training Data

Any machine learning model requires training data. Luckily, training data for this task is easy to find. We used the Alexa top 1 million sites for benign data. We also put together a few DGA algorithms in Python that you can grab on our github site. We use these to generate malicious data. These will generate all the data needed to reproduce the results in this blog.

Tools and Frameworks

We used the Keras toolbox, which is a Python library that makes coding neural networks easy. There are other tools that we like, but Keras is the easiest to demo and understand. Keras is stable and production ready as it uses either Theano or TensorFlow under the hood.

Model Code

The following is our Python code for building our model:

Yup, that’s it! Here is a very simple breakdown of what we’re doing (with links to additional information):

We create a basic neural network model on the first line. The next line adds an embedding layer. This layer converts each character into a vector of 128 floats (128 is not a magical number. We chose it as it gave us the best numbers consistently). Each character essentially goes through a lookup once this layer is trained (input character and output 128 floats). max_features defines the number of valid characters. input_length is the maximum length string that we will ever pass to our neural network.
The next line adds an LSTM layer. This is the main workhorse of our technique. 128 represents the number of dimensions in our internal state (this happens to be the same size as our previous embedding layer by coincidence). More dimensions mean a larger more descriptive model, and we found 128 to work quite well for our task at hand.
The Dropout layer is a trick used in deep learning to prevent overtraining. You can probably remove this, but we found it useful.
This Dropout layer precedes a Dense layer (fully connected layer) of size 1.
We added a sigmoid activation function to squash the output of this layer between 0 and 1, which represent, respectively, benign and malicious.
We optimize using the cross entropy loss function with the RMSProp optimizer. RMSProp is a variant of stochastic gradient descent and tends to work very well for recurrent neural networks.

Preprocessing Code

Before we start training, we must do some basic data preprocessing. Each string should be converted into an array of ints that represents each possible character. This encoding is arbitrary, but should start at 1 (we reserve 0 for the end-of-sequence token) and be contiguous. An example of how this can be done is given below.

Next, we pad each array of ints to the same length. Padding is a requirement of our toolboxes to better optimize calculations (theoretically, no padding is needed for LSTMs). Fortunately, Keras has a nice function for this:

maxlen represents the length that each array should be. This function pads with 0’s and crops when an array is too long. It’s important that your integer encoding from earlier starts at 1 as the LSTM should learn the difference between padding and characters.

From here, we can divide out our testing and training set, train, and evaluate our performance using a ROC curve.

Comparisons

We compared our simple LSTM technique to three other techniques in our arxiv paper. To keep things simple for this blog, we only compare the results with a single method that uses bigram distributions with logistic regression. This technique also works better than the current state of the art (but not as good as an LSTM) and is surprisingly fast and simple. It is a more conventional feature-based approach where features are the histogram (or raw count) of all bigrams contained within a domain name. The code for this technique is also on our github site.

Results

We are finally ready to see some results! And here they are:

Nice! An AUC of 0.9977 with just a few lines of code! All of this is featureless with a simple and straightforward implementation. We actually did a much deeper analysis on a larger and more diverse dataset and observed 90% detection with a 1/10,000 false positive rate, and this can be combined with other approaches outside the scope of this post to improve detection even further. With our LSTM, we even did quite well on multiclass classification, meaning we could classify a piece of malware just by the domains it generated.

Conclusion

We presented a simple technique using neural networks to detect DGAs. The technique uses no contextual information (such as NXDomains or domain reputation) and performs far better than state of the art. While this was a brief summary of our DGA research, feel free to read our complete paper on arxiv and explore our code on github. As always, comments and suggestions are welcome on our github site!

A new 0day against the popular browser Firefox was revealed yesterday which specifically targets the popular “Tor Browser” project, a favorite of Tor users. The Endgame Vulnerability Research & Prevention team quickly analyzed the exploit from the original post, as well as a clean version of reduced JavaScript.

This vulnerability appears to be a fairly typical Use-After-Free (UAF) when animating SVG content via JavaScript. In general, when exploiting UAF vulnerabilities there are a few key steps to gain code execution, which also occur here. The code first must create a situation where an object is allocated and used. Next, the memory of the process must be arranged so that when the “use after free” happens, code execution is gained.

In this exploit the initial HTML page includes JavaScript to first set up some global variables used to communicate to the initial page as well as a Web Worker thread responsible for the majority of the exploit, including heap grooming, memory reading/writing, ROP creation, and payload storage.

In the cleaned version of the worker JavaScript it is evident that the exploit first creates three utility objects. First, the Memory class is used to precisely read and write arbitrary virtual memory. By exposing a few functions it gives the exploit writer a clean interface to bypass ASLR, as well as store the ROP and payload into memory. The PE class is next. This is designed to dynamically resolve import functions for use in the ROP chain. The final helper class is ROP. Its purpose is to dynamically scan memory for ROP gadget bytes and generate a ROP chain. This is done so the author can target multiple versions of Firefox without requiring version-specific information beforehand.

The bulk of the worker thread is designed to drive the exploitation in the following steps.

Prepare memory by allocating a large number of objects to address 0x30000030.
Trigger the usage of the free memory in the original exploit function which will corrupt the allocated heap blocks.
Use the memory class to leak the base address of XUL.dll.
Use the PE class to resolve the import address of Kernel32.dll!CreateThread.
Dynamically construct the ROP chain using the ROP class.
Store the addresses of the ROP chain and payload in memory.
Trigger the control flow hijack that calls the stack pivot to initiate the ROP chain.
Gain code executation and execute the payload stored in the JavaScript variable “thecode”.

While this vulnerability was surely discovered while fuzzing, the exploit writer wasn’t a novice. The use of the helper functions to dynamically build the payload is of moderate skill.

Prevention

After analyzing the exploit we quickly wanted to determine how our Endgame exploit prevention techniques would do against this never-before-seen vulnerability and exploit.

Endgame has two distinct approaches to exploit prevention. One uses our novel Hardware-Assisted Control-Flow-Integrity (HA-CFI) that can predict when a branch misprediction is about to execute malicious instructions. The other consists of dynamically inserted checks that are performed during program execution.

When testing HA-CFI against exploits, we really want to catch the exploit before that ROP chain. This gives the defender an earlier place to determine whether something “bad” is about to happen. In this instance, because we know this UAF will eventually alter control flow, we were able to successfully detect that first change. Using IDA, we have visualized the detection information straight from HA-CFI's alert. The source is the exact location of the control-flow hijack, and the destination is the beginning of the ROP chain. It is important to note that we block the destination address from ever executing.

HA-CFI Source

HA-CFI Destination

Pretty neat! As an added bonus it helps us when reverse engineering exploits because we know exactly where the bug is.

As mentioned earlier, we also have a dynamic ability to detect exploits as well. We call this DBI, for Dynamic Binary Instrumentation. When talking about exploit prevention, there is early and late stage detection. Late detection is typical of most exploit prevention software. For an experienced exploit writer, the later a detection the easier it is to bypass. This is why it is essential to detect exploits as early as possible with innovative techniques such as HA-CFI and new DBI checks.

Our recently released DBI prevention check has proven to detect early against this vulnerability. The new protection is designed to detect when an attacker is attempting to dynamically resolve PE imports, and construct a ROP chain. By doing this we actually catch the exploit in Step 4 above. Below is some output from a debug version of the software showing exactly what happened.

Screen Shot 2016-11-30 at 8.14.26 PM.png

You can see that the exploit tried to read the header of xul.dll (base address 0x695c0000). And if we disassemble the source of the read we see an executable JIT page making the offending read.

Stay tuned for the next release, where some of our latest research is designed to catch exploits even earlier, in Stage 1.

Conclusion

At this point, 0days should not surprise anyone. They have been, and will be, a regular occurrence. However, in this case targeting Firefox and Tor is particularly unique, as Tor is often used for illegal purposes, and exploiting the user's browser is a quick way to “de-anonymize” the user. Also, unlike most exploit kits, this exploit doesn’t appear to be highly obfuscated, which also differentiates it from other 0days.

Mozilla should be releasing a patch today or tomorrow, and users are urged to upgrade if they are using Firefox. In the meantime Endgame users are protected, and everyone else should disable JavaScript.

Identifying outliers or anomalous behavior depends heavily on a robust and credible understanding of those baseline characteristics within a network. Normal behavior and attributes vary significantly and are unique to each environment. Any efforts to structure the baseline may include essential factors such as temporal and geographic considerations, number of users, file types, approved applications, and so forth. In the past, we have referenced baseline and gold image usage to reduce false positives. Since so much of outlier detection and hunting rests on a solid foundation of understanding these baseline characteristics, it is useful to devote some time to this and how to use gold images for hunting. When available, this is extremely important and impacts every aspect of the hunt.

Baselining the Hunt

So what is the most useful baseline for hunting? Most IT staffs start with an OS standard build, and expand it to include additional applications approved for internal company usage. The gold image is defined as this clean slate prior to any user interaction. By comparing workstations to the baseline image, you can simply perform the delta analysis to help hone your hunt. This is especially important if your users are unable to install new software or don’t commonly do so, as anything that deviates from that baseline gold image might be an anomaly worth investigating. Even if this analysis determines the file or application is benign, it may surface a policy violation. This sounds pretty simple, but like any hunting approach, there is a level of grooming that is required to reduce false positives in your environment. Baselining is straightforward in principle but can be very difficult within heterogeneous environments.

Tips to Help Baseline Your Environment

While there are numerous ways to baseline your environment, SANS wrote a great white paper on it which is still useful today. However, you can’t keep the same baseline forever, and it is imperative to keep this image updated to stay current and avoid noise in the analysis. For instance, if you aren’t updating your baseline with routine OS updates, you could end up comparing different Windows machines with varying updates – which will cause many false positives.

In the end, if are you able to keep an actively up-to-date baseline image, then what comparisons should be prioritized between the baseline and active image? It is far too complicated and inefficient to start comparing every system file and running process. These comparisons are riddled with false positives, so it’s important to only compare artifacts worth investigating. As we’ve previously written, any discussion of malware quickly turns to the challenges of persistence. So why not compare only persistent files? This is a much more manageable approach which entails only collecting persistent artifacts in our baseline and then comparing differences. This will more broadly illuminate potentially suspicious persistence items. Even if you are worried about user installed applications, outlier analysis will assist you, assuming that there is some consistency in different business units at your company. As always this consistency may not exist, and introduces an additional factor for consideration when building your baseline.

From Baseline to Hunt

Now that you have established your baseline, there are many open-source methods for conducting in-depth delta analyses to help guide the hunt. While the list of approaches is long, it’s most useful to start with Powershell and SysInternals Autoruns. To do this, simply collect persistent files using sysinternals on your gold image and store those results. Next, you could execute remote scripts to collect additional autoruns collections in your environment (check out our previous blog for more information on Hunting on the Host). Compare the results of these scripts to your gold image and highlight the differences. You might get lucky and find something interesting without using any logic!

Here is a quick example of how to compare MD5 hashes from persistent files:

On the gold image execute: PS>.\autorunsc.exe –a * –h | select-string MD5 | sort-object -unique > baseline-autoruns-output.txt
On the target host execute: PS>.\autorunsc.exe –a * –h | select-string MD5 | sort-object -unique > target-autoruns-output.txt
Compare: PS>compare-object -ReferenceObject $(get-content .\baseline-autoruns-output.txt) -DifferenceObject $(get-content .\target-autoruns-output.txt

To optimize your chances of success, you should put some logical thought into what you collect. If you store too many variables, it simply becomes too unwieldy for manual analysis, while too few risks omitting some key concepts. In the above example, we simply looked at MD5 hashes and only examined unique occurrences of a hash, but you could expand this logic. Starting with the persistent location is a great first step, but you should also consider expanding to hashes, filenames, signer information, etc. as additional useful data to enhance your comparison. If you need to utilize outlier analysis, you may want to include counts as well. There are plenty of third party applications that may not be in your baseline, but will be installed by your users. Let’s just hope malware isn’t installed broadly to ruin our outlier analysis!

Baselining with Endgame

Manually constructing that gold image can take significant time, especially at the enterprise scale and if there are different environments across business units. Using Endgame, you can create a baseline by investigating a clean image. As our previous posts and videos have shown, an Endgame investigation is an aggregation of hunts or surveys to include persistence, process, network, applications, user surveys, and more. Now with this baseline investigation, we can survey those production workstations and compare.

Rather than using our UI, in the referenced video I’ll show how we can do all this using our RESTful API in a few simple steps. As we hunt with Endgame, each investigation is given a unique UUID. Using the UUID from the baseline investigation and our target system investigation, we can compare all the tasks and collections that were executed through the Endgame platform. All of our collection data is stored as JSON, which enables these simple comparisons. For instance, you can look for those pesky persistent files that were not in the baseline. Any differences you find may indicate something malicious.

Conclusion

A solid baseline is necessary to executing hunts based on finding deviations from an established baseline. This approach can quickly help identify those key areas that look suspicious and reduce false positives in relatively homogenous environments. Unfortunately, most of us don’t have the luxury of obtaining a snapshot of an enterprise environment, so we need a shortcut to creating this baseline quickly and precisely to get into the more interesting aspects of the hunt. Endgame’s Investigation feature provides this shortcut, intuitively allowing you to compare the baseline investigation against an investigation with the new tasks and collections, providing a quick means to exploring any differences. For us, the API turns into an analytic haven, helping us structure that baseline and quickly leading us to potential malicious activity.

Watch Endgame CEO Nate Fick on Bloomberg discussing the company's deal with the U.S. Air Force, the largest endpoint detection and response (EDR) deal of the year.