Quantcast
Channel: Endgame's Blog
Viewing all 698 articles
Browse latest View live

Stopping FIN7: Endgame's Full Stack Protection Against Fileless Attacks

$
0
0

Financially motivated cyber attacks occur on a daily basis, often via ransomware but often through direct and aggressive targeting of organizations both in and out of the financial sector. Attackers delivering ransomware can and do make significant sums of money doing so - after all, that’s why they do it - but targeting specific institutions directly has also proven extremely profitable. For instance, last year’s heist from the Bangladesh Central Bank brought in $81 million. In a similar fashion, criminal groups have been targeting a range of organizations across numerous industries, with estimates of the impact of cyber crime on the global economy ranging from $120 billion to $454 billion in 2017.

One of the most prolific criminal groups is FIN7, who has targeted a range of enterprises, including restaurants, hotels and banks, and is closely associated with the Carbanak gang, responsible for attacks on financial firms since at least 2013. Unlike many other groups, FIN7 employs a range of financial attack vectors, compromising companies through ATMs, point of sale systems, fileless attacks, and spear phishing. FIN7 has proven to be one of the most successful cyber criminal groups to date. We’ll describe the unique characteristics of this group, their attack vectors, and then detail how the Endgame platform stops multi-faceted attack vectors, such as those used by FIN7, through our layered prevention and detection capabilities. 

 

Who is FIN7?

During the past year, information pertaining to FIN7 has slowly surfaced, with a focus on their adoption of fileless attacks and effective spear phishing tactics. The group has emerged as a major cyber criminal group in 2017, making headlines thanks to several successful attacks against prominent organizations, as well as their unique integration of a range of fileless attack vectors and targeted spearphishing campaigns. In March, a FIN7 fileless attack campaign targeted government and financial institutions involved in Security and Exchange Commission filings.  FIN7 then turned their sights to the restaurant industry, targeting many restaurant chains, including Chipotle, Baja Fresh, and Ruby Tuesday.

The FIN7 label is often used interchangeably with Carbanak, a well-known family of malware associated with FIN7. The connection within and between the threat actors remains up for debate. While Carbanak malware has been associated with over a billion dollars in financial compromises, it remains unclear whether this represents the actions of a single group, let alone those of FIN7 as well. Although the group and the malware share the same name, in an era of proliferating availability, it is not certain that all groups who deploy Carbanak are part of the same group. FIN7 has used a modified version of Carbanak malware in recent operations, so there is valid reason for the confusion. Given the technical similarities of the malware and approaches, as well as the same financial motivations, these groups are likely either the same or closely associated. The uncertainty surrounding their association is likely a strategy to help disguise the breadth and extent of their global operations.

 

FIN7 in Action

FIN7 relies first on social engineering to compromise its victims, manipulating them with a false sense of urgency to either download the malware or self-infect - often through malicious, highly customized Word documents (in some cases macro enabled). While spear phishing is nothing new, what is noteworthy about these attacks is the thoughtful implementation of evasion techniques to avoid both signature-based and behavioral detections. The user opens a macro-enabled Word document, which uses PowerShell to perform a series of DNS queries that are used to construct a memory-resident payload (e.g. Meterpreter). Unlike other kinds of attacks that might download a malicious file to disk, these actions occur completely in memory. By working entirely in memory and keeping executables off the filesystem, attackers are often able to evade detection due to the reliance many security products have on artifacts on disk.  The attackers continue to use techniques which often evade defenses by living off the land through the use of PowerShell and other native scripting languages throughout the operation.

 

Endgame’s Layered Protections

Endgame’s platform features signatureless detections at each phase of the attack cycle. The platform is exceptional in its ability to stop fileless attacks and other advanced evasion techniques.  Fileless attack protection is just one layer of the Endgame platform which allows customers to stop FIN7 in its tracks at several steps, before damage and loss.  If FIN7 changes one technique and surprisingly bypasses Endgame, the next layer will be there waiting.  We will lay out the various ways in which Endgame is able to stop this advanced attack, and those using similar evasion techniques.

 

State of the Art Prevention

To gain initial access to victims, FIN7 uses targeted spear phishing, tricking users to view a customized attachment, such as a Microsoft Word document, to load malicious macros in the process. Endgame’s dynamic binary instrumentation technology is able to detect the malicious macro content, preventing it before execution. If macro prevention is disabled on a system protected by Endgame, we can see what happens next.  As noted, FIN7 lives off the land in its operations, hidiing in the noise by using legitimate applications to evade most security products.  These activities can be easily seen and responded to using the Endgame platform.  During their operation, FIN7 attempts process injection to gain in-memory execution. As we have previously detailed in depth, process injection techniques are often employed to help attackers maintain stealth by lurking within a legitimate processes’ memory.  Endgame’s in-memory protection technologies prevent this entirely, stopping the attacker before any damage and loss can take place and sending a clear signal to the Endgame customer that an attack has been stopped.

 

Premier Detection & Response

Sophisticated attackers are constantly modifying their tools and tradecraft to evade defenses. Therefore, defenders should assume they are breached, even with the best preventative technologies in place. These flexible and intuitive detection and response capabilities are of paramount importance as they enable defenders to triage active attacks and pivot through rich, comprehensive data to determine the extent of a breach and terminate all unwanted activity.  

Endgame’s preventative layers provide powerful detection and response capabilities that give the operator complete access to endpoint and network data.  The power of that data is unlocked through Endgame’s Artemis.  Artemisis a an AI-powered security chatbot built upon natural language understanding technology. It assists analysts of all skill levels in quickly responding to compromise. Artemis expedites and facilitates search and discovery, a key part of the workflow to help analysts cut through the noise within the immense amount of security-relevant data generated on endpoints. From supporting quick exploration of event logs to searching across systems to determine the extent of a breach, Artemis expedites the analyst workflow by quickly guiding them to those events and data that matter most for decision-making.

Endgame’s Artemis also delivers simple and straightforward information on process lineage for suspicious processes.  This gives the analyst a concise temporal overview of what occurred, where and when.  In addition to providing the analyst with complete information on actions taken by that process, it enables root-cause understanding.  As the video below demonstrates, as we pivot from the FIN7 macro detection, it is possible to view an entire series of suspicious events using Artemis. By asking for a process lineage of the Microsoft Office application, we discern that malicious powershell and vbscript were executed and suspicious network traffic was generated.

 

Hunting with Tradecraft Analytics

What if the breach occurred before Endgame was installed?  Endgame’s powerful hunt analytics can immediately pinpoint outliers and suspicious artifacts on endpoint systems. These analytics are designed to immediately surface malicious artifacts with low noise in most environments.  Suspicious running processes, network connections, and persistent software can be easily identified with Endgame.  In addition, Endgame opens up memory to hunters, enabling users to find all in-memory adversaries, including FIN7.  In contrast to any other security product, Endgame’s memory analysis scales, allowing the analyst to examine all processes for entrenched fileless adversaries across tens of thousands of endpoints in minutes. Hunting in memory and hunting for persistence are just two of the ways Endgame’s hunting capabilities easily detect infections across a range of the most advanced groups, such as NetTraveler, Roaming Tiger, Fancy Bear and Cozy Bear.

Specifically for FIN7, Endgame is able to detect the in memory component of FIN7's attack as well as discover FIN 7’s persistence methods of creating scheduled tasks and WMI objects. The Endgame hunt capability combined with our tradecraft analytics are essential to determining the breach status in a fraction of time.

The above hunt, detection, and triage capabilities provide the responder with all the necessary information to respond to the attack and execute their incident response procedures, with Endgame providing thread-level remediation capabilities on even the most critical systems.

 

See Endgame in Action

The video below walks through each of these layers in the Endgame product - prevention, detection and response, and hunt - to demonstrate not only the power of, but also the necessity for pursuing a layered approach in light of the increasingly creative and evasive adversarial techniques.

 

Conclusion

FIN7 is arguably the most sophisticated financially motivated group, but they ]certainly aren’t the only group which combines a range of advanced attack vectors to compromise organizations. These not only include frequently employed techniques such as spear phishing and malicious macros, but also sophisticated in-memory stealth, persistence, and living-off-the-land strategies. With the odds remaining in favor of targeted attackers compromising a network, a layered approach is absolutely essential to prevent, detect, and respond to the wide range of attack vectors as early as possible.  If you’ll be at Black Hat this week, stop by our booth 1360 to see the demo live, and attend any of our talks and workshops at BSidesLV, Black Hat and DEF CON.


New Open Source Repositories for Data Scientists in Infosec

$
0
0

Over the past few years, we have published numerous posts on the benefits and challenges of machine learning in infosec in an effort to help fellow practitioners and customers separate hype from the reality. We also believe contributing to the larger open source community is an essential component of this outreach.  In conjunction with Black Hat, DefCon and BSidesLV, we have released two GitHub repositories, each a playground for data scientists in information security.

 

gym-malware: An OpenAI Gym for Malware Manipulation

First, last week our research team released gym-malware, an open source OpenAI gym for manipulating Windows PE binaries to evade next-gen AV models.  The “gym” allows data scientists in information security to simulate realistic black-box evasion attacks against their own machine learning model by training a reinforcement learning agent to compete against it.  In contrast to other approaches for attacking machine learning models, this approach is agnostic to the model architecture under attack and only requires API access to the model. The reinforcement learning agent can probe the model to retrieve a malicious or benign label for any query.  By learning through tens of thousands of competitive rounds, the reinforcement agent can begin to produce with modest success functional malware that evades the model under attack.

Data scientists may use and modify this framework to answer questions such as:

  1. How sensitive is my model to evasion attacks for ransomware (or other category)?
  2. What mutations tend to evade my model the most?
  3. How can I create a killer reinforcement learning agent to bypass my model?

The repository contains a toy machine learning malware model and some preliminary agents (but bring your own malware!) that data scientists can use as a starting point to improve and optimize.

 

You Are Special, But Your Model Probably Isn’t

On a lighter note, at BSidesLV I presented “Your model isn’t that special: zero to malware model in not much code, and where the real work lies”.  A GitHub repo accompanies this talk, and contains a series of Jupyter notebooks that demonstrate building deep neural networks for Windows PE malware classification. The playground includes code (bring your own data!) for creating:

  1. A multilayer perceptron using hand crafted features (feature extraction code included);
  2. An end-to-end convolutional deep learning network for malware detection;
  3. A slightly silly re-work of ResNet for malware that I’ve named MalwaResNet for even deeper end-to-end convolutional deep learning for malware detection.

The talk and the notebooks aim to demonstrate that one cannot always simply port sophisticated deep learning models from computer vision domains and expect them to work immediately for malware classification.  Architectures developed to identify cats in images may not be optimally designed for finding malicious content in raw bytes. Deep learning does require work, and training them can be a challenge. These notebooks point practitioners in the right direction, but also highlight some of the shortcomings through the toy demonstration. For example, in the notebooks intended for consumption on a modest computer, models are trained on far too little data, for too few epochs, with non-optimized optimization parameters.  In fact, the simple multilayer perceptron with hand-crafted features and careful attention to the data (bring your own!) can actually produce a decent Windows PE malware machine learning model.  Each of the deep learning model architectures in the repository can be instructive to those who are interested in getting started with feature-based and end-to-end deep learning models in infosec.

 

A Deeper Look at Machine Learning in Infosec

Machine learning has become an important tool in security for detecting and preventing unknown threats, in large part because of its ability to generalize.  However, all machine learning models have blind spots that present an attack surface for motivated and sophisticated adversaries.  These open source packages help demystify machine learning for malware, and allow others in security to understand, attack, and harden their own machine learning models.  Especially in security, a rising tide lifts all boats. At Endgame, we continuously work to improve our models for malware and other threat detection and prevention, and share our insights and lessons learned to support others in the community.  

Bot Talk Pretty One Day

$
0
0

Conversational interfaces have improved customer interactions across a wide range of industries and use cases, providing interactive and intuitive experiences. That experience, however, is diminished if the core model of the conversational interface is poorly implemented. While many chatbots have a wealth of data on which to train, many have not been trained in the wild. This limits their ability to target customers’ pain points, and at worst, the bot can become infuriating (think Clippy). In other words, garbage in, garbage out, garbage UX. Unlike, for example, the ATIS dataset for the airline industry, there currently are not any open infosec training datasets. The absence of training datasets was one challenge we encountered in developing Artemis, our machine learning (ML)-powered security chatbot.

Artemis is designed to elevate and enhance the capabilities of analysts across all levels of expertise. Given natural language queries, Artemis can perform and automate complex analyses, data collection, investigation, and response to threats. But how does this bot’s ML engine work? Artemis extracts information from natural language queries and converts it into an action, typically an API call to the core Endgame platform. This requires an optimized training set that accurately captures a generalized set of expected and unexpected user input. I’ll discuss the fundamental steps our team took to train Artemis, including the process of collecting training data, as well as our tool for validating the language models, BotInspector. Going forward, this foundation will be essential to meet evolving customer needs and use cases through a regular cadence of Artemis updates.

 

Natural Language Understanding

Before discussing the data, it is useful to provide a brief refresher on two of of the key components of natural language understanding (NLU). As we discussed in a previous post, entities and the intent are core components of the user input (utterance) passed into our data pipeline. The entities in an utterance may include IP addresses, endpoints, usernames, and filenames, all of which the platform can perform some action with or on. The intent is, of course, what we would like to do. Do we want to search for processes or process lineage? Search DNS or netflow data? Sentence structures often vary depending on the intent, so entity extraction and intent classification go hand in hand.

 

Training a Chatbot with a Chatbot

To train the entity extractor and intent classifier components of the model, we need quite a bit of data—particularly labeled data consisting of real utterances matched with their intents and entities. Sure, there are natural language libraries freely available, but we aren’t aiming for Artemis to understand all sentences and strike up a casual conversation with us. Rather than arbitrary English sentences, we need to train the model on realistic natural language queries used in infosec. This requires amassing enough data to train a chatbot to comprehend intent based on utterances full of security jargon, and translate it into more intuitive conversational structure. Unfortunately, this data is not readily available —so we generated our own!

Initially, this data was generated using a complex script that randomized both utterance structures and fields in a manner resembling Madlibs. That is, we created a set of templates that was auto-populated with information security jargon, and their associated label (intent) to compile our dataset. While this data enabled us to train a reasonably accurate model, it was limited by the speech patterns created by the author of the templates. How did we fix this without having to manually modify the generator to support more utterance structures? With another chatbot!

Designed as a chat room on an Endgame chat server, the artemis_test chatbot allows users to directly supply sample queries to the Artemis engine, which responds with its interpretation based on the current model. This allows us to employ active learning to improve the model. The bot prompts the user to correct any issues with its interpretation via a straightforward conversational interface. This process outputs labeled data that can be used to train both components of the model, as you see in the HipChat conversations below.

 

           

Hipchat Output of User and Bot Interactions 

Since several samples of each utterance structure are required for the machine learning algorithms for Artemis, we wrote another generator that treats data from artemis_test as templates, but only randomizes the extracted entities such as filenames or IP addresses, rather than the entire utterance structure. This allows us to generate numerous samples for each template provided by the original data generator and artemis_test. While it is true that we must rely on the original generator for a sizable portion of the training data, artemis_test allows us to add new utterance structures and patch misinterpretations incredibly quickly without requiring a lot of tedious work. We have seen massive improvements in entity extraction due to this new implementation.

 

Tuning Our Training Data

BotInspector, a tool we developed to evaluate our results, serves two purposes.  It gives us detailed understanding of the makeup of our training data and also provides insights on the resulting models’ performance.

We analyze the composition of our training data so that we can be sure that different intents and entity types occur at frequencies that are statistically appropriate based on the complexity of utterances for each intent. For example, there are more ways to phrase the query, “I wanna search process for notasuspiciousfile.exe” than there are to ask Artemis to “cancel,” so we must make sure our training data reflects this difference. Furthermore, the intents, “search process” and “parent process tree”, are often present in queries with only subtle differences in utterance structure. Significant differences in the amount of training data for either intent could cause one intent to dominate in the classifier, resulting in misclassifications in unfamiliar utterances. BotInspector also generates frequencies of different entity types. This ensures that the model is trained equally on each entity type, as well as their presence in lists of varying lengths.

 

Model Validation

In addition to analyzing training data, BotInspector serves as a model validation tool. Given a test file containing a set of queries not present in the training data, BotInspector returns a results file containing accuracy percentages for entity extraction, intent classification, and a combination of the two. It also displays a list of misinterpreted samples, including incorrect entity extractions that are sorted by intent, and incorrect intent classifications that are categorized by whether the entity extractor also failed on the same samples. This can be useful because, as previously mentioned, intent classification and entity extraction are intertwined.

These results can be juxtaposed against those of previous model versions via the comparison function of BotInspector, which highlights areas of improvement and decline between model versions.

With these results, we can now complete the model building cycle. BotInspector tells us where the model goes wrong, indicating which utterance structures it fails to comprehend. We can then add or reinforce these structures by supplying them to artemis_test and using BotInspector to monitor training data quality. Finally, although the model building stage cycle is complete, we constantly train, validate with BotInspector, and repeat.

 

Next Steps

The implementation of artemis_test and BotInspector has vastly improved our NLU training pipeline, not only allowing us to ascertain deficiencies, but also providing a means of eliminating problems and measuring the NLU engine’s improvement. The system we use is optimal for our use case. Compared to personal assistants like Siri and Alexa, Artemis supports a smaller, domain-specific set of intents and entity types. This limited domain allows Artemis to support more complex natural language queries with varied phrasing.

As users increasingly utilize Artemis to interact with their data and systems via the Endgame platform, we will use targeted customer feedback as an additional training data source. This helps us close any unanticipated gaps in Artemis’ comprehension, increase the functionality within Artemis, and enable us to continue to hone and improve the user experience.  

Tools need to empower analysts, not obstruct them. At Endgame, we are committed not only to providing the best prevention and protection in the industry, but also making our platform as easy and intuitive as possible to use. Artemis reflects this integration of ease of use with bleeding edge protections within the Endgame platform, facilitating and expediting the analytic workflow while surfacing insights for analysts across a broad range of expertise.

Milliseconds Matter: Prevention Architecture and Cloud Considerations

$
0
0

The rise of ransomware and other destructive attacks in the last year demonstrates that prevention is critical to stopping damage and loss in your enterprise. Attacks come in many shapes and sizes, requiring a broad range of prevention and detection capabilities. Attacks that directly lead to the destruction of IP, disclosure of sensitive information, and lateral movement are the most existential to enterprises because they represent tangible damage and loss. Attacks such as ransomware, credential theft, and pass-the-hash occur extremely quickly, and demand additional considerations when evaluating endpoint security effectiveness.

To counter this broad range of attacks, many in security are quick to highlight the role of cloud-based analytics to make security decisions. While the cloud is certainly one of the hottest buzzwords and does provide tangible value, the cloud cannot solve all problems. Early prevention is the most effective way to stop attacks. This entails a fast response, and low time-to-kill that cannot be achieved when solutions require an out-and-back trip to the cloud.

I’ll discuss how attacks are increasingly time sensitive, compare endpoint prevention versus those in the cloud, and demonstrate the limitations of cloud-based analytics for prevention.  At Endgame, our prevention capabilities focus on stopping both known and emerging threats, early in the attack cycle, and on the endpoint, without requiring the roundtrip to the cloud. Attacks happen fast, and when preventing them, milliseconds matter.

 

Time to Kill

When an attacker lands on a host they have multiple options, which are in turn determined by the motive. Frequently the motive boils down to a subset of extortion, theft, or entrenchment. By analyzing attacks driven by these motivations, we have determined that some of today’s most prominent and impactful threat vectors require prevention to react in milliseconds to be effective.

The quickness of these attacks should not be minimized. When not contained, organizations suffer greatly. To estimate just how quickly these attacks propagate, we measured the speed by which ransomware and several malware-less attacks occur. This analysis demonstrates that prevention is necessary and must be immediate. These following examples show that damage-and-loss containment isn’t an abstract concept. When organizations shut down due to attacks the reality is clear. You need to stop attackers early.

malware propagation analysis

Ransomware

Ransomware is the most obvious of these attacks. Organizations cannot afford to have critical assets encrypted for ransom money. These attacks instantly encrypt documents in common locations such as a user’s data directories. This can be especially difficult to remediate due to increasingly stronger encryption and backup solution disablement. Prevention is the best means for eliminating the cost associated with this attack. For instance, our analysis of the WannaCry ransomware determined that it encrypts files on disk within 1322 milliseconds of launch. Protecting against Wannacry, and other forms of ransomware, requires prevention technologies that act instantly.

 

Credential Theft and Lateral Movement

Additionally, we measured alternative damage-and-loss tactics such as credential theft. In credential theft attacks, the attacker scans memory for domain passwords and leverages common weaknesses to steal password hashes. This attack happens in less than 50 milliseconds creating an even smaller window for effectively stopping critical authentication compromise.

Similarly, entrenchment techniques like pass-the-hash are also time sensitive. It is well understood that once an adversary spreads across a network, evicting them becomes increasingly difficult. In our analysis, it only took 832 milliseconds to use stolen credentials to move to another endpoint. If you cannot contain this threat in less than a second, response costs escalate rapidly.

 

Cloud-based Prevention

The cloud is great. It is unmatched when considering scale, computing, storage, and redundancy. These make sense to web service providers. If you are Amazon, there is no other choice. In fact, we thoughtfully utilize these technologies to provide centralized management, instant updates, agile access to Endgame threat data and services such as Endgame Arbiter.  However, while a cloud architecture is excellent for these use cases, it does not provide comprehensive endpoint protection.

Endgame does provide detections which require visibility on data from many endpoints or require the vast compute capabilities of the cloud. These capabilities are additive and generally retrospective.  That is, they will tell you about an active attack.  This is extremely important, but it’s too late to achieve prevention.  When your endpoint is under attack, a cloud-prevention architecture alone is a tough sell.

Cloud solutions that rely on streaming event data through the cloud come with two costs. The first cost is generally understood. Providers such as Amazon’s AWS charge for every resource you use, such as network bandwidth, elastic storage and compute instances.

In addition to the financial costs of maintaining a cloud architecture, there is also an efficacy cost. Because milliseconds matter, we must consider if cloud-solutions are fast enough to achieve effective prevention, and even detection, during attacks such as those that contain lateral movement and credential theft. Let’s look at a typical request to a cloud service.

Delays in cloud prevention approaches

What Can Happen on the Way to the Cloud and Back

As the graphic above depicts, the time lag for the trip to the cloud and back impacts efficacy, which comes down to a few key elements. The network latency of a cloud lookup when attempting to prevent an intrusion could be many seconds, allowing an attacker to encrypt files or steal credentials. Additionally, to make relevant security decisions in real time, data filtering must occur. Given the hundreds of security relevant events available on each endpoint, multiplied by every endpoint in your enterprise, it is strikingly clear that data must be selectively filtered before sending it to the cloud due to bandwidth restrictions. Filtering by its very nature means you have less information when making a security decision. With more and more complex attacks you need more and more data. In a cloud-architecture, the round-trip time required for prevention and the speed of the attack are at odds with providing the best security.

The final cost to consider is the “offline” capabilities when endpoints aren’t connected to the corporate network, internet, or VPN. In a mobile world where users work from anywhere, your endpoint security solution should, too. That is why an autonomous agent that is always-on and always-preventing is the optimal solution to handle the speed of today’s attacks.

 

Autonomous Endpoints

Endgame’s security architecture puts the intelligence on each endpoint giving defenders the power to react instantly to threats. This contrasts with those relying on cloud-connected analytics to make security decisions. Our autonomous agent reads security event data in real-time, as it occurs, with prevention technologies such as machine-learning, pattern recognition, and domain expertise protecting the endpoint. Because we deploy our capabilities to each endpoint this protection works 24/7/365 whether the endpoint is connected to the Internet, VPN, in the corporate office, or traveling the globe.

Our smart endpoint agent has a three-phase approach to delivering strong prevention across all deployment options. The process starts with acquiring relevant event data. On every operating system we collect telemetry on dozens of sources including, but not limited to, processes, files, network traffic, user behavior, and logs. With each telemetry source, we have access to hundreds of data points that feed into our next phase.

The analysis phase of prevention consumes these sources and applies the best available analysis capability for each attack type. Security is not a one-size-fits-all approach. That’s why our analysis engines rely on multiple inputs to structure our prevention capabilities to make real-time prevention decisions in milliseconds. For instance, MalwareScore machine learning models live on the endpoint, and require no external connections.

Once a security decision is made by the analysis engine, our final phase determines the appropriate action to take against the attack. This action can include process termination, IOC collection, network termination, or soon endpoint isolation. Each action can be configured to happen autonomously with zero analyst interaction.

This acquire, analyze, and act process happens on each endpoint across your enterprise in real-time by watching each system for attacks. Because milliseconds matter, we do not have to “phone home” during a ransomware breakout. By strengthening the endpoint, we provide resilient and effective protection at all stages of the attack kill chain.

 

Prevention in Real-time

Your endpoint agent should be smart, autonomous, and have everything it needs to prevent attacks. This is becoming more important as detect-only strategies fail against attackers who extort organizations with the threat of data destruction and IP theft. Prevention is required given the high-stakes involved to maintain data security.

Endgame’s endpoint-first architecture ensures that we are encapsulating our security expertise into various preventions that provide fast and effective security across your enterprise. We are determined to analyze real-time data without needing cloud assistance, providing a better solution for our customers.

The cloud has revolutionized software as a service and the way content is delivered on the web. Endgame effectively uses cloud services when it makes sense, providing enhanced customer support, centralized management, hosted services, and some secondary detection capabilities. However, it is not appropriate for every security use case. When prevention is on the line, we utilize extensive capabilities on each endpoint to ensure the most comprehensive security effectiveness across your organization. Thanks to our expertise in vulnerabilities and attacker techniques, we know  which technologies are (and are not) right for the job. By taking an endpoint-first approach, the Endgame platform provides innovative and expedited prevention capabilities, protecting enterprise data from today’s attackers.

Data Visualization for Machine Learning

$
0
0

Building a machine learning model for a data product is a difficult task involving many steps from data collection and management all the way to integration and presentation of results. One of the most important steps in the process is validating that your model is providing useful predictions in the way you intended. Normally, model validation occurs through onerous data munging processes and analyses and can take a significant amount of time. Instead, simple data visualizations can greatly expedite the validation process while also helping bridge the gap between data scientists and engineering and product managers.  

At Endgame, we needed a faster and easier way to receive feedback on our malware model, MalwareScoreTM, so I created an internal tool called Bit Inspector. The tool provides this feedback in the form of data visualizations, and itself was originally meant for only the developers of MalwareScoreTM. Since its original implementation, the audience has expanded to include subject matter experts, internal red teamers, and managers. Building upon my talk last month at BSidesLV (video and slides available), I will first discuss the value of data visualization for machine learning, and then dive into the technical details of Bit Inspector. Data visualization is an under-utilized tool for validating research and development efforts, while also providing great explanatory powerful across teams to help translate the intricacies of machine learning models.   

 

The Value of Data Visualization

Data visualizations have different levels of explainability, trustworthiness, and beauty. Each visualization has a limited time and resource budget and so sometimes one of these aspects of a visualization won’t be prioritized. As the audiences for your visualizations change, the relative importance of spending time on these qualities will also change. The categories aren’t rigorous, but they do provide some rough guidelines I follow when crafting data visualizations.

Let’s first define each category. Explainability refers to the ability of the visualization to stand on its own, requiring little or no explanation to the viewer. There are several obvious factors that can enhance a visualization including labels, units of measurement, and a title. But beyond that, annotations and explanations can go a long way towards making a visualization truly understandable. This Earth temperature timeline done by XKCD is an excellent example of annotations adding to the viewer’s understanding. Relatively short timeframes such as years or decades are easy for us to understand, but when we increase the scope to many thousands or millions of years our ability to conceptualize it breaks down. By annotating human accomplishments along the y-axis, the reader is better able to understand and appreciate these human events.
 
Screen Shot 2017-08-02 at 2.19.38 PM.png

A Timeline of Earth's Average Temperature  

Another way to improve a visualization is to make it more professional looking in order to increase its trustworthiness. Credibility can be gained by citing sources of the data, explaining methods, and citing the author’s expertise. Media companies will also attempt to maintain a strong brand that audiences trust and then communicate that brand through consistent styling in their data visualizations. FiveThirtyEight and The Economist are both good examples of this.

Beauty is, of course, in the eye of the beholder and so I’m not going to try and define it here. Needless to say, a visualization must be more than eye candy, but those that are visually striking are going to leave a lasting impression when information is conveyed well. Below are internal and external examples of what I believe are beautiful visualizations.

 

A Timeline of Ransomware Attacks

 

 

A Comparison of Rappers’ Vocabularies

 

What is Bit Inspector?

At Endgame, I wanted to explore how data visualization could help during the R&D process itself to support model validation. The early versions of Bit Inspector were really just plots I made to convince myself that a proof of concept malware model was actually accomplishing something and that it was worth further engineering effort. Model metrics such as area under a ROC curve and false positive rates were really important at this point to tell me if the model was effective at discriminating between benign and malicious software. At this level, Bit Inspector was effective even without spending time improving the data visualizations.

Over time, Bit Inspector has evolved and is now a flask app that makes visualizations built with D3js, matplotlib, and seaborn internally available. There are two main routes for Bit Inspector: viewing information related to a single PE file (a sample) and viewing information related to one version of the MalwareScoreTM model. On sample pages, contextual information is listed and visualized including model features, section information, and links to outside resources like VirusTotal. A history of MalwareScoreTM predictions for the sample over time are also visualized, allowing the viewer to quickly diagnose if the model is improving or worsening on this sample.

Screen Shot 2017-08-02 at 2.32.20 PM.png

Screenshot of one sample’s page on Bit Inspector

 

The model page displays model specific metrics like false positive rates, false negative rates, and also ROC curve plots. Plots also display the confusion matrix and distribution of scores for various subsets of our training data. By tracking problem areas after they’ve been fixed, the MalwareScoreTM team can quickly verify after training each model that problems remain or have been fixed. The other model metrics can help us decide if a model is an improvement over the last iteration.

Screenshot of one model’s page on Bit Inspector

 

Data Visualization to Inform R&D

As MalwareScoreTM increasingly proved effective, more people became interested in it and in Bit Inspector, including other Endgame data scientists and malware researchers. Soliciting feedback on problem areas and generally red teaming the model were most important at this point. To that end, it was necessary to add context in the form of data sources and model parameters for the data scientists and hashes, PE header information, and links to VirusTotal for the malware researchers. This required a focus on explainability over trustworthiness to quickly convey information. Because of this, Bit Inspector was easier to understand for the majority of users, but there was still work to do related to increasing user confidence and simplification.

Eventually, the audience of Bit Inspector expanded to managers and executives. This audience was interested not only in the performance of the models, but also the progress in training new models and fixing problem areas. The explainability of the plots was tested at this level and Bit Inspector would benefit from additional focus and design to make it understandable to the widest audience possible.

 

Looking Ahead

Fortunately, there are great resources for data visualization, including the new Facets tool for exploratory data analysis. If the BSidesLV Ground Truth track on data and visualizations is any indicator, infosec is beginning to embrace the value of data visualization. This is a valuable trend that will help increase data literacy across the industry. I thoroughly enjoyed all the talks and I hope that I sparked more interest and ideas in data visualization for other researchers.

 

Kicking off the Endgame Threat Hunting Workshop Series

$
0
0

Last night, we kicked off our first Threat Hunting Workshop Series in the Endgame Arlington office. Guided by Endgame and Capital One practitioners, hunters and incident responders from the government and the commercial sector (with industries ranging from telecommunications to legal to entertainment) engaged in a hands-on, collaborative and informative discussion about proactive threat detection. We covered a range of use cases, as well as tips for justifying and evaluating your team and involving leadership, and provided additional networking time for follow-on questions and conversation. This is just the first of these workshops, and we hope to continue the discussion at future events both in the DC area and across the United States.

We have previously written about open source hunting tools and techniques, including hunting in memory, hunting on the cheap and hunting for persistence. We have also presented our research at industry conferences such as Blackhat, Defcon, Derbycon, SANS Threat Hunting & IR Summit, and SANS DFIR. In these cases, the information flow tends to be unidirectional, with readers and attendees frequently requesting additional time and outlets for more information and additional interactions. Based on these experiences, as well as input from customers and those in the community, it was time to integrate this expertise and lessons learned into an interactive workshop.

 

Endgame Threat Hunting Workshop

 

Endgame researchers Paul Ewing and Devon Kerr, both experienced in threat detection and incident response, were joined by threat hunting expert Roberto Rodriguez from Capital One, to lead attendees through three hunt use cases: hunting for persistence, lateral movement, and credential theft. Using a range of open source tools and techniques, participants learned the details of each of these adversary techniques, the evidence necessary to see them, analysis techniques to generate high fidelity detections, and how each of these fit into the broader MITRE ATT&CK matrix. Importantly, MITRE’s ATT&CK matrix helped move the conversation from very specific, tactical questions - such as those pertaining to specific devices or operating systems - to more broadly thinking like an attacker. Knowing how attackers operate at each phase of the attack lifecycle and understanding how they might adapt to an environment is essential for teams proactively looking for threats.

 

Endgame Threat Hunting Workshop

 

Clearly, successfully identifying these malicious techniques is a core component of the process. However, technical insights alone don’t make the successful hunter; there are often organizational constraints that can make or break this capability. Our trio of facilitators armed participants with resources and quantifiable models to help justify the value of the hunt within an organization. A key challenge hunt teams encounter is that the analyst may come up empty handed, which itself provides measurable but often overlooked value. This notion of assessing or measuring the value of your program cannot be overstated, and is a topic we’ll cover more deeply in a subsequent technical blog, especially when considering the challenges surrounding scalability.

 

Endgame Threat Hunting Workshop

 

We had a great time sharing our experiences and insights, and helping to grow the threat hunting community. This was just the beginning, and we look forward to building community and exchanging insights with other threat hunters across the United States. Future workshops will continue last night’s discussion, and expand across additional use cases, analyses, industries, and geographies. To request a workshop in your area, please complete this form.

 

Transparency in Third-Party Testing

$
0
0

Before making a major purchase, chances are you shop around, compare products with a critical eye, and rely heavily on the experiences and opinions of people you trust to inform your buying decision. For example, before purchasing a car, customers often turn to friends and third-party reviews, such as Consumer Reports, Kelley Blue Book, or the wisdom of crowds captured in sites such as Yelp and Angie’s List. These sources can validate claims made by the automobile manufacturer or salesman, and are essential to help inform consumers and provide transparency.

But what happens if we’re talking about computer security instead of cars?  A consumer seeking a third-party review might feel somewhat bewildered for a few key reasons.  First, some vendors may not participate in public third-party testing.  In addition, the methodologies employed by testing companies may not be fully transparent to customers or even agreed upon by those vendors whose products are being evaluated.  More transparency is required to incentivize both vendors and testers to adopt uniform and mutually understood standards for third-party evaluations, as occurs in many other industries. The expanded participation in third-party assessments by vendors, and an improvement of third-party tests, can best be achieved by fully adopting testing standards, such as those drafted by the Anti-Malware Testing Standards Organization (AMTSO), of which Endgame is a participating member.  These assessments are a key step toward providing better transparency across the industry, and arming consumers with the required knowledge to make more informed decisions.

 

Moving Beyond Self-Evaluations

At Endgame, we maniacally and rigorously test our platform internally in environments that we have set up to mimic conditions experienced by our customers, including nascent threats.  We exercise our detection engines, probe our protection mechanisms, even proactively and adversarially probe our machine learning models for blind spots.  As such, we believe our product provides superior protection to our customers.  However, moving beyond self-evaluations is critical for at least three reasons.

First, we’re not naïve to the fact that---subtly, but importantly---our product has been built to protect against threats that fit our understanding of the contemporary threat landscape.  We have designed our product to generalize to evolving threats using behavioral-based and signatureless detections. Layered protections minimize the impact of a miss at any single point in the attack chain.  As such, we’re confident in our ability to protect customers.  It’s probably safe to assume that most mature vendors believe that their product is superior to competitors based on self-critical internal evaluations and subsequent improvements.  We certainly do.  But dataset bias is real, and by nature invisible.  So, it behooves the rigorous and objectively-minded to submit to third-party testing, even when the testing results are never made public.  It explicitly benefits the vendor by identifying any hidden weaknesses and strengths.  

Related is that fact that internal testing methodologies create a conflict of interest with inherent bias, either innocently or purposefully, because the vendor benefits from self-assigned high grades. Vendors optimize to their success metrics. Methodological massaging to boost metrics is a direct consequence of financial incentives to broadcast the metrics’ supremacy.   Of course, anyone can score well on a test they write for themselves.  As amateur statistician Homer Simpson rightly noted, “people can come up with statistics to prove anything.” This notion of juking the stats was popularized on The Wire, but reflects a serious challenge when defining testing methodology and evaluation criteria. Thus, integrity compels an honest vendor to participate in independent third-party tests.

Thirdly, publishing test results is critical to the consumer.  While many customers have the resources to conduct their own extremely rigorous product evaluation, many must rely on third-party tests as an objective confirmation that products protect as advertised.  For vendors who do not participate in public tests, the consumer is left to wonder whether private test results (if any) were unsatisfactory or would otherwise be embarrassing to the vendor if published.  

To summarize, vendors should participate in public third-party tests with motivations of objectivity, integrity, credibility, in addition to the obvious value-add as a critical comparative tool for customers.  Just like the attackers, these tests also must evolve to evaluate against the broader range of attacker techniques in addition to malware, such as fileless attacks. Customers can and should demand this.  Endgame is proud to participate in public third-party testing, including AV Comparatives and SE Labs, and we look forward to sharing results of ongoing third-party public testing in the near future.

 

But Who is Testing the Test?

Just as imperative as the need for vendors to participate in independent third-party testing, the security community must also ensure credibility in how tests are performed. From a customer's perspective, there are several elements of testing in infosec, especially related to potential conflicts of interest, that should at least raise an eyebrow.  To paraphrase Dennis Batchelder, current president of AMTSO, at the most recent AMTSO meeting, in what other industry do you find the following scenarios?

  • A testing organization that is evaluating a product may get paid by the vendor (or by a competitor) that is authorizing the test.

  • A testing organization may test an unwilling vendor’s product for a use case or in an environment not intended by the vendor.

  • A testing organization may receive significant input (e.g., malware samples) from the vendor being tested.

  • A vendor may monitor or influence the outcome of the test while it is being performed, even changing the behavior of the product while it is being evaluated.

  • A vendor may pay the testing organization for additional privileges or upgrades to the basic test.

Let’s be clear, here.  Nonprofit testing is important, but it is not the only option.  Testing organizations are expending significant effort to develop realistic testing methodologies, which requires resources.  At Endgame, we pay testers to test our product.  Furthermore, the methodology by some testers may not be perfectly aligned with Endgame’s protection strategy, and we appreciate testers who seek our input into how to exercise our product’s functionality. Customers must be able to trust that money isn’t driving the outcomes of the test, and that the relationship of the vendor to tester is indeed independent rather than influential.  In both cases---payment and methodology refinement---the best way to ensure that these relationships don’t undermine customers’ interests is for both parties to agree to the highest standards of transparency.  

Specifically for this reason, in May AMTSO adopted a draft standards document that outlines appropriate protocol for testers and vendors when evaluating anti-malware solutions.  Like many technical standards documents (e.g., ISO 27001), the standards do not mandate specific implementation details. Rather, the document establishes a foundational protocol that promotes tests that are impartial, transparent, and notably address the inherent conflicts of interest present in our industry. Without this basic foundation that outlines protocols for fairness and transparency, more detailed standards aimed to improve testing quality are a proverbial house upon the sand. When testing methodology is transparent and becomes the subject of public scrutiny, testing quality is bound to follow.

The draft status of the standards is intended to help AMTSO, testers, and vendors work through the implementation details in providing transparency and rules of engagement for vendors and testers.  Through this process, AMTSO will revise the draft as needed.  Once adopted, AMTSO (comprised of a consortium of vendors and testers and interested parties) then becomes an organization that, indeed, can implement testing standards, which have notably been absent in its decade-long history.

Endgame is proud to support AMTSO in the organization’s effort to set standards of transparency, fairness and impartiality in third-party validation of security products.  I am hopeful that after any necessary minor revisions revealed during the exercise of the draft, AMTSO member organizations will come to a consensus and fully adopt the testing protocol standards document.  Similarly, Endgame is committed to additional tests that extend beyond malware to cover a broad range of attacker techniques. Customers deserve it.

 

Moving Forward with Openness

The call to vendors and testers for openness and transparency is an invitation to somewhat unfamiliar territory in infosec.  As Nate Fick, our CEO, sharply called out in a 2017 New America conference keynote: “Security is bedeviled by a dark arts culture that’s both self-serving and wrong.  Security is no more a dark art than finance or real estate or tax policy or animal husbandry...and to wrap itself in a cape of black magic is nothing more than self-importance and a vain attempt at job security.  It’s bad for customers.”

We are in a transformative stage of the infosec industry. As enterprises iterate on compliance with frameworks such as NIST, there is a parallel demand to ensure greater transparency of the products to ensure they perform as advertised. Just as Kelley Blue Book and Angie’s List have provided independent means to assess various products, so too is infosec moving in this direction. Driven by the expectation of methodologically rigorous and fair independent evaluation, vendors are participating in these assessments, and consumers are demanding them.  While testing methodologies may not be perfect, openly participating in them is crucial, and that’s why we have been and will continue to be actively engaged in a number of public tests with various vendors.  Together as members of AMTSO, we’ll adopt what are now draft standards, and can finally move the community forward, removing the ‘dark arts’ mystique of the industry, and provide protection assurances to consumers, corporations and national security assets. Transparency is key.  We should expect nothing less.

Beyond the Math: Effective Machine Learning in Security

$
0
0

In an attempt to appeal to information security executives and practitioners, some vendors have positioned Machine Learning (ML) – often liberally decorated as “Artificial Intelligence” (AI) – as a panacea for information security’s challenges.  In many cases, the hype has gone well beyond reality and become marketing nonsense.  Is there a useful place in information security for machine learning?  Most definitely. Will algorithms replace domain experts?  No.  Is machine learning automagically better than other carefully crafted detections because, well, math?  No.

Whether you’re on the vendor side (as I am) or on the user side (as many of you surely are), there is far more nuance to effectively using ML to achieve your security objectives.   ML can be extremely powerful, but it is not always the answer.  Even when it is the best tool for the job, it is very easy to screw up a ML model due to the use of bad data, incomplete data, the wrong features, or a variety of other factors.  Worse, vendors’ poorly explained metrics and unverified claims can make it impossible for users to recognize bad implementations.

Even the concepts themselves are misused and mischaracterized interchangeably, leading to the necessity for articles to ‘demystify’ AI and ML in security.   To simplify, AI is the practice of making computers behave or reason in an “intelligent” manner.  ML is a subset of AI, in which the computer makes predictions or insights from data.  In security, most discussions related to AI are actually just ML so we will stick to that terminology here for consistency.    

As ML remains a hot buzzword in security, the misconceptions surrounding it only seem to be increasing. I’ll first address many of the misconceptions of ML in security, and then highlight the advantages of good ML-driven solutions when properly implemented. Given the range of pitfalls and considerations required to optimize a ML-driven solution, a hybrid approach of domain expertise combined with ML is necessary. It is this interplay that has made our research and development so successful, and continues to innovate our detection and protection capabilities within the Endgame platform.  

 

Common Misconceptions

Various myths about ML are gaining a foothold in the security community.  Many industry veterans are rightly skeptical and growing increasingly cynical as they watch vendors claim their magical ML solutions will solve all security problems. Machine learning is not the silver bullet often claimed.  Below are  four of the biggest misconceptions in security that result from that type of messaging.

  1. ML REPLACES THE NEED FOR SKILLED EMPLOYEES: This is not likely in infosec. According to some projections, the industry will face close to a two million workforce shortage by 2022.  Demand for talent is likely to outstrip supply for a long time.  Vendors need experts well versed in both ML and security to properly implement ML-driven security products.  In addition, security teams need people to interpret the results and take actions via their tools, whether they are ML-driven or not.  Some organizations may be able to reduce the need for additional resources, but don’t expect a significant drop.  ML can augment your current workforce, but the value of your human capital isn’t going down anytime soon.
  2. ML IS INHERENTLY BETTER THAN HUMANS AT FINDING AND STOPPING INTRUSIONS: There are instances where this is true, but it is not universal. To be effective, solutions which are implemented using ML need to be trained, tuned, and tested by domain experts who understand the problem.  Solutions which lack this rigor tend to be ineffective, noisy, and ultimately shelved by users.
  3. IN ML, ALGORITHMS ARE THE MOST IMPORTANT FACTORS: We hear a lot about deep learning, neural networks, and a host of other fancy sounding terms.  Most people don’t know what these words mean, but they sound great!  The truth is that the quality of the data you feed to any ML technology greatly determines the real-world efficacy of the resulting model.  Data curation, cleaning and labeling is hard and extremely time consuming.  It requires deep domain knowledge and collaboration between data scientists and security experts.
  4. ML IN SECURITY IS ONLY ABOUT DETECTION: It is true that ML is an excellent way to solve some detection problems, but security is much bigger than your appliance blocking or finding something bad.  False positives happen.  An alert rarely tells the entire story of an attack.  Security teams often need help deciding what to do next and how to respond.  The industry’s narrow focus on ML only for detection has hindered potential advances in other key areas such as triage, response, and workflow.

 

Advantages of Machine Learning in Security

Despite these challenges with ML, when properly implemented, ML can be powerful in security.  In fact, when used correctly, it has significant advantages over non-ML approaches in solving certain problems.  These include:

  1. GENERALIZATION: Models can generalize to never-before-seen techniques, sequences of adversary actions, or malware samples based on structural or behavioral relationships within the data that are not obviously constructed by hand.  Human-constructed signatures or heuristics tend to be very specific and reactionary, giving rise to subsequent high false negative rates because of evolving adversary tools and tradecraft.  ML solutions, if done right, can consistently detect never-before-seen evil.
  2. SCALE AND AUTOMATION:ML solutions scale well with increasing volumes of data, a pervasive problem in the industry. They can aggregate, synthesize, and analyze disparate data sources automatically. When a novel attack occurs or new malware sample emerges, just adding it to the training set can improve the model.  An army of malware analysts and signature developers is not necessary.
  3. DEEP INSIGHTS: ML learns from the data what constitutes malicious and benign content or behaviors.  A human does not dictate exactly where the decision boundary between benign and malicious lies.  This can lead to surprising and sometimes non-intuitive ways the underlying data can be sliced to make detection decisions – things which would not occur to a domain expert.
  4. INFREQUENT UPDATES: A need for constant signature updates can be a substantial operational burden and put systems, especially offline or off-corporate-network systems, at rapidly increasing risk of compromise.  Well-constructed ML solutions generalize well to new threats and thus require far less frequent updates to be effective.

 

Considerations When Building ML/AI Models

Endgame has a number of powerful ML-based detection solutions in our product, with many more in development.  Where we have chosen to implement ML, the advantages are significant.  However, in building these capabilities, we are always extremely aware of both the pitfalls (ensuring we avoid them) and opportunities (ensuring we optimize them) inherent within ML-based solution. For others looking to evaluate a ML model, or thinking about building their own, I’ve compiled some of the key considerations to keep in mind when building ML-based solutions.

  1. GARBAGE IN/GARBAGE OUT: This is the most significant issue facing security researchers building models.  First, gathering the right representative data, both malicious and benign, is a huge challenge.  Any model will have holes in its global understanding of data and behavior, but the bigger the holes, the worse the model will perform in the real-world. In addition, unsupervised machine learning - where the model alone derives inferences from the data without human labeling - generally is inadequate for most security use cases. Instead,  most models are built through supervised machine learning – that is, feeding the tool a lot of training data which is labeled either good or bad.  The model then learns whether to call future unlabeled content or behavior fed to it by the security application for evaluation good or bad.  Labeling of training data may sound simple, but in practice it is very hard.  There are significant edge cases, like how to deal with things like adware, or legitimate tools which can look like malware that does similar things, such as remote access solutions.  Worse, attackers can and do sneak into the benign training set, usually not through some conscious effort but through labeling laziness or mistakes.  A file has zero malicious detections in VirusTotal?  Scoop up that piece of advanced malware, call it benign, and your model will ignore it forever.  
  2. FEATURE ENGINEERING: Models are usually built upon features which describe the data.  Feature extraction is the process by which input data, such as files, are transformed deterministically into a representation of that data comprising many features. The features are engineered to encode domain knowledge about the problem to be solved. For example, Windows executable files are transformed into thousands of floating point numbers during our MalwareScore™ feature extraction process.   Feature engineering, that is, researching and implementing the right feature set, is an enormously important part of the model building process.  To generate features that can generalize to the future, an understanding of how and why things work is required: operating systems, networks, adversary tradecraft, etc.  For example, we can simply make a feature that represents whether a binary imports functions for keylogging, but what if an adversary decides to dynamically load functions associated with keylogging to evade detection?  The feature space needs to be diverse and account for evasion techniques to generalize well.  Featureless learning is an exciting area of research, but it is nascent, and just as vulnerable to the garbage in/garbage out problem. In many security domains (e.g., static malware detection) hand-crafted features still represent the state of the art, allowing designers to encode decades of domain knowledge that have yet to be replicated through end-to-end deep learning.
  3. USERS WILL DO THINGS YOU DO NOT EXPECT: Your model needs to be false positive intolerant to have a chance at being successful in production.  Many models look great when tested against known data, but when deployed they explode with false positives.  This is often because users don’t act in predictable ways.  Administrators may create new accounts, bounce between systems aggressively, and use tools, such as Powershell, which are often co-opted by hackers. Someone from accounting might log in at unpredictable times or practice new things on her computer discovered online.  Software your model doesn’t know about might get installed.  Researchers must be aware of these issues and expect the unexpected in the real-world. In addition, researchers must be aware of scale.  A 1 in 10,000 false positive rate might seem great, but what if you’re observing a billion events per day?
  4. OVERFITTING: Overtraining isn’t only an issue at the gym.  It is a real issue in data science.  It is possible to build a model that is tuned exceptionally well at detecting known data but won’t extrapolate well for unknown data.  This leads to a loss of generalization and sub-par, real-world efficacy.  Researchers need to avoid falling into a trap of seeking perfection on results against a training set and instead constantly assess model performance against representative slices of withheld data; that is, data which the model did not test on in training. This helps avoid the trap of performing perfectly on the known data set, while failing when applied to real-world, unknown data.
  5. MISLEADING METRICS: This is mostly an industry problem, but also is a potential issue for internal teams building and “selling” custom models as they seek funding and production deployment on their own networks.  There are always tradeoffs in model performance. Researchers and engineers must select cut offs between good and bad. It’s quite easy to manipulate the numbers and market a certain level of performance. For instance, it’s easy to detect 99.99 % of all malware, but at what false positive rate?  Probably way too high.  This is a dangerous but unfortunately common behavior by vendors, and the impact is compounded by how hard it can be for customers and users to validate claims in this industry.  Test against real, representative data, be transparent, and demand that solutions submit to third-party validation whenever possible.
  6. BURN-IN TIME:Some solutions require months to tune an environment’s baseline of normal, otherwise known as burn-in time. Beyond forcing users to wait a few months before there is any value, a lengthy burn-in time increases the opportunity for malicious data or behavior to creep into the benign training set and does not eliminate the issue of holes in understanding what is normal.  Efforts should be taken to minimize burn-in time or avoid it all together.

 

Moving Beyond Detection

In addition to these considerations, we must also hold up the actual use case to tight scrutiny. To date, ML is largely only applied to detection. For those who spend their time protecting systems day in, day out, you may by this point be asking yourself, “do I even need more alerts?”  “Is detection even my biggest problem?”  I find that the answer to both questions is often no.  Security products need to be much broader than pure detection.  They need to provide context, tie information together, and guide the practitioner through incident triage, scoping, and response.  Today’s products generally are severely lacking in these areas. It is time to consider how AI and ML can improve the workflow including, but beyond, detection.

ML and AI can help automatically gather and correlate data, can suggest or even automatically execute response actions, remember what you did before, and make the process of asking questions of a mountain of available data straightforward and accessible to users.  We need to collectively think bigger about AI in security and demand tools which apply AI and ML not just to create more alerts about things that might be malicious, but make the process of doing our jobs as security analysts easier.

At Endgame, through our own experiences as operators and analysts, we know the key pain points and hurdles, and bake ML into our endpoint security product in effective ways.  Our research and development team combines experienced data scientists, reverse engineers, incident responders, and threat experts hailing from diverse industry and government backgrounds.  Because domain expertise is so critical, security domain experts are paired with data scientists to craft, evaluate, and red-team solutions.  We constantly question our assumptions, endlessly seek and fill gaps in our data sets and features, and get way into the weeds in analyzing our results to identify and fix potential shortcomings. The fruits of this tireless labor include our ML-powered MalwareScore™ feature, which we’d put up against anyone in the industry (and you can check it out yourself in VirusTotal).

We are also very comfortable in saying that ML is not the answer to all our detection woes.  Our researchers and engineers have built world-class capabilities our customers use every day to block exploits, stop process injection, find injected code, eliminate ransomware, and much more.  The techniques we use are very powerful and no less interesting or revolutionary because of the absence of ML – our growing patent portfolio and publicly shared research in these areas attests to this.

We also treat the usability and workflow problem as a serious R&D challenge in need of a solution.  By pairing researchers with user experience experts, our product addresses those key workflow pain points with novel, yet intuitive to use, ways to fix them. The result is Artemis®, our AI powered chatbot which eases search, triage, and response throughout our endpoint protection platform.  This does not eliminate the need for highly skilled users, but it does facilitate training of new users and removes friction for more advanced users.  Under the hood remains a powerful two-way API which is fully documented and open for any user, extending our commitment to usability and extensibility.

 

The Bottom Line

Looking beyond the hype and buzzwords, when implemented well ML can be a powerful piece of a security program or tool. The key challenge is to build it correctly, taking into account the range of factors I’ve addressed in this post.  The marketplace is littered with poorly implemented ML-based solutions fronted by aggressive marketing.  To help cut through the noise, the graphic below provides a cheat sheet for key questions to ask when evaluating ML-driven solutions.

Machine Learning Ground Truth Checklist

The key to effective implementation of ML is domain expertise.  In security, that domain expertise needs to drive dataset curation, labeling, feature selection, and evaluation.  It also is equally effective to apply that domain expertise directly via other methods of detection.  You need real time, inline analysis of data and actions on monitored systems looking for behaviors across the range of adversary techniques.  ML isn’t the answer when domain expertise is required to go deep into the kernel and boil adversary actions like process injection down to their essential essence and block that in real-time.

Most importantly, remember that machine learning is a powerful tool, but is not inherently good or even better than alternatives.  The appropriate use case, data, parameters, and domain expertise (just to name a few of the challenges) all impact the efficacy of ML-based solutions. Moreover, beyond improving detection, as a community we must think bigger and address our industry’s usability problems and how ML could help alleviate some of the major workflow challenges.  To this end, at Endgame we will continue to research and operationalize powerful new features, both ML and non-ML based. Stay tuned!


The Escalation of Destructive Attacks: Putting Dragonfly in Context

$
0
0

Today, Symantec released another report on Dragonfly, a cyber-espionage group targeting the energy sector in the United States, as well as Turkey and Switzerland. As the report thoroughly details, the campaign has evolved beyond the initial phase of reconnaissance, and has shifted to gaining access to the operational systems of energy facilities. This transition elevates the intent beyond exploration and espionage towards an operational foundation necessary for sabotage and destruction. Importantly, as Mark Dufresne and I presented at BSidesLV this summer, attacks intended to sabotage and destroy are not new. Destructive attacks have been increasing in frequency, and are tightly linked to rising geopolitical tensions between the most conflict-prone country-pairs. In short, this latest news of deep intrusions into the energy sector fits in with a much larger and disconcerting pattern, and should not be viewed as an anomaly but rather the ‘state-of-the cyber state’ in 2017 and beyond.

 

Destructive Attacks: A Brief Overview

As the timeline below illustrates, destructive or sabotage attacks have targeted a range of critical infrastructure, media, and select, targeted organizations for the last decade. Importantly, while these kinds of attacks have been on the rise, there has been a marked spike in destructive attacks since late 2016. In December, the Ukraine power grid was struck again with destructive malware, later attributed to Russian-linked Crash Override. Crash Override is a highly customized malware with a wiper component, and is compiled to control the grid circuit switches and breakers. A few weeks earlier, Shamoon 2.0 surfaced, targeting Saudi government entities, infecting thousands of machines and spreading to Gulf states. Shamoon 2.0 was followed by the discovery of Stonedrill, another destructive malware targeting Saudi entities, but has also been discovered in at least one European organization.

 

Screen Shot 2017-09-06 at 10.08.58 AM.png

 

These destructive attacks not only are expanding their target set, but also are innovating for additional effects. Just as Crash Override innovated in the sophisticated customization to control power grids, additional innovations in destructive malware have occurred this year.  KillDisk, malware with a wiper component, has been linked to previous attacks on the Ukraine power grid as well as the shipping and financial sectors. It has recently been updated to encrypt files and contains a ransomware component. Conversely, NotPetya masqueraded as ransomware, but was likely a targeted wiper malware attack focused on Ukraine. Finally, Ireland’s EirGrid was compromised in April, and reported last month. It remains unclear whether destructive malware was installed, which could have resulted in a blackout. These are just those attacks that have been publicized or discovered. As the Dragonfly report makes clear, these campaigns can remain undiscovered for quite some time before discovery or public acknowledgement.

 

Phases of Tensions, Phases of Escalation

As the Symantec report well-articulates, Dragonfly 2.0 reflects an escalation from general intelligence gathering towards the sort of deeper access to and reconnaissance on control systems necessary for potential sabotage. These phases of escalation are increasingly common, and often coincide with escalating geopolitical tensions between countries. More often than not, the escalation to destructive attacks occurs between interstate rivals - pairs of countries that exhibit a higher propensity toward conflict. While not noted in the above timeline, the Dark Seoul gang, linked to North Korea, was among the first to combine wiper malware within a larger campaign in 2009, targeting the United States and South Korea with a combination of DDoS attacks and wiper malware. North Korea has a long history of integrating wiper malware with additional attack vectors, which often coincides closely with planned exercises, anniversaries of key events, or other geopolitical events such as the disintegration of the six-nation talks.

Similarly, Shamoon 2.0 manifests the escalation from the previous campaign, and the geopolitical tensions between Iran and Saudi Arabia. While there was a relative ‘cease fire’ of destructive attacks between the pair from 2012-15, following the Iran nuclear agreement there was a major escalation of tit-for-tat attacks on websites prior to Shamoon 2.0 and Stonedrill. Finally, Russia and Ukraine represent the most prominent use of destructive attacks, as well as the asymmetric and uni-directional use of these attacks by major powers on smaller countries. Unfortunately, many of these attack vectors and wiper malware are now in the wild, and are likely to be deployed by other groups similarly seeking larger effects and objectives.

 

Protecting Against Targeted Attacks

In many of these examples, private sector organizations are caught in the geopolitical crossfires, and often are viewed simply as collateral damage by the attackers. NotPetya (or “Nyetya”) may cost the shipping giant, Maersk, $300 million, even though by most accounts it was not the intended target. Because of these growing externalities to targeted attacks, it is important to remain cognizant of the attack vectors and protect against them accordingly.

First, although the energy sector is a prime target for destructive attacks, enterprises in other industries must also be prepared to protect against these kinds of attacks. Second, the Symantec report notes that Dragonfly uses a range of infiltration techniques to access a victim’s network, including “malicious emails, watering hole attacks, and Trojanized software.” Because destructive attacks, and targeted attacks in general, integrate a variety of intrusion techniques, prevention must be considered a top priority.  This is why at Endgame we focus on prevention across the attack lifecycle - exploits, malware, and post-exploit techniques including living-off-the-land.  Such an approach is necessary to protect against threats such as Dragonfly.

 

Conclusion

Today’s Symantec report on Dragonfly is just the latest reminder that attackers are increasingly brazen, and critical infrastructure remains a prime target.  Unlike the series of publicized destructive attacks that have been slowly on the rise for the last decade, we see no proof of actual sabotage, but pre-positioning is probably underway.  We should not panic that the grid is about to go down, but we must pay attention to the trend.  Too often these attacks are viewed through a limited lens and remain on the radar for only a brief news cycle. This myopic view of these attacks overlooks the larger, escalatory increase in these attacks, especially within and between geopolitical rivals. As long as geopolitical tensions remain high, and with the growing open source proliferation of nation-state malware, this trend is unlikely to abate any time soon.

Corvil and Endgame: Safeguarding the World's Algorithms

$
0
0

To obtain a competitive advantage, businesses across nearly every sector are increasingly turning towards algorithms to unlock and act on signals hidden in mounds of data.  Today, algorithms frequently drive key revenue generating and back office functions. These implementations have moved beyond hardcoded or basic statistical models, often incorporating Artificial Intelligence (AI) and Machine Learning (ML) based approaches, where the algorithm’s underlying rules are learned not written. As with most key technological advancements, securing these algorithms has largely been an afterthought. Endgame and Corvil have partnered to address this gap.

 

Early Implementation of Machine Learning

The financial sector was one of the first sectors to turn to algorithms. In the late 1990s, to optimize training operations, the first computers began trading without human intervention. However, the underlying methods were simple in comparison to algorithmic trading that accounts for 90% of modern trading activity. Other sectors have followed suit, adopting algorithms as a way to rapidly make complex decisions without human involvement. Today, when you shop online, apply for a loan, or binge watch Netflix originals - algorithms are working behind the scenes.  Over the last 18 months, the security community, a relatively late entrant, has put an almost obsessive focus on leveraging machine learning to improve detection outcomes.

 

Dangers with the Rise of the Machine

While Mark Zuckerberg and Elon Musk debate the dangers of this emerging future, others are looking for ways to exploit this evolving reality.  Massive data breaches often dominate the headlines, but the threats to algorithms are often overlooked.  Algorithms can be gamed, stolen, or manipulated.

To offer a simple example, ride-sharing companies use algorithms to track, motivate, and compensate their drivers. For instance, when there is a lack of drivers on the road in a specific area, pricing algorithms implement price surges to attract drivers. Since fares during a surge are considerably more profitable, drivers have been organizing massive “switch offs” to trigger surge pricing. This is possible because, through experimentation, the drivers understand the basic features that drive the algorithm’s output.  

As the ride-sharing example demonstrates, algorithms can be manipulated for profit.  In 2015, an engineer at Citadel, a Chicago-based hedge fund, was convicted of stealing thousands of files containing "alpha and term" data via a personal hard drive.  The engineer used the data to trade in his personal brokerage account. Ironically, the perpetrator lost money, possibly because the data was no longer reflective of current market conditions.  In April 2017, a KCG employee allegedly installed a credential-stealing tool in order to access systems and steal source code from his hedge fund employer.  While most Wall Street firms aggressively block external email and external storage devices, attackers evolve. What’s interesting about this case is that the actor, an insider, employed attacker tools and techniques to achieve his ultimate objectives.  

Five years ago, Knight Capital served as a case study that perhaps foreshadows the dangers posed by an errant or degraded algorithm.  An errant algorithm cost the trading firm $10 million a minute over a 45-minute window, resulting in a 70% loss in market value. Although this was based on developer error, not attacker success, it demonstrates the risks involved when algorithms are manipulated. 

 

Protecting Enterprises from Targeted Attacks in A Machine World

Clearly, reverse engineering, unauthorized access, or manipulation of algorithms pose a threat to algorithmic businesses. Security vendors are not immune. In a recent webinar, Endgame’s technical director of data science, Hyrum Anderson, showed how even blackbox machine learning models can be gamed and exploited if an adversary is given the ability to run sufficient experiments against those algorithms.

Endgame and Corvil have partnered to provide a unique solution to address the dangers posed to algorithmic businesses. Endgame, whose customers include the most targeted military and commercial organizations, is the only endpoint platform unifying prevention, detection and response, and threat hunting to stop sophisticated attacks before theft can occur. With customers among those in the top global banking and financial services companies, Corvil provides real-time traffic analysis to safeguard financial transactions representing over $1 trillion on a daily basis. Together, the Endgame-Corvil partnership provides a joint solution, with full stack protection, nanosecond visibility, and lightweight threat hunting for the most sensitive environments.  In the financial sector, the joint solution is uniquely equipped to protect both back office, development, and trading environments.  

With expertise and lineage that well understands the challenges and opportunities this growing reliance on algorithms brings, Corvil and Endgame are uniquely experienced in safeguarding enterprises from these evolving threats. Corvil and Endgame will continue to work closely with algorithmic businesses, across a variety of sectors, to help protect their customers, partners, and investors.

Bots, Trolls, and Warriors: The Modern Adversary Playbook

$
0
0

Last night, The Washington Post published an article on Russia’s use of Facebook for micro-targeting. According to the article, last summer Facebook’s cyber experts found evidence of APT 28 setting up fake accounts, including Guccifer 2.0. APT 28 has been linked to Russia, and not only consists of hackers but also media operations that can be carried out simultaneously. This is just the latest example of how attackers are integrating offensive cyber operations and (dis)information operations, in conjunction with automation and machine learning to achieve both tactical and strategic impact. This modern adversary playbook was the topic of my talk at Derbycon on Friday. Today’s attackers combine bots, trolls, and warriors in increasingly novel and brazen ways. Defenders must catch up and comprehend this modern adversary playbook, and prepare defenses accordingly.

 

Today’s Battle for Information Control

For today’s attackers, information security is synonymous with information control – including theft, manipulation, destruction, and disinformation – and they achieve this through a combination of bots, trolls and warriors. First, cyber warriors is the unfortunate term coined for experts in computer network exploitation, both offense and defense. When applied to adversaries, the offense-focused warriors are increasingly brazen and leverage traditional means to compromise (e.g. credential theft used in OPM hack) as well as sophisticated techniques (e.g. Crash Override, customized for energy grids) to achieve strategic surprise. While most think of Russia, China and the US as dominant in this area, smaller countries such as Mexico, Vietnam and Sudan increasingly have their own teams of warriors, who often target domestic populations and corporations. At the same time, anti-government groups are pushing back, such as Venezuela’s Binary Guardians or Ukraine's hactivist network. In short, there is no one size fits all, and thanks to the proliferation of open source capabilities – such as through the Shadow Brokers and Vault 7 dumps – the number of both state and non-state cyber warriors is only growing.

While warriors generally focus on compromising machines, trolls focus on compromising hearts and minds. Trolls reflect the growth of entities largely, but not always, linked to nation-states, who leverage online forums to influence opinions, perspectives, and achieve specific objectives. The Russian trolls are well-known and have had an impact across the globe, but China also has the Fifty Cent Party of government-affiliated workers pushing forth positive narratives about the government. To this end, astroturfing– replacing negative narratives with positive ones about the government – has become a popular tactic within authoritarian regimes. In addition, the use of state media, disinformation, and fake or compromised social media personas also serve as a springboard for diffusing the narrative. Just as with the warriors, smaller countries are similarly copying these tactics. For instance, Philippine President Rodrigo Duterte has a keyboard army aimed to drown out critics of the government. Turkey’s AK Trolls, affiliated with the ruling Justice and Development Party, astroturf critiques of the government, but due to some narratives gone awry, they largely now stick to traditional state outlets to oppose foreign governments. In many of these cases, the governments test out tactics on the domestic population and then use them internationally.

Trolls also migrate from the virtual world to the physical. LinkedIn has recently drawn much attention as a forum for fake profiles to infiltrate networks who then meet in person, or convince targets to download malware. Cobalt Gypsy/OilRig group is notorious for targeting execs in high tech and energy. For instance, they have been linked to the fake persona Mia Ash, who connected with targets through LinkedIn, followed by a rapid progression of Facebook and What’s App connections, eventually convincing targets to download malicious excel spreadsheets with RATs.

To be fair, troll armies and warriors existed before the internet, so it’s no surprise even more sophisticated versions exist in the virtual world. The distinction now is the role of automation in helping warriors and trolls target both tactically and achieve strategic breadth and depth. Bots reflect the implementation of automation and machine learning by the trolls and warriors, and manifest in everything from DDoS to malvertising to ransomware fueled by propagating worms to social bots. By some estimates, bots comprise over half of all web traffic, half of which are malicious bots. At the tactical level, machine learning helps warriors evade defenses, including machine learning-powered defenses (e.g., bot vs bot). Machine learning also helps trolls target very specific sub-sets of the population to optimize the impact. The social media bots are good examples of this, not only to interfere in elections, but also to prop up governments and weaken opposition. Strategically, bots are essential to help trolls attain a strategic impact with widespread diffusion of narratives, and also enables warriors to spread malware globally, as we saw recently with both WannaCry and NotPetya

 

An Integrated Strategy

Increasingly, we are seeing this combination of bots, trolls, and warriors to achieve strategic impact. The recent French election demonstrates the integration of bots, trolls, and warriors – disinformation by trolls, warriors dumping data 48 hours prior to the election, and bots to help the diffusion and targeting of both. The Macron Leaks, which were false or doctored documents, photos, and correspondence linked to his campaign, was a late effort to influence the election, but also a continuation of the disinformation leading up to the election. Bots helped proliferate the narrative, with 40% of the #MacronGate tweets coming from just 5% of accounts.

Importantly, this is occurring both in peace and wartime. The annexation of Crimea in 2014 also involved all three, and is noted as one of the first hybrid wars, where these digital activities complimented kinetic attacks. DDoS attacks against Ukrainian government and media sites led to an information blackout, which left an opening for pro-annexation actors to gain information superiority and dominate the narrative. Both hackers and infobots were part of the aggression against Ukraine.

Most recently, this summer’s Qatari tensions is indicative of the geopolitical instability that can arise when these three tactics are integrated. It first involved a hack of a news agency, followed by the posting of false reports of the Qatari emir praising Iran and Hamas. Twitter bot armies spread the disinformation, leading to Saudi Arabia, UAE, Bahrain, and Egypt all banning Qatari media and then enlisting trade and diplomatic boycotts against Qatar. By one report, 20% of Twitter accounts posting anti-Qatari hashtags were bots.

Although these are examples of nation-state attacks on the private sector, the private sector also has adopted some of these tactics. Everyone thinks of Hacking Team, but an article last week in Motherboard describes a leaked catalog that includes services ranging from ‘weaponized information’ to DDoS services to spyware to industrial control system exploits, and is a stark demonstration of the market demand for the range of information-related weapons.

 

Looking Ahead

Remember, it wasn’t long ago that Facebook CEO Mark Zuckerburg stated that the use of social media to influence elections was a crazy idea. It also wasn’t long ago that concerns over critical infrastructure attacks were criticized as fearmongering. Then came along this past year’s discoveries of a wave of destructive malware –  such as Stonedrill, Shamoon 2.0, BlackEnergy, CrashOverride, and NotPetya – as well as reports over the past few weeks of Iranian linked APT 33 or Dragonfly 2.0 moving from reconnaissance to destructive objectives, and there is a clear expansion and brazenness of objectives and intent.

In short, attackers already view the information/cyber domain holistically, through a socio-technical lens and as a battle for information control.  While there are Lithuanian elves combatting the trolls, the defenders generally lag far behind in this battle for information control, and must innovate toward novel, multi-faceted and creative solutions. In a recent testimony submitted to the Commission on Security and Cooperation in Europe, U.S. Helsinki Commission, Molly F. McKew explained the challenge succinctly,“Right now, there are efforts to analyze the war; expose the war; map the war — but very little is being done to fight the war.” In short, defenders must catch up and view the information security landscape through a socio-technical lens, comprehend this modern adversary playbook, and prepare and innovate defenses accordingly.

Practical Tips for Becoming Cyber Savvy

$
0
0

Following the Equifax breach in early September, in which 143 million records were stolen, The New York Times updated their interactive tool for individuals to comprehend how much of their data has been exposed across a range of breaches. Just a few weeks later, they updated it again following the announcement that the 2013 Yahoo breach impacted three billion accounts. Given the extent of the data theft, individuals may feel hopeless since, across these breaches, the majority of personally identifiable information (PII) is available somewhere.

Now is not the time to abandon good security habits. Targeted attacks continue unabated on high profile individuals, such as political candidates and executives, often to inflict reputational damage and steal data. But the vulnerabilities extend beyond high profile individuals and exist throughout organization charts and production supply chains. Cobalt Gypsy infiltrates corporate networks through employees. Target was successfully compromised through an HVAC company. If an attacker is determined to access a network, they will likely figure it out. But there are many steps individuals can and should take to make it harder and ideally deter an attacker, while also limiting collateral damage should an attack succeed. To kick-off national cybersecurity awareness month, below are some background and tips that are relevant for everyone, from candidates running for office to parents wanting to protect their kids online.

 

It’s All Social

It is true that attackers are increasingly finding innovative technical means to compromise a network, but the most prominent initial attack vector remains phishing. Phishing refers to when attackers disguise themselves and seek to dupe targets in order to access sensitive data or information. This occurs largely through electronic modes of communication, frequently emails. Phishing attempts reflect a range of creative malfeasance to achieve a variety of objectives. Attackers may impersonate colleagues to solicit sensitive information or access credentials as the springboard to broader access. An attacker may convince a target to download what appears to be a document or spreadsheet, but in reality contains malicious software, such as ransomware and spyware. Ransomware, wherein data is encrypted and inaccessible until a fee is paid (and maybe not even then), has rapidly increased over the last two years and often gains entry through phishing. Also, while spyware is most commonly associated with nation-state attacks on NGOs and journalists largely in authoritarian countries, spyware is also a prominent tool in domestic abuse and bullying. Finally, long gone are the days of blatantly obvious phishing emails. Today’s attacks leverage the variety of online information available, and are increasingly difficult to differentiate from legitimate emails. Recently, authentic G20 invitations were manipulated with a backdoor Trojan for espionage purposes.

In addition, social media is also a popular means for compromise, with similar objectives of gaining sensitive data, acquiring credentials, and compromising machines. These socially engineered attacks target victims through popular social media platforms such as LinkedIn, Facebook, and Instagram, and often manipulate the victim to download a document or spreadsheet. For example, the Iranian affiliated group, Cobalt Gypsy, created a fake persona named Mia Ash. She connected with employees of energy, technology, defense, and finance companies generally through LinkedIn, and then expanded the relationship through Facebook, WhatsApp, and email. The fake persona eventually convinces the target to download a spreadsheet with malicious software, enabling access to the corporate network. In other cases, fake personas pose as recruiters and convince potential job applicants to download malicious software hidden within false job descriptions. The opposite also occurs, wherein fake job applications containing ransomware are sent to HR departments. In short, social media platforms are a prime attack vector for both gathering information to be used in future attacks, or as the mode of compromise.

 

What to Do?

There is no way to protect 100% across the variety of social and digital attack vectors, but there are several minimal to no cost steps that can greatly protect your data and your identity.

  1. Password considerations: This usually tops most lists, but 123456 remained the most popular password of 2016. Strong passwords changed frequently without reuse across sites is recommended. When you do change passwords, do so directly through the secure website. Do not change them through an email received, as password reset emails are generally fraudulent. Run two-factor authentication, such as Google offers for gmail, everywhere it is offered. This means that when you log in, another point of verification is required, often through a text message to your phone. It also is a good idea to provide fake answers for password recovery. Keep in mind how much information is easily accessible via social media, and this begins to make sense. Finally, given how hard it is to keep up with all your passwords, a password manager can greatly help.

  2. Security best practices: There are a variety of security actions individuals can take that require minimal cost. Keeping patches up to date is especially helpful in preventing widespread attacks like NotPetya. Running basic anti-virus software, such as Windows Defender which is built into Windows 8, is also helpful. Virtual Private Networks (VPNs) also help obfuscate your online activity, but be sure to research first as their capabilities vary. VPNs are especially useful for people who travel and rely on external networks. In general, however, it is best to avoid public Wi-Fi if at all possible. Thanks to products such as the WiFi Pineapple, both nation-states and criminal actors can easily create fake Wi-Fi accounts or gain access through public Wi-Fi.  In addition to VPNs, there are also mobile hotspot devices to help avoid connecting to public Wi-Fi if you don’t want to tether to a known device. Security keys based on universal second factor provide another layer of authentication. And only keeping bluetooth on while in use, and downloading apps from known sources are a few additional, easy steps to limit your risk. Finally, back-up all data (but don’t lose the backup device!) and opt for https sites, which as of early 2017 comprised over half the internet.

  3. Protecting your social data: As the examples for phishing demonstrate, social media is increasingly a source for data collection, reconnaissance, and infiltration for attackers. Avoid providing personally identifiable information, such as your birthday or mother’s maiden name, which makes it easier for attackers. Similarly, be protective of your social networks, accepting invitations only from known and trusted people. Even if someone is connected with people you know, that should not serve as validation of legitimacy. Reiterating the previous point in password management, enable two-factor authentication across all of your social media and don’t reuse passwords across accounts. Moreover, choose messaging apps with encryption and leverage those for more sensitive communications. These steps don’t just pertain to work email, but personal correspondence as well. Remember that the hacks of both Hillary Clinton campaign chairman, John Podesta, and then-CIA director John Brennan were personal accounts. If you feel you have received a suspicious email or suspect a fraudulent profile on either account, report it to your security officer if one exists. Many social media outlets also provide easy means for submitting phishing attempts. This helps organizations better understand and protect you against attacks. Finally, particularly for high profile individuals, ensure correspondence passes the front page headline test. That is, don’t write anything in communication that if hacked, would make the front page news (e.g, Sony breach).

Remember, there is no perfect security, but there is better, resilient security. Too often, attacks are avoidable. A few low cost steps can make it harder for attackers, and help you protect your data and privacy.

Hunting for In-Memory .NET Attacks

$
0
0

In past blog posts, we shared our approach to hunting for traditional in-memory attacks along with in-depth analysis of many injection techniques. As a follow up to my DerbyCon presentation, this post will investigate an emerging trend of adversaries using .NET-based in-memory techniques to evade detection. I’ll discuss both eventing (real-time) and on-demand based detection strategies of these .NET techniques. At Endgame, we understand that these differing approaches to detection and prevention are complimentary, and together result in the most robust defense against in-memory attacks.

 

The .NET Allure

Using .NET in-memory techniques, or even standard .NET applications, are attractive to adversaries for several reasons. First and foremost, the .NET framework comes pre-installed in all Windows versions. This is important as it enables the attackers’ malware to have maximum compatibility across victims. Next, the .NET PE metadata format itself is fairly complicated. Due to resource constraints, many endpoint security vendors have limited insight into the managed (.NET) structures of these applications beyond what is shared with vanilla, unmanaged (not .NET) applications.  In other words, most AVs and security products don’t defend well against malicious .NET code and adversaries know it.  Finally, the .NET framework has built-in functionality to dynamically load memory-only modules through the Assembly.Load(byte[]) function (and its various overloads). This function allows attackers to easily craft crypters/loaders, keep their payloads off disk, and even bypass application whitelisting solutions like Device Guard. This post focuses on the Assembly.Load function due to the robust set of attacker capabilities it supports.

 

.NET Attacker Techniques

Adversaries leveraging .NET in-memory techniques is not completely new. However, in the last six months there has been a noticeable uptick in tradecraft, which I’ll briefly discuss to illustrate the danger. For instance, in 2014, DEEP PANDA, a threat group suspected of operating out of China, was observed using the multi-stage MadHatter implant which is written in .NET. More interestingly, this implant exists only in memory after a multi stage Assembly.Load bootstrapping process that begins with PowerShell. PowerShell can directly call .NET methods, and the Assembly.Load function being no exception. It is as easy as calling [System.Reflection.Assembly]::Load($bin). More recently, the OilRig APT Group used a packed .NET malware sample known as ISMInjector to evade signature based detection. During the unpacking routine, the sample uses the Assembly.Load function to access the embedded next stage malware known as ISMAgent.

A third example, more familiar to red teams, is ReflectivePick by Justin Warner and Lee Christensen. ReflectivePick allows PowerShell Empire to inject and bootstrap PowerShell into any running process. It leverages the Assembly.Load() method to load their PowerShell runner DLL without dropping it to disk. The image below shows the relevant source code of their tool.

It is important to point out that Assembly.Load, being a core function of the .NET framework, is often used in legitimate programs. This includes built-in Microsoft applications, which has led to an interesting string of defense evasion and application whitelisting bypasses. For example, Matt Graeber discovered a Device Guard bypass that targets a race condition to hijack legitimate calls to Assembly.Load, allowing an attacker to execute any unsigned .NET code on a Device Guard protected host. Because of the difficulty in fixing such a technique, Microsoft currently has decided not to service this issue, leaving attackers a convenient “forever-day exploit” against hosts that are hardened with application whitelisting.

Casey Smith also has published a ton of research bypassing application whitelisting solutions. A number of these techniques, at their core, target signed Microsoft applications that call the Assembly.Load method with attacker supplied code. One example is MSBuild, which comes pre-installed on Windows and allows attackers to execute unsigned .NET code inside a legitimate and signed Microsoft process. These techniques are not JUST useful to attackers who are targeting application whitelisting protected environments. Since they allow attacker code to be loaded into legitimate signed processes in an unconventional manner, most anti-virus and EDR products are blind to the attacker activity and can be bypassed.

Finally, James Forshaw developed the DotNetToJScript technique. At its heart, this technique leverages the BinaryFormatter deserialization method to load a .NET application using only JScript. Interestly enough, the technique under the hood will make a call to the Assembly.Load method. DotNetToJscript opened the door for many new clever techniques for executing unsigned .NET code in a stealthy manner. For example, James demonstrated how to combine DotNetToJScript with com hijacking and Casey’s squiblydoo technique to inject code into protected processes. In another example, Casey weaponized DotNetToJScript in universal.js to execute arbitrary shellcode or PowerShell commands.
The number of Microsoft-signed applications that be can be abused to execute attacker code in a stealthy manner is dizzying. Fortunately, the community has been quick to document and track them publically in a number of places. One good reference is Oddvar Moe’sUltimateAppLockerByPassList, and another is Microsoft’s own reference.

 

Detecting .NET Attacks

As these examples illustrate, attackers are leveraging .NET in various ways to defeat and evade endpoint detection. Now, let’s explore two approaches to detecting these attacks: on-demand and real-time based techniques.

 

On-Demand Detection

On-demand detection leverages snapshot in time type data collection. You don’t need a persistent agent running and collecting data when the attack takes place, but you do need the malicious code running during the hunt/collection time. The trick is to focus on high-value data that can capture actor-agnostic techniques, and has a high signal-to-noise ratio. One example is the Get-InjectedThread script for detecting traditional unmanaged in-memory injection techniques. To demonstrate detecting .NET malware usage of the Assembly.Load function, I leverage PowerShell Empire by Will Schroeder and others. Empire allows you to inject an agent into any process by remotely bootstrapping PowerShell. As you see below, after injection calc.exe has loaded the PowerShell core library System.Management.Automation.ni.dll.

 

This fact alone can be interesting, but a surprisingly large number of legitimate applications load PowerShell. Combining this with process network activity and looking for outliers across all your data may give you better mileage. Upon deeper inspection, we see something even more interesting. As shown below, memory section 0x2710000 contains a full .NET module (PE header present). The characteristics of the memory region are a bit unusual. The type is MEM_MAPPED, although there is no associated file mapping object (Note the “Use” field is empty in ProcessHacker). Lastly, the region has a protection of PAGE_READWRITE, which surprisingly is not executable. These memory characteristics are a side effect of loading a memory-only module with the Assembly.Load(byte[]) method.

 

 

To automate this type of hunt, I wrote a PowerShell function called Get-ClrReflection which looks for this combination of memory characteristics and will save any hits for further analysis. Below is sample output after running it against a workstation that was infected with Empire.

 

 

Once again, you will see hits for legitimate applications that leverage the Assembly.Load function. One common false positive is for XmlSerializer generated assemblies. Standard hunt practices apply. Bucket your hits by process name or better yet with a fuzzy hash match. For example, ClrGuard (details next) will give you TypeRef hash with a “-f” switch. Below is an example from Empire.

 

 

Eventing-Based Detection

Eventing based detecting is great because you won’t need luck that an adversary is active while you are hunting. It also gives you an opportunity to prevent attacker techniques in real-time. To provide signals into the CLR on which .NET runs, we developed and released ClrGuard. ClrGuard will hook into all .NET processes on the system. From there, it performs an in-line hook of the native LoadImage() function. This is what Assembly.Load() calls under the CLR hood. When events are observed, they are sent over a named pipe to a monitoring process for further introspection and mitigation decision. For example, Empire’s psinject function can be immediately detected and blocked in real-time as shown in the image below.

 

 

In a similar manner, OilRig’s ISMInjector can be quickly detected and blocked.

 

 

Another example below shows ClrGuard in action against Casey Smith’s universal.js tool.

 

 

While we don’t recommend you run ClrGuard across your enterprise (it is Proof of Concept grade), we hope it spurs community discussion and innovation against these types of .NET attacks.  These sorts of defensive techniques power protection across the Endgame product, and an enterprise-grade ClrGuard-like feature will be coming soon.

 

Conclusion

It is important to thank those doing great offensive security research who are willing to publish their capabilities and tradecraft for the greater good of the community. The recent advancements in .NET in-memory attacks have shown that it is time for defenders to up their game and go toe-to-toe with the more advanced red teams and adversaries. We hope that ClrGuard and Get-ClrReflection help balance the stakes. These tools can increase a defenders optics into .NET malware activities, and raise visibility into this latest evolution of attacker techniques.

A Cozy Community of Data Scientists in Information Security

$
0
0

Every scientist needs a home.  

Like most PhD research topics, mine was “special”. It was unique enough to straddle a few research communities, but fit snugly into none.  Because conferences often reflect these “communities”, I considered my “home academic community” for machine learning to be ICML and NIPS.  But, significant signal and image processing themes often didn’t fit there, and so I also found a “home” at ICASSP, SSP, and Electronic Imaging. The collegial exchange within each subcommunity enriched my research. Still, it frankly took a few years for me to feel comfortable with the simple fact that my presence at each venue would always present a slight mismatch in disciplines.

If you’re a data scientist in information security, you know this feeling. There are a host of excellent venues for machine learning researchers and practitioners, some of which are mentioned above, but these are generally application-agnostic or focus on predominantly computer vision or natural language understanding themes.  At the same time, security conferences like USENIX Security and VirusBulletin provide academics and practitioners a venue for deep dives in infosec, but historically aren’t appropriate for equal depth into data science methodologies. In fact, until relatively recently, your paper about (fabricated) “Siamese neural networks vs. fuzzy imphashing of PE’s import address table” would likely be viewed as at least slightly esoteric to either community.

To be fair, the burden of communicating and educating the security industry on machine learning rests squarely on you as a data scientist.  For many years, there’s been a seat at the table for high level or introductory level machine learning talks at conferences like BlackHat, DefCon, BSides, SchmooCon, DerbyCon, and a host of others.  All these are excellent venues and communities, whose machine learning sophistication is growing.  Your data science presence at these must continue. You must be clear and articulate as you describe “new” applications of machine learning in infosec, while also largely speaking above the math (but without hype!) for a broader audience.  Such a venue is not the place to geek out.

Fortunately, there *is* a burgeoning community within information security where geeking out about machine learning is welcome, even if relatively boutique.  For instance, if you’re academically inclined, then AISec is a great venue for rigorous peer review of machine learning applications in information security.  We lauded AISec last year for its singular ML-for-infosec focus.  It leans heavily academic, with great technical talks.

For a slightly broader audience that also includes infosec data science practitioners, we are excited to co-sponsor the inaugural Conference on Applied Machine Learning for Information Security (CAMLIS).  This is intended to be “in the weeds” for infosec data science practitioners, beginning “where the C-level BlackHat talk left off”.  This year, the lineup boasts data science leads and wonks from various (competing) security companies and government.  Indeed, the lineup of invited speakers may possibly be the richest collection of infosec data scientist friends and foes in any one place giving deep-dive technical talks.  The conference is in its first year, and in sharp contrast to large machine learning conferences that sell out before you realize registration is open, aims to keep the small-batch nature that invites participation and community inclusion of every attendee.

So, if you’ve felt altogether too “special” in your quest to find a conference home, then chin up! There’s a thriving community of infosec data scientists. If you’re in the DC metro area on October 28, register and meet them at CAMLIS!  You might just find a “home”!

The Bug or Feature Debate is Back Yet Again: DDEAUTO Root Cause Analysis

$
0
0

Over the last few years, macro-based document attacks have been growing in popularity.  With the rising cost of memory corruption based exploitation due to the required level of expertise and resources, attackers understand that they can accomplish similar results by just convincing users to click through a dialog box.  This has consequently led to more and more security vendors adding protections and detections against these macro-based document attacks. 

Enter Dynamic Data Exchange (DDE).

This newly published technique leveraging a legacy feature was quickly adopted by various groups such as Fin7, and has been employed in malware campaigns such as Vortex ransomware and Hancitor

The best part about this new DDE attack vector is that it has all of the characteristics of a macro-based document attack, without the macro-based document.  In order to successfully launch their attack, an attacker simply needs to convince a user to click through a few dialogs and suddenly they evade all of these recent macro-based document mitigations.  Despite this, Microsoft has said they will not address this issue in current releases since it is a feature, and not a bug.

Macro-based exploits continue to persist due to user functionality and the argument that “it’s a feature, not a bug”. However, we wanted to take a deeper look at this new DDE-based attack vector to determine if this quality also held true for DDE-based attacks.  We found that although DDE has valid usage as a feature, the aspects that make it a security problem stray a bit from how the documentation describes its intended design.  Furthermore, unlike Office macros, this issue could be addressed to resolve the security aspects without impacting the usability of the feature for the end user.  We’ll offer our recommendations for both a fix and what can be done to help protect against this issue.

 

Background

The excellent post by SensePost first brought the DDEAUTO bug to our attention. To understand the bug, it is important to first briefly discuss dynamic data exchange (DDE). DDE is a legacy Inter-Process Communication (IPC) mechanism dating back to 1987. Like all IPC communication, it consists of a protocol designed to pass messages between two applications. In the case of DDE, it is further enhanced by allowing access to shared-memory, either as one-time or continuous data transfers, and event driven callbacks when new data arrives at each end. The MSDN pages contain additional information about the DDE architecture.

The SensePost blog explains how to reproduce the DDEAUTO bug – demonstrating that command execution could be achieved via DDE in a Word document. Microsoft Office provides an extension to leverage DDE inside of documents to communicate with external processes. DDEAUTO is the specific Word field abused by SensePost.  This is one of many Word field types defined in MS-DOC, Section 2.9.90 (flt).  Moreover, DDEAUTO is a WordprocessingML keyword defined in ECMA-376, Section 3.1.3.2.2 (DDEAUTO). 

From Microsoft:

"For information copied from another application, this field (DDEAUTO) links that information to its original source file using DDE and is updated automatically.  The application name shall be specified in field-argument-1; this application shall be running".

Let’s break down a canonical example of the DDEAUTO field in a document:

{ DDEAUTO excel "C:\\My Documents\\Profits.xls""Sheet1!R1C1:R4C4" \p }

This WordprocessingML statement will import part of Sheet1 of the Profits.xls spreadsheet into the current Word document via DDE communication to the process EXCEL.EXE, which is presumed to be open.

SensePost discovered that instead of specifying an application like Excel, they could specify an absolute path to another application as the first parameter to DDEAUTO and quoted arguments as the second parameter.

{DDEAUTO c:\\windows\\system32\\cmd.exe "/k calc.exe"}

An attacker is free to specify arbitrary parameters, with some restrictions we will identify later.  One caveat to be mindful of is that when the DDEAUTO statement attempts to load, the user is prompted by multiple (non-security related) popups. At this time, we are unaware of a bypass for this prompt, but have not ruled out the possibility due to the complex nature of document structures. Furthermore, it also is important to note that Microsoft has encouraged developers to abandon DDE for more modern IPC mechanisms due to the fact that DDE is considered a legacy feature and is no longer widely used. In the SensePost blog, it is stated that Microsoft declined to fix the bug for current releases, deeming it a feature, likely due to what they see as excessive user interaction. Given the already growing use in APT campaigns in-the-wild, we believe Microsoft is taking a very conservative approach to this issue, and encourage an additional review of the implications if it is left unaddressed.

Despite this excellent analysis, several questions still remain: what is the cause, how can it be prevented, and is this a feature with nothing really to fix?

 

Bug or Feature?

Most write-ups of this issue have focused on the way it is triggered maliciously, and the wide variety of commands that can be executed through a malicious document. We wanted to first dive into the underlying code and better understand what the application was capable of with DDE, and what limitations an attacker may encounter when attempting to leverage this issue.  Please note, all symbol names have been inferred from the publicly available Word 1.1a source code.

The actual DDE implementation in Word at wwlib!FGetAppTopicItemPffb will begin by setting up the first, second, and third DDEAUTO field arguments into the Global Atom Table (GAT) via wwlib!AtomAddSt, a wrapper for kernel32!GlobalAddAtomW.  This means that each DDEAUTO argument is restricted in size to the size of an ATOM, which cannot exceed 255 bytes.

Next, wwlib!DclInitiate is called.  This function first grabs a global struct reference for tracking the current state of the DDE negotiation. This struct has been initialized such that the first member at offset 0 is a HWND of a window owned by winword.exe.  The second member at offset 4 is initialized to NULL, and should represent the HWND of a window we are negotiating DDE with. This struct reference is acquired by calling wwlib!PdcldDcl.

Next, inside wwlib!DclInitiate, USER32!SendMessageTimeoutW is called to broadcast the WM_DDE_INITIATE message to all top-level windows of programs running on the machine.  Each window will be given uTimeout amount of time to respond to this DDE initiate message (0x3e8 or 1000 milliseconds).

If a running application decides to respond to this DDE initiate request, then it sends back a WM_DDE_ACK message.  Word listens for this WM_DDE_ACK message while SendMessageTimeoutW is still blocking execution and waiting for the timeout. Upon receiving this WM_DDE_ACK message, the global DDE communications struct is updated, and the second member of the struct is set to the HWND of the responder. Once the second member of the struct has been set by the first WM_DDE_ACK message, subsequent WM_DDE_ACK responses receive a WM_DDE_TERMINATE message.

Once SendMessageTimeoutW returns, it grabs another reference to the DDE communications struct and checks the second member of the struct to see if another process responded to the broadcast.  If this value is set, then it is used as the HWND of the other application that DDE will perform the IPC communication with.  At this point, execution continues as normal and content is exchanged between the applications over this DDE channel.  If no process responded to the WM_DDE_INITIATE message, and the second member of the struct is still not set, then Word determines that the target process for the DDE request is not currently running and is unavailable for the IPC communication. 

The documentation on MSDN for DDE states that the target application for a DDE request should already be running when the request is made.  At this point in the code, if no running process has responded to the WM_DDE_INITIATE message, then the entire DDE process should stop.  To make this more user friendly, Word detects that the user has not started the target process, and attempts to start the process for them.  This is where things go awry and deviate from the MSDN documentation.

Upon failing to receive a response from another process, execution falls down to a call to the Wwlib!FLaunchAppTopic function. This function is called with the corresponding Atom ID’s from the broadcasted message.  The wrapper wwlib!StFromAtom is called to extract the strings from the specified atoms via the kernel32!GlobalGetAtomNameW API.  These atom strings are then combined in the WCHAR buffer var_414_wstr_cmd shown in the code samples.

A space is appended to this WSTR variable before the second field argument is pulled from the atom table and appended to the string.

The second atom is then read out via wwlib!StFromAtom.  Note - the third atom we saw registered in wwlib!FGetAppTopicItemPffb is discarded.

After the atoms are both used to construct the var_414_wstr_cmd string, wwlib!FOurRunApp is called with the constructed string passed in register ECX.

FOurRunApp assigns the pointer value to a local variable lpCommandLine, which is automatically named by IDA through type propagation.

Finally, CreateProcessW is called with lpCommanLine, resulting in command execution.

The MSDN DDE examples mention that the target of a DDE request should already be running, which makes sense because this is an IPC mechanism.  Deviation from this for usability improvements is what allows this feature to be abused by attackers.

This is a bug specific to the WWLIB implementation of DDE and could be fixed by following the DDE guidelines on MSDN, asking users to start the target process themselves rather than automatically doing it for them. Additionally, the prompts that Word provides could be replaced with more security-oriented wording, such as the dialog that is used by Excel before it starts an application targeted through DDE.

 

 

The downside to this is that the key element in macro-based attacks tells us that some users will always click through any dialog, regardless of the wording.  The proper fix here would be to ask the users to start the application themselves before clicking through the dialog, and then letting Word try the request again.

 

Detection

Throughout 2017, we have seen a very common pattern among attacks exploiting Office vulnerabilities and undocumented capabilities. In most of these attacks, the initial attack is used as a stager, with the next step executing powershell.exe or mshta.exe with an encoded download/execute payload on the command line. Based on our previous research and reverse engineering this bug, we can take two different approaches to detection and prevention – a whack-a-mole approach aimed at this specific bug, or a more holistic prevention approach targeting this class of attacks.

The first approach is to instrument Office applications with code to detect specific violations of expected behavior. Since DDE is meant to be used generically for any application to communicate with another, this issue cannot be detected by simply whitelisting what processes Word can talk to.  In this case, we could take the easier path of simply adding a protection in the Endgame exploit prevention feature that detects instances where a newly created process name and arguments both come from the DDE window messages above.

That approach is effective for this bug, but is narrowly focused. Instead, the next version of the Endgame platform delivers a more general detection capability that uses behavioral analytics to detect Office products executing suspicious child processes such as mshta.exe, or powershell.exe. We can also determine if those child process make external network connections. We combine these two approaches, offering a robust way to detect different vulnerabilities and attacks across multiple Office products.

 

CONCLUSION

The rising use of macro-less document attacks is certainly something we’ll be watching for the foreseeable future. It is unfortunate that Microsoft will not work further to resolve this problem, given its increasing implementation among APTs. As we have demonstrated, the DDEAUTO is a specific kind of vulnerability within this broader class of macro-less document attacks. Instead of creating a one-off solution, our research has identified a means to protect not only against the DDEAUTO bug, but against this entire class of attacks.


A Modern Model for Cyber Adversarial Behavior

$
0
0

Organizations worldwide are facing an onslaught of targeted attacks, or attacks that are uniquely designed and executed against a specific enterprise or government entity. These attacks are 100% successful because they outperform enterprise security programs and outpace vulnerability, patch, and configuration management programs. At the heart of this problem is an outdated attack model implicit in most security programs. In other words, some enterprises -- and the endpoint technologies that protect them -- account for only a few attacker techniques like malware-based attacks. These outdated frameworks lack the comprehensive scope of techniques and technologies used by today’s adversaries. Enterprises and vendors alike must adopt a modern model that accounts for this new level of sophistication.

MITRE, a not-for-profit organization operating Federally Funded Research and Development Centers, created a model with that much-needed granularity. Starting in 2015,  MITRE integrated the vast array of cyber adversarial behavior into the "Adversarial Tactics, Techniques, and Common Knowledge" (ATT&CK™) Matrix. Today, the MITRE ATT&CK™ Matrix provides the most comprehensive framework for adversarial techniques and tactics that enterprises encounter daily.  

MITRE ATT&CK™ has become the highest standard for efficacy measurement in the security community.  A growing community of developers like Roberto Rodriguez built open source tools that apply the ATT&CK™ Matrix to help security programs assess coverage and identify gaps. While Roberto’s use case for MITRE applies to threat hunting exclusively, the ATT&CK™ Matrix is effective across a range of use cases, including prevention, detection, and incidence response.

 

MITRE ATT&CK vs. FIN7

The importance of testing program efficacy against MITRE ATT&CK™ can be understood by applying it to the highly impactful FIN7 attack.

 

MITRE Evaluates Endgame

Endgame recently collaborated with MITRE to go beyond the scope of malware-based efficacy and measures its performance against targeted attacks that include the broader range of adversarial behavior. MITRE mimicked the tactics used by APT3 (a prolific Chinese APT group) to determine Endgame’s coverage of the ATT&CK™ Matrix. Endgame successfully stopped APT3 in the emulation exercise before any data theft or damage would have occurred.

At Endgame, we believe MITRE’s framework provides a far more realistic understanding of protection against targeted attacks compared to other testing regimens. We are committed to providing full transparency about the efficacy of our platform for customers and the broader community, and look forward to continuing to collaborate with MITRE to measure against more malicious attack types.

 

Innovating People, Processes, & Technology

By evaluating an enterprise security program against a more sophisticated and modern model, enterprises can identify gaps in coverage and protection, shift focus to cover those gaps, help security programs gain greater coverage, and be more proactive in countering the range of adversarial behavior.

Endgame’s prevention, detection and response, and threat hunting capabilities provide the scope, speed and simplicity to cover the entire ATT&CK™ Matrix. Moreover, with our focus on usability and augmenting the analytic workflow, the Endgame platform provides this comprehensive coverage without requiring additional resources.

If you’d like to dive deeper into how Endgame performed on MITRE’s evaluation, please reach out to us at products@endgame.com.

BadRabbit Technical Analysis

$
0
0

On October 12th, Ukraine’s SBU security service warned of an imminent attack against government and private institutions similar to the NotPetya attack in June. Two months earlier, the SBU made a similar warning, noting that a second wave of attacks could follow if attackers maintained covert, unauthorized privileged access. These warnings seemed to bear fruit yesterday, as a new ransomware variant called BadRabbit struck. Named after the dark web-based site where the attackers demand the ransom, BadRabbit first hit three Russian media outlets, including Interfax, as well as the Kiev metro system and Odessa airport. Subsequently, BadRabbit has hit hundreds of organizations, largely in Ukraine and Russia, but it also has spread within Europe, including Turkey and Germany, and US-CERT notes discoveries in the United States as well. The impact and research into BadRabbit remains ongoing, but already there are useful insights and missteps that have occurred. To help separate the facts from rumors, this post provides a technical deep dive into BadRabbit.

 

Similar, But Different

Similar to NotPetya, BadRabbit encrypts files using DiskCryptor and demands a ransom in Bitcoin.  However, there are some key differences.  There were initial reports that BadRabbit leveraged the EternalBlue SMB exploit to traverse, similar to WannaCry and NotPetya, but my research and that of others has since refuted this. Instead, BadRabbit uses two methods for lateral movement: WMIC and open SMB shares. Also, while NotPetya contained a wiper component, BadRabbit interestingly includes the capability of a wiper, but I haven’t seen any evidence of its use.  Finally, while WannaCry and NotPetya compromised through more passive victim behavior, BadRabbit requires the victim to actively execute the malicious file. This may be why BadRabbit is - at least initially - seemingly more contained than WannaCry or NotPetya. A vaccine for BadRabbit was also identified relatively early in the community’s analysis of the malware.  By placing any file at C:\windows\cscc.dat, the dropper will fail. The BadRabbit execution flow graphic below summarizes the technical details of the subsequent sections.

 

BadRabbit

 

 

Files

Original Name 256hashDescription
install_flash_player.exe 630325cac09ac3fab908f903e3b00d0dadd5fdaa0875ed8496fcbb97a558d0daDropper
infpub.dat579fd8a0385482fb4c789561a30b09f25671e86422f40ef5cca2036b28f99648DLL payload
cscc.dat0b2f863f4119dc88a22cc97c0a136c88a0127cb026751303b045f7322a8972f6DiskCryptor Driver (x64)
dispci.exe8ebc97e05c8e1073bda2efb6f4d00ad7e789260afa2c276f0c72740b838a0a93DiskCryptor Client
xxxx.tmp301b905eb98d8d6bb559c04bbda26628a942b2c4107c07a02e8f753bdcfe347cMimikatz (x64)
xxxx.tmp2f8c54f9fa8e47596a3beff0031f85360e56840c77f71c6a573ace6f46412035Mimikatz (x32)

 

Chasing Down the Rabbit Hole

As the graphic illustrates, there are a series of steps that take place from compromise through encryption to ransom demand. I’ll walk through each of these steps below.

 

Initial Mode of Compromise & Propagation

The BadRabbit attack first begins when the victim receives and installs a fake Adobe Flash update.

 

Dropper Failure Message

Screen Shot 2017-10-25 at 1.34.57 AM.png

 

Dropper Malware Details

Original Name: install_flash_player.exe

SHA-256: 630325cac09ac3fab908f903e3b00d0dadd5fdaa0875ed8496fcbb97a558d0da

Version 1.2.8

Main EntryPoint

  1. Get command line arguments
  2. If the number of arguments do not equal 1:
    1. Store the argument
  3. If the argument equals 1
    1. Load the string “15”
    2. Get System Directory C:\\windows\\system32\\rundll32.exe
    3. Create and decrypt the payload (install_flash_player.exe:0x4010C0)
      1. Store the encrypted data located at an offset within itself
      2. Allocate space on the heap for the size of that file
      3. Copy 0x5ABA3 bytes starting at offset 0xDE00
      4. Allocate new space on heap
      5. Decrypts DLL into memory of size 0x64488
    4. Create and save DLL as C:\\Windows\\infpub.dat (install_flash_player.exe:0x401260)
    5. Launch new process "C:\\Windows\\system32\\rundll32.exe C:\\Windows\\infpub.dat,#1 15”

 

DLL Payload

Original Name: infpub.dat

SHA-256: 579fd8a0385482fb4c789561a30b09f25671e86422f40ef5cca2036b28f99648

EntryPoint Function #1 (infpub.dat:infpub_1)
  1. Adjust token to add SeShutdownPrivilege, SeDebugPrivileges, SeTcbPrivilege  (infpub.dat:0x7897)
  2. Allocate an executable memory space using VirtualProtect
  3. Create Mutex "%08X%08X"
    1. I haven’t seen the mutex trigger so its purpose is unknown.
  4. Load Resource “7” File and save as cscc.dat (infpub.dat:0x7E8E).  This is the DiskCryptor driver.
  5. Load DiskCryptor Client and Start Service (infpub.dat:0x10A7)
    1. Load Resource “9” File and save as dispci.exe (infpub.dat:0x8313).  This is the DiskCryptor client
    2. Start up schtasks (infpub.dat:0x1000)
      1. Run command “schtasks /Delete /F /TN rhaegal” - This deletes any existing scheduled task named “rhaegal”
      2. The -id parameter is the random generated key. Run command “/c schtasks /Create /RU SYSTEM /SC ONSTART /TN rhaegal /TR "C:\Windows\system32\cmd.exe /C Start \"\" \"C:\Windows\dispci.exe\" -id 3127978853 && exit".  This causes the DiskCryptor client to launch on startup of the system.
    3. Create Service “cscc” for the DiskCryptor Driver (infpub.dat:0x1531).
      1. Call OpenSCManagerW
      2. Binary path name “cscc.dat”
      3. Service Name “Windows Client Side Caching DDriver”
      4. Set Regkey = SYSTEM\\CurrentControlSet\\services\\cdfs
        1. Imagepath “cscc.dat”
      5. LowerFilters: SYSTEM\CurrentControlSet\Control\Class\{71A27CDD-812A-11D0-BEC7-08002BE2092F}
      6. UpperFilters: SYSTEM\CurrentControlSet\Control\Class\{4D36E965-E325-11CE-BFC1-08002BE10318}
      7. DumpFilters: SYSTEM\CurrentControlSet\Control\CrashControl\csccdumpfve.sys
  6. Start the socket connection with WSAStartup
  7. Get Command Line Arguments from options [ -h, -f ] (infpub.dat:0x652F)
  8. Get Server Info (infpub.dat:0x7DD0)
  9. Schedule Shutdown with persistence (infpub.dat:0x8192)
    1. Run command “shutdown.exe /r /t 0 /f”.  This causes a forced reboot with no delay and forces running applications to close.
    2. Run Command “/c schtasks /Create /SC once /TN drogon /RU SYSTEM /TR "C:\Windows\system32\shutdown.exe /r /t 0 /f" /ST 09:29:00”.  This causes a task to be created which schedules a shutdown for 09:29 for the local system. The system needs to reboot in order to install the DiskCryptor driver.
  10. Create an Event and Start Thread   (infpub.dat:0x8A6F)
    1. Run command “schtasks /Delete /F /TN drogon” which deletes the previously scheduled shutdown/reboot task.
  11. Route 1: File Encryption
    1. Derive and setup symmetric encryption key from the hardcoded public key (infpub.dat:0x554A, 0x636B)
      1. Generate AES key with CryptGenRandom
      2. Public key (See Appendix)
    2. Create a thread to start encryption (infpub.dat:0x6299)
      1. Create the Readme.txt ransom note file. (See Appendix)
      2. Ignored paths:
        1. \\Windows
        2. \\Program Files
        3. \\ProgramData
        4. \\AppData
      3. Start encrypting files which have a targeted file extension (See Appendix)
      4. Uses "encrypted" as a part of the encryption header
      5. Display ransom note
  12. Route 2: Lateral Movement
    1. Start thread to connect to the service (infpub.dat:0x77D1)
      1. Get localhost address
      2. Create Thread
        1. GetAdaptersInfo
        2. CreateThread
      3. Iphlpapi.dll and call GetExtendedTcpTable to retrieve a table that contains a list of TCP endpoints available to the application
      4. Get IP address
      5. Get Domain and Server
      6. Duplicate process token
      7. Thread: Connect to the service
    2. Create Mimikatz and pipe (infpub.dat:0x7146)
      1. Load Resource and save as temp file xxxx.tmp based on architecture (x86/x64)
      2. Create process from temp file with pipe: C:\\WINDOWS\\xxxx.tmp \\\\.\\pipe\\{GUID}
        1. Example: “C:\Windows\E287.tmp" \\.\pipe\{FA577FE2-92A2-47EF-8EAF-1016B5B22B72}”
      3. Setup Co task memory
    3. EstablishConnection to admin$
      1. Duplicate Process Token and Set thread tokens (infpub.dat:0xA3B1)
      2. Get network resource from server
      3. Enumerate credentials for "TERMSRV/" (infpub.dat:0xA016)
      4. Setup connection via "wbem\\wmic.exe" with username and password (infpub.dat:0x9F7A)
      5. Open Service to \\\\%s\\admin$ to access a remote machine via admin shares, calls NetAddConnection2, and connects to server  (infpub.dat:0x9534)
      6. Set the binary path for cscc.dat and copy binary to remote system
        1. http://<IPAddress>/admin$/infpub.dat
        2. http://<IPAddress>/admin$/cscc.dat
      7. Run remote command C:\Windows\System32\rundll32.exe "C:\Windows\infpub.dat,#2 “  (infpub.dat:0x944F)
    4. Traverse SMB Shares (infpub.dat:0xA420)
      1. Connect to shares via socket connection
      2. Test hardcoded username and passwords against $IPC services
      3. Copy files over via ADMIN$
    5. Continue back to Route 1: File Encryption.

 

EntryPoint Function #2 (F infpub_2)
  1. Run the command "C:\\Windows\\system32\\rundll32.exe C:\\Windows\\infpub.dat,#1 %ws”

 

DiskCryptor Client

Original Name: dispci.exe

SHA-256: 8ebc97e05c8e1073bda2efb6f4d00ad7e789260afa2c276f0c72740b838a0a93

 

Main Entry Point

  1. Check for C:\\Windows\\cscc.dat (dispci.exe:0x4052D0)
    1. Set up buffer to connect to DiskCryptor driver  \\.\dcrypt
      1. Set up hooks with callback functions
        1. SetWindowsHookEx
        2. function  fn(int code, WPARAM wParam, LPARAM lParam)
        3. function sub_403FC0(int code, WPARAM wParam, LPARAM lParam)
      2. Get raw disk access
        1. \\.\GLOBALROOT\ArcName\multi(0)disk(0)rdisk(0)partition(1)
      3. Call to Driver DeviceIOControl Code 0x220040 (DC_CTL_RESOLVE)
  2. Run command “schtasks /Delete /F /TN rhaegal”
  3. Take 2 routes: start disk decryption, if not do setup and encryption.
  4. Route1: Disk Encryption
    1. Set a control handler that allows the writer to be shut down SetConsoleCtrlHandler
      1. Create a COM connection
      2. Get the “DECRYPT” to run the console for decryption
      3. Use \\Desktop\\DECRYPT.lnk
    2. Create Scheduled Tasks
      1. First run command “schtasks /Delete /F /TN drogon”
      2. Loop in new thread: schedule shutdown with persistence
        1.  “shutdown.exe /r /t 0 /f”
        2. Run command “schtasks /Create /SC ONCE /TN viserion_%u /RU SYSTEM /TR %ws" /ST %02d:%02d:00”
        3. Run command “schtasks /Delete /F /TN viserion”
    3. Access driver communication buffer and encrypt (dispci.exe:0x405370)
      1. Communication buffer (dispci.exe:0x402020)
        1. Access physical device  \\.\dcrypt
        2. DeviceIOControl Control code 0x220060 (DC_CTL_LOCK_MEM)
      2. Generate AES key with CryptGenRandom (dispci.exe:0x4012A0)
      3. Encrypt files with public key (dispci.exe:0x4015A0)
        1. Public Key (See Appendix)
        2. Encrypt data from server buffer
      4. Encrypt raw disk and open resource file
        1. Send DeviceIOControl Control code 0x220058 (DC_CTL_GET_FLAGS)
        2. Get raw disk access \\.\GLOBALROOT\ArcName\multi(0)disk(0)rdisk(0)partition(1)
        3. Access raw disk routine: Send DeviceIOControl Control codes:
          1. 0x70048 (IOCTL_DISK_GET_PARTITION_INFO_EX)
          2. 0x74004 (IOCTL_DISK_GET_PARTITION_INFO)
          3. 0x7405C (IOCTL_DISK_GET_LENGTH_INFO)
          4. 0x2D1080 (IOCTL_STORAGE_GET_DEVICE_NUMBER)
          5. 0x560000 (IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS)
        4. Open \\\\.\\PhysicalDrive0 and DeviceIOControl Control code 0x70000 (IOCTL_DISK_GET_DRIVE_GEOMETRY)
        5. Send DeviceIOControl Control code 0x700A0 (IOCTL_DISK_GET_DRIVE_GEOMETRY_EX)
        6. Read from handle of device in 0x200 byte chunks
      5. Open Resource Files Ransom notes (dispci.exe:0x402800)
        1. Open resource files as EXEFILE
          1. 0x8B (bootloader)
          2. 0x8C (Kernel Component) or 0x8D (Kernel Component)
        2. Load resource in memory
        3. Open and Read Raw disk and Send DeviceIOControl Control code 0x700A0 (IOCTL_DISK_GET_DRIVE_GEOMETRY_EX)
        4. Check the filesystem type [NTFS,FAT12,FAT16,FAT32,EXFAT]
        5. Read and write to file
      6. Send DeviceIOControl Control code 0x220064 (DC_CTL_UNLOCK_MEM) to driver
      7. Send DeviceIOControl Control codes:
        1. 0x22003C(DC_CTL_SYNC_STATE)
        2. 0x22001C (DC_CTL_STATUS)
        3. 0x220034 (DC_CTL_ENCRYPT_STEP)
        4. 0x220008 (DC_CTL_CLEAR_PASS)
      8. DeviceIOControl Control codes:
        1. 0x220060 (DC_CTL_LOCK_MEM)
        2. 0x220028 (DC_CTL_ENCRYPT_START)
        3. 0x220064 (DC_CTL_UNLOCK_MEM)
    4. Send DeviceIOControl Control code 0x220008 (DC_CTL_CLEAR_PASS)to driver
    5. Wait for the encryption to finish with WaitForSingleObject and Sleep
    6. Shutdown (dispci.exe:0x405BF0)
      1. Run command “shutdown.exe /r /t 0 /f”
  5. Route 2: Disk Decryption
    1. Start Disk Decryption Logging (dispci.exe:0x405510)
      1. DeviceIOControl Control codes:
        1. 0x22003C (DC_CTL_SYNC_STATE)
        2. 0x220038 (DC_CTL_DECRYPT_STEP)
    2. Access Raw Disk Routine: Send DeviceIOControl Control codes:
      1. 0x70048 (IOCTL_DISK_GET_PARTITION_INFO_EX)
      2. 0x74004 (IOCTL_DISK_GET_PARTITION_INFO)
      3. 0x7405C (IOCTL_DISK_GET_LENGTH_INFO)
      4. 0x2D1080 (IOCTL_STORAGE_GET_DEVICE_NUMBER)
      5. 0x560000 (IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS)
    3. Send DeviceIOControl Control codes:
      1. 0x220058 (DC_CTL_GET_FLAGS)
      2. 0x220060 (DC_CTL_LOCK_MEM)
      3. 0x22002C (DC_CTL_DECRYPT_START)
      4. 0x220064 (DC_CTL_UNLOCK_MEM)
    4. Check for AntiVirus and initiate decryption with provided password
      1. Send DeviceIOControl control code 0x220020 (DC_CTL_ADD_SEED)
      2. Asks “Disable your anti-virus and anti-malware programs”
      3.  Drops Readme.txt in rootpath “C:\Readme.txt”
      4. Create the readme.txt file from resources
  6. After both routes are complete, run the DECRYPT console
    1. Call CoInitialize
    2. Get the “DECRYPT” to run the console for decryption
    3. Use \\Desktop\\DECRYPT.lnk

 

DiskCryptor Driver

Original Name: cscc.dat

SHA-256: 0b2f863f4119dc88a22cc97c0a136c88a0127cb026751303b045f7322a8972f6

The code is verbatim from https://diskcryptor.net and https://github.com/smartinm/diskcryptor. This is not malware, it is yet another example of legitimate software being used for nefarious means in a malware attack.

 

DeviceIOControl Codes:

Device IO Control Codes

 

The Countdown Clock Begins

Acquiring the decryption key from the onion server caforssztxqzf2nm[.]onion

Bad Rabbit countdown clock

 

Summary

BadRabbit joins WannaCry and NotPetya among the list of global ransomware attacks in 2017. However, there are many differences between BadRabbit and the other attacks that are missed when simply lumping them all together. BadRabbit does notuse the EternalBlue exploit, but demonstrates yet again how these attacks continue to evolve and innovate their evasive techniques. I’ll be keeping an eye on BadRabbit, and future variants, as these attacks evolve. The Appendix below provides additional information, including those GoT and Hackers references that are making headlines. And as always, be wary of pop-up updates, which are an incredibly popular mode of compromise.

 

Appendix

Public Key

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA5clDuVFr5sQxZ +feQlVvZcEK0k4uCSF5SkOkF9A3tR6O/xAt89/PVhowvu2TfBTRsnBs83 hcFH8hjG2V5F5DxXFoSxpTqVsR4lOm5KB2S8ap4TinG/GN/SVNBFwllpR hV/vRWNmKgKIdROvkHxyALuJyUuCZlIoaJ5tB0YkATEHEyRsLcntZYsdw H1P+NmXiNg2MH5lZ9bEOk7YTMfwVKNqtHaX0LJOyAkx4NR0DPOFLDQONW 9OOhZSkRx3V7PC3Q29HHhyiKVCPJsOW1l1mNtwL7KX+7kfNe0CefByEWf SBt1tbkvjdeP2xBnPjb3GE1GA/oGcGjrXc6wV8WKsfYQIDAQAB

 

File Extensions

3ds, 7z, accdb, ai, asm, asp, aspx, avhd, back, bak, bmp, brw, c, cab, cc, cer, cfg, conf, cpp, crt, cs, ctl, cxx, dbf, der, dib, disk, djvu, doc, docx, dwg, eml, fdb, gz, h, hdd, hpp, hxx, iso, java, jfif, jpe, jpeg, jpg, js, kdbx, key, mail, mdb, msg, nrg, odc, odf, odg, odi, odm, odp, ods, odt, ora, ost, ova, ovf, p12, p7b, p7c, pdf, pem, pfx, php, pmf, png, ppt, pptx, ps1, pst, pvi, py, pyc, pyw, qcow, qcow2, rar, rb, rtf, scm, sln, sql, tar, tib, tif, tiff, vb, vbox, vbs, vcb, vdi, vfd, vhd, vhdx, vmc, vmdk, vmsd, vmtm, vmx, vsdx, vsv, work, xls, xlsx, xml, xvd, zip

 

Readme.txt Contents Example

Oops! Your files have been encrypted. If you see this text, your files are no longer accessible. You might have been looking for a way to recover your files. Don't waste your time. No one will be able to recover them without our decryption service. We guarantee that you can recover all your files safely. All you need to do is submit the payment and get the decryption password. Visit our web service at caforssztxqzf2nm.onion Your personal installation key#2: Zu3T6///6gTViRsNAWpMUmUAvuseFAhcG/ppEt4WiB+OwqRtZjNJvPbCDn2r20V5 Wn70lrtUES38dabDQMDhLp6ZjSWeCSOk4ek6FL0qF+CwhM9i2mxLsa4DAlxIFunp QatuxDD6AQTsl7OiheHy1/FG9gXQ10+aeXj8B7PIT51T2Iuw/UWNN2iGzvnCMOhZ /DXTL66SfbtyxWfHd9Pvo4S7p5HDlv4SWiQdPHkOidRQqccHHEDD6urvBxQJuSwe tqCBxPwx+B6uwQ/Znco9f+nRxOiX3a0OG/c6xg4+W6qbRqden+4fL1VUsPQSmhod fMt0iuW2ACJ2BFgfvkpZ2MchA8OS7+mGKw==

 

UserNames

"Administrator"

"Admin"

"Guest"

"User"

"User1"

"user-1"

"Test"

"root"

"buh"

"boss"

"ftp"

"rdp"

"rdpuser"

"rdpadmin"

"manager"

"support"

"work"

"other user"

"operator"

"backup"

"asus"

"ftpuser"

"ftpadmin"

"nas"

"nasuser"

"nasadmin"

"superuser"

"netguest"

"alex"

 

Passwords

"Administrator"

"administrator"

"Guest"

"guest"

"User"

"user"

"Admin"

"adminTest"

"test"

"root"

"123"

"1234"

"12345"

"123456"

"1234567"

"12345678"

"123456789"

"1234567890"

"Administrator123"

"administrator123"

"Guest123"

"guest123"

"User123"

"user123"

"Admin123"

"admin123Test123"

"test123"

"password"

"111111"

"55555"

"77777"

"777"

"qwe"

"qwe123"

"qwe321"

"qwer"

"qwert"

"qwerty"

"qwerty123"

"zxc"

"zxc123"

"zxc321"

"zxcv"

"uiop"

"123321"

"321"

"love"

"secret"

"sex"

"god"

 

Service Names

"atsvc"

"browser"

"eventlog"

"lsarpc"

"netlogon"

"ntsvcs"

"spoolss"

"samr"

"srvsvc"

"scerpc"

"svcctl"

"wkssvc"

Multidisciplinary Innovation for Better Defenses

$
0
0

Five years ago, the Strata Conference hosted a panel debating the value of domain expertise versus machine learning skills in data science. The machine learning side won. This debate continues today, not just in data science, but there is frequently news of AI-powered robots on track to replace humans across most industries. In security, this contention generally manifests along the lines of some new AI-powered tool that single-handedly stops all digital attacks. While this certainly would be a welcome surprise, security still requires a human in the loop, and not solely those experts in computer science, but also across a range of disciplines. Machine learning isn’t the only technological innovation that will shape security for the foreseeable future, but human-computer interaction also will be key to truly innovate security and strengthen defense resiliency. It will require all disciplines on deck to build and apply these defenses.

 

The Data Challenge

Until relatively recently, the overwhelming volume, velocity, variety, and veracity of infosec data remained a natural, but underexplored, data science challenge. This has started to change and, as often happens, the pendulum has swung and is trying to overcorrect. Data science expertise has been designated the sexiest job in the twenty-first century, and security is definitely one area where both attackers and defenders are increasingly integrating data science. Data scientists are essential to tackle a range of security challenges, including malware classification, outlier detection, structuring the data pipeline, and even providing natural language query capabilities. Interestingly, most operationally successful use cases require some level of coordination between data scientists and domain experts. In cases of semi- and supervised machine learning, the domain experts are necessary to ensure the parameters are properly scoped, and to help update and train the model. In addition, an underappreciated, but vital, role of domain experts is ensuring that data scientists are addressing the key pain points for defenders. Domain experts also comprehend biases inherent in data, and provide feedback to ensure algorithms don’t reinforce biases, while simultaneously data scientists craft solutions to provide new insights for domain experts and serve as force multipliers for that expertise. In short, five years after that Strata panel, to build defenses it is not an either/or scenario - both domain experts and data scientists are key to innovating defenses.

 

The Usability Factor

Unlike most tech industries, security has yet to fully embrace the value of user experience and human-computer interaction (HCI). Instead, many interfaces remain clunky, difficult to use, or require proprietary scripting for even simple queries. Too frequently, users are berated for their inability to use the tools, as opposed to making the tools more accessible. Couple this mentality with a growing skills and workforce shortage, and it becomes exceedingly clear that improved HCI could have a big impact on security.

Fortunately, HCI is slowly creeping into security, prompting the integration of user experience professionals, visual designers, data scientists, and domain experts. For instance, alert fatigue is a well-known challenge for defenders, as they must prioritize which of the ever-growing number of alerts to respond to first, and which may end up being ignored all together. A combination of data science, design, and workflow improvements could help make this more manageable by enabling user-defined priorities, context, and easier query capabilities, for instance. In fact, user experience professionals can interview and work with domain experts to enhance the entire workflow across a range of use cases. From improved tooltips to simplified data querying and visualization, more usable and intuitive interfaces not only optimize and enhance the workflow of current defenders, but HCI can make security more accessible for a broader range of defenders as well.

 

Domain Expertise Still Required

Returning to the Strata debate, regardless of the best data science and usability improvements, domain expertise remains essential. This expertise extends both into the technical coding aspect of defense, as well as the analytic side. Malware researchers, reverse engineers, and experts in offensive techniques and exploitation all are essential for understanding and stopping adversarial behavior. Similarly, threat intelligence analysts are necessary for campaign-level insights and identifying the objectives and intent behind the attacks. Each of these, in turn, also are essential to ensure the data scientists and user experience professionals craft the appropriate parameters and workflows into their work.

These are only some of the more technical disciplines where domain expertise will remain vital. Defense must also be viewed through a socio-technical lens, including the legal, policy and privacy domain expertise required to establish the appropriate regulations, rules of the road, and protections necessary for stronger defenses. In addition, since every company is a tech company these days, organizations need cultural shifts in security awareness within their organizations. Experts across a range of disciplines from organizational theory to marketing to security experts can work together to instigate a security culture in ways that can actually resonate within the workforce (as opposed to those click-through trainings that can easily be gamed).

 

Multidisciplinary Innovation for Better Defenses

In short, the path to better defenses is through a multidisciplinary approach, integrating innovations across a range of disciplines. From the data complexities to usability to the proliferation of new techniques and capabilities, crafting better defenses is truly a multidisciplinary challenge. For the most part, security remains perceived as a career path only for experts in offensive techniques, reinforcing stereotypes and contributing to the workforce shortage. However, this is changing, as conferences such as O’Reilly Security contribute to bringing together cutting-edge research across a range of disciplines to improve defenses. For instance, Endgame data scientists Rich Seymour and Bobby Filar will be presenting there next week on usable security, combining data science with user experience and domain expertise to address key pain points in the user workflow.  In short, multidisciplinary and socio-technical solutions are increasingly the key to building stronger defenses and requires a range of perspectives, insights, and innovations across a range of disciplines.

Falling into the TRAP: How the Endgame Platform Stops BadRabbit

$
0
0

BadRabbit is the latest auto-propagating ransomware making the rounds and disrupting organizations.  We previously went deep into the technical details.  This post will describe our testing of BadRabbit in the presence of our endpoint protection platform.  I didn’t want to rush to join the pack for self-congratulations along the lines of, “Look, Vendor X would have protected you!” posts and emails which make bold, sparse, and often difficult to verify claims.  To be sure, there should be a high level of skepticism and cognitive dissonance when wondering how all these vendors stop every attack, yet such attacks succeed with mass impact on a regular basis.  Even worse, the detection techniques often described in such posts (if one is described at all) may seem exceedingly false positive-prone when thinking about applying the method in a real-world network.

I don’t want to fall into this trap of self-agrandissement with little value added to the conversation. Instead, I will provide detail about how things work when BadRabbit is executed on an endpoint protected by Endgame. I will highlight the core techniques within BadRabbit, and how Endgame stops them, while also contributing to the technical conversation and overall understanding of the attack.  

 

Endgame’s Protection at Work

So would Endgame have stopped BadRabbit?  Yes.  Three of our signatureless, inline preventions come into play against this attack.  I’ll address each in detail.  These preventions described have not been altered after the release of BadRabbit.  They all existed in the form described below prior to the discovery of  BadRabbit.

 

Malware Prevention

The screenshot below is from a virtual machine where we ran BadRabbit with all our preventions turned on.  Endgame MalwareScoreTM prevents the BadRabbit payload DLL (infpub.dat, 579fd8a0385482fb4c789561a30b09f25671e86422f40ef5cca2036b28f99648) from being loaded and executed by rundll32.  Importantly, this was effective with no updates to our software such as signature or model updates.  If you have VirusTotal Intelligence, you can see that Endgame designated this file as malicious with high confidence the first time it was seen in the wild.

Screen Shot 2017-10-25 at 4.36.27 PM.png

Screenshot of Blocking DLL

 

What is Endgame MalwareScoreTM? MalwareScoreTM is our machine learning-powered malware classifier.  Based on gradient boosted decision trees, it blocks execution of malicious files and loads of malicious DLLs inline, on the endpoint, requiring only milliseconds to classify malware. This is done pre-execution so the malicious file(s) never have a chance to load and do damage.

Our researchers and data scientists work tirelessly everyday to continue to improve this capability.  To build a new candidate model, we train on tens of millions (and growing) of carefully chosen malicious and benign samples.  Building a good model is a significant process. The model currently in production took about 6,700 processor hours to build and is backed by years of iterative research and experimentation dedicated to finding the optimal model parameters. Before shipping a model, we conduct a variety of tests, from analysis of samples whose benign/malicious labels which change between models, to analysis of the model’s performance against previous misses, to comparative testing against new in-the-wild samples. This is an art as much as a science.  Model updates are applied when we determine we can impact either the efficacy of detection, lower false positive rates, or usually, both.  This generally happens at a minimum monthly, and updates are applied either automatically via the cloud or through a one line privileged user command from our platform for customers without cloud connectivity.  

That’s all to say that a lot goes into building a malware classifier which performs excellently against both well-known but, perhaps more importantly, against never-before-seen malware like BadRabbit.  As we saw with WannaCry and NotPetya, Endgame MalwareScoreTM is effective in stopping emergent and highly damaging ransomware and other forms of malware.  

Below is screenshot of how BadRabbit looks to a user of our management console when blocking BadRabbit with MalwareScoreTM.  There are no child processes or other activities to be seen.  The ransomware is stopped dead in its tracks.  

 

Screen Shot 2017-10-26 at 2.09.15 PM.png

Screenshot of the MalwareScoreTM Alert after Blocking DLL

 

Process Injection Prevention

Next, we tested what the BadRabbit would do against the rest of our protection platform.  We put MalwareScoreTM prevention into detection-only mode so we could analyze what the malware does next.  Interestingly, a process injection prevention took place.  The screenshot below demonstrates how it looks to the user of our console.

Screen Shot 2017-10-26 at 4.07.22 PM.png

Screenshot of Process Injection Prevention Alert

 

This attack timeline shows the child processes of rundll32.dll before the injection prevention as well as file writes and deletes caused by the process.  

There seems to be little to no discussion of BadRabbit’s in-memory defense evasion techniques. BadRabbit’s payload DLL (infpub.dat) remaps itself into a new (unbacked) memory region within its running rundll32 process. It does this by first allocating new memory with VirtualAlloc and copying the entire contents of the payload DLL’s address space to the new region. Next, it parses the relocation table and performs the required fixes for this new base address where the executable will live. After that, it updates the page protections for the new memory to match what is specified in the executable header (i.e. .text section is marked executable). Execution then jumps to this new memory region. Immediately after this jump, the original payload DLL file on disk is unloaded from rundll32's address space. Then, the original payload DLL file on disk is overwritten with null bytes and then deleted completely. The executable now attempts to run from this new unbacked memory region.  The Endgame injection prevention takes place at this step, identifying it as abnormal.

The BadRabbit authors perhaps used this behavior to hamper forensic analysis of infected machines through deletion of the payload on-disk. The technique used is more sophisticated than traditional batch scripts used for self-deletion.  It also underscores the importance of detection across the entire attack lifecycle, and not just tracking the initial malware components.

It should be noted that this particular prevention alone does not entirely stop the ill effects of the BadRabbit malware because of how it is written.  The main thread of execution creates the scheduled tasks for shutdown and for spawning the diskcryptor client malware on reboot before migrating to stealthier in-memory-only execution blocked by the process injection prevention. However, our user would have been alerted to an issue on this endpoint and a SOC team could begin an investigation.  Additionally, stopping this suspicious execution from unbacked memory blocks the credential dumping step described in the next section which would help control the malware’s spread.  

Overall, this is a good example of a single layer in our overall detection and protection system, even though in this case this layer doesn’t prevent the ultimate effect of file encryption on the targeted system due to the specific sequence of actions taken by the BadRabbit ransomware. Keep in mind, though, that with MalwareScoreTM on, it would never get this far.

 

Credential Dumping Prevention

Now let’s allow the malicious DLL payload and also allow the process injection to take place by putting those protections in detection-only mode.  In this case, if we run BadRabbit, a credential dumping prevention alert appears, as demonstrated in the screenshot below.

Screen Shot 2017-10-26 at 4.51.16 PM.png

Screenshot of Credential Dumping Alert

 

Endgame credential dumping prevention blocks the techniques used by BadRabbit when it attempts to dump credentials.  This hampers the ability of the malware to move laterally, containing its impact in most environments.

Finally, let’s turn off all Endgame prevention capabilities so we can see what the malware does from beginning to end. The screenshot and video below shows Endgame displaying everything BadRabbit does in an interactive attack timeline.  Clearly, we recommend turning on preventions so emergent malware like BadRabbit is stopped before causing damage. However, for this post, exploring BadRabbit without these preventions on provides additional insight for how users can understand attacks quickly and easily with Endgame.  

Screen Shot 2017-10-26 at 5.01.29 PM.png

Screenshot of Complete BadRabbit Execution

 

Video of Endgame Platform in Detection-Only Mode Against BadRabbit

 

Conclusion

The BadRabbit ransomware uses some interesting techniques as it executes, including the stealthy migration and execution techniques described above in the process injection prevention section.  Testing it in the presence of our platform showed that Endgame MalwareScoreTM was  effective on Day 0 in blocking the attack through prevention of malicious module loads. BadRabbit also tripped additional preventions if execution was allowed to continue.

BadRabbit is the third major international ransomware attack this year, and we are likely to see more of these kinds of attacks into 2018. It’s important both to understand the range of techniques employed, as adversaries increasingly benefit from a mix-and-match approach across a range of attack vectors, exploiting opportunities within singular layers of defense. BadRabbit demonstrates why a multi-layer approach to prevention and detection is so necessary. Each layer provides an additional trap, catching even the most evasive of attacks.

Increasing Retention Capacity: Research from the Field

$
0
0

Security professionals from academia and industry gather this week in Dayton, OH for the annual National Initiative for Cybersecurity Education (NICE) Conference and Expo.  NICE is a program of the National Institute for Standards and Technology, and focuses on the cybersecurity workforce, education, and training needs of the nation. As part of this conference, I am presenting my research on improving retention within the industry.

The security workforce shortage is well-publicized, and is only expected to grow. By 2022, the industry may face a shortage close to two million qualified security personnel. For the most part, improving the pipeline understandably dominates most discussions when looking for solutions to this shortage. However, all of the resources and work that goes into improving the pipeline will go for naught if the industry fails to address the retention challenges as well.

My research builds upon existing social science research on retention, including organizational change and cultural inclusion. In addition, I distributed a survey throughout August and September to infosec professionals via social media. Over 300 people responded, and represented a range of experience within the industry: three-quarters worked in the field over five years, and 35% eleven years or more. The survey findings and recommendations for addressing them are discussed in detail in the final white paper. Below is a summary of the key findings.

  • Ill-defined Career Path: The lack of professional advancement, a well-defined career path, and work that at times is not challenging were strong factors for respondents when considering leaving the industry, and why they left their previous employment.
  • Burnout: Stress and burnout, coupled with long hours, topped responses for reasons for leaving a position or considering leaving the industry.
  • Industrial Change: The industry culture is among the top reasons respondents consider leaving the industry. Discrimination and harassment at professional conferences far exceeded that found within company work environments. Moreover, males were significantly less likely to experience harassment or discrimination than non-males.

The white paper also covers a range of recommendations for organizations to improve retention, each aimed at addressing the three major findings from the survey. The recommendations are divided into mutually constituted structural factors (material constructs, institutions, environmental constraints) and agents (motives, ideas, and actions of individuals). There is no silver bullet, as social change requires efforts across both categories, as summarized below.

  • Structural factors: Corporate policies (e.g., performance metrics, PTO, social events); Conference culture & representation; Visual cues (e.g., workplace, marketing materials)
  • Agents: Leadership (e.g., by example, policies & values); cultural entrepreneurs (grassroots leadership to shape culture, provide accountability, social capital)

Although the survey analysis highlighted significant challenges, security has a core, competitive advantage: the mission. The mission is a key motivating factor for retaining talent across industries, and ranked as among the most important factors in the workplace among half the respondents. By focusing on the mission and addressing the key challenges, not only can retention rates dramatically improve, but it will also reinforce many of the ongoing pipeline efforts and truly begin to hack away at the workforce shortage.  Security professionals want to stay in the field - let’s help make it easier.

Viewing all 698 articles
Browse latest View live