From Macro to SSL with Shellcode A Detailed Deconstruction

Nytro · March 10, 2016

From Macro to SSL with Shellcode A Detailed Deconstruction

petez on ‎03-09-2016 09:30 AM

During the era of Windows 3.1 where “www” was just a repeated consonant and the internet was yet to enter the vernacular, Microsoft Office annoy-ware was already using macros within Office documents to cause mischief. It wasn’t until Mick Jagger’s “Start me up!” ushered in Windows95 (with Trumpet winsock for networking) that saw Office macros in the media with the Melissavirus. Back in those early days of Netscape, “co-operative multitasking,” and Doom II, the functionality of these macros was either parasitic or mass-mailing, but seldom did the complexity increase beyond the functionality offered by VBAScript. Why would they? The language was rich enough to write self-replicating code, send emails, and download files – more than enough to declare an undying love, as in the “I love you” virus. String obfuscation is cheap to perform but can quickly become far from trivial to detect generically. Much like the fashions of the late 1990’s, macro viruses fell out of vogue. Then in the last couple of years, macro viruses have returned. This article aims to deconstruct what is currently the new kid on the block.

While not strictly necessary, I prefer to glance at one-off samples, such as incident responses, with a trusty hex-editor. This not only helps me familiarize myself with the file format, but it also presents an opportunity to spot hints on where to direct my attention next.

The sample I chose to analyze clearly begins with the typical D0 CF 11 E0 magic, confirming this to indeed be an Office document file. Further strings below hint at this being a Microsoft Word file as opposed to an Excel spreadsheet or PowerPoint presentation.

Figure 1 – Hex Workshop view of the sample

Continuing to glance the hex dump, the presence of Auto-Open related strings hints that a macro and not some other form of exploit is the likely trigger mechanism.

Figure 2 – Hex Workshop view showing Auto-Open related strings

If the system does have macros enabled, this would not have been shown.

When macros are not enabled (the default setting) the document employs a bit of social engineering. The warning and helpful instructions invite the recipient to enable macros, upon doing so invoking the malicious macro functionality. I do not however fall victim to this ruse.

Figure 3 - Getting the user to enable macros

Figure 3 - Getting the user to enable macrosInstead I attempt to view the embedded macros using the Office Visual Basic editor. Clearly the macros are present but annoyingly have been password protected and so are not made visible.

Figure 4 Protected mode

Although Office itself prevents the viewing of the Visual Basic macro without the password, this security feature is easily circumvented.

Where are the macros?

To extract the VBA macro for further study, I first tried some handy Python tools I have available. The comprises a handful of stand-alone, task-specific components designed to extract various information from an OLE container.

While the metadata and time information tools have their value, the oleid.py and olevba.py are the most useful for payload analysis. The olevba.py script not only allows for rapid triage and dumping of macro code but can also perform some basic analysis.

Another useful OLE tool used is oledump.py by Didier Stevens, which also has several options to show increasingly more detail. The stream hierarchy is a good overview, which makes the large macro at #8 and smaller one at #9 easy to spot for extraction.

If Python is not your cup of tea, it may possible to manually extract the macro code with a hex editor – provided the OLE data is not compressed. However, care must be taken to ensure any artifacts get correctly handled. This is not recommended for the inexperienced.

Having extracted the macros into a separate file for convenience I can now clearly discern two main parts: an auto-open stub and the core functionality. Random names and various other obfuscations in macros is often indicative of maliciousness. Macro malware is often arranged this way thwart naïve macro analyzers relying on string matching.

I can further break down the core functionality contained in function “tatata” into a few key logical blocks; declaration of variables, a shell call-out and some ‘glue’. The obvious long string of mostly base64 and a lightly obfuscated “POWERSHELL.EXE” are practically directing the analysis.

It is easy work to determine that the (randomly named) tatata function simply invokes powershell.exe and passes it the large string. The -enc flag is not used as a poor man’s obfuscator but actually for security circumvention for PowerShell. Peeling back the base64 layer is clearly the next step in deconstructing the sample, and I'm employing GNU's base64 on only the necessary part to reveal the PowerShell script.

Macro gives rise to PowerShell

As I glance the resulting script, three functional regions become apparent: setup, data, and glue. Of these, only the large blob of what is easily recognizable as hex data piques my interest. It is almost certainly going to be invoked as 32bit x86 code – hinted by the accompanying reference to win32 APIs. In addition, the ‘glue’ residing near the end does not appear to mangle the $z array, which means no additional transformation will be required.

PowerShell hexbytes condense to x86

Before being able to analyze this code, I need to massage the byte array from its textual form into an equivalent binary form conducive to disassembly. I achieve this thru nothing more than some fancy copy-paste (using the “interpret as hex” feature) into hex workshop to produce a 448 byte binary file I arbitrarily named sc.bin.

Even for those familiar with x86 opcodes, the above hex-dump is unlikely to resemble valid code. This is mainly evident due to lack of 0x00's and common opcodes (a familiarity gained after many years). A quick peek with x86dis, an open-source x86 disassembler, reveals why:

The apparently absent call-pop, typically E8 00 00 00 00 58, is actually implemented using the FPU fnstenv instruction in conjunction with another preceding floating-point instruction (fcmovnb) then the ‘POP EDI’. Now I can establish EDI’s relative value after the fpu-call-pop. The fnstenv instruction saves the FPU state (including the EIP of the last FPU instruction) to some address, which here is ESP-0xC. The POP EDI then retrieves what is in effect the FPU states EIP member. This is how the shellcode locates itself in memory. After the POP, EDI will be the address of the first FPU instruction. I can now use this value to compute the target of the "XOR [EDI+0x18]" placing it a few (0x18 to be exact) bytes further.

I immediately realize that 0x18 is smack-bang in the middle of the shellcode I’m examining! More importantly, the XOR will modify the code as I see it. This reveals a classic snippet of self-modifying code.

Navigating such self-modifying code in a disassembler is tedious, even for an experienced researcher and consequently, it's time to find a debugger or emulator. Of the abundant debuggers at hand, such as OllyDbg, WinDbg, IDA's own, gdb, etc., I found the x86emulator IDA plugin to be the most convenient for this particular task. Using a debugger would require too much faffing about with setup.

Now that I know what to expect, I carefully step thru just past the "XOR [EDI+0x18]" instruction as it patches the code just following. Revealed is a new backward “LOOP” that now forms a familiar decryptor.

Allowing the loop to run until completion reveals a second, similar decryptor at 0x2B, which I similarly overcame to finally reveal the inner workings of the shellcode. At this point I revert back to the analysis and annotation of the shellcode (made seamless by my choice to use the x86emul plugin).

Figure 5 - IDA view of encoded shellcode being worked on by x86Emul plugin

Swimming through the shellcode

I continue the deconstruction at offset 0x37 as this is just after the second decryption loop. My analysis now follows execution-flow rather than linear address order, switching as may be required. I begin by observing a CALL which I determine must be to the body. I infer this by recognizing the alternate flow as a typical get-kernel32 and api-hash routine.

Since I’m already expecting some form of API resolver, I ignore the code below the CALL and direct my attention to the target of the call, namely “body”.

As expected, the “POP EBP” at 0xBE confirms the “CALL body” does not return, serving only to get the address of the API-resolver into EBP. The next two immediate PUSHes collectively put the string 'ws2_32' on the stack (recalling that the stack grows down) while the “PUSH ESP” effectively puts the address of that string.

The final push of an unfamiliar 32bit value before “CALL EBP” must then be the hash of kernel32!LoadLibraryA by inference (‘ws2_32’ is the name of the windows networking library, and libraries are typically loaded by the LoadLibrary API). I confirm this assumption by later analyzing the code hashing code immediately following the “CALL body”.

The next part takes a leap of faith, or a decade of analysis experience. Knowing the following lets me speed ahead:

A successful call of ‘WSAStartup’ returns zero in EAX
The ‘socket’ API is required to transmit or receive network data
A zero in the protocol argument of socket is acceptable (from MSDN: “…user does not wish to specify a protocol….”).

Working on the premise that the previous API was indeed ‘socket’, and noticing that this shellcode is thus far well-constructed, I’m left with two alternatives for the next API: either ‘connect’ to establish an outbound connection or 'bind' in preparation to accept an inbound connection. The choice is not a difficult one – following the execution flow in my minds-eye I observed no constructs resembling bind-listen-accept; so I chose ‘connect’.

On that stride, what must be pushed is then the contents of the sockaddr_in structure, its size and subsequent address, socket descriptor, and lastly the APIs hash:

Using the sockaddr_in structure definition (remembering to byte-reverse where appropriate), I can identify the call-home IP and port number as [redacted] aaa.bbb.ccc.ddd where xxx=dec(xx), port 0x1bb (443, or the SSL port). As a reminder, here’s a basic picture of stack layout showing how the bytes map out to show the order and thus confirms the correct interpretation of the IP address.

Continuing the dissection, the next API-by-hash is reasoned to be ‘recv’. I already know it is socket related thru EDI, and an essentially undefined buffer is being passed in ESI. The hash value is also referenced later by fragments that only make sense for ‘recv’.

When the ‘recv’ API returns (assuming no error) ESI will point to a 4byte (32bit word, aka dword) value received from the network. The code will then xor the recv'd dword by some 32bit constant. This is likely done to obfuscate and/or avoid null bytes.

The next API is easily recognized by its arguments as ‘VirtualAlloc’, and since the previously xor’ed dword is directly passed to it as the size argument (ECX), it reveals it to be just that – the size of data to receive.

The alloc’d pointer in EAX is nudged by 0x100 and some registers saved. For now, I skip over this and expect the purpose will reveal itself in due time.

Now the already familiar hash for ‘recv’ (0x5FC8D902) makes understanding the next fragment simple - receive from the socket until the required number of bytes have been obtained.

Did that look like SSL?

The astute reader will at this point have realized that despite the SSL port being contacted, there is a distinct lack of typical SSL negotiation traffic being sent or expected. Whereas real SSL negotiation should begin with a well-defined ClientHello message, the data actually observed (and expected by the malware) has a different structure. Real SSL traffic has a specific structure that differs from what is observed.

A leading dword specifying size followed by ‘size’ bytes of RC4 encrypted data; as derived from the code that manipulates the content of the received buffer.

Figure 6 - Redacted TCP reconstruction

So what happens with the data?

Realizing that the network exchange does not appear to be SSL, I continue analysis to learn how the received data is to be utilized. This must be the code when the exit condition of the recv-loop is met (all expected bytes received) at 0x164. Here we find a CALL to a function a short distance away that is recognizable as RC4. The key is in-lined immediately following the CALL and obtained by the pop into ESI.

That bit of code strongly resembles RC4 but caution! I say “resembles” since minor variations on the theme are difficult to spot, and would fail to decrypt with RC4-proper. I’ve not gone the extra mile to verify however, since 99% of the time it is unnecessary. How did I identify this as (likely) RC4? Easily! RC4 consists of three parts: initialize sbox, permute sbox with a key, and encrypt/decrypt data. More importantly, the sbox is 0x100 bytes, and is initialized with the index stored at each respective location. This also explains the nudge by 0x100 earlier.

Application to data capture

To decrypt the data-stream I had previously captured, I simply loaded the shellcode into a debugger, grafted the data at some nearby address, fixed up the relevant registers and ran it until the ‘RET’ at +0x1BF. Voila! An “MZ” appears! But wait – where does execution continue? Examining the shellcode leads me to conclude that the address of the buffer (now containing a PE file) is what remains on the stack prior to the ‘RET’… And so resulting in returning execution at the MZ header!? Why sure! The IDA window shows the loader executing “dec ebp ; pop edx” (aka “MZ”) in clever code-data alias. It is this loader stub hidden inside the MZ header that does the ReflectiveLoader call.

Figure 7 - Ida view showing the “MZ” header as instructions (0x4d => ‘M’, 0x5a => ‘Z’)

It turns out that without too much effort, the first function being called by this stub is ReflectiveLoader, and the module is metsrv.dll of Metasploit fame. Sanity-checking also confirms the modules size matching exactly the payload size.

I leave to the reader to confirm the api_by_hash function is as suggested, and perhaps a topic for another day.

Conclusion

Use of Office auto-action macros made even more successful by a simple ruse was only the first step in this complex chain of events. The malware then obfuscated VBA by invoking Powershell to construct and launch non-obvious self-modifying shellcode. It then masqueraded as an outbound SSH connection to fly below-the-radar in order to download a Metasploit module. Once this agent has been deployed, there is no telling what else it had brought with it! Astute readers may have noticed that deep-packet inspection would probably have caught this non-conforming-ssl-attempt, but I’m sure the next revision will have a fix for that!

Sursa: http://community.hpe.com/t5/Security-Research/From-Macro-to-SSL-with-Shellcode-A-Detailed-Deconstruction/ba-p/6839623#.VuE2wfl96Uk

Sign In

From Macro to SSL with Shellcode A Detailed Deconstruction

Recommended Posts

Nytro

From Macro to SSL with Shellcode A Detailed Deconstruction

Join the conversation

Browse

Activity

Pages