Shellcode analysis like a semi-PRO

Nytro · July 16, 2014

Shellcode analysis like a semi-PRO

During Nicolas Brulez‘s training at REcon there was a challenge where the goal was to have function names instead of hashes into IDA in order to make shellcode analysis easier. This post describes the problem with more detail, possible solutions and the approach I took to solve the challenge. If you would like to know the PRO version then take Nicolas’s training next year.

Introduction

In a few words, in order to resolve API function addresses, shellcode uses to parse EAT from loaded modules and compare a given function name with a hash, this is sometimes used by malware as well for the purpose of being stealth. More information about this technique is available here. The problem is that on IDA Pro you will have an output like the below:

seg000:00386C91                 mov     ecx, 0A1233BBCh
seg000:00386C96                 mov     edx, [ebp+4]
seg000:00386C99                 call    find_func_addr
seg000:00386C9E                 call    eax

Since there are many calls to find_func_addr, static analysis without knowing the function name related to each hash is very time-consuming, it would be necessary to follow each call on the debugger and manually update IDA Pro.

Possible solutions

There are two ways to solve this problem: statically or dynamically. Each of them has its advantages and disadvantages.

Dynamic

The only advantage I can think of this approach is that you don’t need to reverse engineer the hash function in order to get every function name, the disadvantages are: (a) the code can take different paths and then you won’t be able to identify all the hashes; ( it’s not portable, the Immunity or Olly script needs to be changed for every other sample that use hashes. A similar example of using the dynamic approach can be found at the VRT blog.

Static

This is the approach I took, the advantages/disadvantages are the reserve of the dynamic one, I had to crack the hash algorithm (which was pretty simple) but at least the other parts can be easily portable for further samples.

The static solution

The solution is divided in three parts: (a) get the function names from EAT of a given DLL; ( calculate the hash for each function name; © import the data into IDA.

Getting the function names

I ended up using DLL Export Viewer, Nicolas used pefile, I totally forgot about pefile. For the purpose of this post, the code below is enough to illustrate:


dll = "c:/windows/system32/kernel32.dll"
pe =  pefile.PE(dll)   
for func in pe.DIRECTORY_ENTRY_EXPORT.symbols:
    print func.name

Calculate the hash

This will change for every sample. On this example, the hash algorithm was very simple, it can get really complicated and then the dynamic approach would be better. There is a MUCH easier/clever way to calculate the hash, however, you will need to take Nicolas’s training to get to know it, his solution to this was a *facepalm* moment.

Hash function:

seg000:00386BE0 calc_hash       proc near               ; CODE XREF: find_func_addr+28p
seg000:00386BE0                 push    eax
seg000:00386BE1                 xor     eax, eax
seg000:00386BE3                 xor     ecx, ecx
seg000:00386BE5
seg000:00386BE5 loop_calc_hash:                         ; CODE XREF: calc_hash+13j
seg000:00386BE5                 lodsb
seg000:00386BE6                 test    al, al
seg000:00386BE8                 jz      short calc_hash_end
seg000:00386BEA                 xor     ecx, eax
seg000:00386BEC                 rol     ecx, 3
seg000:00386BEF                 inc     ecx
seg000:00386BF0                 shl     eax, 8
seg000:00386BF3                 jmp     short loop_calc_hash
seg000:00386BF5 ; ---------------------------------------------------------------------------
seg000:00386BF5
seg000:00386BF5 calc_hash_end:                          ; CODE XREF: calc_hash+8j
seg000:00386BF5                 pop     eax
seg000:00386BF6                 retn
seg000:00386BF6 calc_hash       endp

Relevant C code:

#include <stdio.h>

#include <inttypes.h>

#include <string.h>

__inline__ rol(uint32_t operand, uint8_t width) {

__asm__ __volatile__ ("rol %%cl, %%eax"

: "=a" (operand)

: "a" (operand), "c" (width)

);

}

int main(int argc, char* argv[]) {

unsigned int i = 0;

int out = 0;

int eax = 0;

FILE *ptr_file;

char buf[100];

ptr_file =fopen(argv[1],"r");

if (!ptr_file) {

printf("Unable to read text file\n");

return 1;

}

while (fgets(buf,100, ptr_file)!=NULL) {

for (i = 0; i < strlen(buf)-1; i++) {

eax = eax | buf;

out = out ^ eax;

out = rol(out,3);

out += 1;

eax = eax << 8;

}

printf("0x%08x",out);

printf(",");

printf("%s",buf);

eax = 0;

out = 0;

}

fclose(ptr_file);

return 0;

}

Import data into IDA

Basically, the script below add Enums into IDA, later just press ‘M’ on every hash to get the function name. If the hash is not found, consider import other DLLs. I didn’t find any function in IDA to automatically “refresh” a hex value for an enum value, if this is available please let me know.

IDAPython script:

from idaapi import *

from idc import *

SANE_NAME_RE = re.compile("[@?$:`+&\[\]]", 0)

def sanitize_name(name):

return SANE_NAME_RE.sub("_", name)

def main():

enum_name = AskStr("Kernel32_Functions_Hash","Enter Enum name:")

id = idc.AddEnum(0, enum_name, idaapi.hexflag())

print 'Enum id %d' % (id)

file_path = AskFile(0,"*.*","Open txt file")

file = open(file_path,'r')

addr = ''

name = ''

for line in file:

addr,name = line.split(',')

addr_hex = long(addr,16)

name = sanitize_name(name).rstrip('\n')

if __name__ == '__main__':

main()

After executing script and adding the Enum to the hash value:

.data:00405159                 mov     ecx, KERNEL32_ExitProcess
.data:0040515E                 mov     edx, [ebp+4]
.data:00405161                 call    get_func_addr
.data:00405166                 push    0
.data:00405168                 call    eax

Conclusion

As shown on another post, even without a clean IAT or loading a binary file like when dealing with shellcodes, it’s still possible to have a decent static analysis by mixing IDAPython with other tools.

Sursa: Shellcode analysis like a semi-PRO | drimeldotorg

Sign In

Shellcode analysis like a semi-PRO

Recommended Posts

Nytro

Join the conversation

Browse

Activity

Pages