Jump to content
Sign in to follow this  
ionut97

Decrypting RunPE malware

Recommended Posts

RunPE in a nutshell

RunPE is a technique often used by (novice) virus authors to hide their viruses from an anti-virus scanner. It works by using a small launcher application that has the executable virus file embedded in an encrypted state. The easiest way to launch the executable would be to write a decrypted version of the virus to a file, but this would give the anti-virus scanners a chance to detect and subsequently disable it. Instead, the RunPE loader runs an innocent application and replaces its loaded process image with the virus.

To understand the reasoning in the rest of the document, it's important that you understand how running a process from a memory buffer works from a technical point of view.

First, the RunPE loader launches an innocent process using the CreateProcess API. The process is launched using the CREATE_SUSPENDED flag. This will suspend the process right after it is mapped into memory, but before the windows PE loader loads all additional library files.

Next, the RunPE loader calls GetThreadContext on the main thread of the newly created process. The returned thread context will have the state of all general purpose registers. The EBX register holds a pointer to the Process Environment Block (PEB), and the EAX register holds a pointer to the entry point of the innocent application. In the PEB structure, at an offset of eight bytes, is the base address of the process image.

The loader calls NtUnmapViewOfSection. This function will unmap all the mapped sections of the innocent executable, freeing up the memory space for the virus to be mapped in.

Then the loader reads the headers of the decrypted virus, and maps the headers and all sections into the innocent process using WriteProcessMemory. The correct memory page permissions are set using VirtualProtectEx.

The loader writes the new base address into the PEB and calls SetThreadContext to point EAX to the new entry point.

Finally, the loader resumes the main thread of the target process with ResumeThread and the windows PE loader will do it's magic. The executable is now mapped into memory without ever touching the disk.

Plan of attack

The weaknesses of RunPE should be obvious to anyone: At some point the loader has to decrypt the excutable in the loader's memory space. Furthermore, the original executable will be mapped in the target process' memory space in a readable state, you can easily dump the executable into a file. My first instinct was to try OllyDBG with the OllyDump plugin. Sadly, the RunPE loader left the process in a mutilated state, causing the plugin to fail.

Another way to solve the problem would be forcing the RunPE loader to write the executable to a file instead of to the memory space of another process. The easiest way to achieve this is by hooking the WriteProcessMemory calls. You have to place the hooks before the RunPE loader ever gets control of execution, this proved to be quite challenging when the RunPE loader is written in a .NET language (and it often is, the people using this technique usually aren't very good at what they do).

To solve this I decided to create my own debugger application that places an int3 breakpoint on WriteProcessMemory and reads the required data straight from the RunPE loader process with ReadProcessMemory.

The Source

The code I have so far does its job but is quite dirty. There is no real design behind it, I just started coding and fixing stuff as I thought about it. It also doesn't yet create a real decrypted executable but instead creates binary files for each WriteProcessMemory call. I haven't tested how anti debugger techniques react on this debugger yet. IsDebuggerActive() will return true for sure, but that can easily be prevented. How it reacts on tricks with exception handlers is something I have to test. I'd like to eventually expand this code to some sort of simple debugger framework where you can execute callback functions for every breakpoint. Not sure if I'll ever be motivated enough though.

C:\>debugger.exe target.exe
Process target.exe Loaded at 00400000
Handling exception chain...
Unknown Breakpoint at 7C90120E
Creating breakpoint (WriteProcessMemory) at 7C802213
Handling exception chain...
Exception at 7C812AFB type 4242420
Handling exception chain...
Breakpoint (WriteProcessMemory) at 7C802213
getting stack...
WriteProcessMemory was called at address 4000000 on buffer b3adf8 with length 14

Process closed with exit code 0

C:\>cat 4000000.bin
Hello this is a test
C:\>

#include <iostream>
#include <string>
#include <map>
#include <vector>
#include <iomanip>
#include <sstream>
#include <algorithm>
#include <functional>
#include <fstream>

#include <Windows.h>
#include <TlHelp32.h>

typedef void (*BreakpointCallback)(HANDLE proc, HANDLE thread);

struct Breakpoint
{
std::string name;
unsigned char originalBytes[2];
BreakpointCallback callback; // Not yet used
};

typedef std::map<std::string, MODULEENTRY32> ModuleMap;
typedef std::map<void *, Breakpoint> BreakMap;

bool DumpDataToFile(HANDLE proc, DWORD address, DWORD bufferAddress, DWORD length)
{
std::stringstream filename;
filename << std::hex << address << ".bin";
std::ofstream outfile(filename.str(), std::ofstream::binary | std::ofstream::trunc);

char *buffer = new char[length];
ReadProcessMemory(proc, reinterpret_cast<void *>(bufferAddress), buffer, length, NULL);
outfile.write(buffer, length);
delete[] buffer;
return false;
}

bool UpdateModuleList(int pid, ModuleMap& moduleList)
{
HANDLE snap = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, pid);
if(snap == INVALID_HANDLE_VALUE)
return false;

MODULEENTRY32 me;
moduleList.clear();
Module32First(snap, &me);
do
{
std::string key = me.szModule;
std::transform(key.begin(), key.end(), key.begin(), std::ptr_fun<int, int>(tolower));
moduleList[key] = me;
} while (Module32Next(snap, &me));

return true;
}


// I only place breakpoints on WINAPI functions so I write 2 bytes:
// mov edi,edi becomes int3
// nop
// so I don't need to do anything special to handle the breakpoint, it will just continue with the nop
// and then follow the function prologue
bool ToggleInt3Breakpoint(void *address, std::string name, BreakMap& breakList, HANDLE proc)
{
BreakMap::iterator bpit = breakList.find(address);

if(bpit != breakList.end())
{
// Rewove existing BP
std::cout << "Removing breakpoint (" << bpit->second.name << ") at " << address << std::endl;

if(!WriteProcessMemory(proc, address, bpit->second.originalBytes, 2, NULL))
return false;

breakList.erase(address);
}
else
{
// Create new BP
Breakpoint bp = {name, 0, 0, NULL};
std::cout << "Creating breakpoint (" << name << ") at " << address << std::endl;
if(!ReadProcessMemory(proc, address, bp.originalBytes, 2, NULL))
return false;

if(!WriteProcessMemory(proc, address, "\xCC\x90", 2, NULL))
return false;

breakList[address] = bp;
}
return true;
}

bool PlaceBreakpoints(ModuleMap& moduleList, BreakMap& breakList, HANDLE proc)
{
ModuleMap::iterator kernel32 = moduleList.find("kernel32.dll");
if(kernel32 != moduleList.end())
{
void *wpmAddress = GetProcAddress(GetModuleHandle("kernel32.dll"), "WriteProcessMemory");
ToggleInt3Breakpoint(wpmAddress, "WriteProcessMemory", breakList, proc);
return false;
}
else
return true;
}

bool GetStack(int slots, HANDLE thread, HANDLE proc, std::vector<DWORD>& stack)
{
CONTEXT context;
context.ContextFlags = CONTEXT_ALL;

std::cout << "getting stack..." << std::endl;

if(!GetThreadContext(thread, &context))
return false;

for(int i = 0; i < slots; i++)
{
DWORD slot;
if(!ReadProcessMemory(proc, reinterpret_cast<void *>(context.Esp + (i * 4)), &slot, sizeof(DWORD), NULL))
return false;
stack.push_back(slot);
}

return true;
}

// We handle the breakpoints here (and potential other exceptions)
void HandleException(DEBUG_EVENT de, BreakMap& breakList, HANDLE proc)
{
std::cout << "Handling exception chain... " << std::endl;
EXCEPTION_RECORD *exception = &de.u.Exception.ExceptionRecord;

do
{
BreakMap::iterator bp = breakList.find(exception->ExceptionAddress);
if(exception->ExceptionCode == EXCEPTION_BREAKPOINT && bp != breakList.end())
{
std::cout << " Breakpoint (" << bp->second.name << ") at " << exception->ExceptionAddress << std::endl;

if(bp->second.name == "WriteProcessMemory")
{
std::vector<DWORD_PTR> stack;
HANDLE thread = OpenThread(THREAD_GET_CONTEXT | THREAD_QUERY_INFORMATION, false, de.dwThreadId);
GetStack(6, thread, proc, stack);
CloseHandle(thread);
std::cout << "WriteProcessMemory was called at address " << stack[2]
<< " on buffer " << stack[3]
<< " with length " << stack[4] << std::endl;
DumpDataToFile(proc, stack[2], stack[3], stack[4]);
}
}
else if(exception->ExceptionCode == EXCEPTION_BREAKPOINT)
std::cout << " Unknown Breakpoint at " << exception->ExceptionAddress << std::endl;
else
{
std::cout << " Exception at " << std::hex << exception->ExceptionAddress << " type " << exception->ExceptionCode << std::endl;
MessageBeep(0);
Sleep(100);
}
} while (exception = exception->ExceptionRecord);
}

int DebugMain(std::string targetPath)
{
STARTUPINFO si = {0};
si.cb = sizeof(si);
PROCESS_INFORMATION pi;

if(!CreateProcess(targetPath.c_str(), NULL, NULL, NULL, FALSE, DEBUG_PROCESS, NULL, NULL, &si, ?))
{
std::cerr << "Error while creating process: " << GetLastError() << std::endl;
return EXIT_FAILURE;
}

DEBUG_EVENT de;
bool keepLooping = true;
bool needBreakpoints = true;
ModuleMap moduleList;
BreakMap breakList;

while(keepLooping && WaitForDebugEvent(&de, INFINITE))
{
switch(de.dwDebugEventCode)
{
case LOAD_DLL_DEBUG_EVENT:
case UNLOAD_DLL_DEBUG_EVENT:
UpdateModuleList(pi.dwProcessId, moduleList);
break;
case CREATE_PROCESS_DEBUG_EVENT:
std::cout << "Process " << targetPath << " Loaded at " << de.u.CreateProcessInfo.lpBaseOfImage << std::endl;
break;
case EXIT_PROCESS_DEBUG_EVENT:
std::cerr << "Process closed with exit code " << std::hex << de.u.ExitProcess.dwExitCode << std::endl;
keepLooping = false;
break;
case EXCEPTION_DEBUG_EVENT:
HandleException(de, breakList, pi.hProcess);
break;
default:
break;
}

// Place breakpoints as soon as kernel32 is loaded
if(needBreakpoints)
needBreakpoints = PlaceBreakpoints(moduleList, breakList, pi.hProcess);

ContinueDebugEvent(de.dwProcessId, de.dwThreadId, DBG_CONTINUE);
}
return EXIT_SUCCESS;
}

int main(int argc, char **argv)
{
if(argc != 2)
{
std::cerr << "Usage: " << argv[0] << " <executable path>" << std::endl;
return EXIT_FAILURE;
}

return DebugMain(argv[1]);
}

Sursa:

https://thunked.org/programming/decrypting-runpe-malware-t110.html

Edited by ionut97
  • Upvote 1

Share this post


Link to post
Share on other sites

Da, asa da. Ceva asemanator ar trebui sa faca si antivirusii vietii, iar asta ar salva milioane de oameni de stealere sau programe infectate. Probabil nu ar proteja impotriva tuturor PELoadere-lor, dar ar putea face o treaba excelenta.

Share this post


Link to post
Share on other sites

daca nu ar mai fi oameni care iau stealere si programe infectate ar disparea multi antivirusi(si multe alte servicii), nu?

lantul trofic!

Share this post


Link to post
Share on other sites

ok, doua chestii:

1. unii antivirusi scaneaza si memoria, deci gaseste malware-ul acolo...

2. de ce nu as putea pur si simplu sa pun un breakpoint pe resume thread si sa fac dump atunci?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...