Nytro Posted February 19, 2013 Report Posted February 19, 2013 [h=3]Process Thread Creation Notification – The Easy Way[/h]adeyblue @ 3:22 amIf what you’re writing already requires a dll, or you can leverage an existing one then you’re already set and can use the fact that DllMain gets called when threads are created and destructed to your advantage. If you’re not, or can’t then you’re pretty much stuck for an answer. Conventional wisdom on the web seems to revolve around hooking CreateThread. However, with several methods of creating threads called at various levels of the Win32 system, that isn’t always sufficient, especially if you want to execute code in the thread context. Dll thread_attach notifications work because when threads are created and torn down, ntdll loops around the internal structures corresponding to each module loaded in the process and calls their entry point if they meet certain criteria. The structure for the exe is included in the enumeration but as it doesn’t identify as a dll, its entry point isn’t called. The thing to do then, is modify it to a) look like a dll and make it think our entry point is a DllMain. Usually this poking around officially undocumented stuff is at least a slight pain (after all you’re not meant to be touching it), in this case however the structure is a single call away.// in ntdll.dllEXTERN_CNTSYSAPINTSTATUSNTAPILdrFindEntryForAddress( HMODULE hMod, LDR_DATA_TABLE_ENTRY** ppEntry);You give it an address in a module, and it gives you a pointer to the structure. Seems like a fair trade to me. Here’s what’s in it.struct LDR_DATA_TABLE_ENTRY { LIST_ENTRY InLoadOrderModuleList; LIST_ENTRY InMemoryOrderModuleList; LIST_ENTRY InInitializationOrderModuleList; PVOID BaseAddress; PVOID EntryPoint; ULONG SizeOfImage; UNICODE_STRING FullDllName; UNICODE_STRING BaseDllName; ULONG Flags; SHORT LoadCount; SHORT TlsIndex; union { LIST_ENTRY HashLinks; PVOID SectionPointer; }; ULONG Checksum; union { ULONG TimeDataStamp; PVOID LoadedImports; }; PVOID EntryPointActivationContext; PVOID PatchInformation;};This is the XP version layout. More recent versions have appended extra fields, but we ignore them. The three fields of interest are BaseAddress, EntryPoint, and Flags. Entrypoint and Flags are self-explanatory, BaseAddress though? Yep, as well as checking the flags, the looping code also excludes those whose BaseAddress is the same as GetModuleHandle(NULL). This might sound like it could be a bit disabling to the program, especially if you know that GetModuleHandle can also loop over these structures. It isn’t in practice thought, only affecting calls to GetModuleHandle(), the value returned from GetModuleHandle(NULL) is cached elsewhere so modifying the entry leaves it unaffected. Setting the new entrypoint to a DllMain-a-like is easy enough, and the BaseAddress value can be changed to BaseAddress += 2 (this enables it to still be used in kernel32 functions if anybody ever does GetModuleHandle()), yet we’ve neglected the flag values.//// Loader Data Table Entry Flags, from ReactOS//#define LDRP_STATIC_LINK 0x00000002#define LDRP_IMAGE_DLL 0x00000004#define LDRP_LOAD_IN_PROGRESS 0x00001000#define LDRP_UNLOAD_IN_PROGRESS 0x00002000#define LDRP_ENTRY_PROCESSED 0x00004000#define LDRP_ENTRY_INSERTED 0x00008000#define LDRP_CURRENT_LOAD 0x00010000#define LDRP_FAILED_BUILTIN_LOAD 0x00020000#define LDRP_DONT_CALL_FOR_THREADS 0x00040000#define LDRP_PROCESS_ATTACH_CALLED 0x00080000#define LDRP_DEBUG_SYMBOLS_LOADED 0x00100000#define LDRP_IMAGE_NOT_AT_BASE 0x00200000#define LDRP_COR_IMAGE 0x00400000#define LDRP_COR_OWNS_UNMAP 0x00800000#define LDRP_SYSTEM_MAPPED 0x01000000#define LDRP_IMAGE_VERIFYING 0x02000000#define LDRP_DRIVER_DEPENDENT_DLL 0x04000000#define LDRP_ENTRY_NATIVE 0x08000000#define LDRP_REDIRECTED 0x10000000#define LDRP_NON_PAGED_DEBUG_INFO 0x20000000#define LDRP_MM_LOADED 0x40000000#define LDRP_COMPAT_DATABASE_PROCESSED 0x80000000There are a lot of them, yet we only need concern ourselves with three. LDRP_IMAGE_DLL and LDRP_PROCESS_ATTACH_CALLED need to be set to signal that we are a dll and that we’ve had our init code called. LDRP_DONT_CALL_FOR_THREADS needs to be clear, because being called for threads is exactly what we’re after! So, putting it all into motion:#define WIN32_LEAN_AND_MEAN#include <windows.h>#include <winternl.h> // for Unicode_string#include <cstdio>#define LDRP_IMAGE_DLL 0x00000004#define LDRP_DONT_CALL_FOR_THREADS 0x00040000#define LDRP_PROCESS_ATTACH_CALLED 0x00080000struct LDR_DATA_TABLE_ENTRY { LIST_ENTRY InLoadOrderModuleList; LIST_ENTRY InMemoryOrderModuleList; LIST_ENTRY InInitializationOrderModuleList; PVOID BaseAddress; PVOID EntryPoint; ULONG SizeOfImage; UNICODE_STRING FullDllName; UNICODE_STRING BaseDllName; ULONG Flags; SHORT LoadCount; SHORT TlsIndex; union { LIST_ENTRY HashLinks; PVOID SectionPointer; }; ULONG Checksum; union { ULONG TimeDataStamp; PVOID LoadedImports; }; PVOID EntryPointActivationContext; PVOID PatchInformation;};BOOL APIENTRY ThreadAndShutdownNotify(HMODULE hMod, DWORD reason, PVOID pDynamic){ char buffer[100]; switch(reason) { case DLL_THREAD_ATTACH: { sprintf(buffer, "Thread attach : %lu\n", GetCurrentThreadId()); } break; case DLL_THREAD_DETACH: { sprintf(buffer, "Thread detach : %lu\n", GetCurrentThreadId()); } break; case DLL_PROCESS_DETACH: { sprintf(buffer, "Process detach : %lu\n", GetCurrentThreadId()); } break; } OutputDebugStringA(buffer); puts(buffer); return TRUE;}DWORD WINAPI WaitThread(PVOID p){ return WaitForSingleObject((HANDLE)p, INFINITE);}typedef NTSTATUS (NTAPI*pfnLdrFindEntryForAddress)(HMODULE hMod, LDR_DATA_TABLE_ENTRY** ppLdrData);int main(){ HMODULE hNtdll = GetModuleHandle(L"ntdll.dll"); pfnLdrFindEntryForAddress LdrFindEntryForAddress = (pfnLdrFindEntryForAddress)GetProcAddress(hNtdll, "LdrFindEntryForAddress"); LDR_DATA_TABLE_ENTRY* pEntry = NULL; if(NT_SUCCESS(LdrFindEntryForAddress(GetModuleHandle(NULL), &pEntry))) { pEntry->EntryPoint = (PVOID)&ThreadAndShutdownNotify; pEntry->Flags |= LDRP_PROCESS_ATTACH_CALLED | LDRP_IMAGE_DLL; pEntry->Flags &= ~(LDRP_DONT_CALL_FOR_THREADS); pEntry->BaseAddress = (PVOID)(((ULONG_PTR)pEntry->BaseAddress) + 2); } else { return puts("Something's strange, in the neighbourhood, and my phone doesn't work!!"); } HANDLE hEvent4 = CreateEvent(NULL, TRUE, FALSE, NULL); HANDLE hThread[10]; for(DWORD i = 0; i < ARRAYSIZE(hThread); ++i) { hThread[i] = CreateThread(NULL, 0, &WaitThread, hEvent4, 0, NULL); } SetEvent(hEvent4); WaitForMultipleObjects(ARRAYSIZE(hThread), hThread, TRUE, INFINITE); for(DWORD i = 0; i < ARRAYSIZE(hThread); ++i) { CloseHandle(hThread[i]); } CloseHandle(hEvent4); return 0;}And running it produces:Thread attach : 4208Thread attach : 3052Thread attach : 2076Thread attach : 3144Thread attach : 1476Thread attach : 516Thread attach : 4224Thread attach : 4320Thread attach : 3620Thread attach : 1280Yay… wait a minute. Where are the detach and shutdown notifications? So much for mimicking DllMain. See those LIST_ENTRY structures at the head of the LDR_DATA_TABLE_ENTRY structure, those are the three different orders in which the module entries are linked. LoadOrder is obvious, whereas non-obviously MemoryOrder is the same as LoadOrder, and InitializationOrder is the order that the DllMain’s were called…, ah. Our LDR_DATA_TABLE_ENTRY is the first entry of the first two lists however since it isn’t a kosher dll, it’s not present in the third. ThreadAttach notifications work since ntdll walks the LoadOrder module list when calling entry points for thread attaches. The other notifications don’t work, since ntdll walks the InitOrder list for those. The solution therefore is to add ourself to that one, and that’s quite a simple op too. Where we should insert outselves is up to you, before kernel32 seems as good a place as any though. In summation:#define WIN32_LEAN_AND_MEAN#include <windows.h>#include <winternl.h> // for Unicode_string#include <cstdio>#define LDRP_IMAGE_DLL 0x00000004#define LDRP_DONT_CALL_FOR_THREADS 0x00040000#define LDRP_PROCESS_ATTACH_CALLED 0x00080000struct LDR_DATA_TABLE_ENTRY { LIST_ENTRY InLoadOrderModuleList; LIST_ENTRY InMemoryOrderModuleList; LIST_ENTRY InInitializationOrderModuleList; PVOID BaseAddress; PVOID EntryPoint; ULONG SizeOfImage; UNICODE_STRING FullDllName; UNICODE_STRING BaseDllName; ULONG Flags; SHORT LoadCount; SHORT TlsIndex; union { LIST_ENTRY HashLinks; PVOID SectionPointer; }; ULONG Checksum; union { ULONG TimeDataStamp; PVOID LoadedImports; }; PVOID EntryPointActivationContext; PVOID PatchInformation;};BOOL APIENTRY ThreadAndShutdownNotify(HMODULE hMod, DWORD reason, PVOID pDynamic){ char buffer[100]; switch(reason) { case DLL_THREAD_ATTACH: { sprintf(buffer, "Thread attach : %lu\n", GetCurrentThreadId()); } break; case DLL_THREAD_DETACH: { sprintf(buffer, "Thread detach : %lu\n", GetCurrentThreadId()); } break; case DLL_PROCESS_DETACH: { sprintf(buffer, "Process detach : %lu\n", GetCurrentThreadId()); } break; } OutputDebugStringA(buffer); puts(buffer); return TRUE;}DWORD WINAPI WaitThread(PVOID p){ return WaitForSingleObject((HANDLE)p, INFINITE);}void InsertIntoList(LIST_ENTRY* pOurListEntry, LIST_ENTRY* pK32ListEntry){ // dll detach are called in reverse list order // so after Kernel32 is before it in the list // our forward link wants to point to whatever is after // k32ListEntry and our back link wants to point to pK32ListEn LIST_ENTRY* pEntryToInsertAfter = pK32ListEntry->Flink; pOurListEntry->Flink = pEntryToInsertAfter; pOurListEntry->Blink = pEntryToInsertAfter->Blink; pEntryToInsertAfter->Blink = pOurListEntry; pOurListEntry->Blink->Flink = pOurListEntry;}typedef NTSTATUS (NTAPI*pfnLdrFindEntryForAddress)(HMODULE hMod, LDR_DATA_TABLE_ENTRY** ppLdrData);int main(){ HMODULE hNtdll = GetModuleHandle(L"ntdll.dll"); pfnLdrFindEntryForAddress LdrFindEntryForAddress = (pfnLdrFindEntryForAddress)GetProcAddress(hNtdll, "LdrFindEntryForAddress"); LDR_DATA_TABLE_ENTRY* pEntry = NULL; if(NT_SUCCESS(LdrFindEntryForAddress(GetModuleHandle(NULL), &pEntry))) { pEntry->EntryPoint = (PVOID)&ThreadAndShutdownNotify; pEntry->Flags |= LDRP_PROCESS_ATTACH_CALLED | LDRP_IMAGE_DLL; pEntry->Flags &= ~(LDRP_DONT_CALL_FOR_THREADS); pEntry->BaseAddress = (PVOID)(((ULONG_PTR)pEntry->BaseAddress) + 2); LDR_DATA_TABLE_ENTRY* pK32Entry = NULL; LdrFindEntryForAddress(GetModuleHandle(L"kernel32.dll"), &pK32Entry); InsertIntoList(&pOurEntry->InInitializationOrderModuleList, &pK32Entry->InInitializationOrderModuleList); } else { return puts("Something's strange, in the neighbourhood, and my phone doesn't work!!"); } HANDLE hEvent4 = CreateEvent(NULL, TRUE, FALSE, NULL); HANDLE hThread[10]; for(DWORD i = 0; i < ARRAYSIZE(hThread); ++i) { hThread[i] = CreateThread(NULL, 0, &WaitThread, hEvent4, 0, NULL); } SetEvent(hEvent4); WaitForMultipleObjects(ARRAYSIZE(hThread), hThread, TRUE, INFINITE); for(DWORD i = 0; i < ARRAYSIZE(hThread); ++i) { CloseHandle(hThread[i]); } CloseHandle(hEvent4); return 0;}Output:Thread attach : 4856Thread attach : 3896Thread attach : 4140Thread attach : 3488Thread attach : 188Thread detach : 188Thread attach : 3272Thread attach : 3376Thread attach : 1632Thread detach : 4856Thread attach : 4120Thread detach : 3896Thread detach : 4140Thread detach : 3488Thread attach : 3484Thread detach : 3272Thread detach : 3376Thread detach : 1632Thread detach : 4120Thread detach : 3484Process detach : 4720Yay, for realsies this time. So that’s how you do it, with help from some jiggery poker. No hooks, no external code, just pure, clean, faffing around.Sursa: Process Thread Creation Notification – The Easy Way Quote