Nytro Posted September 17, 2014 Report Posted September 17, 2014 (edited) Microsoft Windows 8.1 Kernel Patch Protection Analysis & Attack Vectors Kernel Patch Protection (also known as "patchguard") is a Windows mechanism designed tocontrol the integrity of vital code and data structures used by the operating system. It wasintroduced in Windows 2003 x64 and has been constantly improved in further Windowsversions. In this article we present a descriptive analysis of the patchguard for the latestWindows 8.1 x64 OS, and primarily focus on patchguard initialization and attack vectors relatedto it.It is natural that kernel patch protection is being developed incrementally, so the initializationprocess is common for all versions of Windows that have patchguard. There are a lot of paperspublished about kernel patch protection on Windows, which describe the process of itsinitialization, so you may use references at the end of this article to obtain details.Initialization sourcesAs widely known, the main component of patchguard is initialized in a misleadingly namedfunction "KiFilterFiberContext". It will be the starting point of our investigation. Looking forcross-references doesn't help us much for pointing out its call site, but several articles help usby stating that patchguard initialization is called indirectly in a function"KeInitAmd64SpecificState". By indirectly we mean here not just an indirect call, but the usageof exception handlers. It is a very common trick often found in patchguard-related functions, aswe'll see further. So, we have an initialization function call stack:[FONT=Courier New] (call) (call) (exception)[/FONT][FONT=Courier New]... --> Phase1InitializationDiscard --> ; KeInitAmd64SpecificState -> KiFilterFiberContext[/FONT]This type of initialization is described in more detail in [1]. By the way, this one is always calledon the last CPU core, if it matters.However, it is not the only way that kernel uses to initialize patchguard. With a 4% probabilitypatchguard context can also be initialized from a function also misleadingly called"ExpLicenseWatchInitWorker":[FONT=Courier New]... --> Phase1InitializationDiscard --> sub_14071815C (obviously with a stripped symbol because this one processes Windows license type for a current PC) --> ExpLicenseWatchInitWorker[/FONT]The pseudocode of this function looks like this:VOID ExpLicenseWatchInitWorker() { PVOID KiFilterParam;NTSTATUS (*KiFilterFiberContext)(PVOID pFilterparam);BOOLEAN ForgetAboutPG; // KiServiceTablesLocked == KiFilterParamKiFilterParam = KiInitialPcr.Prcb.HalReserved[1];KiInitialPcr.Prcb.HalReserved[1] = NULL; KiFilterFiberContext = KiInitialPcr.Prcb.HalReserved[0];KiInitialPcr.Prcb.HalReserved[0] = NULL; ForgetAboutPG = (InitSafeBootMode != 0) | (KUSER_SHARED_DATA.KdDebuggerEnabled-> -> 1);// 96% of cases will fail if ( __rdtsc() % 100 -> 3 )ForgetAboutPG |= 1; if ( !ForgetAboutPG && KiFilterFiberContext(KiFilterParam) != 1 )KeBugCheckEx(SYSTEM_LICENSE_VIOLATION, 0x42424242, 0xC000026A, 0, 0);}As you may notice, there is a small "present" in the “HalReserved” processor control block fieldleft for this initialization case. Tracing down the guy who left it leads us to the very beginning ofsystem startup:[FONT=Courier New]... --> KiSystemStartup --> KiInitializeKernel --> KeCompactServiceTable --> KiLockServiceTable -v ??????[/FONT]We have to pause here, because there is no code that puts data into HalReserved fieldsdirectly. As instead, it is done using the exception handler. And it is done in a different wayfrom "KeInitAmd64SpecificState", because it doesn't trigger any exceptions. What it doesinstead is – it directly looks up the current instruction pointer, finds the corresponding functionand it's exception handler manually, and then calls it. The exception handler of"KiLockServiceTable" function is an unnamed stub to the "KiFatalExceptionFilter".[FONT=Courier New]?????? ---> KiFatalExceptionFilter[/FONT]“KiFatalExceptionFilter” in turn looks up an exception handler for "KiServiceTablesLocked"function. And surprisingly it is the "KiFilterFiberContext"! Also, a parameter that is passed to"KiFilterFiberContext" is located right after the "KiServiceTablesLocked" function. It is a smallstructure:typedef struct _KI_FILTER_FIBER_PARAM {NTSTATUS (*PsCreateSystemThread)(); // a pointer to PsCreateSystemThread function KSTART_ROUTINE sub_140235C44; // unnamed checker subroutineKDPC KiBalanceSetManagerPeriodicDpc; // global DPC struct } KI_FILTER_FIBER_PARAM, *PKI_FILTER_FIBER_PARAM;"KiFatalExceptionFilter" stores these pointers to “HalReserved” fields.Creating patchguard contextLet's get back to the "KiFilterFiberContext" function. It's pseudocode is given below:BOOLEAN KiFilterFiberContext(PVOID pKiFilterParam) {BOOLEAN Result = TRUE;DWORD64 dwDpcIdx1 = __rdtsc() % 13;DWORD64 dwRand2 = __rdtsc() % 10;DWORD64 dwMethod1 = __rdtsc() % 6; AntiDebug(); // Let's call sub_1406D6F78 KiInitializePatchGuardContext since it does initialize patchguard context Result = KiInitializePatchGuardContext(dwDpcIdx, dwMethod1, (dwRand2 < 6) + 1, pKiFilterParam, TRUE); // A 50% chance to create two patchguard contexts if (dwRand2 < 6){DWORD64 dwDpcIdx2 = __rdtsc() % 13;DWORD64 dwMethod2 = __rdtsc() % 6; do {dwMethod2 = __rdtsc() % 6;} while ((dwMethod1 != 0) && (dwMethod1 == dwMethod2)); Result = KiInitializePatchGuardContext(dwDpcIdx2, dwMethod2, 2,pKiFilterParam, FALSE); } AntiDebug(); return Result;}It is rather clear, and with provided code we can assume that up to 4 patchguard contexts canbe active on a running system simultaneously. Remember this one because wherever it iscalled, we can be 100% sure that a new patchguard context is being initialized.The function that creates and initializes patchguard context is so-called"KiInitializePatchGuardContext". It is a huge obfuscated function. I guess it is suitable toreference Alex's Ionescu tweet about it: "I love the new #Windows 8 Patch Guard. Fixes so many of the obvious holes in downlevel, and the new hyper-inlined obfuscation makes me cry."You bet it! IDA Pro's decompiler works on it ~20 min on 3770 Core i7 CPU and spews out 26Klines of code. It is not worth dealing with it as a single unit. Luckily, you can bite out smallpieces of information that give you a clue about methods that the new patchguard uses. That'swhy we did not reverse engineer it entirely, as instead we took and analyzed several parts in it.Feel free to explore this function yourself, and you may discover new wonderful things!It takes 5 parameters on Windows 8.1:1. Index of DPC routine to be called from a created patchguard DPC for checking thepatchguard context. It may be one of these: // These ones don't use exception handlers to fire checksKiTimerDispatch (copied to random pool allocation)KiDpcDispatch (copied into patchguard context)// These use exception handlers to fire patchguard checksExpTimerDpcRoutineIopTimerDispatchIopIrpStackProfilerTimerPopThermalZoneDpcCmpEnableLazyFlushDpcRoutineCmpLazyFlushDpcRoutineKiBalanceSetManagerDeferredRoutineExpTimeRefreshDpcRoutineExpTimeZoneDpcRoutineExpCenturyDpcRoutineAlso those 10 DPCs are regular system DPCs with useful payload, but when they encounter aDeferredContext which has non-canonical address, they fire a correspondingKiCustomAccessRoutine function.These functions are only called when an appropriate scheduling method is used (0, 1, 2, 5)2. Scheduling method:These are the methods that are used to fire a patchguard DPC object that is created inside"KiInitializePatchGuardContext" function.KeSetCoalescableTimer (0). A timer object is created with a random fire period between 2 minutes and 2 minutes and 10 seconds. Prcb.AcpiReserved (1). In this case a patchguard DPC is fired when a certain ACPI event occurs, f.e. transitioning to idle state. In this case "HalpTimerDPCRoutine" checks if 2 minutes have passed since last queued by itself DPC, and queues another one, taken from Prcb.AcpiReserved field. Prcb.HalReserved (2). Here a patchguard DPC is queued when HAL timer clock interrupt occurs, in the "HalpMcaQueueDpc". It is also done with 2 minutes period at least. Queued patchguard DPC is taken from Prcb.HalReserved field. PsCreateSystemThread (3). In this case, patchguard DPC routine is not used, as instead a system thread is created. The thread procedure is taken from KI_FILTER_FIBER_PARAM structure. Patchguard DPC in turn is used just as a container of the address of a newly created patchguard context. KeInsertQueueApc (4). This time a regular kernel APC is queued to the one of the system threads with "KiDispatchCallout" APC procedure. No patchguard DPC is fired also. System thread is chosen based on its start address, i.e. it must be equal to either PopIrpWorkerControl or CcQueueLazyWriteScanThread. KiBalanceSetManagerPeriodicDpc (5). Patchguard DPC is stored in a global variable named "KiBalanceSetManagerPeriodicDpc". It is queued in "KiUpdateTimeAssist" function and "KeClockInterruptNotify" function within every "KiBalanceSetManagerPeriod" ticks. 3. This parameter can be either 1 or 2. We are not sure about how it affects "KiInitializePatchGuardContext" function, but it is somehow connected to the quantity of checksbeing done during patchguard context verification routine execution.4. A pointer to KI_FILTER_FIBER_PARAM structure. It is noticeable that a method chosen inside"KiInitializePatchGuardContext" is selected based on the presence of this parameter. If it ispresent, a method bit mask is tested with 0x29 (101001b) which allows methods 0, 3 and 5.Otherwise methods 0, 1, 2 and 4 are available. That makes sense, because methods 3 and 5require a valid KI_FILTER_FIBER_PARAM structure.5. Boolean parameter which tells if NT kernel functions checksums have to be recalculated.As you might guess, the only scheduling method that can be initialized twice is 0, so"KiFilterFiberContext" takes this fact into account when chooses a method for a second call of"KiInitializePatchGuardContext".Firing a patchguard checkMethods that fire patchguard DPCThe main principle of patchguard check routine is to launch a patchguard context verificationroutine on a DPC level, and then queue a work item that will check vital system structures on apassive level with a proceeding context recreation and rescheduling. The verification work itemuses a copy of "FsRtlUninitializeSmallMcb" function. You can check this one out, if you want tofigure out how the check works.For the methods which use DPC activation there is a common code inside 10 listed DPCroutines, which checks "DeferredContext" for being a non-canonical address. If it is OK, DPCjust executes its payload. Otherwise one of 10 "KiCustomAccessRoutineX" functions is called.When "KiCustomAccessRoutineX" is called, (last 2 bits + 1) of "DeferredContext" are taken andused to roll along "KiCustomRecurseRoutineX". These recursive routines are cycledincrementing X value. When the roll is over, "KiCustomRecurseRoutineX" tries to dereference aDeferredContext value as a pointer, which inevitably generates #GP exception since thisaddress is non-canonical.// Inside DPC routineif ( (DeferredContext >> 47) < 0xFFFFFFFFFFFFFFFFui64 && DeferredContext >> 47 != 0 )// Is DeferredContext a canonical address {...KiCustomAccessRoutineX(DeferredContext);...} void KiCustomAccessRoutine9(DWORD64 DeferredContext) {return KiCustomRecurseRoutine9((DeferredContext & 3) + 1, DeferredContext);} void KiCustomRecurseRoutine9(DWORD dwRoll, DWORD64 DeferredContext) {DWORD dwNextRoll;DWORD64 go_go_GP; dwNextRoll = dwRoll - 1; if ( dwNextRoll )KiCustomRecurseRoutine0(dwNextRoll, DeferredContext); Microsoft Windows 8.1 Kernel Patch Protection Analysis Page 11 / 18go_go_GP = *DeferredContext; // #GP}// DPC routine call sequenceExpTimerDpcRoutine -> KiCustomAccessRoutine0 -> KiCustomRecurseRoutine0 ...KiCustomRecurseRoutineN IopTimerDispatch -> KiCustomAccessRoutine1 -> KiCustomRecurseRoutine1 ...KiCustomRecurseRoutineN IopIrpStackProfilerTimer -> ; KiCustomAccessRoutine2 -> KiCustomRecurseRoutine2 ...KiCustomRecurseRoutineN PopThermalZoneDpc -> KiCustomAccessRoutine3 -> KiCustomRecurseRoutine3 ...KiCustomRecurseRoutineN CmpEnableLazyFlushDpcRoutine -> KiCustomAccessRoutine4 -> KiCustomRecurseRoutine4 ...KiCustomRecurseRoutineN CmpLazyFlushDpcRoutine -> KiCustomAccessRoutine5 -> KiCustomRecurseRoutine5 ...KiCustomRecurseRoutineN KiBalanceSetManagerDeferredRoutine -> KiCustomAccessRoutine6 -> KiCustomRecurseRoutine6 ... KiCustomRecurseRoutineN ExpTimeRefreshDpcRoutine -> KiCustomAccessRoutine7 -> KiCustomRecurseRoutine7 ...KiCustomRecurseRoutineN ExpTimeZoneDpcRoutine -> KiCustomAccessRoutine8 -> KiCustomRecurseRoutine8 ...KiCustomRecurseRoutineN ExpCenturyDpcRoutine -> KiCustomAccessRoutine9 -> KiCustomRecurseRoutine9 ...KiCustomRecurseRoutineNHere comes vectored exception handling again. If you look up all the exception handlers forthese DPC routines, you'll discover that there are several nested __try\__except and__try\__finally blocks. For example, "ExpTimerDpcRoutine" looks something like this:...__try{ __try { __try { __try { KiCustomAccessRoutine0(DeferredContext); } __finally { FinalSub1(); } } __except (FilterSub1()) // patchguard context decryption occurs here { // Nothing } } __finally { FinalSub2(); }}__except (FilterSub2()){ // Nothing}... ExpCenturyDpcRoutine, ExpTimeZoneDpcRoutine, ExpTimeRefreshDpcRoutine,KiBalanceSetManagerDeferredRoutine, CmpLazyFlushDpcRoutine, CmpEnableLazyFlushDpcRoutine,PopThermalZoneDpc, ExpTimerDpcRoutine … -> _C_specific_handler IopIrpStackProfilerTimer , IopTimerDispatch … -> _GSHandlerCheck_SEH (GS check + _C_specific_handler)Depending on the DPC routine, decryption routine (based on KiWaitAlways and KiWaitNevervariables) may reside in one of the exception filters, exception handlers or termination handlers.Further patchguard context verification occurs also inside decryption routine, right after thedecryption.As for "KiTimerDispatch" and "KiDpcDispatch" DPC routines - they call patchguard contextverification directly. Also, depending on the DPC routine a different type of patchguard contextencryption is used (or not used at all).Other methodsMethod 3 creates a system thread. System thread procedure sleeps between 2 minutes and 2minutes and 10 seconds using "KeDelayExecutionThread" or "KeWaitForSingleObject" on akernel object, which is always not signaled. After the wait is timed out it decrypts patchguardcontext and executes verification routine.Method 4 inserts an APC with "KiDispatchCallout" function as a kernel routine and"EmpCheckErrataList" as a normal routine. Patchguard context decryption and validation occurs upon APC delivery to the target waiting thread, which happens almost immediately. A 2 minuteswait is located inside the verifier work item routine in this method.One more piece of a puzzleThat would be it about patchguard initialization, but looking for the cross-references toKUSER_SHARED_DATA.KdDebuggerEnabled lead me to a suspicious function named"CcInitializeBcbProfiler". It is full of bit rotations and magic numbers, which forced me to checkwhether it is related to patchguard mechanism.[FONT=Courier New]... -> Phase1InitializationDiscard --> CcInitializeCacheManager --> CcInitializeBcbProfiler[/FONT]It seems to have the same roots!With 50% chance it queues DPC with "CcBcbProfiler" routine or a work item with an unnamedwork item routine (which is almost identical to the "CcBcbProfiler" routine). This mechanismpicks one random function from NT kernel module and checks its consistency every 2 minutes.It is interesting that all of the patchguard-related functions are located nearby, one afteranother starting from "FsRtlMdlReadCompleteDevEx". It tells us that they are likely to belocated in a single compilation unit. This fact gives us a hope that all of the patchguardinitialization paths have been covered in this article.AttacksNow, as we covered patchguard initialization, we know what wires of a patchguard bomb canbe cut to defuse it! However, there are several ways depending on a patchguard DPCscheduling method. Since we cover a specific version of patchguard, i.e. Windows 8.1, we aregoing to use precomputed offsets for accessing the private kernel structures' fields.The common defusing principle is firstly to check if verification routine is in progress, and wait abit if it is true. Then do the following:KeSetCoalescableTimer (0). Scan through the Prcb timer table and disable the one with suitable DPC object. AcpiReserved field (1). Zero this field out, so the DPC won't be fired again. HalReserved field (2). Same here. PspCreateSystemThread (3). Enumerate all threads in a system and unwind their stacks. Then check if a start routine from “KiServiceTablesLocked” structure is present in a call stack. If it is there, it's a patchguard thread. Disable it while it is in a wait state setting the wait time to infinite. APC (4). Take the current Prcb NUMA Node and its worker thread pool. Scan through its sleeping worker threads unwinding the stacks until "ExpWorkerThread" function. If there are functions that are not to be found in NT image runtime function data, try to unwind them sequentially with runtime data for "FsRtlMdlReadCompleteDevEx" and "FsRtlUninitializeSmallMcb". If succeeded, than it is a patchguard worker. Disable it setting the wait time to infinity. KiBalanceSetManagerPeriodicDpc (5). Zero this struct out. By disabling a timer we mean setting its due time to infinity, so it never fires. And by suitableDPC object we mean a DPC object with a deferred context set to a non-canonical address.Furthermore, you can additionally check this pointer to be valid after XORing its value with aquad-word following right after KDPC struct and ANDing it with 0xFFFF800000000000.As for the "CcBcbProfiler" piece, we consider it not to be relevant since there is a small chancethat it will check exactly the needed function.SummaryA quality of Windows 8.1 kernel patch protection mechanism is extremely high. There are a lotof interesting anti-debugging tricks used again dynamic analysis, f.e. resetting IDT beforeaccessing debug registers (which leads you to hanging if you set break on debug registersaccess), overall obfuscation like using macroses for generating pseudo-random values, loopunrolling etc. It is also extremely difficult to do a static analysis since a lot of indirect functioncalls are used including the usage of exception handlers.It is a really nice tool to keep the system safe. Therefore we hope that as a developer youwon't face situations when you need to disable this cool mechanism!Authors: Mark Ermolov, Artem Shishkin, Positive ResearchSursa: Positive Research Center: Microsoft Windows 8.1 Kernel Patch Protection Analysis & Attack Vectors Edited September 17, 2014 by Nytro Quote