Interactive guide to Buffer Overflow exploitation

Nytro · April 15, 2020

Interactive guide to Buffer Overflow exploitation

December 16, 2019
Written by: Vetle Økland

A Buffer Overflow is a bug class in a program typically written in a memory unsafe language like C or C++. Buffer Overflow bugs from user-input can often allow someone to overwrite some data in memory they weren't supposed to. Before we dive into how to exploit Buffer Overflow bugs, we will do a quick introduction to Assembly.

Assembly is a language that describes raw machine code. Assembly is the lowest level programming language, and the processor executes exactly what you write in Assembly. This means that we can also directly translate machine code bits and bytes back to Assembly without losing any information. Languages like C, C++, Rust, etc. on the other hand translate what you write to machine code, which we can translate back to Assembly, but we can't translate it back to exactly what the code was in C, C++ or Rust, only approximations based on the Assembly. For all intents and purposes, Assembly is the exact code that the processor executes.

Let's jump into a little interactive primer on Assembly. First of all, in Assembly we don't really have variables in the sense that we have in JavaScript, C, Rust, Go, etc. Instead, we have a set amount of registers that can store one value at a time. On a 64-bit system, these registers can store up to 64 bits of values (e.g. number from 0 to 18 446 744 073 709 551 615), on a 32-bit system registers can store up to 32-bit of values (0 to 4 294 967 295). Some of the registers in Intel architectures are named RAX, RDI, RBX, these are general purpose registers that you can pretty much use for whatever you want. Then there are others like RIP and RSP which control the address of the next instruction we should execute is in memory and address to the stack (more on that later), respectively.

Underneath you will see what happens when we execute mov {register}, {value} instructions. mov is a "command" that tells the processor to store (or "move") values into a register or memory address. When you step through the program you can see that for each mov instruction the register updates with the value specified in the mov instruction. Most values in these examples are represented as HEX values.

main:

0xFA2 mov eax, 0x1337

0xFA7 mov ebx, 0x7331

0xFAC mov edi, 0x1995

0xFB1 call exit

RAX0x00000000RBX0x00000000RDI0x00000000

Note that eax and rax are two different names for the same register. You can read about that here if you're interested in learning more about it.

Most processor architectures have a concept of a "stack". The stack is a quick place to store temporary values in memory (RAM). 64-bit systems store 64 bits (8 bytes) at a time in a growing stack, and when you go to fetch the values you fetch them 64 bits at a time, or 8 bytes. You store a value on the stack by using instruction push {register name} and you fetch a value into a register by using instruction pop {register name}. There is a special register RSP that has the value (pointer) of the memory address of the last value on the stack. This means RSP should always point to the last value you pushed to the stack. If you pop (fetch) a value from the stack, RSP decreases by 8, because we always push (store) and pop (fetch) values in 8-byte increments or decrements.

main:

0xFA9 mov eax, 0x13371337

0xFAE push rax

0xFAF pop rdi

0xFB0 call exit

RAX0x00000000RDI0x00000000RSP0x00005000

Here we set RAX to the HEX value 0x13371337, we push that value to the stack and then pop it into RDI

On the right side underneath the registers you can see the stack, each box is 1 byte, each row is 1 8-byte value. On the top of each box you can see the memory address of the byte, on the bottom is the byte's value and in the middle you can see the ASCII representation of the value.

Here's how it looks if you push and pop multiple values.

main:

0xFA5 mov eax, 0x13371337

0xFAA push rax

0xFAB push rax

0xFAC push rax

0xFAD pop rdi

0xFAE pop rbx

0xFAF pop rsi

0xFB0 call exit

RAX0x00000000RDI0x00000000RBX0x00000000RSI0x00000000RSP0x00005000

There's another special register called RIP that always points to the memory address of the next instruction the processor is going to execute. This register cannot be changed with mov instructions, but we can use jmp instructions instead. This allows us to jump to different places in the program when we need to.

add1:

0xFA5 add rax, 1

0xFA9 jmp add2

main:

0xFAB mov eax, 1

0xFB0 jmp add1

add2:

0xFB2 add rax, 2

0xFB6 jmp add1

RAX0x00000000RIP0x00000fab

When we jump to "add2" we add 2 to the current value of RAX, when we jump to add1 we add 1.

The emulator has conveniently labeled addresses for us, so instead of seeing jmp 0xFA5, it's changed to the more readable jmp add1. The labels in the assembly view are for convenience, but when the processor executes the code, it doesn't care nor need the labels. This means that the label add1 is for address 0xFA5, main is for address 0xFAB and add2 is for address 0xFB2.

Another way to jump to another place in the code is to use a call {address} instruction. A call instruction basically pushes the address of the next instruction after the call instruction to stack then jumps to the specified address. The counterpart to call is ret which pops the last value on the stack and jumps (returns) to that address. This lets pieces of code act as functions that you can call and then return from when it's done and continue normal execution. Functions we often want to call is to print something on screen, transform text, etc., just like you use functions in other programming languages.

main:

0xF9D mov eax, 0x666

_loop:

0xFA2 call sub1

0xFA7 call add2

0xFAC jmp _loop

add2:

0xFAE add rax, 2

0xFB2 ret

sub1:

0xFB3 sub rax, 1

0xFB7 ret

RAX0x00000000RIP0x00000f9dRSP0x00005000

Calls can also be nested, and you should see the stack grow and shrink as calls and returns are being executed.

main:

0xF93 call call1

0xF98 call exit

call1:

0xF9D call call2

0xFA2 ret

call2:

0xFA3 call call3

0xFA8 ret

call3:

0xFA9 call call4

0xFAE ret

call4:

0xFAF mov eax, 0x42

0xFB4 ret

RAX0x00000000RIP0x00000f93RSP0x00005000

We can also call functions with arguments, but we cannot do that inline in a call instruction, we need to either use the stack or the registers. For functions with few parameters we use registers for the first argument, the second, and the third. If we want to print something, we can use the printf function with the memory address of the string we want to print in the RDI register. To load the memory address into RDI we need to use the lea instruction, for our intents and purposes you can think of lea as a fancy mov. When we run the lea instruction in the emulator you can see that RDI is updated to 0x1000 which is the address of the "Hello, world!" string.

main:

0xFA2 lea rdi, qword ptr [hello_world]

0xFA9 call printf

0xFAE call exit

RDI0x00000000

Which registers to use for function arguments, and how many before the stack is used instead, varies a lot depending on programming language, compiler, operating system, etc. Assembly does not specify which registers must be used for arguments.

Some functions also return some value, this is often in the RAX register. After we've called a function we can get the return value from the RAX register.

Underneath is a little program that parrots what you input. First it prints the string "Say something: " to the screen, then it allocates 8 bytes on the stack for your input by subtracting 8 from the value of RSP. The read function takes two arguments, the first (in RDI) is the address of where to put the user's message, the second (in RSI) is the maximum number of bytes from user input to read.

main:

0xF7C lea rdi, qword ptr [say_something]

0xF83 call print

0xF88 sub rsp, 8

0xF8C mov rdi, rsp

0xF8F mov esi, 8

0xF94 call read

0xF99 mov rax, rsp

0xF9C lea rdi, qword ptr [you_said]

0xFA3 call printf

0xFA8 call exit

RDI0x00000000RSP0x00005000

When you reach the read instruction, all further execution is blocked until you press enter in the console.

When running the example, you should be able to see your input (up to 8 bytes) in the stack visualization, each character represented in their own "boxes". printf in this case takes the first format argument in RAX, so we copy the value of RSP over to RAX because the user input string is currently the last thing on the stack, and we want printf to print that string.

A bug arises when user input is allowed to exceed the number of bytes we reserved for it. In the case of the last example, if RSI had been a larger number than what we subtracted from RSP before calling read, the user could have written over some previous values on the stack.

In the get_name function of the example below we are allocating 0x10 (16) bytes from RSP (with sub rsp, 0x10), but when calling read we are instructing it in the RSI register to accept up to 0x18 (24) bytes into the memory location of RDI (which points at the stack). The value put on the stack before the 16 bytes allocated for user input is a return address from the call made to get_name in main.

secret_function:

0xF52 lea rdi, qword ptr [secret_text]

0xF59 call print

0xF5E call exit

main:

0xF63 call ask_name

0xF68 call get_name

0xF6D mov rax, rdi

0xF70 call welcome

0xF75 call exit

ask_name:

0xF7A lea rdi, qword ptr [wassa_you_name]

0xF81 call print

0xF86 ret

get_name:

0xF87 sub rsp, 0x10

0xF8B mov rdi, rsp

0xF8E mov esi, 0x18

0xF93 call read

0xF98 mov rdi, rsp

0xF9B add rsp, 0x10

0xF9F ret

welcome:

0xFA0 lea rdi, qword ptr [welcome_msg]

0xFA7 call printf

0xFAC ret

RDI0x00000000RSP0x00005000

In this example there is a function called secret_function that is not called anywhere in normal execution. However, because of the buffer overflow it is possible to "return" to that function by first writing 16 bytes of whatever you want followed by 8 bytes that represent the address you want to return to. For example, if I wanted to go directly to call exit at address 0xF75, I would input AAAAAAAAAAAAAAAA\x00\x00\x00\x00\x00\x00\x0f\x75 into the console when prompted for input. By prepending \x to a hexadecimal, you input that byte directly, so \0x41 is the ASCII representation for A, \x00 is just a null byte. The secret_function starts at address 0xf52, you should be able to edit my example input to jump to that function.

The last example is more complex than the previous examples. This program generates a random password each time it starts and requires you to input the right one to "log in". You'll find that in the enter_password function read is being given more space than was allocated on the stack. Try to bypass the login function and get to say_success.

main:

0xE80 call boot_up

0xE85 call exit

say_success:

0xE8A lea rdi, qword ptr [_loggedin_msg]

0xE91 call print

0xE96 call exit

say_wrong:

0xE9B lea rdi, qword ptr [wrong_pass_msg]

0xEA2 call print

0xEA7 ret

enter_password:

0xEA8 lea rdi, qword ptr [enter_pass_msg]

0xEAF call print

0xEB4 sub rsp, 0x10

0xEB8 mov rdi, rsp

0xEBB mov esi, 0x20

0xEC0 call read

0xEC5 lea rsi, qword ptr [password_ptr]

0xECC mov rsi, qword ptr [rsi]

0xECF mov edx, 0x10

0xED4 call strncmp

0xED9 add rsp, 0x10

0xEDD ret

boot_up:

0xEDE lea rdi, qword ptr [booting_msg]

0xEE5 call print

0xEEA call integrity_check

0xEEF lea rdi, qword ptr [welcome_msg]

0xEF6 call print

password_loop:

0xEFB call enter_password

0xF00 cmp rax, 0

0xF04 je say_success

0xF06 call say_wrong

0xF0B jmp password_loop

0xF0D ret

integrity_check:

0xF0E push rdi

0xF0F lea rdi, qword ptr [password_ptr]

0xF16 mov edi, 0x10

0xF1B call malloc

0xF20 mov rdi, rax

0xF23 call create_random_password

0xF28 lea rax, qword ptr [password_ptr]

0xF2F mov qword ptr [rax], rdi

0xF32 pop rdi

0xF33 ret

RAX0x00000000RBX0x00000000RDI0x00000000

Some of the concepts in this blog post are grossly oversimplified, and we are not talking about any mitigations to these types of attacks at all. These types of attacks have worked well in the past, but due to mitigations such as ASLR, DEP, Stack Canaries, and more recently Pointer Authentication, modern binary exploitation requires a lot more effort from the attacker.

I have not touched endianness because I only see it as adding unnecessary complexity that isn't really relevant to the exercises. In fact, the emulator actually handles endianness wrong, according to the Intel specifications.

Normally, in a situation where you can control the instruction pointer and DEP is not present (e.g. you can execute code on the stack or other writable memory) you would just write your own custom code there and jump to it. That gives you a lot of freedom in terms of how to attack, but with DEP enabled you instead have to create a ROP chain. We have another interactive tutorial on ROP that you might want to look at.

I'd love some feedback over on Twitter @bordplate. Or you can check out the code for the emulator on GitHub.

Sursa + Interactiv: https://nagarrosecurity.com/blog/interactive-buffer-overflow-exploitation

Sign In

Interactive guide to Buffer Overflow exploitation

Recommended Posts

Nytro

Interactive guide to Buffer Overflow exploitation

Join the conversation

Browse

Activity

Pages