ropasaurusrex: a primer on return-oriented programming

Nytro · May 17, 2013

ropasaurusrex: a primer on return-oriented programming

One of the worst feelings when playing a capture-the-flag challenge is the hindsight problem. You spend a few hours on a level—nothing like the amount of time I spent on cnot, not by a fraction—and realize that it was actually pretty easy. But also a brainfuck. That's what ROP's all about, after all!

Anyway, even though I spent a lot of time working on the wrong solution (specifically, I didn't think to bypass ASLR for quite awhile), the process we took of completing the level first without, then with ASLR, is actually a good way to show it, so I'll take the same route on this post.

Before I say anything else, I have to thank HikingPete for being my wingman on this one. Thanks to him, we solved this puzzle much more quickly and, for a short time, were in 3rd place worldwide!

Coincidentally, I've been meaning to write a post on ROP for some time now. I even wrote a vulnerable demo program that I was going to base this on! But, since PlaidCTF gave us this challenge, I thought I'd talk about it instead! This isn't just a writeup, this is designed to be a fairly in-depth primer on return-oriented programming! If you're more interested in the process of solving a CTF level, have a look at my writeup of cnot.

What the heck is ROP?

ROP—return-oriented programming—is a modern name for a classic exploit called "return into libc". The idea is that you found an overflow or other type of vulnerability in a program that lets you take control, but you have no reliable way get your code into executable memory (DEP, or data execution prevention, means that you can't run code from anywhere you want anymore).

With ROP, you can pick and choose pieces of code that are already in sections executable memory and followed by a 'return'. Sometimes those pieces are simple, and sometimes they're complicated. In this exercise, we only need the simple stuff, thankfully!

But, we're getting ahead of ourselves. Let's first learn a little more about the stack! I'm not going to spend a ton of time explaining the stack, so if this is unclear, please check out my assembly tutorial.

The stack

I'm sure you've heard of the stack before. Stack overflows? Smashing the stack? But what's it actually mean? If you already know, feel free to treat this as a quick primer, or to just skip right to the next section. Up to you!

The simple idea is, let's say function A() calls function B() with two parameters, 1 and 2. Then B() calls C() with two parameters, 3 and 4. When you're in C(), the stack looks like this:

+----------------------+
|         ...          | (higher addresses)
+----------------------+

+----------------------+ <-- start of 'A's stack frame
|   [return address]   | <-- address of whatever called 'A'
+----------------------+
|   [frame pointer]    |
+----------------------+
|   [local variables]  |
+----------------------+

+----------------------+ <-- start of 'B's stack frame
|         2 (parameter)|
+----------------------+
|         1 (parameter)|
+----------------------+
|   [return address]   | <-- the address that 'B' returns to
+----------------------+
|   [frame pointer]    |
+----------------------+
|   [local variables]  |
+----------------------+

+----------------------+ <-- start of 'C's stack frame
|         4 (parameter)|
+----------------------+
|         3 (parameter)|
+----------------------+
|   [return address]   | <-- the address that 'C' returns to
+----------------------+

+----------------------+
|         ...          | (lower addresses)
+----------------------+

This is quite a mouthful (eyeful?) if you don't live and breathe all the time at this depth, so let me explain a bit. Every time you call a function, a new "stack frame" is built. A "frame" is simply some memory that the function allocates for itself on the stack. In fact, it doesn't even allocate it, it just adds stuff to the end and updates the esp register so any functions it calls know where its own stack frame needs to start (esp, the stack pointer, is basically a variable).

This stack frame holds the context for the current function, and lets you easily a) build frames for new functions being called, and return to previous frames (i.e., return from functions). esp (the stack pointer) moves up and down, but always points to the top of the stack (the lowest address).

Have you ever wondered where a function's local variables go when you call another function (or, better yet, you call the same function again recursively)? Of course not! But if you did, now you'd know: they wind up in an old stack frame that we return to later!

Now, let's look at what's stored on the stack, in the order it gets pushed (note that, confusingly, you can draw a stack either way; in this document, the stack grows from top to bottom, so the older/callers are on top and the newer/callees are on the bottom):

Parameters: The parameters that were passed into the function by the caller—these are extremely important with ROP.
Return address: Every function needs to know where to go when it's done. When you call a function, the address of the instruction right after the call is pushed onto the stack prior to entering the new function. When you return, the address is popped off the stack and is jumped to. This is extremely important with ROP.
Saved frame pointer: Let's totally ignore this. Seriously. It's just something that compilers typically do, except when they don't, and we won't speak of it again.
Local variables: A function can allocate as much memory as it needs (within reason) to store local variables. They go here. They don't matter at all for ROP and can be safely ignored.

So, to summarize: when a function is called, parameters are pushed onto the stack, followed by the return address. When the function returns, it grabs the return address off the stack and jumps to it. The parameters pushed onto the stack are removed by the calling function, except when they're not. We're going to assume the caller cleans up, that is, the function doesn't clean up after itself, since that's is how it works in this challenge (and most of the time on Linux).

Heaven, hell, and stack frames

The main thing you have to understand to know ROP is this: a function's entire universe is its stack frame. The stack is its god, the parameters are its commandments, local variables are its sins, the saved frame pointer is its bible, and the return address is its heaven (okay, probably hell). It's all right there in the Book of Intel, chapter 3, verses 19 - 26 (note: it isn't actually, don't bother looking).

Let's say you call the sleep() function, and get to the first line; its stack frame is going to look like this:

          ...            <-- don't know, don't care territory (higher addresses)
+----------------------+
|      [seconds]       |
+----------------------+
|   [return address]   | <-- esp points here
+----------------------+
          ...            <-- not allocated, don't care territory (lower addresses)

When sleep() starts, this stack frame is all it sees. It can save a frame pointer (crap, I mentioned it twice since I promised not to; I swear I won't mention it again) and make room for local variables by subtracting the number of bytes it wants from esp (ie, making esp point to a lower address). It can call other functions, which create new frames under esp. It can do many different things; what matters is that, when it sleep() starts, the stack frame makes up its entire world.

When sleep() returns, it winds up looking like this:



          ...            <-- don't know, don't care territory (higher addresses) +----------------------+ |      [seconds]       | <-- esp points here +----------------------+ | [old return address] | <-- not allocated, don't care territory starts here now +----------------------+           ...            (lower addresses)

And, of course, the caller, after sleep() returns, will remove "seconds" from the stack by adding 4 to esp (later on, we'll talk about how we have to use pop/pop/ret constructs to do the same thing).

In a properly working system, this is how life works. That's a safe assumption. The "seconds" value would only be on the stack if it was pushed, and the return address is going to point to the place it was called from. Duh. How else would it get there?

Controlling the stack

...well, since you asked, let me tell you. We've all heard of a "stack overflow", which involves overwriting a variable on the stack. What's that mean? Well, let's say we have a frame that looks like this:

          ...            <-- don't know, don't care territory (higher addresses)
+----------------------+
|      [seconds]       |
+----------------------+
|   [return address]   | <-- esp points here
+----------------------+
|     char buf[16]     |
|                      |
|                      |
|                      |
+----------------------+
          ...            (lower addresses)

The variable buf is 16 bytes long. What happens if a program tries to write to the 17^th byte of buf (i.e., buf[16])? Well, it writes to the last byte—little endian—of the return address. The 18^th byte writes to the second-last byte of the return address, and so on. Therefore, we can change the return address to point to anywhere we want. Anywhere we want. So when the function returns, where's it go? Well, it thinks it's going to where it's supposed to go—in a perfect world, it would be—but nope! In this case, it's going to wherever the attacker wants it to. If the attacker says to jump to , it jumps to 0 and crashes. If the attacker says to go to 0x41414141 ("AAAA"), it jumps there and probably crashes. If the attacker says to jump to the stack... well, that's where it gets more complicated...

DEP

Traditionally, an attacker would change the return address to point to the stack, since the attacker already has the ability to put code on the stack (after all, code is just a bunch of bytes!). But, being that it was such a common and easy way to exploit systems, those assholes at OS companies (just kidding, I love you guys ) put a stop to it by introducing data execution prevention, or DEP. On any DEP-enabled system, you can no longer run code on the stack—or, more generally, anywhere an attacker can write—instead, it crashes.

So how the hell do I run code without being allowed to run code!?

Well, we're going to get to that. But first, let's look at the vulnerability that the challenge uses!

The vulnerability

Here's the vulnerable function, fresh from IDA:

 1   .text:080483F4vulnerable_function proc near
 2   .text:080483F4
 3   .text:080483F4buf             = byte ptr -88h
 4   .text:080483F4
 5   .text:080483F4         push    ebp
 6   .text:080483F5         mov     ebp, esp
 7   .text:080483F7         sub     esp, 98h
 8   .text:080483FD         mov     dword ptr [esp+8], 100h ; nbytes
 9   .text:08048405         lea     eax, [ebp+buf]
10   .text:0804840B         mov     [esp+4], eax    ; buf
11   .text:0804840F         mov     dword ptr [esp], 0 ; fd
12   .text:08048416         call    _read
13   .text:0804841B         leave
14   .text:0804841C         retn
15   .text:0804841Cvulnerable_function endp

Now, if you don't know assembly, this might look daunting. But, in fact, it's simple. Here's the equivalent C:

1   ssize_t __cdecl vulnerable_function()
2   {
3     char buf[136];
4     return read(0, buf, 256);
5   }

ARTICOL COMPLET:

http://www.skullsecurity.org/blog/2013/ropasaurusrex-a-primer-on-return-oriented-programming

Edited May 17, 2013 by Nytro

Sign In

ropasaurusrex: a primer on return-oriented programming

Recommended Posts

Nytro

Join the conversation

Browse

Activity

Pages