Writing Linux programs in raw binary

Nytro · March 13, 2009

cat > a.out

by G-Brain

C

Let's begin with Linux system calls. A system call is a request made

by a program to the operating system for performing certain

tasks. System calls provide the interface between a process and the

operating system.

A good example of a Linux system call is _exit:

void _exit(int status)

The function _exit() terminates the calling process "immediately". Any

open file descriptors belonging to the process are closed; any

children of the process are inherited by process 1, init, and the

process's parent is sent a SIGCHLD signal.

The value of status is returned to the parent process as the process's

exit status.

In a C program, you could use _exit like this:

_exit(0)

Ending the program with a status of 0, indicating success.

Another example of a system call is write:

ssize_t write(int fd, const void *buf, size_t count)

write() writes up to count bytes from the buffer pointed buf to the

file referred to by the file descriptor fd.

On success, the number of bytes written is returned (zero indicates

nothing was written). On error, -1 is returned, and errno is set

appropriately.

Here's how you'd use write() from a C program:

write(1,"Test\n",5)

There are 3 standard POSIX file descriptors (Linux complies to this

part of the POSIX standard):

0 = Standard Input (stdin)

1 = Standard Output (stdout)

2 = Standard Error (stderr)

So what the above line of code would do, is write "Test\n" up to the

5th byte to file descriptor 1, standard output.

That should explain how system calls work.

System call table:

http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html

To get system call documentation, use

man 2 syscall

For example:

man 2 write

Here's a C program using the two syscalls we learned:

syscall.c

#include <unistd.h>

int main()

{

write(1,"Test\n",5);

_exit();

}

To compile:

$ gcc -o syscall syscall.c

To see what system calls are being made, use strace:

$ strace ./syscall

execve("./syscall", ["./syscall"], [/* 42 vars */]) = 0

brk(0) = 0x804a000

access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

open("/etc/ld.so.cache", O_RDONLY) = 3

fstat64(3, {st_mode=S_IFREG|0644, st_size=136536, ...}) = 0

mmap2(NULL, 136536, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7fc2000

close(3) = 0

open("/lib/libc.so.6", O_RDONLY) = 3

read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360d\1"..., 512) = 512

fstat64(3, {st_mode=S_IFREG|0755, st_size=1575187, ...}) = 0

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fc1000

mmap2(NULL, 1357360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e75000

mmap2(0xb7fbb000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x146) = 0xb7fbb000

mmap2(0xb7fbe000, 9776, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7fbe000

close(3) = 0

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e74000

set_thread_area({entry_number:-1 -> 6, base_addr:0xb7e746c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0

mprotect(0xb7fbb000, 4096, PROT_READ) = 0

munmap(0xb7fc2000, 136536) = 0

write(1, "Test\n", 5Test

) = 5

exit_group(0) = ?

Process 3492 detached

Never mind that stuff at the top, you can see our two system calls

being executed at the bottom.

Assembler

Try opening the executable we created above (syscall) in a hex

editor. It's huge, and it's full of stuff we don't need. Surely, we

could use some GCC flags to make it smaller, but to really

understand what's going on, we'll have to write our stuff in

assembler.

syscall2.asm

format ELF executable

entry _start

segment readable executable

_start:

mov al, 4

mov bl, 1

mov ecx, message

mov dl, messageLen

call 0xffffe414

mov al, 1

mov bl, 0

call 0xffffe414

segment readable writable

message db 'Test',0x0a

messageLen = $-message

Which can be assembled using the following command:

$ fasm syscall2.asm

Note that we're using fasm, the flat assembler

(http://www.flatassembler.net) because it produces neat code, and

doesn't clutter our executables like nasm does.

Let's go through the code:

format ELF executable

entry _start

We want an ELF executable, and we want it to start at _start.

segment readable executable

A.K.A section .text. This tells the assembler that everything under

this line will be readable and executable (=code) unless stated

otherwise (with a new "segment" instruction).

_start:

This is the entry point of our program.

mov al, 4

mov bl, 1

mov ecx, message

mov dl, messageLen

call 0xffffe414

Woah, what's that? I'll tell you what it is:

write(1,"Test\n",5)

The syscall number for write() is 4, 1 is standard output, message

is "Test\n", messageLen is 5, and call 0xffffe414 calls the kernel.

So what we do is, we put the syscall number in the al register, the

arguments in the other registers and then we call the kernel with

call 0xffffe414. Pretty easy.

Now, the memory location 0xffffe414 might need a bit of explanation:

Since Linux 2.5.53 there is a fixed page, called the vsyscall page,

filled by the kernel.

At kernel initialization time the routine sysenter_setup() is called.

It sets up a non-writable page and writes code for the sysenter

instruction if the CPU supports that, and for the classical int 0x80

otherwise. Thus, the C library can use the fastest type of system

call by jumping to a fixed address in the vsyscall page.

The vsycall page is mapped in the memory of every process at

0xffffe000-0xffffefff. To read the vsyscall page:

get_vsyscall_page.c

#include <unistd.h>

#include <string.h>

int main()

{

char *p = (char *) 0xffffe000;

char buf[4096];

memcpy(buf, p, 4096);

write(1, buf, 4096);

return 0;

}

$ gcc -o get_vsyscall_page get_vsyscall_page.c

$ ./get_vsyscall_page > vsyscall_page

$ objdump -d vsyscall_page

syscall_page: file format elf32-i386

Disassembly of section .text:

ffffe400 <__kernel_sigreturn>:

ffffe400: 58 pop %eax

ffffe401: b8 77 00 00 00 mov $0x77,%eax

ffffe406: cd 80 int $0x80

ffffe408: 90 nop

ffffe409: 8d 76 00 lea 0x0(%esi),%esi

ffffe40c <__kernel_rt_sigreturn>:

ffffe40c: b8 ad 00 00 00 mov $0xad,%eax

ffffe411: cd 80 int $0x80

ffffe413: 90 nop

ffffe414 <__kernel_vsyscall>:

ffffe414: 51 push %ecx

ffffe415: 52 push %edx

ffffe416: 55 push %ebp

ffffe417: 89 e5 mov %esp,%ebp

ffffe419: 0f 34 sysenter

ffffe41b: 90 nop

ffffe41c: 90 nop

ffffe41d: 90 nop

ffffe41e: 90 nop

ffffe41f: 90 nop

ffffe420: 90 nop

ffffe421: 90 nop

ffffe422: eb f3 jmp ffffe417 <__kernel_vsyscall+0x3>

ffffe424: 5d pop %ebp

ffffe425: 5a pop %edx

ffffe426: 59 pop %ecx

ffffe427: c3 ret

As you can see, on my system __kernel_vsyscall is at memory location

0xffffe414. This is what we'll use to call the kernel. If the

address is different on your system, use that instead.

Let's move on:

mov al, 1

mov bl, 0

call 0xffffe414

System call 1 is exit, it's first argument status is 0, so we get:

_exit(0)

Makes sense, right?

On with the show:

segment readable writable

message db 'Test',0x0a

messageLen = $-message

readable, writable = data

db = define byte

$ = the current address.

Define message as an array of bytes.

Define messageLen as the current address minus the address of

message. This is a cool trick to calculate string length.

Now, let's strace our program to see how awesome it is:

$ fasm syscall2.asm

$ strace ./syscall2

execve("./syscall2", ["./syscall2"], [/* 43 vars */]) = 0

write(1, "Test\n", 5Test

) = 5

_exit(0) = ?

Process 3627 detached

Two beautiful syscalls.

Hexadecimal

Let's take a look at the syscall2 executable we produced in

hexadecimal. I'll be using emacs' hexl-mode. Use whatever you like.

87654321 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789abcdef

00000000: 7f45 4c46 0101 0100 0000 0000 0000 0000 .ELF............

00000010: 0200 0300 0100 0000 7480 0408 3400 0000 ........t...4...

00000020: 0000 0000 0000 0000 3400 2000 0200 2800 ........4. ...(.

00000030: 0000 0000 0100 0000 7400 0000 7480 0408 ........t...t...

00000040: 7480 0408 1900 0000 1900 0000 0500 0000 t...............

00000050: 0010 0000 0100 0000 8d00 0000 8d90 0408 ................

00000060: 8d90 0408 0500 0000 0500 0000 0600 0000 ................

00000070: 0010 0000 b004 b301 b98d 9004 08b2 05e8 ................

00000080: 9063 fbf7 b001 b300 e887 63fb f754 6573 .c........c..Tes

00000090: 740a t.

Now what the hell is that? Well, actually, it's not that hard.

You just need to have the right documents.

The outer parts are added by hexl-mode, they indicate the address of

each byte.

The first part is just the ELF header, up to 0x74, where our actual

program is loaded:

b004 b301 b98d 9004 08b2 05e8 9063

fbf7 b001 b300 e887 63fb f754 6573

740a

That's it. That's our whole program. Seriously.

Let's try translating it back to assembler:

Reading the Intel Software Developers Manual Volume 2A

(http://download.intel.com/design/processor/manuals/253666.pdf)

Appendix A: Opcode map, we discover the following:

b0 means: move immediate byte into the AL register (referring to the

next byte, 04)

b0 04

mov al, 4

b3 means: move immediate byte into BL register

b3 01

mov bl, 1

b9 means: move immediate word or double into the eCX register

(referring to 8d 90 04 08)

8d 90 04 08 is the address of message. It's reversed because I'm on a

little-endian architecture. 0x0804908d is where our data is loaded,

and message is the first piece of data, so it's at offset 0, which

is address 0x0804908d again.

b9 8d 90 04 08

mov ecx, message

b2 means: move immediate byte into DL register

b2 05

mov dl, messageLen

And to top it off... call the kernel!

e8 means: call the next offset to be added to the instruction pointer

register.

e8 90 63 fb f7

call 0xffffe414

Now how does 90 63 fb f7 translate to 0xffffe414? Firstly, my byte order

is little endian, so the actual address is 0xf7fb6390 (putting the bytes

in reverse order). How do we get to this number? We take the address we

want to call, 0xffffe414 and we subtract it by the instruction pointer

(the starting point of our program, 0x08048074 plus the size of the

instructions so far, which is 0x10, resulting in 0x08048084).

So:

(addr - ip)

(0xffffe414 - (0x08048074 + 0x10)) = 0xf7fb6390

One more time from the beginning:

write(1,"Test\n",5);

mov al, 4

mov bl, 1

mov ecx, message

mov dl, messageLen

call 0xffffe414

b0 04

b3 01

b9 8d 90 04 08

b2 05

e8 90 63 fb f7

It makes perfect sense!

Now, for exiting:

exit(0);

mov al, 1

mov bl, 0

call 0xffffe414

b0 01

b3 00

e8 87 63 fb f7

Note that these are 9 bytes, so the instruction pointer increases by 9,

resulting in:

(0xffffe414 - (0x08048074 + 0x19)) = 0xf7fb6387

As the address to call.

And the last part....

54 65 73 74 0a

Test\n

Now the whole thing one more time:

write(1,"Test\n",5);

exit(0);

mov al, 4

mov bl, 1

mov ecx, message

mov dl, messageLen

call 0xffffe414

mov al, 1

mov bl, 0

call 0xffffe414

message db 'Test',0x0a

messageLen = $-message

b0 04

b3 01

b9 8d 90 04 08

b2 05

e8 90 63 fb f7

b0 01

b3 00

e8 87 63 fb f7

54 65 73 74 0a

You can read hexadecimal!

Binary

At last, you will find out how to write programs in

binary. Hexadecimal is actually shorthand for binary, so the right

numbers are already there, we just have to convert from base 16

(hex) to base 2 (bi).

Here's a table:

0: 0000

1: 0001

2: 0010

3: 0011

4: 0100

5: 0101

6: 0110

7: 0111

8: 1000

9: 1001

A: 1010

B: 1011

C: 1100

D: 1101

E: 1110

F: 1111

So let's try to convert "Test\n" to binary. Here's it in hexadecimal:

54 65 73 74 0a

5 hexadecimal is 0101 binary.

4 hexadecimal is 0100 binary.

54 hexadecimal is 0101 0100 binary!

The full string:

T e s t \n

5 4 6 5 7 3 7 4 0 a

0101 0100 0110 0101 0111 0011 0111 0100 0000 1010

It's that simple! You can just convert all the numbers individually.

For more about base conversion, Google it.

Converting our entire program to binary is simple:

b0 04

b3 01

b9 8d 90 04 08

b2 05

e8 90 63 fb f7

b0 01

b3 00

e8 90 63 fb f7

54 65 73 74 0a

1011 0000 0000 0100

1011 0011 0000 0001

1011 1001 1000 1101 1001 0000 0000 0100 0000 1000

1011 0010 0000 0101

1110 1000 1001 0000 0110 0011 1111 1011 1111 0111

1011 0000 0000 0001

1011 0011 0000 0000

1110 1000 1001 0000 0110 0011 1111 1011 1111 0111

0101 0100 0110 0101 0111 0011 0111 0100 0000 1010

And that's all there is to it! Of course, a sequence of bits like that

is unmaintainable, but now you know: how it works.

Comments? Suggestions? Drop me a line at g-brain@g-brain.net.

Sign In

Writing Linux programs in raw binary

Recommended Posts

Nytro

Join the conversation

Browse

Activity

Pages