Modifying a Binary File?

Nytro · July 20, 2013

Modifying a Binary File?

In this post I will be talking about how I cracked one of the CTF challenges during the final semester in my university. The CTF competition was part of the Network Security course and was held for two days over a weekend in spring. Those two days were one of the most exciting times I have spent in front of a computer. The adrenaline rush of solving problems and getting points awarded is something that only the participants will know about.

So, coming to the challenge I'm talking about. Given were: a program binary that provides encryption/decryption services, an encrypted message and the key that was used to encrypt the message. The challenge was to decrypt the encrypted message. But.... Decryption capability in the program binary was disabled! The program took the following command line arguments "Usage: ./crypt INPUT-FILE OUTPUT-FILE KEY (DECRYPT|ENCRYPT)". So if we mentioned "DECRYPT" it would output "NO DECRYPT ENABLED !!!".

Okay, so I opened up the GNU debugger gdb. Started stepping through code to see where the command line arguments were processed. I observed that the function names were still preserved, although they were mangled because the C++ compiler implements name mangling. There were two functions, '_Z7decryptSsSs' and '_Z7encryptSsSs' apart from the 'main' function. Stepping one instruction at a time through the main function I found the code where the command line argument was being checked for DECRYPT/ENCRYPT. Here it is:

Observe that the code is checking only whether the fifth argument's first character is 'D' or not. D is hex 44h. So if it is 'D' then register al is set and then the stack variable at [ebp-0x239] is also copied the value of al. The code then checks whether this value is zero, meaning 'E' was the first character of the fifth parameter, and then jumps based on this comparison. If the jump is not taken then the code calls std::cout to print the "NO DECRYPT ENABLED !!!" message and exits. When "ENCRYPT" was passed, jump was taken and the stack variable [ebp-0x236] was being used further down in the code when the decision was made to call either _Z7encryptSsSs() or _Z7decryptSsSs(). So the basic steps the code was taking is shown in this pseudo-code:

main()

{

// x = ebp-0x239

al = (argv[4][1] == 'D')? 1 : 0;

x = al;

if(x == 0) goto do_not_exit;

std::cout << "NO DECRYPT ENABLED !!!\n";

exit_program

do_not_exit:

read input file

read key file

open output file

if(x == 1)

call _Z7decryptSsSs()

else

call _Z7encryptSsSs()

exit_program

}

As you can see there are two places where the control flow changes. Two ideas came to my mind on that day.

Control the stack variable 'x' so that the correct branches are taken even though "DECRYPT" is passed.
Change the binary file so that the call to the encrypt() function gets replaced with a call to the decrypt() function.

I chose the first option since it was easier. However I didn't quite succeed because the [ebp-0x239] was being over-written when some library functions were called. GDB has the functionality that enables you to monitor a memory location and break whenever a read/write operation is performed from/to that memory location. I tried using this but got tangled up in multiple places where this memory area was accessed. It was taking too much time. I then thought of the second option. For changing the binary file, all I had to do was to modify the target of the call instruction to point to the decrypt function instead of the encrypt function. The intel call instruction takes the target as an offset which means the call instruction simply does this to call a function: EIP = EIP+offset. See the disassembly below:

The call instruction at 0x080494ee is "e3 e1 f8 ff ff". Call opcode is 0xe3 and the code_offset is 0xfffff8e1. All we have to do is replace the immediate value 0xfffff8e1 by the new offset 0xfffffa1b. This will be the new offset that the call instruction will use. Simple enough right? Well... sort of. This binary was a 32bit ELF binary. We can use "objdump -h" to find out the offset of important sections of a binary file. What we are interested in is the offset of the .text section. The output of "objdump -h" (only .text information shown here):

The text section starts at offset 0xd20 in the binary file and its virtual address is 0x08048d20. The call instruction is at 0x08484ee. So, 0x08494ee - 0x08048d20 = 0x7ce. This is the distance between in the start of text section and the call to encryption routine instruction. Therefore, the call instruction is at offset 0xd20 + 0x7ce = 0x14ee in the binary file. The call instruction format is "opcode(1byte) code_offset(4bytes)". So we have to replace the integer(4bytes) at offset 0x14ee+1 with 0xfffffa1b. I wrote a program to do this by memory-mapping the binary and replacing the code_offset. Of course, we can simple open the file, seek to 0x14ee+1 and write an integer there but the reason I used memory mapping is because I wanted to see if I was modifying the correct place in the binary file and memory-mapping enables you to do this easily since you can view the file contents in the memory window of a debugger. Small snippet of the code to make the modification:

WCHAR *szFile = L"C:\\Users\\Shishir\\CTF\\kes\\crypt";

WCHAR *szOutFile = L"C:\\Users\\Shishir\\CTF\\kes\\crypt_new";

HANDLE hFile = NULL, hFileObj = NULL, hFileView = NULL;

DWORD dwFileSize = 0;

wprintf_s(L"Using input file: %s\n", szFile);

if( ! fOpenAndMapFile(szFile, &hFile, &hFileObj,

&hFileView, &dwFileSize) )

{

wprintf_s(L"Error opening input file\n");

return 1;

}

unsigned char *fptr = (unsigned char*)hFileView;

// Seek to offset where the call instruction's code_offset is

fptr += 0x14ee + 1;

unsigned int *iptr = (unsigned int*)fptr;

int old_data = *iptr;

int new_data = 0xfffffa1b;

*iptr = new_data;

Once this was done, I took the new modified binary and did an objdump of it and the call instruction at 0x08494ee was pointing to the decryption routine _Z7decryptSsSs() now!! Note the virtual address, 0x080494ee, is where the call instruction is in the following modified code and in the previous screenshot.

Executed the program by giving "ENCRYPT" as the fifth argument and got the decrypted message in the output file! Now, this was possible because the decryption and encryption routines both had the same function signatures - same arguments especially - so the code that setup arguments on the stack before the call to _Z7encryptSsSs() did not have to be changed.

Well, after submitting the flag and getting rewarded handsomely with points, I spoke to a fellow student who did it in a different, more easier, way. GDB allows you to control the contents of any register during debugging. So all you have to do is to make sure you break the program before each of the branching decision points and change the value of EIP register so that the correct branch is taken even though you specify "DECRYPT" at the command line. Anyway, it was fun playing around with gdb, objdump and the binary file even though my method was more complicated!

This actually brings us to the topic of packers. A packer, in the context of computers and security, is a program that can compress and/or encrypt a binary file that is provided as input. This is how many of today's software programs are distributed in order to protect Intellectual Property. The compressed/encrypted binary is the one that is distributed to consumers. This supposedly prevents competitors and crackers from reverse engineering the binary and stealing the code implemented by a company. You might ask how the OS manages to run these 'packed' programs. When a binary file is packed, the unpacker is included as part of the final output of the packer. So executing the output binary file actually first invokes the unpacker which has the logic and information to unpack - decompress and/or decrypt - the rest of the binary and then execute the unpacked program. The unpacked program is usually run by creating a child process by the unpacker. UPX is a popular open-source packer.

Even with packed a binary, we can still read the original code although it is a little more difficult. This difficulty arises because many packers are clever enough to not unpack everything at once. Unpacking may be done 'on-demand' and in-memory. So the challenge now is to determine when the unpacking process is finished and at this point we can dump the process's memory to disk for later analysis. The dump will hopefully have the whole binary in unpacked form to facilitate disassembly. This is why one must never ever depend on the fact that the source code will never be leaked. Remember, security through obscurity never works. The topic of packers and techniques for anti-packing and vice-versa is very interesting. I will write about it may be in a later post. That's it for now!

Posted by Shishir Prasad at 01:31

Sursa: Ring0 - The Inner Circle: Modifying a Binary File?

Sign In

Modifying a Binary File?

Recommended Posts

Nytro

Join the conversation

Browse

Activity

Pages