Many modern applications today use XML documents to transfer data between clients and servers. Due to its simplicity, XML is actually great for this and is therefore very often used for representation of various data structures.
Typically a rich web application will use XML to send data from the web server to the server side. This XML document, which might contain various data structures related to the web application, is then processed on the server side. Here we can see a typical problem with untrusted input – since an attacker can control anything on the client side he can impact integrity of the XML document that is submitted to the server. Generally this should not be a problem unless the following happens.
Since the web application (on the server side) receives the XML document we just sent, it has to somehow parse it. Depending on the framework the application uses on the server side, it’s is most often (especially business applications) either a Java or a .NET application; other frameworks typically rely on libxml2.
The security problem here is in a special structure defined by the XML standard, entity. Every entity has certain content and are normally used throughout the XML document. However, one specific entity type is particularly dangerous: external entities.
External entity declaration further allows declaration of two types: SYSTEM and PUBLIC. The SYSTEM external entity is what we are interested about – it allows one to specify a URI which will be used during dereferencing to replace the entity. One example of such an entity is shown below:
Now when parsing such an XML document, wherever we have the ISCHandler entity, the parser will replace it with the contents of the retrieved URI. Pretty bad already, isn’t it? But it gets even worse – by exploiting this we can include any local file by simply pointing to it:
Or on Windows
As long as our current process has privileges to read the requested file, the XML parser will simply retrieve it and put it when it finds a reference to the ISCHandler entity (&ISCHandler;).
The impact of such a vulnerability is pretty obvious. A malicious attacker can actually do much more – just use your imagination:
- We can probably DoS the application by reading from a file such as /dev/random or /dev/zero.
- We can port scan internal IP addresses, or at least try to find internal web servers.
Obviously, probably the most dangerous “feature” is extraction of data – similarly to how we would pull data from a database with a SQL Injection vulnerability, we can read (almost) any file on the hard disk. This includes not only the password file but potentially more sensitive files such as DB connection parameters and what not. Depending on the framework, including a directory will give us even the directory’s listing, so we don’t have to blindly guess file names! Very nasty.
Everything so far is nothing new really, however in last X pentesting engagements, for some reason XXE vulnerabilities started popping up. So what is the reason for that? First of all, some libraries allow external entity definitions by default. In such cases, the developer himself has to explicitly disable inclusion of external entities. This is probably still valid for Java XML libraries while with .NET 4.0 Microsoft changed the default behavior not to allow external entities (in .NET 3.5 they are allowed).
So, while XXE will not become as dangerous as SQLi (hey, the subject was there to get your attention), we do find them more often than before which means that we should raise awareness about such vulnerabilities.
(c) SANS Internet Storm Center. http://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.