XML External Entity Injection [XXE]

1. Introduction:

This attack occurs when untrusted XML input containing a reference to an external entity is processed by a weakly configured XML parser.

It may lead to the disclosure of confidential data, denial of service, Server Side Request Forgery (SSRF), port scanning from the perspective of the machine where the parser is located, and other system impacts.

2. Typical vulnerable code:

var parserOptions = {
    noblanks: true,
    noent: true,
    nocdata: true
};

try {
    var doc = libxml.parseXmlString(data, parserOptions);
} catch (e) {
    return Promise.reject('XML parse error');
}

An XXE attack works by taking advantage of a feature in XML, namely XML eXternal Entities (XXE) that allows external XML resources to be loaded within an XML document. By submitting an XML file that defines an external entity with a file:// URI, an attacker can effectively trick the application's SAX(Simple API parser for XML) parser into reading the contents of the arbitrary file(s) that reside on the server-side filesystem.

Finally, we use the XML declaration ENTITY to load additional data from an external resource. The syntax for the ENTITY declaration is ENTITY name SYSTEM URI where URI is the full path to a remote URL or local file. In our example, we define the ENTITY tag to load the contents of "file:///etc/passwd"

This is an example of how the XML "payload" file can look like.

payload.xml
<!DOCTYPE foo [<!ELEMENT foo ANY >
<!ENTITY bar SYSTEM "file:///etc/passwd" >]>
<?xml version="1.0" encoding="UTF-8"?>

<trades>
    <metadata>
        <name>Apple Inc</name>
        <stock>AAPL</stock>
        <trader>
            <foo>&bar;</foo>
            <name>B.Bobi</name>
        </trader>
        <units>1500</units>
        <price>130</price>
        <name>Google</name>
        <stock>GOOGL</stock>
        <trader>
            <name>B.Bobi</name>
        </trader>
        <units>1500</units>
        <price>130</price>
    </metadata>
</trades>

When this XML document is processed by a vulnerable parser, such as the one presented above, any instances of &bar; will get replaced by the contents of /etc/passwd file. Simply put the XML parser was tricked into accessing a resource that the application developers did not intend to be accessible, in this case, a file on the local file system of the remote server. Because of this vulnerability, any file on the remote server (or more precisely, any file that the web server has read access to) could be obtained.

Unfortunately, the SAX parser has not been configured to deny the loading of external entities ( DOCTYPE declarations ), which when specified within our modified payload.xml file can be abused by an attacker to read arbitrary system files on the remote server.

Because user-supplied XML input comes from an "untrusted source"(namely, the client) it is very difficult to properly validate the XML document to prevent this type of attack.

3. Mitigation:

2.1. Disabling inline DTD:

Since we know that the parser should technically not allow loading external entities( any kind of DOCTYPE declaration), we can have the XML parser configured to only use a locally defined Document Type Definition (DTD) and disallow any inline DTD that is specified within a user-supplied XML document(s). Due to the fact that there are numerous XML parsing engines available, each has its own mechanism for disabling inline DTD to prevent XXE. You may need to search your XML parser's documentation for how to "disable inline DTD" specifically.

Here are some examples of how you may disable the inline DTD parsing.

var parserOptions = {
    noblanks: true,
    noent: false, // doesn't allow DOCTYPE declarations
    nocdata: true
};

try {
    var doc = libxml.parseXmlString(data, parserOptions);
} catch (e) {
    return Promise.reject('XML parse error');
}

2.2. Follow the principle of least privilege:

The threat of XXE attacks entirely illustrates the importance of following the principle of least privilege, which states that software components and processes should be granted the minimal set of permissions required to perform their tasks.

Since there are rarely good reasons for an XML parser to make outbound network request, consider locking down outbound network requests for your web server as a whole. If you do need outbound network access: for example, if your server code calls third-party APIs—you should whitelist the domains of those APIs in your firewall rules.

3. Takeaways:

It is common knowledge that DTDs are legacy technology, thus allowing for inline DTDs is ALWAYS A BAD IDEA! Modern XML parsers are hardened by default, and because of this, using such frameworks or parsers means that you might already be protected against attacks.

You can find more details about this topic here:

Last updated