Thursday, August 1, 2019

XML External Entity - Overview - Part I

XXE Attack -

Hi Techies! XML External Entity is one of the vulnerabilities in OWASP Top 10 list. Let us understand more about this vulnerability. This blog will give you a basic understanding of Extensible Markup Language (XML) required for the attack, different types of XXE attacks with a practical demonstration for each type in later parts. Before we move further, we need to understand the basics of XML(Extensible Markup Language).

What is XML?

Extensible Markup Language (XML) is similar to HyperText Markup Language(HTML) but XML and HTML were designed with different goals. Table 1 shows the difference between HTML and XML.

Sl No.
HTML(HyperText Markup Language)
XML(Extensible Markup Language)
1
Designed to display data that focuses on how data looks
Designed to transport and store data that  focuses on what data is
2
It is a Markup language
It is a framework to define markup languages
3
Own predefined tags
Custom defined tags
4
Presentation language
Neither a presentation language nor a programming language
5
Not case sensitive
Case sensitive
6
Does not preserve whitespaces
Preserves whitespaces
7
Closing tag is not mandatory
Closing tag is mandatory
8
Static as it displays data
Dynamic as it transports data
Table 1

A Sample Tree Structure followed by XML is shown below. XML documents are formed as element trees. An XML Tree starts with root element which includes the child and subchild elements. All elements can have sub-elements (child elements).


<root>
<child>
<subchild>------</subchild>
</child>
</root>

The terms parent, child, and sibling are used to describe the relationships between elements. 

A simple example of an XML document is shown below which describes a student. The first line describes the XML version and character encoding. The second line is the root element of the document i.e., <student>. The <student> elements have 3 child elements named <name>, <DOB>, <address> and the last line ends with the student element.

<?xml version="1.0" encoding="UTF-8"?>
<student>
<student category="Class X">
<name>Rohan</name>
<DOB> 01-04-2005</DOB>
<address> New Delhi </address>
</student>

XML - DTD (Document Type Definition)

The XML Document Type Declaration, commonly known as DTD, is a way to describe XML language precisely. An XML document is called "well-formed" if it contains the correct syntax. A well-formed and valid XML document is one which has been validated against DTD. An application can use DTD to verify that the XML data is valid. If the DTD is declared inside XML file, it must be wrapped inside the <!DOCTYPE> definition as shown below:


<?xml version="1.0"?>
<!DOCTYPE student[
<!ELEMENT student (name, DOB, address)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT DOB (#PCDATA)>
<!ELEMENT address (#PCDATA)>
]>

<student>
<name>Rohan</name>
<DOB> 01-04-2005</DOB>
<address> New Delhi </address>
</student>
  • !DOCTYPE - defines the root element of this document i.e. student.
  • !ELEMENT - defines that student must contain 3 elements.
  • #PCDATA - Datatype

Note: <student> element will contain only 3 element i.e. ‘name’, ‘DOB’, ‘address’.

XML - External DTD (Document Type Definition)


External DTDs are useful for creating a common DTD that can be shared between multiple documents. In an External DTD, a DTD file is defined in the XML code which contains the details wrapped inside the <!DOCTYPE> definition. 
Below is an example fo External DTD:


<?xml version="1.0"?>
<!DOCTYPE student SYSTEM "student.dtd">
<student>
<name>Rohan</name>
<DOB> 01-04-2005</DOB>
<address> New Delhi </address>
</student>

The “student.dtd” file will contain the following information.


<!ELEMENT student (name, DOB, address)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT DOB (#PCDATA)>
<!ELEMENT address (#PCDATA)>

The example above shows that the SYSTEM identifier is being used to call “student.dtd” file and the “student.dtd” file contains the Elements. Changes made to the external DTD are automatically updated in all the documents that reference it. 

XML - DTD (Document Type Definition) - Entity Declaration

In an XML Document, entities are used to define shortcuts to special characters. The below example shows that the system identifier is called using ENTITY.

Example 1 - (Internal Entity)
An entity declared within a DTD is called as an internal entity. Here, entity_name is the name of the entity, i.e., "name" followed by its value within the double quotes or single quote and entity_value holds the value for the entity name i.e., "Rohan."


<!ENTITY name "Rohan">

<student>
<name> &name; </name>
</student>

Example 2 - (External Entity)
An entity declared outside a DTD is called as an external entity. External Entity can be referred by either using system identifiers or public identifiers.


<!ENTITY student SYSTEM "https://www.mysite.com/student.dtd">

<student> &student; </student>

I hope you got some basic understanding of how XML works. This is all you need to know about XML to move further with XXE Attack. Now let us understand how an attacker can take advantage of XML Parser.

What is XML External Entity?


XML External Entity (XXE) is an attack that can be performed on the servers parsing XML inputs allowing an attacker to cause Denial of Service (DOS) and access local or remote files and services, by abusing a widely available, rarely used feature in XML parsers. This attack occurs when a weakly configured XML parser processes XML input containing a reference to an external entity. This attack can compromise the CIA (Confidentiality, Integrity, Availability) of an application and lead to the disclosure of local files, denial of service attacks, server-side request forgery, remote code execution, and other system impacts.

Types of Attack -

Working of XXE Injection

In XXE Injection, an attacker takes advantage by embedding malicious inline DOCTYPE definition in the XML data. When the webserver processes the malicious XML input, the entities are expanded, which results in potentially gaining access to a web server's file system, remote file system access, or establishing connections to arbitrary hosts over HTTP/HTTPS. Diagram 1 explains the working of XML External Entity Attack.

Diagram 1
The above diagram explains how an attacker can use External DTD to gain access to the local files. An attacker sends a malicious request to the webserver. The webserver processes the XML input and requests for the DTD file to the attacker's server as embedded in the payload. The attacker's server sends malicious DTD file, which is processed by the server, and the response is sent to the attacker's server, as shown in Diagram 1. 
Don't worry if you are confused or did not understand the working clearly. The working has been explained in more detail in XML External Entities - Out of band (HTTP) - Part III. So you can directly read XML External Entities - Out of band (HTTP) - Part III or go through all the parts in detail.

Impact

As XML External Entity(XXE) provides a provision to declare and use external files, it can be misused by an attacker to -  cause Denial of Service (DoS) attack, access and read local/system files on the server, access internal network and may even lead to remote code execution.

Vulnerable/Secure XML Parsing Code


Platform
Insecure XML parsing 
Secure XML parsing
ASP.NET
StreamReader stream = new StreamReader(data); XmlReaderSettings settings = new XmlReaderSettings(); settings.DtdProcessing = DtdProcessing.Parse; XmlReader xmlReader = XmlReader.Create(stream, settings); 
StreamReader stream = new StreamReader(data); XmlReaderSettings settings = new XmlReaderSettings(); settings.DtdProcessing = DtdProcessing.Ignore; XmlReader xmlReader = XmlReader.Create(stream, settings);
PHP
libxml_disable_entity_loader (false); $postData = utf8_encode(file_get_contents('php://input')) ; $dom = new DOMDocument(); $dom->loadXML($postData, LIBXML_NOENT | LIBXML_DTDLOAD); $items = simplexml_import_dom($dom);
libxml_disable_entity_loader (true); $postData = utf8_encode(file_get_contents('php://input')) ; $dom = new DOMDocument(); $dom->loadXML($postData, LIBXML_NOENT | LIBXML_DTDLOAD); $items = simplexml_import_dom($dom);

References


http://synradar.com/documents/XXE_Attack_Guide.pdf
https://www.w3schools.com/xml/xml_whatis.asp