XML Injection Attacks: What to Know About XPath, XQuery, XXE & More
1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5.00 out of 5)
Loading...

XML Injection Attacks: What to Know About XPath, XQuery, XXE & More

XML injections are exploits of web app vulnerabilities that can have big payouts for cybercriminals — here’s what to know about these attacks and how you can mitigate them

A computer science PhD student named Florian (who goes by the username FHantke) recently shared an accidental XML injection attack discovery on his blog. According to his post, Florian discovered that he could hack Saarland University’s web server using a technique known as XML injection (or an XML injection attack, as it’s sometimes called). By simply removing one number from the applicant ID field, he caused his personally identifying information (PII) to display along with similar data of other users.

But what is an XML injection attack? And what makes this technique such an easy yet often overlooked issue for web app developers?

Let’s hash it out.

What Is XML Injection Attack? XML Attacks Explained

XML injection, sometimes called XML code injection, is a category of vulnerabilities where an application doesn’t correctly validate/sanitize user input before using it in an XML document or query. XML, which stands for extensible markup language, is a language format that’s commonly used for structuring storing data. Having XML injection vulnerabilities within your app means that bad guys will have free rein to cause whatever damage they can to your XML documents.

XML injections are also a subcategory of injection attacks in general. Bad guys use injection attacks to exploit weaknesses in your applications and front-end services that allow them to deploy malicious payloads and gain access to your sensitive stored data.

XML injections enable unvalidated user data to construct queries that allow an attacker to read or modify XML documents or execute commands in your XML-enabled database. This enables an attacker to get around your application’s front end to gain access to the juicy stored data they seek by taking advantage of vulnerabilities that exist in input fields (e.g., the user’s name, password, and search input fields).

Imagine an attacker wants to read the contents of your organization’s stored data files. They’ll try to enter improperly formatted queries on your web app’s front end with the hope that the unvalidated input will trick the system into passing the message on to your web server. This shouldn’t happen if your app is properly coded and configured (woohoo, good job if that’s the case!). But if it does, it means that you dropped the ball somewhere and that unvalidated request will get sent on to your app’s XML parser (more on this in a couple of minutes) and then on to the web server and XML-enabled database for execution.

Suffice to say, that’s no bueno.

XML injection graphic: A basic overview of how an XML injection attack works.
This illustration provides a basic overview of what occurs during an XML injection attack. Of course, this is just one possible scenario — XML injection attacks are more varied. We’ll cover more of the specific technical aspects of this type of attack later in the article.

Where XML Injections Can Be Used

Practically speaking, XML injection is a technique that can be used against virtually any type of software that uses XML for data input, output and/or storage. A few quick examples of these vulnerable surfaces include:

  • Applications that rely on XML-based protocols
  • Applications that store XML documents in a database or as flat files
  • Applications that support XML-based document formats and data import
  • Software that relies on XML-enabled databases (e.g., BaseX or MarkLogic)
  • XML-based application programming interfaces (APIs), e.g., SOAP

What XML Injections Do (And Why They’re Big Problems for Insecure Applications)

Now that we know what these types of injection attacks are, let’s explore what it is they do that can cause big problems for your business. An XML injection attack allows a threat actor to do any or all of the following:

  • Bypass your web app’s authentication measures. An XML injection can gain unauthorized access to your stored sensitive data by inputting code that allows them to bypass the authentication requirements altogether. One such vulnerability was recently published by the National Institute of Standards and Technology (NIST) as CVE-2022-25251. A vulnerability in Axeda agent and desktop servers for windows allowed an attacker to bypass authentication to send XML messages to a specific port and, potentially, read and alter the affected system’s configuration.
  • Read your organization’s stored sensitive files. XML injections typically allow bad guys to read or modify the contents of your XML data files. So, if you don’t like the idea of some unauthorized schmuck snooping around your stuff or doing other stuff they shouldn’t, then you’re definitely not going to be happy.  
  • Alter or modify your XML files. As if reading their data isn’t bad enough, some XML injection attacks allow bad guys to change the data contained within them!
  • Carry out XML-based denial of service (DoS) attacks. Attackers can overload a web app’s memory and block legitimate traffic from accessing your web apps or services.

For example, an attacker can use an XML injection to add themselves to the table of your web app’s user database. Heck, they could even add themselves to your database as an admin user just for kicks and giggles. Basically, they get the application to create a node that’ll be added to the XML-enabled database that gives them access to read whatever files are accessible to the profile that’s been granted admin privileges. We’ll speak more as to how an XML injection attack works later in the article.

In the meantime, whatever the attacker’s end goal, the important piece we want you to take away from here is that XML injections are bad news, and you need to do everything within your power to prevent these vulnerabilities from being exploited. We’ll address mitigation strategies more toward the end of the article.

Successful XML Injection Attacks Come With Big Price Tags

Although JSON has succeeded XML in some applications, XML is a popular language that’s still in use in many places across the web. (Gotta use the right tool for the right job, yes?) As such, if you don’t lock down the risks associated with injection attacks, you’re in for a world of hurt in many ways:

  • Data compromise issues
  • Unauthorized access to your secure resources and systems
  • Brand reputational damages
  • Compliance issues
  • Loss of customer trust and relationship damage
  • Financial losses (lost revenue, fines and penalties, lawsuits, etc.)

The Role of an XML Parser in Client Applications

To use XML for your application, you’ll need to have an XML parser. An XML parser is typically a software package or library that’s responsible for reading, interpreting, editing and validating XML documents and queries.

Because it’s the XML parser’s job to work with XML documents directly, preventing XML injection attacks starts with ensuring your XML parser correctly sanitizes and validates user input. This must occur before the inputs get inserted into an XML document or query.  

Now that we know what an XML parser is and what it does to help protect your web applications and data against invalid inputs, it’s time to familiarize ourselves with some of the different types of XML injection attacks.

An Overview of the Different Types of XML Injection Attacks

XML injections aren’t singular weaknesses. They’re a whole umbrella category that consists of multiple unvalidated input-related vulnerabilities that tend to overlap:

  • XML entity expansion (XEE) — Also known as XML bombs (aka an XML DoS attack or the “billion laughs attack” we mentioned earlier), this tactic involves an attacker injecting a massive number of recursive or nested references to crash your web app or server.
  • XML external entity (XXE) — This is where an attacker inserts an external entity reference into their input to either access sensitive XML files that they shouldn’t have access to, or to make malicious queries to external URIs.  
  • XPath injection — This type of attack involves an attacker sending malicious data or commands via an XPath expression to your XML document or database. (XPaths allows you to select specific parts of XML or HTML documents to display on your site or in your app.) By injecting a malicious value into the XPath expression, an attacker can modify or add something to your XML-enabled database or document or do something else (e.g., gain remote access to sensitive data by bypassing authentication).
  • Blind XPath injection — This is done as a way to carry out an XPath injection when an attacker doesn’t know how a target XML document is structured or if you’re not displaying errors they find useful. This helps an attacker discover how your files are structured and modify the data contained within as desired. This attack method typically consists of XML crawling and Boolean testing to generate true/false responses that inform them whether an attack is successful or failed.  
  • XQuery injection — An attacker uses a malicious XQuery input to execute a malicious command or add unauthorized info to your XML-enabled database or files. XQL injections use XML query language characters to create inputs with invalid syntax to access or modify sensitive information contained within your XML documents or database.

XML Is Just One Category of Code Injection Attacks

We touched on this earlier: XML injections are just one type of injection attack. So, if you thought that XML injections were a one-of-a-kind thing, you were sadly mistaken. Code injection is a general umbrella term for bad guys’ attacks that aim to gain access to or modify information they shouldn’t have access to via unvalidated code or commands. They’re able to do this by using malformed queries or commands that don’t trigger red flags in the validation process because the application fails to validate the data.

As such, code injection exploits typically result from misconfiguring your application or just poorly coding it from the get-go.

Some injection attacks are client-side attacks while others are server-side attacks. An XML injection falls under the umbrella of a server-side attack because the goal is to exploit one or more vulnerabilities to gain access to sensitive resources stored on your web server.

While XML injections are one of the most common exploits you’ll see in web apps, it’s just one type of code injection you’ll find in the Common Weaknesses Enumeration (CWE) list of injections. Examples of others include:

  • HTML injection attacks — This injection technique attack involves an attacker targeting hypertext markup language (HTML) elements. This is an example of a client-side injection attack.
  • LDAP injections — This type of injection attack involves an attacker exploiting lightweight directory access protocol (LDAP) statements to insert additional commands and return sensitive info. This is another example of a server-side attack.
  • SSI injections — Think of this as a type of delayed server-side attack. It allows an attacker to exploit vulnerabilities in a web app to use unvalidated input to insert info into the server’s HTML file (which the server executes later).
  • SQL injection attacks — This attack method involves injecting your own structured query language (SQL) code into a site or service’s data streams through a front end to modify data in the SQL database. This is an example of a server-side injection attack.
  • Cross-site scripting (XSS) injection attacks — XSS is a client-side attack that aims to target users by exploiting a compromised legitimate website through malicious code injection.

So, what’s the difference between an XML injection and, say, an SQL injection? Not all that much — they’re actually two closely related attack techniques that aim to achieve similar goals. But the key difference between these two types of attacks is that the former uses XML inputs to target XML documents and databases, whereas the latter uses SQL queries and targets SQL resources.

Of course, there are always exceptions to the rule. There’s a hybrid attack known as an XML SQL injection, which involves an attacker injecting SQL code into an XML payload to carry out the attack on XML resources.

These vulnerabilities are weaknesses in your defenses that have the potential of being exploited. So, you still need to secure your defenses by taking steps to mitigate them.

How an XML Injection Attack Works

An XML injection attack works much the same way as an SQL injection attack. But in the case of an XML code injection, you’re inserting unauthorized information into existing XML data streams or files. Within XML documents, there are special characters called metacharacters (<, >, “ and &) that can be used to add or modify data or XML syntaxes. When an attacker uses these characters, it allows them to get the target server to carry out desired operations.

We’re not going to go too in-depth in this section since understanding how XML injections work is another full article’s worth of content and explanation. But we’ll briefly cover it to at least give you a basic understanding of how it works. Here’s a quick overview of what generally happens in an XML injection attack where an attacker intends to read a file:

  • An attacker enters an illegitimate input in your front-end system. They can do this by creating an XML query or by uploading an improperly formatted XML document to your web app that requests access to specific documents or resources. For example, an attacker could remove a digital from a phone number entry or add a ‘1’=’1 in the password field.
  • This unvalidated input will result in an error or send the input along to the database. Without proper parsing and validation processes in place, your web app server sends along the malicious payload to your server and then your database to process.
  • Your database then tries to process the query. This typically results in sending back information to the server in response to the query. If the illegitimate query includes info that’s repeated over and over, then it can result in overloading your XML parser as a DoS attack.
  • The server then responds to your query with the requested information. This typically entails sharing whatever information is included in the requested resource or adding your specified information to it.

Check out the following basic illustration that provides a basic overview of how an XML injection attack works:

XML injection graphic: A look at how an attacker uses a vulnerable web app to gain access to or modify XML files.
An example of one type of XML injection attack. An attacker injects improperly formatted code into a vulnerable web application. This unvalidated information gets sent on to the database for processing and, ultimately, returns the requested info to the attacker or adds the specified information to the document.

How to Test Your Application for XML Injection Vulnerabilities

All of this may leave you wondering how you can tell whether your application is secure against an XML injection. You can test your application by entering XML metacharacters — for example, a single or double quote — into one of your app’s fields and see whether it generates a response. If it results in an error, then it indicates that an XML injection could be possible.  

Check out this resource from OWASP for more information on XML injection testing.

How to Mitigate XML Injection Risks

If you properly write your application to safely handle inputs and outputs, then you really don’t have anything to worry about (at least, as far as XML injection attacks are concerned). Why? Because you’ve blocked an attacker’s ability to inject non-approved code into any XML document or query, thereby creating a strong and secure application.

So, what do you do if your organization doesn’t fall within this camp? If you have an insecure application that’s susceptible to XML injection attacks, there are a few key things you can do:

  1. Sanitize user inputs to filter out unacceptable characters. You can do this by escaping or disallowing characters like we’d mentioned earlier — XML metacharacters like ’, ”, <, >, /, etc. — from your web form user input fields.   
  2. Specify which inputs are allowed. Rather than trying to cover all of your bases one by one (which is very tedious and you’re likely to miss something), you can take a different approach and specify which characters are allowed by setting a default deny policy. For example, if you want to include a field for a user’s age, restrict user inputs to only allow the use of numbers.
  3. Keep an eye on your XML parser. To help make your XML parser more secure against these types of attacks, keep an eye on your parser to identify any vulnerabilities. Also be sure to disallow DTDs.
  4. Implement a content security policy (CSP). An HTTP CSP response header restricts the types of resources a user can load while using your site to a set list of predetermined resources.

Some additional good rules of thumb include the following recommendations:

  • Follow secure coding best practices
  • Continually educate yourself and other developers on these best practices
  • Keep your software and systems patched and up to date

Final Thoughts on What XML Injections Are and Why You Should Mitigate Them

XML injections are possible due to vulnerabilities that are enabled through the use of poor coding. However, the truth isn’t so black-and-white. XML injection attacks can also result from poor cybersecurity awareness or a lack of adequate time to test and QA web apps prior to launch.

When it comes to knowledge and following industry best practices, you don’t know what you don’t know. And if your programmers and developers are so overwhelmed with projects that they don’t have the necessary time to dedicate to proper application testing, then there are bigger problems at hand.

So, before you berate a developer for creating an insecure web app, be sure to first look at your organization’s processes, procedures, policies, and project expectations. If you’re not providing your employees with the educational resources, training, and time they need to be thorough and successful, you may be setting them — as well as your organization — up for failure.

Author

Casey Crane

Casey Crane is a regular contributor to (and managing editor of) Hashed Out with 15+ years of experience in journalism and writing, including crime analysis and IT security. Casey also serves as the Content Manager at The SSL Store.