XPath Injection

What is XPath?

XML Path Language (XPath) is a query language for Extensible Markup Language (XML) data, similar to how SQL is a query language for databases. As such, XPath is used to query data from XML documents. Web applications that need to retrieve data stored in an XML format thus rely on XPath to retrieve the required data

image.png

Now let’s go and see how XPath language actually works before pentesting!

<?xml version="1.0" encoding="UTF-8"?>
  
<academy_modules>  
  <module>
    <title>Web Attacks</title>
    <author>21y4d</author>
    <tier difficulty="medium">2</tier>
    <category>offensive</category>
  </module>

  <!-- this is a comment -->
  <module>
    <title>Attacking Enterprise Networks</title>
    <author co-author="LTNB0B">mrb3n</author>
    <tier difficulty="medium">2</tier>
    <category>offensive</category>
  </module>
</academy_modules>

image.png

Imagine we have this information stored in a XML file and we want to write a XPath query that will retrieve certain data from it.

First of all the XML document itself starts with a declaration

<?xml version="1.0" encoding="UTF-8"?>

image.png

The xml header identify the XML version been used and the character encoding being used so in this case the version is “1.0” and the encoding is “UTF-8”

Now let’s see how the XML data is structured!

The XML data is like a tree structure the main element of the date is named root element and is called “nodes

<academy_modules>  

</academy_modules>

In our case the academy_modules is the root or we can call now the node of the XML Document

Moving on into the XML file we can find element nodes


  <module>
    <title> </title>
  </module>

  <module>
    <title> </title>
  </module>