A Comprehensive XPath Tutorial – XML Path Language

Learn all about XML Path Language (XPath) with Examples. This XPath Tutorial covers the Uses and Types of XPath, XPath Operators, Axes, & Applications in Testing:

The term XPath stands for XML Path Language. It is a query language employed for selecting various nodes in the XML document.

As SQL is used as the query language for different databases (For Example, SQL can be used in database like MySQL, Oracle, DB2, etc ), XPath can also be used for various languages and tools (For Example, languages like XSLT, XQuery, XLink, XPointer, etc. and tools like MarkLogic, Software Testing tools like Selenium, etc.)

XML Path Language(XPath) Introduction

XPath – An Overview

Xpath is basically a language for navigation through XML documents and while discussing navigation, it means moving in an XML document in any direction, going to any element or any attribute and text node. XPath is a recommended language of the World Wide Web Consortium(W3C).

Where Can We Use XPath?

XPath can be used in both the Software Development industry and Software Testing industry.

If you are in the Software Testing domain then you can use XPath for developing automation scripts in Selenium,  or if you are in the development domain then almost all of the programming languages have XPath support.

XSLT is predominantly used in the XML Content conversion domain and uses XPath for conversion. XSLT works closely with XPath and some other languages like XQuery and XPointer.

Types Of XPath Node

Enlisted below are the various types of XPath Node.

#1) Element Nodes: These are the nodes that come directly under the root node. An element node can contain attributes in it. It represents an XML tag. As given in the below example: Software Tester, State, Country are the element nodes.

#2) Attribute Nodes: This defines the property/attribute of the element node. It can be under the element node as well as the root node. Element nodes are the parent of these nodes. As given in the below example: “name” is the attribute node of the element node (software tester). The shortcut to denote attribute nodes is “@”.

#3) Text Nodes: All the texts that come in between element node is known as text node like in below example “Delhi”, “India”, “Chennai” is the text nodes.

#4) Comment Nodes: This is something that a tester or developer writes to explain the code which is not processed by the programming languages. Comments (some text) comes in between these opening and closing tags: <!– put comments here –>

#5) Namespaces: T\”;0j89////  /these are used to remove ambiguity between more than one set of the XML element names. For Example, in XSLT the default namespace is used as (XSL:).

#6) Processing Instructions: These contain instructions that could be used in the applications for processing. The presence of these processing instructions could be anywhere in the document. These come in between <? ….. ?>.

#7) Root Node: This defines the topmost element node which contains all the child elements inside it. Root Node does not have a parent node. In the below XML example the root node is “SoftwareTestersList”. To select the root node, we use forward slash i.e. ’/’.

We will write a basic XML program to explain the above-mentioned terms.

<SoftwareTestersList>
<!-- Below is the list of Software Testers working in different States in India -->
    <softwareTester name="T1">
        <State>Delhi</State>
        <country>India</country>
    </softwareTester>
    <softwareTester name="T2">
        <State>chennai</State>
        <country>India</country>
    </softwareTester>
</SoftwareTestersList>

Atomic Values: All those nodes which do not have either child nodes or parent nodes, are known as Atomic Values.

Context Node: This is a particular node in the XML document on which expressions are evaluated. It could also be considered as the current node and abbreviated with a single period (.).

Context Size: This is the number of children of the parent of the Context Node. For Example, if the Context Node is one of the fifth children of its parent then the Context Size is five.

Absolute Xpath: This is the XPath expression in the XML document that starts with the root node or with ‘/’, For Example, /SoftwareTestersList/softwareTester/@name=” T1″

Relative XPath: If the XPath expression starts with the selected context node then that is considered as Relative XPath. For Example, if the software tester is the currently selected node then /@name=” T1” is considered as the Relative XPath.

Axes In XPath

  • Self-axis: Select the Context Node. The XPath expression self::* and . are equivalent. This is abbreviated by a single period(.)
  • Child axis: Select the children of the Context Node. Elements, comment, text nodes, and processing instruction are considered as a child of the Context Node. Namespace node and the attribute node are not considered as the child axis of the Content Node. For Example, child:: software tester.
  • Parent axis: Select the parent of the context node (if the context node is the root node, then the parent axis will result in an empty node.) This axis is abbreviated by a double period(. .). The expressions (parent:: State) and (../State) are equivalent. If the context node does not have <State> element as its parent then this XPath expression will result in an empty node.
  • Attribute axis: Select the attribute of the context node. This attribute axis is abbreviated by the at-sign(@). If the context node is not an element node then this will result in an empty node. The expression (attribute::name) and (@name) are equivalent.
  • Ancestor axis: Select the parent of the context node and it’s parent’s parent and so on. This axis contains the root node if the context node itself is not the root node.
  • Ancestor-or-self: Select the context node with its parent, its parent’s parent and so on and will always select the root node.
  • Descendant axis: Select all the children of the context node, their children’s children and so on. The children of the context node could be elements, comments, processing instructions, and text nodes. Namespace node and attribute node are not considered under the descendant axis.
  • Descendant-or-self: Select the context node and all the children of the context node and all the children of the children of all the context node and so on. As in the above case elements, comments, processing instructions, and text nodes are considered and namespaces & attribute nodes are not considered under the children of the context node.
  • Preceding axis: Select all the nodes that come before the context node in the whole document which is considered as the preceding axis. Namespace, ancestors and attribute node are not considered as the preceding axis.
  • Preceding-sibling axis: Select all preceding siblings of the context node. All nodes that appear before the context node and also have the same parent as of the context node in the XML document. The preceding-sibling will result in empty if the context node is a namespace or is an attribute.
  • Following axis: Select all nodes that come after the context node in the XML document. Namespace, attribute, and descendants are not considered in this following axis list.
  • Following-sibling axis: Select all the following siblings of the context node. All nodes that come after the context node and also have the same parent as the context node in the XML document are considered as a following-sibling axis. This will result in an empty node-set if the context node is namespace or attribute node.
  • Namespace: Select the namespace nodes of the context node. This will result in empty if the context node is not an element node.

Datatypes In XPath

Given below are the various Datatypes in XPath.

  • Number: Numbers in XPath represents a floating-point number, and are implemented as IEEE 754 floating-point numbers. Integer datatype does not consider in XPath.
  • Boolean: This represents either true or false.
  • String: This represents zero or more characters.
  • Node-set: This represents a set of zero or more nodes.

Wildcards In XPath

Enlisted below are the Wildcards in XPath.

  • An asterisk (*): This will select all the element nodes of the context node. It will select the text nodes, comments, processing instructions and attributes node.
  • At-sign with an asterisk (@*): This will select all the attribute nodes of the context node.
  • Node(): This will select all the nodes of the context node. These select namespaces, text, attributes, elements, comments and processing instructions.

XPath Operators

Note: In the below table, e stands for any XPath expression.

OperatorsDescriptionExample
e1 + e2Additions (if e1 and e2 are numbers)5 + 2
e1 – e2Subtraction (if e1 and e2 are numbers)10 – 4
e1 * e2Multiplication (if e1 and e2 are numbers)3 * 4
e1 div e2Division (if e1 and e2 are numbers and result will be in floating-point value)4 div 2
e1 | e2union of two nodes that match e1 and match e2.//State | //country
e1 = e2Equals@name = ’T1’
e1 != e2Not Equal@name != ’T1’
e1 < e2Test of e1 is less than e2 (less-than sign ‘<’ must be excaped by ‘<’)test=”5 < 9” will result true().
e1 > e2 Test of e1 is greater than e2 (greater-than sign ‘>’ must be excaped by ‘>’)test=”5 > 9” will result false().
e1 <= e2Test of e1 is less than or equal to e2.test=”5 <= 9” will result false().
e1 >= e2Test of e1 is greater than or equal to e2.test=”5 >= 9” will result false().
e1 or e2Evaluated if either e1 or e2 are true.
e1 and e2Evaluated if both e1 and e2 are true.
e1 mod e2 Returns floating-point remainder of e1 divided by e2.7 mod 2

Predicates In XPath

Predicates are used as filters that restrict the nodes selected by the XPath expression. Each predicate is converted to Boolean value either true or false, if it is true for the given XPath then that node will get selected, if it is false then the node will not be selected.

Predicates always come inside square brackets like [ ].

For Example, softwareTester[@name=”T2″]:

This will select the <softwareTester> element which has been named as an attribute with the value of T2.

Applications Of XPath In Software Testing

XPath is very useful in Automation testing. Even if you are doing Manual testing, the knowledge of XPaths will be very useful to help you understand what’s happening at the backend of the application.

If you are in Automation testing, you must have heard about Appium studio which is one of the best automation tools for Mobile Apps Testing. In this tool, there is one very powerful feature called the XPath feature which enables you to identify the elements of a specific page throughout the automation script.

We would like to quote another example here from the tool which almost every software tester knows i.e. Selenium. The knowledge of XPath in Selenium IDE and Selenium WebDriver is a must-have skill for testers.

XPath acts as an element locator. Whenever you are required to locate a specific element on a page and perform some action over it, you need to mention its XPath in the target column of the Selenium script.

Xpath Example

As you can see in the above image, if you select any element of a web page and inspect it, you will get an option of ‘Copy XPath’. As an example was taken from Google search web element through the Chrome web browser and when the XPath was copied as shown in the above image, we got the below value:

//*[@id="tsf"]/div[2]/div[3]/center/input[1]

Now, if suppose we need to perform a click action on this link then we will have to provide a click command in the Selenium script and the target of the click command will be the above XPath. The usage of XPath is not just limited to the above two tools. There are a lot of areas and tools of software testing in which XPath is used.

We hope that you got a fair idea about the importance of XPath in the field of software testing.

Conclusion

In this tutorial, we have learned about XPath, How to use XPath expression, Support for XPath expression in different languages and tools. We learned that XPath can be used in any domain of Software Development and Software Testing.

We also learned the different Datatypes of XPath, different Axis used in XPath along with their usage, Node types used in XPath, Different Operators, and Predicates in XPath, the difference between Relative and Absolute XPath, Different Wildcards used in XPath etc.

Happy Reading!!