Working with XPath

XPath is defined as XML path, a query language for selecting nodes from an XML document. It is designed to allow the navigation of XML documents and helps to find an element on a web page using expressions. It is used to find the location of any element on a webpage using HTML DOM structure.

XML provides a standard syntax for the markup of data and documents. XML documents contain one or more elements.

<message>
  <to>allusers@forkedblog.com</to>
  <from>admin@forkedblog.com</from>
  <subject>Reminder</subject>
  <body>Don't forget to visit ForkedBlog.com!</body>
</message>

The XML language has no predefined tags. HTML works with predefined tags like <p>, <h1>, <table>, etc.

An element may have one or more attributes with values which will provide additional information about the element type or its content.

In XPath the starting point is called the context node.

Types of X-path

There are two types of XPath:

  • Absolute XPath
  • Relative XPath

Absolute XPath

The key characteristic of XPath is that it begins with the single forward slash (/) ,which means you can select the element from the root node.
The advantage of using absolute XPath is that it identifies the element very fast.
Disadvantage is that if there are any changes made in the path of the element then the XPath gets failed.

Example:
If you define XPath as
/html/body/div/div[1]/table/tbody/tr/th

and there is a tag that has added between div[1] and table as below
/html/body/div/div[1]/form/table/tbody/tr/th

The first XPath will not work as ‘form’ tag added in between.

Relative XPath

A relative XPath is one where the path starts from the node of your choice – it doesn’t need to start from the root node. You can start from the middle of the HTML DOM structure and no need to write long XPath.

It starts with the double forward slash (//), which means it can search the element anywhere at the webpage.

Syntax:
//tagname[@AttributeName=’AttributeValue’]

Example:
1. //input[@id=’username’]

2. //*[@id=’username’]

The ‘//’ indicates that the processing will start from the current node i.e first matching node with ‘id’ attribute value equal to ‘username’.
And in second example, ‘*’ is used for selecting all the element nodes descending from the current node with ‘id’ attribute value equal to ‘username’.

Advantage of using relative xpath is, 1) you don’t need to mention the long xpath, 2) you can start from the middle or in between.

Disadvantage here is, it will take more time in identifying the element as we specify the partial path and not exact path.

If there are multiple elements for the given XPath, first element that is identified will be selected.

There are scenarios where we need to match element using partial match. Different methods are available in XPath to find elements based on partial match and this is useful when we are handling elements with dynamic attribute.

Available Methods

contains()

contains() is a method used in XPath expression, when value of any attribute changes dynamically. The contains() method has ability to find the element with partial text.

In this example, we try to identify the element by just using partial text value of the attribute. Assume that complete value of ‘id’ attribute is ‘submit’.

//button[contains(@id,’sub’)]

If there are two elements matching, first element will be returned.

Similarly, in the below expression, we have taken the ‘class’ as an attribute and ‘button-style’ as a partial value.

//button[contains(@class,’button-style’)]

If you want to match an element with any attribute values, then use ‘.’ in place of attribute name.

//button[contains(.,’user’)]

starts-with()

starts-with() function finds the element whose attribute starting value matches the given value. Using this, you can also find the element whose attribute value is static.

For example, Assume ID of a particular element changes dynamically on page load like below and also note that the initial text is same.

id=”username-14534″
id=”username-98434″
id=”username-23954″

XPath can be constructed as follows

//input[starts-with(@id,’username’)]

text()

With text() function, we can find the element with exact text match as shown below. In our scenario, we find the element with text “username”.

//span[text()=’username’]

Using OR & AND expression

We use OR expression when we expect at least any one of multiple conditions to be true to match an element. Also applicable when element should be located if any one condition is true or may be all conditions are true.

//button[@type=’submit’ OR @name=’submit’]

We use AND expression when we expect all conditions to be true to match an element. It fails to find element if any one of all condition is false.

//button[@type=’submit’ AND @name=’submit’]

Please let us know your thoughts in comment section.

Leave a Reply

avatar
  Subscribe  
Notify of