In this article, you’ll see how you can exploit the DOM. DOM traversal entails getting to the desired element with the help of either XPaths or CSS. It is possible to traverse the DOM in a forward and backward direction with XPaths but traversal through XPaths is slow compared to CSS. Traversal using CSS can only be done in the forward direction. In order to traverse the DOM, using either XPaths or CSS, we need to understand the By class.
Dissecting the By class
The By class is an abstract class that has eight static methods and eight inner classes. Let's understand the structure of the By class.
The following code skeleton shows a fragment of the structure of the By class:
public abstract class By {public static By id(String id);public static By cssSelector(String css);public static By name(String name);public static By linkText(String text);public static By className(String className);public static By tagName(String tagName);public static By partialLinkText(String partialLinkText);public static By xpath(String xpath);public static class ById extends By {WebElement findElement(By by);List<WebElement> findElements(By by);}}
Note: Inner classes are present corresponding to all the static methods. There are inner classes such as ByName, ByTagName, and so on.
Inner classes similar to ById exist for name, linkText, xpath, and so on. We will be using the two static methods xpath and cssSelector to design what are called customized xpath and css. Let's try to understand the various mechanisms to access the DOM elements. There are eight ways to access WebElementsusing the static methods in the By class. We will just be covering access using static methods.
We will look at each of the eight methods individually, and then we will adopt a better approach using the relative (customized) XPath, which will cover each method internally:
- By.id: Uses the id attribute of the element to locate. For example, By.id("userid").
- By.name: Uses the name attribute of the element to locate. For example,By.name("username").
- By.className: Uses the class attribute of the element to locate. For example,By.className("class1").
- By.linkText: Uses the text of any anchor link to locate. For example, By.linkText("Click here to Login").
- By.partialLinkText: Uses the partial text of any anchor link to locate. For example, By.partialLinkText("Login").
- By.xpath: Uses the XPath of the element to locate. For example, By.xpath("//*[text()='Login']").
- By.cssSelector: Uses CSS selectors to locate. For example, By.cssSelector(".ctrl-p").
- By.tagName: Uses tag names, such as input, button, select, and so on to locate. For example,By.tagName("input").
The two types of XPaths
Let's now understand what absolute XPaths are and how they differ from relative (customized) XPaths:
- An absolute XPath is the entire path of the WebElement taken from the root node. For example, html/body/div/a.
- A relative or customized XPath is one in which we use the following format. For example, if the div has an idof ABC, then the same absolute XPath will be //div[@id='ABC']/a.
There is an apparent problem with the absolute type of XPath. If the DOM structure changes in the future (for example, if div is removed), then this path will undergo changes.
Understanding customized XPaths
The structure of a customized XPath is given as follows: //*[@Attribute = 'Value'].
Here, // indicates that the entire DOM will be searched. We will understand some important XPaths with the help of http://www.freecrm.com:
Mentioned below are some of the commonly used strategies.
- Using the name attribute: //*[@name='username']. This searches the DOM for an element for which the name value is username. This is the login field on the landing page.
- Using the name and type attributes: //*[@type='password'][@name='password']. In the DOM, the passwordfield on the screen can be identified using just the name field. Just for the sake of demonstrating multiple attributes, I have taken the type attribute also. The need for multiple attributes arises when a unique element cannot be found using just one attribute.
- Using the contains clause: //*[contains(@type,'password')]. This searches for an element whose type attribute contains the text password.
- Using starts-with: //*[starts-with(@name,'user')]. This XPath will find the username field again but this time based on the starting text present in the name attribute.
- Using the following node: //*[@name='username']// following the :: input. This XPath searches for input tags which follow the username field. The boundary of this search is the container element within which the username lies. Since there is a password textbox and Login button following the username and the username, password, and Login button are inside a form, it identifies the password textbox and Login button.
- Using the node: //*[@value='Login']// preceding the :: input. This will provide the username and password textboxes.
- Using the onclick attribute: //a[contains(@onclick,'html/entlnet/userLogin.html')]. This is a very common case and is used when we have anchor tags without an ID or name and just an onclick attribute that has a JavaScript function called onclick()={function content}. In this case, the anchor tag can be structured as Login.
- Using the ExtJs qtip attribute: //*[@*[local-name()='ext:qtip'][.='Account Number']]. With the growing popularity of ExtJs for developing web apps, it is necessary to have something to identify common ExtJs attributes. One ExtJs attribute is qtip. Here we are finding an element with the qtipAccount Number.
- Using and: //input[@class='textboxes' and @name='firstName']. In this case, an input element with the class attribute as textboxes and name as firstName will be located. Both conditions in and must be satisfied.
- Using or: //input[@class='textboxes' or @name='firstName']. In this case, an input element with the class attribute as textboxes or name as firstName will be located. Either of the conditions in and must be satisfied.
Customized CSS
Now that we have seen the customized XPath, it's time to look at customized CSS. Remember, CSS can be used only for forward traversal.
The following are some customized CSS examples that one can use while coding the program:
- Using the name attribute: input[name='username']. This CSS identifies the username. Notice there are no '//'s.
- Using the name and type attribute: input[type='password'][name='password']. This will identify the password textbox.
- Using the ID and class: form[id='loginForm'],form[id='loginForm'][class='navbar-form']. These two CSS selectors will identify the login form.
- Using the 'contains' clause: form[id*='Form']. This will identify the form since the ID of the form contains the text Form. Contains is indicated by '*' in CSS.
- Using the 'starts-with' clause: form[id^='login']. This will identify the form since the form ID starts with the text 'login'. starts-with is indicated by '^' in CSS.
- Using the 'ends-with' clause: form[id$='Form']. This will identify the form since the form ID ends with the text 'Form'. ends-with is indicated by '$' in CSS.
An example traversal
The element retrieval and traversal can be done quite easily by what is known as a browsers console. In all the three browsers, the console can be invoked by pressing the F12 key on the keyboard. In Chrome, the Elements tab will help in finding the XPath. One can traverse back and forth in the DOM using '/..' and ('//' or '/'). Let's see what the Chrome console looks like.
The following snapshot shows the Chrome console with the username field highlighted because we tried to find an element through it's XPath . In order to search for any element, just press Ctrl + F on the console. A search box opens where you can type the XPath:
Â
A similar console in Internet Explorer is called Developer Options and in Firefox it is called Firepath. In Firefox, one must remember to first add the firebug plugin from the Firefox plugins page (go to the Tools | Add-ons menu and then select Add-ons from the left pane). Only then can Firepath be accessed using the F12 key.
Apart from the consoles, which come built-in with the browsers, there are a few extensions such as XPath helper in Chrome and MRI in Internet Explorer. MRI is a bookMarklet for IE. One can get it from http://westciv.com/mri/ as a free installation. All the instructions are available on this website.
Note: MRI will not work on popup windows. In the case of popups, the console is a better option.
Understanding the text() methods
One very useful method in finding XPath is the text() method. When we need to supply some text at runtime, say for example, from an Excel file, then we can utilize the text() method in the following manner:
public class DynamicText {public static void main(String[] args) {System.setProperty("webdriver.chrome.driver","C:\\SeleniumWD\\src\\main\\resources\\chromedriver.exe");WebDriver driver = new ChromeDriver();driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);driver.get("http://www.google.com");String variableData = "Google";String dynamicXpath = "//*[text()='" + variableData + "']";List<WebElement> elem =driver.findElements(By.xpath(dynamicXpath));System.out.println("no of elements: " + elem.size());}}
The program above prints:
no of elements: 3
Finding elements within the container element
On the http://www.freecrm.com login page, the structure is such that the username, password, and the login button are contained inside the form with id=xyz. In such a situation, the child elements can be accessed using findElements on the container or parent element.
The following code displays the number of input elements in the form:
public class DynamicText1 {public static void main(String[] args) {System.setProperty("webdriver.chrome.driver","C:\\SeleniumWD\\src\\main\\resources\\chromedriver.exe");WebDriver driver = new ChromeDriver();driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);driver.get("http://www.freecrm.com");String dynamicXpath = "//*[@id='loginForm']";List<WebElement> elem =driver.findElements(By.xpath(dynamicXpath));List<WebElement> elem1 =elem.get(0).findElements(By.tagName("input"));System.out.println("no of elements: " + elem1.size());}}The output displayed in console is shown belowno of elements: 3
This is a very simple program which has hardcoded values. To remove hardcoding from a program, we require a framework, which we will discuss in forthcoming chapters.
Best practice
A best practice while coding Selenium is always to follow a design pattern. We will go over design patterns in a subsequent chapter.
We should always have modular code delinked from each other so that when one module changes, there is no impact on other modules.
Extracting WebElements dynamically using tagName
Now that we have seen how to create relative (customized) XPaths, it's time to see how to retrieve WebElementsprogrammatically using Java lists. The best way to understand this is through an example. Suppose we want to find all the input textboxes on the login page of http://www.freecrm.com. We will make use of the findElements method. Remember, the findElements method is in the SearchContext interface.
Since the WebDriver interface is a child interface of SearchContext, it inherits the findElements method and we can invoke this method on the reference variable of WebDriver. In conjunction with findElements, we will also make use of the static method, tagName , of the By class.
The following code makes efficient use of the list interface in Java (present in the Java.Util package):
public class URLTest {public static void main(String[] args) {System.setProperty("webdriver.chrome.driver","C:\\SeleniumWD\\src\\main\\resources\\chromedriver.exe");WebDriver driver = new ChromeDriver();driver.get("http://www.freecrm.com");List<WebElement> inputBoxes =driver.findElements(By.tagName("input"));System.out.println("No of inputBoxes: " + inputBoxes.size());}}Output from this program:No of inputBoxes: 3
The two textboxes for UserName and Password and the Loginbutton are treated as input tags. The tagName static method is an extremely useful method and you can use this method for almost any element on any web page.
Properties file for WebElements
We have explored WebElements to a large extent. Now we will actually start preparing for the hybrid framework (we will look at this in a later chapter) by creating a WebElement store. This store will be created in a file known as the properties file, which always has the .properties extension. An example entry in the properties file can be:
USERNAME=//*[@name='username']
Entries in the properties file consist of key value pairs. Here, username is the key and '//*[@name='username'] ' is the value.
Note: The key in a properties file should always be unique. The value part can have duplicate values.
These values should be retrieved by the code once the key is supplied. For this purpose, we will be writing a retrieval program in a subsequent chapter.
The next question that might come up to mind is: we have created a properties file and will be writing retrieval logic for this, but on what basis should the retrieval logic be invoked?
For this purpose, we will have to create test scripts. The test scripts can be created either in an Excel or in a database. We have APIs such as Apache POI and Fillo available as open source. Fillo gives us certain advantages over POI. Fillo treats an Excel tab as a database table and regular SQL queries, such as SELECT, UPDATE, and DELETE, can be triggered on the Excel tab data. Each row is equivalent to the row in a DB table while a column is equivalent to a database field.
Let's take a small diversion here to see what the prerequisites for automating mobile applications are.
Prerequisites for automating mobile applications
For automating mobile applications, there is specific software that needs to be downloaded. The following is a list of all the steps needed to set up the Appium server on your machine:
- Download the Java Development Kit (JDK) (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html).
- Set the Java environment variable path so that Java commands can be executed from anywhere on the system.
- Download the Android SDK/ADB (https://developer.android.com/studio/).
- Utilize the Android SDK packages from SDK Manager in the downloaded Android SDK folder
- Set the Android Environment variables so that commands can be executed anywhere on the system.
- Download and configure nodejs (https://nodejs.org/en/download/). Take the LTS for whichever OS is applicable.
- Download the Microsoft .net framework (https://www.microsoft.com/en-in/download/details.aspx?id=30653).
- Download the Appium server (http://appium.io/).
XPaths for mobile applications
Let's first understand the various types of mobile applications. There are three types of mobile applications:
- Web application: Works only in the mobile browser (for example, a personal blog site)
- Native application: Works only as a standalone app (for example, a calculator)
- Hybrid application: Works on a mobile browser or standalone (for example, Gmail, Flipkart, and so on); it can also be defined as an application that contains a native view and web view
Finding XPaths for mobile browser applications
We have a variety of ways in which we can find the locators of mobile elements. Let's explore some of them.
Connecting the actual mobile device
Perform the following steps to find locators using an actual device connected to the computer:
- Type chrome://inspect/#devices and ensure that Discover USB devices is checked.
- Type URL in the URL textbox and click Open. The website now opens in the connected device.
- Click Inspect in Chrome on the desktop. A new instance of Chrome Developer tools opens in the desktop.
- We can interact with mobile web elements using DevTools.
How to use Screencast
Follow the steps to use screencast in DevTools:
- Perform the preceding steps 1-4.
- Click on the screencast icon in DevTools.
A window opens in which one can see the URL opened in the mobile device. You can interact with this window using DevTools.
Appium Inspector window
To use the Appium Inspector window, perform following the steps and find the desired locator:
- Start the Appium server.
- In the downloaded Android SDK folder, open the Android AVD (Android Virtual Device) Manager.
- Start the emulator inside the AVD.
- Click the magnifying glass icon on the Android server GUI. This opens up the Android Inspector.
How to use UIAutomatorViewer
Perform the following steps to use UIAutomatorViewer for capturing the locator:
- Download Android SDK from https://android-sdk.en.lo4d.com/.https://android-sdk.en.lo4d.com/
- Once downloaded, go to Andriodsdk/tools and double-click uiautomatorviewer.bat.https://android-sdk.en.lo4d.com/
- Click on the device screenshot button, second from the left. The device image gets displayed in the left pane.
- Click on any element and the corresponding information is displayed in the right pane.
Mobile locators
The main locators used in mobile automationare as follows:
- Accessibility id: Unique identifier for a UI element.
- TagName: The same as WebDriver. This tells us what the tag is (input, select, and so on).
- Class Name: Identifies by the classname attribute.
- Xpath: Identifies by the absolute or customized XPath.
- ID: Identifies by the ID of the element.
What is a WebView?
The browser view that is embedded inside a native app is called a Web View. To view the XPath of a webview in a hybrid app, we make use of the Selendroid Inspector. To use the Selendroid Inspector, perform the following the steps:
- Open the Appium Server GUI and put the local APK file path in the application path.
- Select the Automation Name as Selendroid and the other mandatory parameters.
- Start the server after selecting Pre-Launch Application.
- Navigate to http://localhost:8080/inspector and start using the Selendroid Inspector.
If you enjoyed reading this article, you can explore Selenium WebDriver Quick Start Guide to write clear, readable, and reliable tests with Selenium WebDriver 3. Selenium WebDriver is a platform-independent API for automating the testing of both browser and mobile applications. It is also a core technology in many other browser automation tools, APIs and frameworks. Selenium WebDriver Quick Start Guide will guide you through the WebDriver APIs that are used in automation tests.