The Selenium Handbook is a reference to understanding Selenium. By the end, you will know how to locate elements on a web page and create your own test scripts. Plus read about new Selenium 4 features such as the Relative Locators and Chrome DevTools Protocol.
Table of Contents – The Selenium Handbook
- Introduction To Selenium
- Selenium IDE
- How To Install Selenium
- Selenium Locators
- Selenium Browser Methods
- Selenium WebElement Methods
- Selenium Navigation Methods
- Selenium Wait Methods
- Selenium Switch Methods
- Selenium Advanced Actions
- Selenium 4 Screenshot_New Window
- Selenium 4 Relative Locators
- Chrome DevTools Protocol (CDP)
- Selenium Grid
Selenium WebDriver is a reliable API for automating our browser through a driver. That’s great because most applications are web-based and designed to run on browsers.
- Selenium supports the automation of browsers by sending and receiving commands.
- WebDriver communicates with the browser through a driver.
- Selenium IDE allows an engineer to record, playback the recording, edit, and debug our test.
- Selenium Grid executes our test across several browsers, operating systems, and machines.
The Selenium family of products caters to a person who is comfortable with writing or recording their test. We can record our test using Selenium IDE then execute our scripts employing Selenium WebDriver and scale up applying Selenium Grid.
Before Selenium 4, Selenium IDE was only available for Firefox. Now, it’s available as a Firefox and Chrome extension. It has a lot of new features which include the use of plugins. We can extend Selenium IDE to introduce new commands or integrate with a 3rd Party. The following links provide screenshots and examples about more Selenium IDE features:
How To Install Selenium
There is more than 1 way to install Selenium WebDriver. Some engineers go to Selenium’s Official Site or Maven’s Repository. If you want the previous and latest releases for each component, then use Selenium’s Official Site. However, Maven’s Repository has many plugins, project jars, library jars, and artifacts. Detailed instructions for installing Selenium are provided from the following links.
Selenium WebDriver has 8 locator types to help find a WebElement. In alphabetical order, the 8 locators are
- className – locate WebElements by the value of its Class attribute.
- cssSelector – locate WebElements by the CSS Selector’s engine.
- id – locate WebElements by the value of its ID attribute.
- linkText – locate a hyperlink WebElement by its entire text on a webpage.
- Name – locate a WebElement by the value of its Name attribute.
- partialLinkText – locate a hyperlink WebElement by part of the text on a webpage.
- tagName – locate a WebElement by its tag name.
- xpath – locate a WebElement by its xpath.
Each Selenium Locator requires a value to find a WebElement. Here’s a screenshot of the 8 locator types.
Selenium Browser Methods
Browser Methods are a group of methods that perform actions on a browser. In alphabetical order, the following links explain each Browser Method.
Selenium WebElement Methods
A WebElement is sometimes called an element. It symbolizes an HTML element within an HTML document. Buttons, links, and images are examples of a WebElement. Therefore, the WebElement Method category performs an action on elements. In alphabetical order, the following are 16 Selenium WebElement Methods that carry out actions on the web page. Each link has a description, screenshot, and/or source code to explain the WebElement Method:
Selenium Navigation Methods
The Selenium Navigation Methods are a collection of methods that loads a web page, refreshes a web page, moves backward, and moves forward in our browser’s history. Each Navigation method becomes available after writing navigate() then the dot operator. Writing navigate() allows the driver to access our browser’s history. The following links have code snippets and explain each Selenium Navigation Method:
Selenium Wait Methods
The Selenium Wait Methods are a group of dynamic methods that pause execution between statements. Two common errors occur when executing the next statement:
- Before fully loading the page and
- Before an element is visible
The following links have code snippets and screenshots to describe each Selenium Wait Method:
Selenium Switch Methods
The Selenium Switch Methods must switch before performing an action within the frame, alert, or window. Our program returns an exception if we forget to switch. Methods for switching are accessed through switchTo(). switchTo() returns a target locator and used for sending a future command. The following links provide a description and code snippets for
Selenium Advanced Actions
Advanced Actions in Selenium handle keyboard and mouse interactions. However, there is a difference between Action and Actions. Both are members of Selenium’s interaction package.
- Action is an interface representing 1 user-interaction. It only has 1 method and that method is perform.
- Actions is a class with a lot of methods such as build, dragAndDrop, dragAndDropBy, keyDown, keyUp, moveToElement, and perform.
To use the Actions class, we must instantiate the Actions class then refer to the object. Then the dot operator provides access to multiple methods after writing the object reference (i.e., act). Here’s an example of the syntax followed by a screenshot of the Actions class.
Actions act = new Actions(driver); act.
Selenium Screenshot_New Window
Screenshots are extremely useful for automating an application. Selenium grants the ability to capture a screenshot then store the screenshot in a file. We can take a full-page screenshot and WebElement screenshot. A full-page screenshot is only available for Firefox using the getFullPageScreenshotAs method.
In addition to taking a screenshot, Selenium offers a way to open a window/tab then automatically switching to that window/tab. The following links provide code snippets and screenshots to handle a new window/tab and capture screenshots.
Selenium 4 Relative Locators
The purpose of Relative Locators is to locate a specific element depending on the position of a different element. Selenium supplies 5 overloaded methods with an option of 2 parameters (By locator & WebElement element). The 5 methods are
- above() – locates an element(s) above a fixed element.
- below() – locates an element(s) below a fixed element.
- near() – locates an element(s) near a fixed element.
- toLeftOf() – locates an element(s) to the left of a fixed element.
- toRightOf() – locates an element(s) to the right of a fixed element.
There’s more than 1 way to use the Relative Locators. We also have the choice of using 1 Relative or Multiple Relative Locators. The following links have syntaxes, code snippets, and screenshots to explain Relative Locators.
- Import Relative Locators
- Use One Relative Locator To Find A WebElement
- Use Multiple Relative Locators To Find A WebElement
- Find A List of WebElements Using Relative Locators
Chrome DevTools Protocol (CDP)
Another name for Chrome DevTools Protocol is Chrome Debugging Protocol. It’s a new feature that’s designed for debuggers. All browsers (Google Chrome & Microsoft Edge) built on the Chromium platform has a Developer Tools option. Here’s a screenshot of the Developer Tools option.
When it comes to Selenium, the ChromiumDriver class has a lot of methods but 2 methods allow us to control Developer Tools in Chrome and Edge.
The executeCdpCommand allows us to directly execute a Chrome DevTool Protocol command by passing in a parameter for that command. getDevTools is a method that returns DevTools. DevTools is a class that has methods to handle developer options. Here are screenshots displaying DevTools, executeCdpCommand, and getDevTools.
Selenium Grid is used to carry out testing processes on several browsers (Chrome, Firefox, Edge, etc.) and operating systems (Windows, MAC, Linux) using different machines. It allows the execution of Selenium WebDriver scripts on remote machines whether the machines are real or virtual. The commands are routed by the client to remote browser instances.
Selenium’s original Grid was released in 2011. Since then, technologies such as Docker and Kubernetes have made technology more accessible. Therefore, Selenium Grid has been updated to become modern and take advantage of new technology features. In addition to being modern, it’s easier to trace and log what’s going on with a condition. As a result, the debugging session is straightforward with Selenium Grid.
At the heart of Selenium Grid, there is a new architecture that includes 4 processes: router, distributor, session map, and node.
- The router listens to a new session request.
- Distributor selects a node to run a test.
- Session Map is responsible for mapping session ID to the node.
- Node is a machine for executing our test scripts.
Thanks for reading this Selenium Handbook. For more on Selenium and Automation Tutorials, check out TestProject’s blog.