logo logo

The Selenium Handbook

Selenium Handbook

The Selenium Handbook is a reference to understanding Selenium. By the end, you will know how to locate elements on a web page and create your own test scripts. Plus read about new Selenium 4 features such as the Relative Locators and Chrome DevTools Protocol.

Table of Contents – The Selenium Handbook

  1. Introduction To Selenium
  2. Selenium IDE
  3. How To Install Selenium
  4. Selenium Locators
  5. Selenium Browser Methods
  6. Selenium WebElement Methods
  7. Selenium Navigation Methods
  8. Selenium Wait Methods
  9. Selenium Switch Methods
  10. Selenium Advanced Actions
  11. Selenium 4 Screenshot_New Window
  12. Selenium 4 Relative Locators
  13. Chrome DevTools Protocol (CDP)
  14. Selenium Grid
  15. Conclusion

Introduction To Selenium

Selenium WebDriver is a reliable API for automating our browser through a driver. That’s great because most applications are web-based and designed to run on browsers.

  • Selenium supports the automation of browsers by sending and receiving commands.
  • WebDriver communicates with the browser through a driver.

In addition to Selenium WebDriver, the Selenium family also consist of Selenium IDE and Selenium Grid.

  • Selenium IDE allows an engineer to record, playback the recording, edit, and debug our test.
  • Selenium Grid executes our test across several browsers, operating systems, and machines.

The Selenium family of products caters to a person who is comfortable with writing or recording their test. We can record our test using Selenium IDE then execute our scripts employing Selenium WebDriver and scale up applying Selenium Grid.

Selenium IDE

Before Selenium 4, Selenium IDE was only available for Firefox. Now, it’s available as a Firefox and Chrome extension. It has a lot of new features which include the use of plugins. We can extend Selenium IDE to introduce new commands or integrate with a 3rd Party. The following links provide screenshots and examples about more Selenium IDE features:

How To Install Selenium

There is more than 1 way to install Selenium WebDriver. Some engineers go to Selenium’s Official Site or Maven’s Repository. If you want the previous and latest releases for each component, then use Selenium’s Official Site. However, Maven’s Repository has many plugins, project jars, library jars, and artifacts. Detailed instructions for installing Selenium are provided from the following links.

How To Install Selenium

How To Install Selenium

Selenium Locators

Selenium WebDriver has 8 locator types to help find a WebElement. In alphabetical order, the 8 locators are

  1. className – locate WebElements by the value of its Class attribute.
  2. cssSelector – locate WebElements by the CSS Selector’s engine.
  3. id – locate WebElements by the value of its ID attribute.
  4. linkText – locate a hyperlink WebElement by its entire text on a webpage.
  5. Name – locate a WebElement by the value of its Name attribute.
  6. partialLinkText – locate a hyperlink WebElement by part of the text on a webpage.
  7. tagName – locate a WebElement by its tag name.
  8. xpath – locate a WebElement by its xpath.

Each Selenium Locator requires a value to find a WebElement. Here’s a screenshot of the 8 locator types.

Selenium Locators

Selenium Browser Methods

Browser Methods are a group of methods that perform actions on a browser. In alphabetical order, the following links explain each Browser Method.

Selenium Browser Methods

Selenium WebElement Methods

A WebElement is sometimes called an element. It symbolizes an HTML element within an HTML document. Buttons, links, and images are examples of a WebElement. Therefore, the WebElement Method category performs an action on elements. In alphabetical order, the following are 16 Selenium WebElement Methods that carry out actions on the web page. Each link has a description, screenshot, and/or source code to explain the WebElement Method:

  1. clear
  2. click
  3. findElement
  4. findElements
  5. getAttribute
  6. getCssValue
  7. getLocation
  8. getRect
  9. getSize
  10. getTagName
  11. getText
  12. isDisplayed
  13. isEnabled
  14. isSelected
  15. sendKeys
  16. submit

Selenium WebElement Methods

Selenium Navigation Methods

The Selenium Navigation Methods are a collection of methods that loads a web page, refreshes a web page, moves backward, and moves forward in our browser’s history. Each Navigation method becomes available after writing navigate() then the dot operator. Writing navigate() allows the driver to access our browser’s history. The following links have code snippets and explain each Selenium Navigation Method:

Selenium WebElement Methods

Selenium Wait Methods

The Selenium Wait Methods are a group of dynamic methods that pause execution between statements. Two common errors occur when executing the next statement:

  1. Before fully loading the page and
  2. Before an element is visible

The following links have code snippets and screenshots to describe each Selenium Wait Method:

Selenium Wait Methods

Selenium Wait Methods

Selenium Switch Methods

The Selenium Switch Methods must switch before performing an action within the frame, alert, or window. Our program returns an exception if we forget to switch. Methods for switching are accessed through switchTo(). switchTo() returns a target locator and used for sending a future command. The following links provide a description and code snippets for

Selenium Switch Methods

Selenium Switch Methods

Selenium Switch Methods

Selenium Advanced Actions

Advanced Actions in Selenium handle keyboard and mouse interactions. However, there is a difference between Action and Actions. Both are members of Selenium’s interaction package.

  • Action is an interface representing 1 user-interaction. It only has 1 method and that method is perform.
  • Actions is a class with a lot of methods such as build, dragAndDrop, dragAndDropBy, keyDown, keyUp, moveToElement, and perform.

To use the Actions class, we must instantiate the Actions class then refer to the object. Then the dot operator provides access to multiple methods after writing the object reference (i.e., act). Here’s an example of the syntax followed by a screenshot of the Actions class.

Actions act = new Actions(driver);

Selenium Advanced Actions

Selenium Screenshot_New Window

Screenshots are extremely useful for automating an application. Selenium grants the ability to capture a screenshot then store the screenshot in a file. We can take a full-page screenshot and WebElement screenshot. A full-page screenshot is only available for Firefox using the getFullPageScreenshotAs method.

In addition to taking a screenshot, Selenium offers a way to open a window/tab then automatically switching to that window/tab. The following links provide code snippets and screenshots to handle a new window/tab and capture screenshots.

Selenium 4 Relative Locators

The purpose of Relative Locators is to locate a specific element depending on the position of a different element. Selenium supplies 5 overloaded methods with an option of 2 parameters (By locator & WebElement element). The 5 methods are

  1. above() – locates an element(s) above a fixed element.
  2. below() – locates an element(s) below a fixed element.
  3. near() – locates an element(s) near a fixed element.
  4. toLeftOf() – locates an element(s) to the left of a fixed element.
  5. toRightOf() – locates an element(s) to the right of a fixed element.

Selenium 4 Relative Locators

There’s more than 1 way to use the Relative Locators. We also have the choice of using 1 Relative or Multiple Relative Locators. The following links have syntaxes, code snippets, and screenshots to explain Relative Locators.

Chrome DevTools Protocol (CDP)

Another name for Chrome DevTools Protocol is Chrome Debugging Protocol. It’s a new feature that’s designed for debuggers. All browsers (Google Chrome & Microsoft Edge) built on the Chromium platform has a Developer Tools option. Here’s a screenshot of the Developer Tools option.

Chrome DevTools Protocol (CDP)

When it comes to Selenium, the ChromiumDriver class has a lot of methods but 2 methods allow us to control Developer Tools in Chrome and Edge.

  1. executeCdpCommand
  2. getDevTools

The executeCdpCommand allows us to directly execute a Chrome DevTool Protocol command by passing in a parameter for that command. getDevTools is a method that returns DevTools. DevTools is a class that has methods to handle developer options. Here are screenshots displaying DevTools, executeCdpCommand, and getDevTools.

Chrome DevTools Protocol (CDP)

Chrome DevTools Protocol (CDP)

Selenium Grid

Selenium Grid is used to carry out testing processes on several browsers (Chrome, Firefox, Edge, etc.) and operating systems (Windows, MAC, Linux) using different machines. It allows the execution of Selenium WebDriver scripts on remote machines whether the machines are real or virtual. The commands are routed by the client to remote browser instances.

Selenium’s original Grid was released in 2011. Since then, technologies such as Docker and Kubernetes have made technology more accessible. Therefore, Selenium Grid has been updated to become modern and take advantage of new technology features. In addition to being modern, it’s easier to trace and log what’s going on with a condition. As a result, the debugging session is straightforward with Selenium Grid.

At the heart of Selenium Grid, there is a new architecture that includes 4 processes: router, distributor, session map, and node.

  • The router listens to a new session request.
  • Distributor selects a node to run a test.
  • Session Map is responsible for mapping session ID to the node.
  • Node is a machine for executing our test scripts.


Thanks for reading this Selenium Handbook. For more on Selenium and Automation Tutorials, check out TestProject’s blog.

About the author

Rex Jones II

Rex Jones II has a passion for sharing knowledge about testing software. His background is development but enjoys testing applications.

Rex is an author, trainer, consultant, and former Board of Director for User Group: Dallas / Fort Worth Mercury User Group (DFWMUG) and member of User Group: Dallas / Fort Worth Quality Assurance Association (DFWQAA). In addition, he is a Certified Software Tester Engineer (CSTE) and has a Test Management Approach (TMap) certification.

Recently, Rex created a social network that demonstrate automation videos. In addition to the social network, he has written 6 Programming / Automation books covering VBScript the programming language for QTP/UFT, Java, Selenium WebDriver, and TestNG.

✔️ YouTube https://www.youtube.com/c/RexJonesII/videos
✔️ Facebook http://facebook.com/JonesRexII
✔️ Twitter https://twitter.com/RexJonesII
✔️ GitHub https://github.com/RexJonesII/Free-Videos
✔️ LinkedIn https://www.linkedin.com/in/rexjones34/


40 1 comment
  • surbhi nahta May 31, 2021, 1:50 pm

    I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own Blog Engine blog now. Really the blogging is spreading its wings rapidly. Your article is a fine example of it.

    selenium classes in pune

Leave a Reply