logo logo

Selenium 4 Is Officially Released!

Selenium 4

In this article, let’s discuss the release of Selenium 4 🥳 An accepted custom for the software release life cycle goes from the alpha version to the beta version, and then to the release candidate. After the release candidate, the software “Selenium 4” is stable for production release. That means, it has passed many verifications with no showstopper bugs. For Selenium, there was a total of 7 Alphas, 4 Betas, and 3 Release Candidates.

Selenium 4 Versions

We have been waiting since April 2019 for the official Selenium Release date! Now the long-anticipated wait is over. Selenium 4 is here with some advantageous upgrades. By the end of this article, you will read about the family of Selenium 4 release ✅

Table of Contents

There are several new features with the release of Selenium 4. Some of the features include network interception and the ability to authenticate a website using basic authentication or digest authentication.

  • Network interception is the process of capturing network traffic to observe what’s taking place on the network. Selenium 4 allows us to get the HTTP Status Codes and modify the HTTP traffic.
  • Basic authentication is an HTTP authentication method for a client to provide a username and password when sending a request to the server.
  • Digest authentication is an HTTP authentication method designed to be more secure than a basic authentication. The username or password is not sent but the client uses an algorithm to create a hash.

Selenium IDE

Selenium IDE records the actions we take on a website then playback those same actions. Prior to Selenium 4, the extension was only available on Firefox. With the new version, we can record then replay a user’s actions on Firefox and Chrome.

The new Selenium IDE features include the Backup Element Selectors, Control Flows, and the Command Line Runner also known as CLI runner. Also, there are plans for Selenium IDE to be available as an Electron app. The Electron app permits us to use the Debugging Protocol and listen out for events from the browser.

Selenium Grid

Selenium Grid allows us to remotely execute Test Scripts on virtual machines or real machines. We route commands from the client to remote browser instances. The benefit grants Test Scripts to run in parallel across multiple machines, browsers, and operating systems.

With Selenium 4, the Grid has a new feature that involves an advanced architecture with various components. The components include a Client, Router, Session Queue, Distributor, Session Map, Node, and Event Bus. In addition, GraphQL has been added as a new way to query and get data. Here’s a screenshot of the Grid Components from Selenium’s website.

Selenium Grid

The process starts with the Router listening for a session request from the Client. After receiving a request, the Router sends each request to the correct component. If the request is new, then communication begins between the Router and Session Queue. If the request is not new, then communication exists between the Router and Session Map. The Router adds a new session to the Session Queue then waits for a response. All of the sessions are held in the Session Queue in a First In First Out (FIFO) order.

The Distributor has a responsibility of requesting the first matching request from the New Session Queue. If a slot is available from the New Session Queue, then the Distributor attempts to create a new session. It searches for an appropriate Node to create a session and sends the information to the New Session Queue.

There can be multiple Nodes and the Node can execute sessions in Docker containers or relay commands via a specific configuration. Each Node oversees slots for browsers.

By default, the Node automatically registers all available browser drivers of the running machine. The CPU for Safari and Internet Explorer are different from Chrome, Edge, and Firefox. Only one slot is created for Safari and Internet Explorer. However, one slot per available CPU is created for Firefox and Chromium based browsers (Chrome and Edge).

Information regarding the Node and Session ID is stored in the Session Map. The Router sends a Session ID to the Session Map. In return, the Session Map passes the associated Node back to the Router. The Event Bus should be initiated first when starting the Selenium 4 Grid in its fully distributed mode. It has the duty to be a communicator between the Nodes, Distributor, Session Queue, and Session Map. The Event Bus helps Selenium 4 Grid with internal communication through messages. As a result, the Grid bypass costly HTTP calls.

Selenium WebDriver


According to Simon Stewart, the creator of Selenium WebDriver, the W3C WebDriver Protocol is the main reason for upgrading to Selenium 4. It has at least 3 advantages:

  1. W3C WebDriver Protocol provides standards.
  2. W3C WebDriver Protocol provides stability.
  3. W3C WebDriver Protocol provides an updated Actions API that is supplied with better resources.

The architecture for Selenium 3 included the JSON Wire Protocol. The objective of JSON Wire Protocol was to transfer information from the client to the server. That information was processed over HTTP by sending HTTP Requests and receiving HTTP Responses.

Selenium Architecture

With Selenium 4, the JSON Wire Protocol has been removed from the new architecture. We now have direct communication between the Browser Drivers and Selenium Client & WebDriver Language Bindings.

Selenium 4 Architecture

Relative Locators

Relative Locators were formerly called Friendly Locators. Selenium 4 provides 5 Relative Locators to find an element or elements located above(), below(), near(), toLeftOf(), and toRightOf() another element.

Relative Locators - above Relative Locators - below Relative Locators - near Relative Locators - left of

We have the option of finding a list of elements using a Relative Locator. The Selenium statement consists of List<WebElement> and the findElements() method. Below is a screenshot and code snippet of TestProject’s Addons Library.

An Addon is a collection of coded Automation Actions we can use to empower and extend built-in capabilities. The code snippet shows how our Test Script can combine multiple relative locators to find all platforms (Android, iOS, and Web) using TestProject Addons.

Visible Elements Operations - TestProject Addon

    public void testRelativeLocator_FindListOfWebElements () {
      List<WebElement> allPlatforms = driver.findElements(RelativeLocator.withTagName("img")

      for (WebElement platform : allPlatforms) {

Switch To New Window/Tab

Selenium 4 allows us to open a new window and open a new tab in the same session at the same time. After opening the window or tab, we can work in it without creating a new driver. This is a valuable and requested feature by automation engineers. The newWindow() method creates a new window or tab then automatically switches focus to that new window or tab.

Take WebElement or Full Page Screenshots

With Selenium 3, there was a way to take a screenshot of a page. However, it did not provide a process of capturing a screenshot of an individual WebElement or a full page screenshot. The full page screenshot feature takes a screenshot of an application’s entire page including the footer. It happens if we can see the full page or not see the full page.

Chrome DevTools Protocol (CDP)

The CDP is a new API feature that’s designed for debuggers. CDP is an acronym for Chrome DevTools Protocol or Chrome Debugging Protocol. We are allowed to take advantage of Google Chrome’s and Microsoft Edge’s debugging protocol. Both browsers are built on the Chromium platform and contain a Developer Tools option.

Chrome DevTools - TestProject Blog View

Elements is the most popular tab for automation, but we also have Console, Sources, Network, Performance, Memory Application, and Security. It’s the same panel if we right-click a page and click Inspect.

Selenium 4 has many methods in a class called ChromiumDriver class. However, 2 methods allow controlling Developer Tools in Chrome and Edge. Those 2 methods are executeCdpCommand() and getDevTools(). The executeCdpCommand allows us to directly execute a Chrome DevTool Protocol command by passing in a parameter for that command. getDevTools is a method that returns DevTools. Here’s a high-level list of what we can perform with CDP:

  • View Console Logs
  • Mock Geolocation
  • Slow Down Our Internet
  • Take The Network Offline
  • Ignore A Certificate Error

Selenium 4 Video

Now that Selenium 4 is finally here, are you ready to give it a spin?
Which new feature are you most excited about? Share with us in the comments below 😎

About the author

Rex Jones II

Rex Jones II has a passion for sharing knowledge about testing software. His background is development but enjoys testing applications.

Rex is an author, trainer, consultant, and former Board of Director for User Group: Dallas / Fort Worth Mercury User Group (DFWMUG) and member of User Group: Dallas / Fort Worth Quality Assurance Association (DFWQAA). In addition, he is a Certified Software Tester Engineer (CSTE) and has a Test Management Approach (TMap) certification.

Recently, Rex created a social network that demonstrate automation videos. In addition to the social network, he has written 6 Programming / Automation books covering VBScript the programming language for QTP/UFT, Java, Selenium WebDriver, and TestNG.

✔️ YouTube https://www.youtube.com/c/RexJonesII/videos
✔️ Facebook http://facebook.com/JonesRexII
✔️ Twitter https://twitter.com/RexJonesII
✔️ GitHub https://github.com/RexJonesII/Free-Videos
✔️ LinkedIn https://www.linkedin.com/in/rexjones34/


16 1 comment

Leave a Reply