In this series of Selenium 4 articles, we will look into the entire Selenium suite. The suite includes Selenium IDE, Selenium WebDriver, and Selenium Grid. Each component has something new that is useful for automation.
- Selenium IDE allows us to record, playback the recording, edit, and debug our test.
- Selenium WebDriver is an API that executes our test by driving a browser for automating an Application Under Test (AUT).
- Selenium Grid executes our test across multiple browsers, operating systems, and machines.
We can look at Selenium as a family of products. Our test can be recorded in Selenium IDE then executed using Selenium WebDriver and scaled up utilizing Selenium Grid. The Selenium 4 pre-release version is available from Maven’s Repository. By the end of this series, you will read updates concerning Selenium IDE, Selenium WebDriver, and Selenium Grid.
Tutorial Chapters – Selenium 4
- You’re here → New Selenium 4 Features (Chapter 1)
- Selenium WebDriver (Chapter 2)
- Relative Locators (Chapter 3)
Selenium IDE is open source and available to record then playback our test. The new name is called Selenium IDE TNG (The Next Generation). We can view it as the on ramp to the Selenium family for people without development experience.
Many people wondered what happened to the previous release. Unfortunately, Selenium IDE was only available through Firefox and used an XPI file. XPI is a cross platform compressed installation via Mozilla Application Suite. It stopped working after Mozilla changed from using the XPI file. In spite of the change, we can now download Selenium IDE TNG via https://www.selenium.dev/selenium-ide/ as a Firefox and Chrome extension. The extension is available as a Web Extension but will be available as an Electron app. Electron apps allow us to use the Debugging Protocol.
Also, we can extend Selenium IDE with the use of plugins. Plugins are good for introducing new commands or integrating with a 3rd Party. In addition to the plugins and available for multiple browsers, Selenium IDE TNG has the following features:
Backup Element Selectors is when Selenium IDE records multiple locators for each element. Some examples of locators are ID, XPath, and CSS selector. Imagine a scenario where our test passes as expected, then an update is performed on the application. Depending on the update, the test will fail because Selenium cannot locate the element.
Not anymore 😊 Now our test will pass after an update. If test execution does not locate an element then it will fall back to a different locator until the element is found. Thanks to Backup Element Selectors, one element contains more than one locator.
Control Flow statements help Selenium IDE know the execution order of instructions in our program. There are 2 types of Control Flows: Conditionals and Loops. Conditional statements decide what happens next if a specific condition is met or not met. Loop statements execute an instruction or instructions for a certain number of times.
The Conditional statements are if, else if, else, and end.
- if executes a statement when a condition is true.
- else if only executes a statement when the first if condition is false.
- else extends the if condition when the condition is false.
- end is a command that terminates the conditional block.
The Loop statements are do – repeat if, while, times, and forEach.
- do – repeat if starts with a do command and ends with a repeat if command. A block of statements is executed at least one time then evaluated to determine if they should execute again.
- while only executes a block of statements when the condition is true.
- times determine the number of times to perform a set of commands.
- forEach iterates over a collection and have a reference for each collection item.
Last but not least is Code Export which permits code to get pushed to Selenium WebDriver and use on Selenium Grid. The steps include right-clicking a test or suite, selecting Export, then picking a language/test framework and clicking the Export button. Currently, Selenium IDE TNG supports the following languages and test frameworks:
- C# / NUnit
- C# / xUnit
- Java / JUnit
- Python / pytest
- Ruby / RSpec
Hopefully, support for Java / TestNG comes in the future.
Selenium Grid supplies a way to set up an infrastructure of different browsers and operating systems on different machines. We can specify the browser, browser version, and operating system with Selenium Remote. The tests are distributed across multiple physical or virtual machines so they can execute in parallel.
The original Selenium Grid was released in 2011 and so much has changed with technology. For example, Docker and Kubernetes have made technology more accessible. Docker allows us to run applications in containers instead of virtual machines. Kubernetes allows us to set up a grid of machines. As a result of Docker and Kubernetes, we can scale using a cloud infrastructure. The goal is for Selenium Grid 4 to take advantage of new technology features.
Currently, at start up, the Selenium stand-alone server goes down 1 of 2 code base paths. The divided code bases are getting addressed with a brand-new architecture for Selenium Grid 4. Its architecture includes 4 individual processes: Router, Distributor, Session Map, and Node.
- Router listens to a new session request.
- Distributor selects a node to run a test.
- Session Map is responsible for mapping session ID to the node.
- Node is a machine for executing our Test Scripts.
Here’s the operation of Selenium’s Grid 4 architecture. A message arrives to the router then pings the grid. The new session request goes to the distributor which contains a list of current nodes. It’s the distributor’s responsibility to select a node running in the system. After selecting a node to run our test, a session is started with the node. The node responds back to the distributor with a URL of the session app and returns control to the user.
Although the architecture is new, it is similar to the original architecture. Selenium 2 has a Node and Hub whereby the Hub consists of 3 processes (Router, Distributor, and Session Map). Therefore, Selenium 2 and Selenium 4 have the same processes but Selenium Grid 4 separated the Hub’s process.
Observability helps us to trace and log what’s going on with a condition. This leads to a straightforward debugging session with Selenium Grid 4. As a request comes in, there will be a trace ID to assist our debugging efforts. It will not be as difficult to track why certain conditions failed.
According to Simon Stewart, creator of Selenium WebDriver and Selenium’s Project Lead, the W3C WebDriver Protocol is the reason for upgrading to Selenium 4. There’s not much difference compared to the original JSON Wire Protocol. Here’s a screenshot of the JSON Wire Protocol (left) and W3C WebDriver Protocol (right).
Notice, the JSON Wire Protocol Over HTTP has been removed from W3C WebDriver Protocol. That means information is not transferred over HTTP by sending HTTP Requests and receiving HTTP Responses. With Selenium 4, information is transferred directly back and forth from the client to the server without the JSON Wire Protocol.
An advantage involves testing applications that will execute more consistently between browsers. Kudos to W3C, an acronym for World Wide Web Consortium for developing web standards. A standardization of W3C promotes compatibility beyond WebDriver API implementations.
Chrome DevTools are developer tools that’s built into the Chrome browser. They can help track what’s going on in the browser, diagnose problems, provide a direct communication line, and can be accessed when right-clicking the page then selecting Inspect. The following panels are displayed after selecting Inspect:
- Elements – view the DOM and inspect an element.
- Network – view and debug activities in the network.
- Performance – monitor the load and runtime performance of a website.
- Memory – track leaks and provide information how a page use memory.
- Application – examine all loaded resources.
- Security – inspect the security of a page by debugging issues.
- Lighthouse – audit web application.
There is also a class named DevTools which helps increase productivity with the following capabilities:
- Viewing the DOM
- Handling Developer Options
- Adding Listeners
- Creating a Session
- Inspecting Network Activity
- Emulating Network Connections
- Measuring Performance
Selenium WebDriver uses a driver to manage each browser. ChromeDriver is an executable that Selenium WebDriver uses to control Google Chrome and EdgeDriver controls Microsoft Edge. Both drivers extend ChromiumDriver as an update in Selenium 4. The ChromiumDriver class has methods to create a connection with DevTools.
For more information regarding Selenium 4, we can view the Change Log to see updates concerning each programming language:
- C#: https://github.com/SeleniumHQ/selenium/blob/master/dotnet/CHANGELOG
- Java: https://github.com/SeleniumHQ/selenium/blob/master/java/CHANGELOG
- Python: https://github.com/SeleniumHQ/selenium/blob/master/py/CHANGES
- Ruby: https://github.com/SeleniumHQ/selenium/blob/master/rb/CHANGES
This article discussed some of the upcoming features in Selenium 4: Selenium IDE, Selenium WebDriver, and Selenium Grid. The next article will have additional features for Selenium WebDriver. Stay tuned! 😊