Whenever we think of automating any web application, the first obvious name that comes to our mind is “Selenium”. It’s quite fascinating how the Selenium project has evolved over the years and has become the most popular one. At present, it is a de-facto standard for almost all kinds of web application automation – thanks to its volunteer maintainers, contributors, and the vibrant community surrounding it.
To ensure you are aware of what Selenium is – it is an open-source umbrella project, an ecosystem, consisting of several libraries, servers, and tools which help us to automate the web browsers and thereby assisting in web application testing.
At its very core, is the “Selenium WebDriver”, a remote control interface that enables the control of user agents and interacts with the web browsers, mimicking user interaction with the web elements. It does this with the help of its supported language bindings (written in Java, Python, C#, Ruby, JavaScript) and vendor-provided browser drivers. The browser drivers are standalone proxy servers to assist the language bindings to speak to the browsers (by exposing their internal automation proxy interface) and are available in the form of executable binary files (ChromeDriver, GeckoDriver, IEDriver, EdgeDriver). Since the browsers do not have built-in servers to run the commands, hence the browser drivers are required for communication with the browsers.
Role of Selenium Custom Capabilities
Selenium WebDriver has language-specific libraries with “REST”-ish APIs. These APIs have a set of commands and communicate in JSON format. In layman’s terms, JSON (an acronym for “JavaScript Object Notation”) is a data format to transport/store data. The language bindings make use of these APIs by sending HTTP requests (as JSON Payload), each with a method and template, over the transportation layer. Sessions (with session ids) get created to speak to the different browsers via the browser drivers and HTTP responses are received. The protocol used to transfer the data is a platform and language-neutral “WebDriver W3C” protocol.
“Capabilities” are the set of key-value pairs (encoded as JSON format) which are sent as requests to configure the driver instances and to set/control the browser properties and preferences for each session. We use custom capabilities to define which features we want the browser to satisfy when running a test. They help us to set up our test preconditions (requirements and environments) to perform the automated checks with various combinations of Operating Systems, Browser Configurations, Browser Versions, etc.
Why Do We Need Custom Capabilities to Test Web Applications?
Testing a web application is not as simple as it may seem. When we start testing one, we strive to test it in the best possible ways to deliver a high-quality product to the end-users across a variety of devices. We take the help of knowledge, thoughts, experience, collaboration, specifications, ideas, tools, technologies, and techniques to test it.
But web application testing brings its own set of challenges concerning security, performance, usability, interoperability, requests/responses, accessibility, browser rendering, and so on. Every specification that we want to test using automated checks should be able to get executed in some specific conditions as per our requirement. They can be a specific environment (web browsers, mobile devices, mobile browsers, emulators, simulators), a specific platform (Windows, Linux, macOS), or specific browser settings (incognito, maximized, headless). Custom capabilities help us to achieve these seamlessly from our test automation code.
Let’s discuss the popular Selenium custom capabilities which make our life easier as testers while testing web applications, and see how we can automatically generate informative HTML test reports out of them.
10 Selenium Custom Capabilities Simplifying Web Testing
- Platform Name
- Browser Name and Browser Version
- Proxy
- Headless Mode
- Loading Extensions and Disabling Extensions
- Incognito Mode
- Setting Window Size
- Setting Default File Download Location
- Dealing with Certificates
- Using a Custom Browser User Profile
- Selenium Custom Capabilities using TestProject OpenSDK
- Built-in HTML Test Reports for Selenium Custom Capabilities
- Conclusion
1) Platform Name
Using “DesiredCapabilities” in WebDriver, we can set and send the common standard WebDriver capabilities. We can also use the different options classes (“ChromeOptions”, “FirefoxOptions”, “EdgeOptions”, “SafariOptions”, “InternetExplorerOptions”) which have convenient methods for setting the browser-specific capabilities.
In Java binding, the “setCapability()” method can be used to set the individual capabilities. One such important capability key is “platformName”, which can carry a string as its value, and based on that it will point to the supported and available platform that the WebDriver instance can run on.
- The values can be “LINUX” (indicating any server or desktop system based on the Linux kernel),
- “WINDOWS” (indicating any version of Microsoft Windows operating system, including desktop and mobile versions),
- “MAC” (indicating any version of Apple’s macOS) and a few others.
If the machine does not have the requested platform, then the session won’t get created and a “SessionNotCreatedException” will be thrown.
Performing cross-platform testing is very important since the end-users can access the web applications from any platform (or OS) and as testers, we have to ensure that the applications work the way they are intended to in all of the platform types without the end-users facing any issues related to consistency, UI, usability, and performance.
2) Browser Name and Browser Version
When it comes to web-browsers, the behavior, appearance, and layouts of the web applications may vary from browser to browser. This is because even though all the browsers follow open web standards, but they render HTML and CSS in their unique ways using their rendering engines.
A good web application should look the same across all browsers and all its features should work the same in all browsers. Incompatibility issues may arise not only for different browsers but also for different versions of the same browser. It may be because an older browser version does not support the latest standards. Hence, it is important to perform automated cross-browser testing and it can be achieved by passing the “browserName” and “browserVersion” capabilities.
3) Proxy
If a tester is testing a web application from a system in a corporate environment, then she/he will be mostly behind an HTTP or SSL proxy, configured by the network admins to filter or monitor inbound and outbound web traffic. As a result, a connection to the host(s) will not be established and the automated tests will fail.
In such a situation, the tests have to use the proxy information and set the proxy (with or without authentication). Setting the proxy information is also needed in case of automated Localization testing of web applications where the behavior of the applications has to be tested for specific regions or locales. To set the proxy, we can create a proxy object, set the HTTP or SSL proxies, and send it with the “proxy” capability. We can also turn off proxy settings and pass it with the capability.
4) Headless Mode
We see a web application on the screen when it is rendered by a web browser. If we take away the GUI part of the browser and launch it in a non-graphical mode, then the application runs in what we call a “headless browser”.
The advantage of using headless browsers to test web applications is that they can do all the things that normal browsers do, but with improved speed and performance. This is especially helpful for parallel test execution.
Also, the tests can be run on machines with no GUI. They can even pipe the web content to another program and provide a real browser context to the users, without any of the memory/speed costs of running a full-fledged one with a GUI. Automation scripts, written to perform layout checks, data extraction, document download/upload, page navigation, and web interactions, can be executed comfortably in the headless mode. For Chrome, the WebDriver session can be set to run in the headless mode by sending the “–headless” argument to the ChromeOptions object.
5) Loading Extensions and Disabling Extensions
Browser extensions make our life easier while testing web applications by extending the browser functionalities and taking away many of the boring and mundane tasks. Some of the browser extensions we often use while testing web applications are – Postman Interceptor, Selenium IDE, Clear Cache, SelectorsHub, Analyze Page Performance, WAVE, Talend, AdBlock, What Font, Axe, etc.
We may need to load an extension (packed or unpacked) or disable an extension while running the automated tests. For Chrome, a packed extension is a file with a .crx extension and an unpacked one is a directory containing the extension and a manifest.json file. We can use the addExtensions (in Java)/add_extension (in Python) functions for loading extensions and pass the “–disable-extensions” parameter in the capabilities to disable the loaded extensions.
6) Incognito Mode
There may be occasions when you would want to go incognito (private browsing mode) while testing the web applications.
If you are not running in incognito, then your browsing activity, data, cookies, passwords, admin history, and browsing history may influence your testing of the application, and troubleshooting issues will become difficult. The incognito mode in the browsers will erase temporary data collected by the device you are testing on. They will block targeted ads from showing up when you are testing the applications. They can also prevent the applications from showing you preferences/suggestions which is required if you want to test the application with its default settings. We can force the automated tests to run in the browser incognito mode by sending the “–incognito” argument with the capabilities.
7) Setting Window Size
Most of the web applications created nowadays are based on Responsive Web Design which means that the applications are developed to look good and render well in a variety of devices, windows, or screen-sizes.
A responsive web application will adjust in different viewable window sizes by automatically resizing, hiding, enlarging, or shrinking its content. To test such responsive applications, it may be needed to run the automated tests in various window screen sizes. When WebDriver starts a session and launches the browser, the browser starts with its default settings. After the launch, we might want to resize the browser window programmatically to check how the different web elements behave. To do that, we can pass the “–window-size” argument (with its desired value) in the capabilities.
8) Setting Default File Download Location
With the browser default settings, when we click on an element in a webpage to download a file, the file gets downloaded in the default Downloads folder of our file system. While running automated tests, we may need to download files from the application and save those in a specific directory different from the default download directory.
Also, it is always advisable to download files in a separate folder where we can verify the successful download of the files. It also helps us to keep related downloaded files at the same location. For Chrome we can easily achieve this programmatically by creating a HashMap, adding “download.default_directory” as a key with the desired path as value to it and passing it as a value with the “prefs” key to the “setExperimentalOption()” method of the ChromeOptions class.
9) Dealing with Certificates
SSL (Secure Socket Layer) certificates are data files that digitally bind a cryptographic key, allowing secure connections and initiating secure sessions between the web application servers and web browsers.
There might be situations when you get SSL certificate errors while running automated tests on web applications. This may be due to the browser being not able to establish a secure connection with the requested certificate. To handle such exceptions, we can use pass “–ignore-certificate-errors” as an argument to the ChromeOptions object.
10) Using a Custom Browser User Profile
When a new Chrome session gets initiated by the WebDriver, ChromeDriver creates a temporary user profile for that session. During the automated test run, instead of a new user profile, we might want to use a custom user profile, predefined with special settings like installed extensions, locale, language, etc. We can pass the “user-data-dir” as key and the custom profile path as its value to the capabilities to tell Chrome to use that custom profile for the tests. If the provided path does not exist, Chrome will create a new profile in that specified location.
Selenium Custom Capabilities using TestProject OpenSDK
When working with the TestProject OpenSDK to develop and execute web tests, all that is required to have a completely managed Selenium experience is an installation of the TestProject Agent, and that’s it! Using pure Selenium commands and any Selenium custom capabilities, you can:
- Save time & effort of downloading different Selenium drivers or configuring it. Everything is handled automatically by the TestProject Agent.
- You can use any framework or 3rd party tools: Unit testing frameworks, Cucumber, SpecFlow, Junit, TestNG, etc.
- No need to upload anything to the TestProject Cloud or integrating reporters of any kind.
- You will benefit from beautiful HTML test reports that are automatically created for you on TestProject’s reporting dashboard, where you will be able to also download them in PDF format.
Here are code examples using the Java OpenSDK and Python OpenSDK:
package com.seleniumCapabilities.demo; import java.io.File; import java.util.HashMap; import java.util.Map; import org.openqa.selenium.By; import org.openqa.selenium.Platform; import org.openqa.selenium.Proxy; import org.openqa.selenium.chrome.ChromeOptions; import org.openqa.selenium.remote.CapabilityType; import org.openqa.selenium.remote.DesiredCapabilities; import io.testproject.sdk.drivers.web.ChromeDriver; public class CapabilitiesDemo { public static void main(String[] args) throws Exception { String browserDriverPath = System.getProperty("user.dir") + File.separator + "drivers" + File.separator; System.setProperty("webdriver.chrome.driver", browserDriverPath + "chromedriver.exe"); DesiredCapabilities desiredCapabilities = new DesiredCapabilities(); desiredCapabilities.setCapability(CapabilityType.PLATFORM_NAME, Platform.WINDOWS); desiredCapabilities.setCapability(CapabilityType.BROWSER_NAME, "chrome"); desiredCapabilities.setCapability(CapabilityType.BROWSER_VERSION, "86.0.4240.111"); ChromeOptions chromeOptions = new ChromeOptions(); Proxy proxy = new Proxy(); proxy.setAutodetect(false); proxy.setHttpProxy("<http_proxy-url>:<port>"); proxy.setSslProxy("<https_proxy-url>:<port>"); proxy.setNoProxy("<no_proxy-var>"); chromeOptions.setCapability("proxy", proxy); chromeOptions.addArguments("--headless"); chromeOptions.addExtensions(new File("<path to your extension crx file>")); chromeOptions.addArguments("--incognito"); chromeOptions.addArguments("--window-size=500,200"); Map<String, Object> prefs = new HashMap<String, Object>(); prefs.put("download.default_directory", "<path to your specific downloads folder>"); chromeOptions.setExperimentalOption("prefs", prefs); chromeOptions.addArguments("user-data-dir = <path to your custom user profile>"); chromeOptions.merge(desiredCapabilities); ChromeDriver driver = new ChromeDriver("<your TestProject SDK Developer Token>", chromeOptions); driver.get("https://example.testproject.io/web/"); driver.findElement(By.id("name")).sendKeys("Sam"); driver.findElement(By.id("password")).sendKeys("12345"); driver.findElement(By.id("login")).click(); driver.quit(); } }
from src.testproject.sdk.drivers import webdriver from src.testproject.sdk.drivers.webdriver.chrome import ChromeOptions def capabilities_test(): chrome_options = webdriver.ChromeOptions() chrome_options.add_argument("--proxy-server=<http_proxy_url>:<port>") chrome_options.add_argument("--headless") chrome_options.add_extension("<path to your extension crx file>") chrome_options.add_argument("--incognito") chrome_options.add_argument("--window-size=500,200") prefs = { "download.default_directory": "<path to your specific downloads folder>" } chrome_options.add_experimental_option("prefs", prefs) chrome_options.add_argument("--user-data-dir = <path to your custom user profile>") desired_capabilities = chrome_options.to_capabilities() desired_capabilities = DesiredCapabilities.CHROME.copy() desired_capabilities['platform'] = "WINDOWS" desired_capabilities['version'] = "86.0.4240.111" driver = webdriver.Chrome(projectname="<Your TestProject project name>", token="your TestProject SDK Developer Token", desired_capabilities=desired_capabilities) driver.get("https://example.testproject.io/web/") driver.find_element_by_css_selector("#name").send_keys("Sam") driver.find_element_by_css_selector("#password").send_keys("12345") driver.find_element_by_css_selector("#login").click() driver.quit() capabilities_test()
Built-in HTML Test Reports for Selenium Custom Capabilities
The TestProject OpenSDK will generate great-looking HTML test reports and automatically publish them on the TestProject platform for you, out of the box, no additional configuration needed (you can also download them as a PDF file). Here is the automatically generated report from the OpenSDK code examples with the Selenium custom capabilities:
Conclusion
There are many other Selenium custom capabilities, some standard and some browser-specific, which you can use for your tests depending on the execution context and need. In this article, I listed the capabilities which help us the most when we test the web applications.
Hope you enjoyed the article – Happy testing! 😉