Hello, I’m Paul Grossman, aka the Dark Arts Wizard!
I’m a Test Automation Framework Architect / SDET for 19 years.
This article gives an overview of a Natural Language Process Engine Addon I recently designed for TestProject.
Before we start, let me offer some advice:
If you are an experienced Framework Architect Engineer…
Because you already have a cool framework in place.
I’ll even bet you have formed an opinion, good or bad, about the concepts discussed in this post.
- Natural Language Processing
- Dynamic Object Location (Magic Object Model)
- Class Switching
However, if you are one of those curious experimental types, the kind who takes the statement “That’s impossible!” as a personal challenge, then this post is for you.
- Initial Case Study: How Class Switching “Right Shifted” 16 hours of Maintenance
- What If ANY Element Could Be Identified with Plain English “On-the-fly”
- From Magic to Natural Language Processing
- Take Action with Verbs and Watch the NLP Engine demo in action!
- Stay Flexible
- Deductive Class Reasoning
- Implicit Wait vs. Explicit Wait vs. String Wait
- Verify Your Results
- Negative Testing
- A Few More Highlights!
- That was a close one: What if multiple elements match?
- Case Sensitivity
- Don’t Keep Your Comments to Yourself
- Back to Where it all Started: Class Switching
- Enough with the Good News, What’s the Bad News?
I would like to start by introducing you to a TestProject Addon called the Natural Language Processing (NLP) engine. It leverages what I call the Magic Object Model (MOM) pattern.
NLP/MOM attempts to do the impossible: Convert loosely written Plain English statements into executable code without any Page Object Model support.
The Magic Object Model provides on-the-fly object factory.
It recognized the following list of plain text descriptions: Link, Button, Field, List, Tab, CheckBox, RadioButton and Text. And also it implements Class Switching to reduce maintenance.
Now at this point you may be asking yourself: “I’ve heard of the Page Object Model, but what is this Magic Object Model and Class Switching?”
The Magic Object Model is a technique I’ve been working on for over a decade.
It started with VBscript and most recently it was ported to Java Selenium.
Before we get too deep into the features of all three concepts, I want you to know why I’m convinced this approach is viable. Starting with a subset of this approach, Class Switching, that recently helped me save the day.
Case Study: How Class Switching Helped “Right Shift” 16 Hours of Maintenance
A few years ago I joined an automation team that was supporting a two-year-old “legacy” automation project.
We inherited a suite of 100 test automation scripts for an application The test suite was run once each quarter and the project was on Cruise Control. Little in the application ever changed, so the scripts and code were very low maintenance.
30 – 90 minutes was typical time for maintenance, which was often completed before the entire test suite run was completed. And that’s how it went for eight more quarters, until one day when the application underwent an architectural update to AngularJS.
It broke every automation test case – except Login.
The automation scripts were developed with an automation tool that identified objects by class and most of the Links were now Buttons. The maintenance would require creating 50+ new elements and then debugging over 200 code references due to some Copy & Pasting early in the project. This would take two days to complete when the client was accustomed to having results within two hours of the QA release notification.
I thought “There must be a better solution than asking the client to wait two days for their results”.
So I choose to take a risk and implement a Class Switching technique and embed it into the framework’s custom .Click method.
- If the Link element did not exist, a Button element would be dynamically created.
- If the new Button element existed, it would replace Link object and be clicked.
- The replacement that occurred would be reported in the results as a Warning.
The code change took 90 minutes to be completely debugged and now 97 test cases were up and running and passing!
What was the deal with those last three test cases? One had such a weird object description, it had to be modified directly, which took 5 minutes. And the last two test cases identified an application defect.
In the end, the client had their results with just a two-hour delay, with a detected defect and was happy. We still had maintenance to recover a small performance hit, but we had right-shifted 16 hours of it into the next three months.
To be clear this was not the fault in the tool. It was a fault in the initial approach implemented four years earlier, two years before I had joined the team. This scenario can play out in almost any automation tool available today.
This experience led me to a thought…
What If ANY Element Could Be Identified On-the-fly with Plain English?
Which brings us to the Magic Object Model (MOM). The Magic Object Model identifies an Object in Selenium with one required value [Element Name] and a [Class]. For example “Ok” and “Button”.
The [Class] is normalized to a matching element tag. For example “Link” is normalized to an achor “\\a“.
Then a collection of matching elements are culled from the web page using xPath or CSS where an element property either equals or contains the [Element Name].
If no elements are found, alternate attempts are performed with different property values. If two or more elements have been found, the first visible element is selected. The result being most elements could be identified like this:
- “OK” ” Button”
- “First Name” “Field”
- “Continue” “Link”
- “Quotes” “Tab”
- “Blue Backpack” “Image”
- “State” “List”
- “Yes” “RadioButton”
- “Keep me Logged In” “Checkbox”
- “Thank you for you purchase” “Text”
From Magic to Natural Language
Now that I had the structure for identifying objects with plain English, why not go one step further? Thus the idea to wrap a Natural Language Process Engine, formerly known as a Lexical Analyzer.
The basic structure of an NLP sentence adds two more values: [Verb] [Element Name] [Class] [Verify] [Data]. Here are some examples:
- Open https://candymapper.com/the-magic-object-model
- Click Send Button
- Enter Name Field Paul
- Verify Name Field Contains Paul
- Select Vehicle List Audi
Take Action with Verbs!
The framework supports the following list of Verbs:
|“Hover”||over a named Button, Link, Tab or Field to open a container of hidden link choices|
|“Click”||a named Button, Link, Tab, Field, Image, RadioButton or Checkbox|
|“Select”||a named Item from a named List|
|“Enter”||Text into a Field|
|“Clear”||a named Checkbox|
|“Type”||Text at current element and Esc, PgUp, PgDn, Up, Down keys at the browser|
|“Wait For”||Text to be visible up to 9 seconds|
|“Verify”||any named element exists, or fields contain text, or text to Exist, or Not Exist|
|“Comment”||for additional documentation|
Check out this demo to see the NLP Engine in action:
Shortly after starting this adventure, I realized one sentence structure did not fit naturally into the five-word format: Enter Name Field Paul
It was just not how anyone would write a sentence. But I did not want to limit anyone to a certain format. So I created two exceptions. Select and Enter verbs should support a natural alternative sentence structure:
[Verb] [Data] [Element Name] [Class]
- Enter Paul Name Field
- Select Audi Vehicle List
I also realized that we all write sentences using prepositions (“on”) and definite articles (“the”). And we should allow for that.
- Verify the Name Field Contains Paul
- Verify on the Submit button
- Enter demo into the Description field
- Select Audi from the Vehicle List
And it should be flexible when it came to capitalization.
ENTER ‘Stop yelling’ INTO THE message FIELD
Note that the majority of the element identification is case sensitive, but more on that later…
Deductive Class Reasoning
The NLP framework should be smart and a bit forgiving. What if no object class was included in the sentence structure? Then the framework we should use deductive reasoning to guess element types. And make suggestions about writing in a more verbose form:
- Click Send (Button)
- Select Audi from Vehicle (List)
- Verify Name Contains Paul (Field)
- Verify Paul (Text and Exists)
“Warning: The element type ‘Text’ was assumed.”
Let Me Be Clear So I’m Not Misquoted: There was the issue that having class names in the text might confuse the parser: Click the Grassy Field Button.
So support for single quotes was added to separate any element names and data from the class and validation type: Click the ‘Grassy Field’ Button.
Implicit Wait vs. Explicit Wait vs. String Wait
Your next thought might be: How does this handle page build synchronization without any FluentWaits? Each time a Click method is performed, a custom Implicit Dynamic Page Build Sync Mechanism determines the page build completion. This technique responds in a range between 1 and 9 seconds.
Now I know what you are thinking: “Wait a minute! I mean exactly how could I literally wait a minute?”
A valid question.
There are rare occasions when we want to add some additional wait time. So the framework supports an Implicit Wait statement. Wait 60 seconds
These are “Just-In-Time waits” because they are used in just a few specific points of a test flow.
All other waits are default to 9 seconds at most.
Now I know some of you might also be thinking: “Wait a minute! Hard-coded waits are just bad automation practices and should never, ever be done!”. And that is why I started out by telling the experienced folks to stop reading at the start.
Of course, those experienced SDETs are totally right! Hard-coded waits should never be our first method to handle tricky web page timing issues. So the framework supports the preferred Wait For [String] statement that continues only after a specific text appears or a 9 second timeout: Wait For ‘Thank you! Your order will ship shortly’.
Verify Your Results
The Verify verb supports the following element validation
Verify [Named Element] [Class] [exists | is | contains]
- Verify ‘First Name’ Field exists
- Verify ‘First Name’ Field is ‘Paul’ (Exact match – case sensitive)
- Verify ‘Nickname’ Field Contains Wizard (Close match to “DarkArtsWizard” – case sensitive)
- Verify ‘Zip Code’ Field is Enabled
The NLP Engine’s Verify verb also supports negative testing.
Verify [Named Element] [Class] [does not exist | is not | does not contain]
- Verify ‘Error 404’ text does not exist
- Verify ‘First Name’ field does not contain ‘Enter four first name’
A Few More Highlights
Highlighting elements have long been disparaged by SDETs – and for good reason. The ones built into programming languages like VBScript lasted far too long – flashing three times for a total of 2.5 seconds. And they were a single boring color. Due to the dynamic nature of the Magic Object Model, a custom highlight was added to ensure the correct element was being identified during debugging.
This engine uses highlighting for visual validation during execution:
- Selected Element – turn on a BLUE highlight.
- Pass results are briefly highlighted, 3/4 of a second, in GREEN
- Fail results highlight in RED and pauses for 3 seconds for debugging. assuming your tests run to passing results this would have little to no impact.
That Was a Close One! Multiple Element Matches
What happens if there is more than one element that matches the provided text? Close matches are a highly debatable topic in NLP. The best answer I have found is to select the first visible element, and document the other matching elements. The test does not fail and continues execution.
For the Image element, this framework highlights any on-screen close matches with an extended highlight blink. This is to allow time to hit the pause button and inspect the page to get a better idea of how to improve the element text for a unique match.
Tip: Use F12 to review the exact element wording in the HTML page.
Sometimes the case of the displayed text does not match that of the HTML. At the 1:15 mark there is an example of a close match identification:
Text and Links are the only elements that support case insensitive matches: Click ‘GET IN TOUCH’ Button
All other elements must match in case sensitivity. Verify vehicle list is displayed (Will not work if the list does not contain the string ‘vehicle’ in all lowercase).
Don’t Keep Your Comments to Yourself
Comments can be added to the output with the comment command: Comment ‘Populating the Shopping Cart’
Type Some Text
Sometimes the right element is already selected and all we need to do is type on the keyboard. The Type command will type text at current element: Type Paul
Cheat: Clearing A Popup with Type ESC
Type can be used to clear most popups.
It can also allow typing keyboard keys PgUp, PgDn, Up, Down keys at browser to perform custom scrolling:
Click ‘Show All Images’
Comment Scroll to the next page of images
Type Page Up
Type Down Key
Type Arrow Down
Back To Where It All Started: Class Switching
Coming back to that story that started it all.
- If no Button element can be found, a class switch to a Link will be executed.
- If a match is found, a Warning is output into the results.
- Likewise for Links as Buttons.
Class Switching for Text
If no text element can be found, a class switch to all elements will be executed. If a match is found, a Warning is again output into the results.
- Verify ‘Contract Details’ Text exists (will class switch to search all elements if text is not found)
- Verify ‘Contract Details’ Tab exists
None of us write sentences the same way. But that should not get in the way of quick test case development.
- Nav To and Navigate To are alternates to the verb Open
- Set is an alternate to Enter
- Hover Over is an alternate to Hover
- Is Displayed and Is Visible are alternates to Exists
- Is Not Displayed and Is Not Visible are negative test alternates to Does Not Exist
Enough of All the Good News, What’s the Bad News?
This NLP engine Addon is just a start and there are a lot of unanswered questions from experienced SDETs:
1. Wouldn’t this approach be unreliable with invisible matching elements? No. All matching elements are double-checked for Visibility. The time saved in maintenance and test script development makes up for the slight overhead in execution time.
2. How does it support data parameters and data reuse? Great question! The element identification was the primary goal of this release of the Addon. Generating and reusing dynamic data values would be the next big step to implement because support for data-driven testing is already built into the TestProject platform.
3. There looks like there’s a lack of reusable high-level processes. For example “Enter User Address” which might consist of multiple statements. That is true. Another planned feature is to have grouped statements to be executed as a single reusable block within a step.
4. Can this handle a Table or Grid element? Not in this release as it likely will involve a modification in the sentence structure. However in past projects I have dynamically identified table, columns rows, and two-column matches to capture, verify and modify specific cell values. Stay tuned 😎
5. Can you handle elements within Popups? It depends on the underlying architecture. Currently, I have not had success finding elements on a Popup, aside from clearing it with Type Esc
6. Can this handle reCaptcha? No. DeathByCaptch or alternate secure solutions provided by Dev are currently the only way to get past a reCaptcha visual challenge.
7. Can this verify if an element is Enabled or Disabled? Yes, this is available in the version 2.0.1 released.
8. What if there are two elements with the same name and we want to click the 2nd one? You may find this to be surprisingly smart. Let’s say for example: Hover over ‘Login’, Followed by Click ‘Login’. If the Hover reveals another matching Login button element, the framework will take the next visible element. However there are plans to add something like this: Click the 2nd OK button and find relative object references like this: Verify the ‘Same as Billing Address’ Checkbox is below the ‘Shipping Address’ Field
9. What if the element can’t be found? One of the coolest features about TestProject is that it automatically takes screen captures when an error is reported.
So here is my challenge: Will this work on any website?
This addon has been tested against six public consumer websites and two sandbox sites, including the Tricentis Sample App , as well as the CandyMapper.com supported by GoDaddy.
I would love to test this approach on other websites and I encourage users to send feedback and feature requests in the comment section below.
Happy Testing! 💪
Paul M. Grossman
@DarkArtsWizard on Twitter
Comments5 1 comment
Hi Paul, thank you for your amazing work for NLP for web. Is there any possibility to add NLP for mobile? Another question how we distinguish if there is a lot of same texts but different action in one page. Example there is 2 labels Shops, 1 label is just a label and another is a link to other page. Thank you.