logo logo

Exercise Bad Behavior – Don’t Let Your Testing Fail Due To Fatigue

failure fatigue - Exhausted squirrels

So many of our automation endeavors have to suffer through script failures. Yes, many of these failures are our own fault. We still do things like use non-polling waits, use sleeps, and incorrectly handle race conditions. But sometimes, horror of horrors, some of those failures reported by our automation are due to issues with the system under test 😱

Now, in the best of all worlds, when a test reports a failure, via automation or some other means, the cause of that failure is fixed immediately; most of us, however, don’t get to live in that world. Most of us live in a world where issues are logged and then prioritized to resolved later, sometimes, much later. In fact, due to how overall work is prioritized, that issue effectively or even explicitly designated as “not going to fix”.

Whither the failing automation? We’ll get to that shortly.

When our scripts fail because we have an inconsistent contract with our system under test (SUT), we necessarily have inconsistent automation results. Sadly, many of us live in this world as well. When the SUT code or system base is ingrained in the application, it can be difficult to influence changes that increase testability.

Whether we are seeing failures due to inconsistent system behavior or due to a consistent issue that’s not yet been fixed, it takes a toll on us. We have yet another failed automation report to investigate. Did we get an expected failure? If so, we check to see if we already created an issue report for it. If it appears to be new, is it a consistent failure or intermittent? When we are seeing a high frequency of failed automation executions, we get worn out and suffer from what I call failure fatigue 😥

Failure fatigue is just that: the exhaustion we eventually feel from continually investigating failed automation executions.

So, what do we do about that? Hang on, before we can address what to do, we have to answer another question first, namely, “why do anything at all”?

The overarching answer to the latter question is “to avoid desensitization to failures”, a situation where we are no longer attuned to failures, especially recurring but intermittent failures. Even the most diligent of us will eventually look at a testing report and say, “there’s that failure again; we really need to get around to addressing that”. Yes, it might be a failure in the same script as the last time there was a failure and it might be the exact same failure. It might, however, be something totally different: an unrelated issue that happens to have been reported by the same test script. If we become desensitized to failures, particularly recurring failures in a specific script, we run the risk of assuming all failures reported by that script are due to the previously-discovered reason as opposed to the actual, newly-occurring reason. Desensitization leads to assuming we know why a failure has occurred without doing the due diligence to determine the actual cause.

Now, back to our first question, “what do we do about failure fatigue”?

Well, if a test script is failing, we can stop running that script; that definitely stops the failure and reduces the fatigue, but then we may be missing some information that the script provides despite the failure. We could do something radical, like fix the script, assuming the script is reporting false positives. Or, we could do something even more radical, like fix the issue in the SUT; remember, sometimes the automation is working correctly when it reports a failure (yes, you do sense some sarcasm there) 😉

Or, we would think a bit more outside of the proverbial box. For the most “out there” approach, why not make the test script report success when the current but undesired issue occurs. In other words, let’s make the test script check for the current, but wrong answer. Hang on; don’t toss me on the trash pile just yet. Please, just hear me out.

There are legitimate reasons to consider this approach other than “making all the dots green on the automation dashboard” (I also talk more about the point of automation in this article: Oh No! I Can’t Get My Test Automation Scripts To Pass!). If a fix is being deferred until a later time, this issue is not likely to be a critical/major issue; those are usually fixed with a sense of urgency. Usually, addressing a lower-priority issue is something that isn’t as important or as pressing as other activities; if so, the product decision-maker has, at least temporarily, deemed the current behavior to be “sufficiently correct for now”. Following that line of thinking, making the script confirm the current, albeit undesirable in the long-run, behavior is not a far stretch that helps alleviate failure fatigue. We get extra points if, in addition to making the script confirm the current behavior, we make the script fail if we get different-than-expected behavior, even if that behavior is the desired behavior. This creates a programmatic check that the “now broken” script doesn’t stay broken forever.

❗ There are specific circumstances in which the “check the bad behavior” approach might be a good idea, but most if not all the following criteria must be met when deciding to take on this approach:

  • The issue is that the SUT is not behaving correctly
  • The issue is not with the automation
  • The issue will be fixed eventually, but the failure is causing some amount of fatigue
  • The script can’t or shouldn’t be split into two parts: the part that exercises the steps that are working as expected and the part that is currently failing (and could be temporarily skipped with no appreciable risk)
  • The script can be made to pass if and only if the current behavior occurs, but still fails if other behavior occurs

From a “real life” standpoint, I’ve used this “check the bad behavior” approach with a client recently. At first, they were, shall we say, doubtful but after showing them how it works, we’ve had success with it. Additionally, my Hunting Sasquatch talk discusses failure fatigue; Jenny Bramble also discusses the “check bad behavior” topic in her talk Designing Your Tests Up to Fail.

Featured image is taken from here.

Like this? Catch me at an upcoming event!

About the author

Paul Grizzaffi

As a Principal Automation Architect at Magenic, Paul Grizzaffi is following his passion for providing technology solutions to testing, QE, and QA organizations, including automation assessments, implementations, and through activities benefiting the broader testing community. An accomplished keynote speaker and writer, Paul has spoken at local and national conferences and meetings. He is an advisor to Software Test Professionals and STPCon, as well as a member of the Industry Advisory Board of the Advanced Research Center for Software Testing and Quality Assurance (STQA) at UT Dallas where he is a frequent guest lecturer. In addition to spending time with his twins, Paul enjoys sharing his experiences and learning from other testing professionals; his mostly cogent thoughts can be read on his blog at https://responsibleautomation.wordpress.com/.

Leave a Reply