This article takes you through the process of measuring code coverage, understanding why it is important, and demonstrating how to implement code coverage.
Table of Contents
- Metrics – Are They Important?
- Code Coverage – What Is It?
- Code Coverage from API Workflow / Functional UI (end-2-end) Tests
- Implementing Functional Coverage
There are different metrics relevant to each aspect of the software development life cycle (SDLC). These metrics are important since we use them to understand the current state of the product. If we do not use the metrics, we would not know where we were, what is the change in the present situation, and what target to aim for in the near future.
In a nutshell, Metrics —
- Help us understand the current state.
- Provide an indication of potential problems/issues.
- Give us the ability to make educated, data-driven decisions, quickly!
We will focus on Coverage, one of the popular metrics used by the development/testing teams.
In computer science, test coverage is a measure used to describe the degree to which the source code of a program is executed when a particular test suite runs. A program with high test coverage, measured as a percentage, has had more of its source code executed during testing, which suggests it has a lower chance of containing undetected software bugs compared to a program with low test coverage.
This Wikipedia page for Code coverage explains the concept in detail.
💡 Bear in mind though, high code coverage may give you a false sense of confidence about the quality of the product. Code coverage is ONE of the metrics that give an indication of quality, not THE metric!
It is easy to measure code coverage when running unit tests. You will find a plethora of tools (free & commercial) with a variety of features for any programming language you may be using. You can integrate these tools as part of your pre-commit hooks (i.e. you will not be able to push your code to version control if the limits set for code coverage fall below the minimum limit set), and also as part of your CI builds (i.e. fail the build if the code coverage limit falls below the expected limit).
The reason capturing code coverage works easily for unit tests, and maybe integration tests, is that these tests run in isolation, directly on the code. You do not need to have the product deployed to any environment to run the tests to measure the coverage. I found these great resources you can check out to understand more about how code coverage works:
Lastly, we typically think of code coverage as line coverage. That is of little value. You should focus on branch coverage of your product code as part of the unit test implementation.
However, very frequently this question comes up – how can we measure code coverage of the API Workflow tests, or of the Functional UI (e2e / end-2-end) tests? A quick search about this topic today as well does not result in any concrete suggestions nor answers to this question.
That being said, I remember people asking me this question for at least the past 8-10 years. Every time, I have given the same answer because I have not come across, nor seen any better way of answering this question. It is time I wrote it down for easier access to others as well.
So, here we go ✍✨
A big criteria for the above strategy to work is to ensure the environment is isolated – i.e. NO ONE should be using the environment (for navigating through the product, testing of any kind other than the tests being triggered to measure coverage).
- Deploy the product-under-test to an isolated environment, and start measuring code coverage.
- Then trigger the API Workflow / Functional UI tests.
- That will tell you how much code coverage is achieved by these tests.
The above answer has some big gaps:
- What are you trying to understand from the code coverage of your API Workflow / Functional UI tests? What value will it bring to the team? How will it make the product better?
- Do you expect the code coverage of your API Workflow / Functional UI tests to be similar/identical/better than the unit tests? If yes, we need to have a different conversation about the Test Pyramid!
However, I believe there is a better approach to this.
As seen in the Test Automation Pyramid above, the API Workflow (web service orchestration) tests and the Functional UI tests are business-facing tests. Hence, instead of measuring the code coverage from these tests, it will add more value if we measure the functional/component coverage from the API Workflow and Functional UI tests.
The approach of measuring the code coverage from the Technology Facing Tests (Unit/Integration/…) will bring focus on the technical aspects of coverage, and the Business Facing Tests (API Workflow/Functional UI) will bring focus on the business & functional aspects of coverage.
When looked at together, this gives us a better sense of understanding of the overall quality of the product.
So, how can we implement the capture of functional coverage?
Unfortunately, there does not seem to be an out-of-the-box way to do this. But below is how I have implemented this before:
- I used cucumber-jvm in my test framework.
- For each (end-2-end Test) Scenario, in addition to the tags required for the test (as per the test framework design), I added the following types of tags to the test:
- functional areas touched by the test.
- components/modules touched by the test.
- I used cucumber-reporting plugin to generate rich, HTML reports
- The beauty of the reports generated by cucumber-reporting reports is that I can now see for my test execution the different statistics of the tags when the tests ran.
- Below is an example for the cucumber-reporting github page:
- With this report, you can now get to know:
- The overall tag execution state for the complete test run.
- The number of times the tag was run as part of how many test scenarios.
- Drill down into tag-specific reports as well.
- With this report, you can now get to know:
You can refer to this github project for a sample implementation of the above in the Karate API tests, which uses cucumber-reporting to generate reports which provide Functional Coverage.
As mentioned above, I am adding custom tags to each scenario – based on module/functionality/etc. If there is any critical functionality of my product, I would want to have more concentration of tests covering that feature/module, compared to others.
Another way to visualize tag statistics is in the form of tag heat maps. You want your critical functionality to have a good sized bubble in the heat map. Also, any small bubbles, or non-existing bubbles would mean you do not have coverage for that feature/module.
The above example is one of the easiest ways to implement feature coverage for API/UI tests. But it is very likely you are not using cucumber-jvm, and cucumber-reporting plugin. But if this approach makes sense to you, then you could very easily implement it in your test framework, using the constructs and features of that programming language and tools.
Note that the above example is one of the ways I implemented this capability in my automation framework.
I chose this approach because I like to keep the 80-20 rule in mind. The information this was able to provide to me was sufficient to understand the current state, and take decisions on “what’s next”. For areas that need additional clarity, I can then talk with the team, explore the code to get to the next level of details. This makes it a very collaborative way of working, and joint-ownership of quality! 🚀
You can choose your own way to implement Functional Coverage – based on your context of team, skills, capability, tech-stack, etc.