logo logo

When Nothing Goes Right, Shift Left!

When Nothing Goes Right, Shift Left!

The shift-left approach in the software testing world is when testing is performed early on in the lifecycle of the product. But given the growing complexities of products in the market today, we must customize solutions that work for your organization. In this article, we will discuss a few applied scenarios in shifting left/right, and share how customized approaches for your teams are critical in the success of your test automation efforts 💪

In my experience working at top tech companies, I have seen the software development engineer in test roles and key metrics in test automation evolve with time. 

Once the automation framework matures, the goals shift towards metrics like MTTR (Mean Time to Resolve) and MTTD (Mean Time to Detect). These metrics are critical when we are running tests against pull requests. In general, when tests fail more than once, we classify the tests as flaky tests.

In one of my roles, we had an SLA (Service Level Agreement) of 24 hours before which we needed to fix our test failures. During that time, we were expected to notify our engineers that a test failure occurred and that we were investigating the failure. While debugging issues, we realized that most of our test failures were either due to environment issues, test data issues, or due to a service being down on staging, which impacted UI tests(running on staging). Having a customized approach to resolving each issue was essential to delivering software quality products.

In my current role, for example, we resorted to mocking the API responses, especially in cases where we had to match a driver and a passenger. You can imagine that it would have been a nightmare having a driver being paired with someone else’s passenger on staging, which is a shared environment.

In one of my earlier roles, we realized that our UI tests were failing whenever a service was down in staging. Essentially our tests were notifying other folks that a service is down, which is not the most optimal solution! You can’t be writing a test for a situation where it should have probably been an observability tool, as there are monitoring tools to indicate errors. You shouldn’t be waiting for a test to fail to indicate that you need to act. This is an example of where having a customized approach is crucial. I shifted right and worked closely with the observability team to set up a monitoring tool like Grafana with alarm capabilities that would essentially notify the on-call person if there are any spikes or plummets in any of the services 📈

In this next section, we will cover some common questions that I often receive about how to build a customized approach to ensure your test automation success

Do you test in production OR go to the far left/right part of the pipeline?

In one of my earlier roles, we were able to achieve it way more easily. We were able to run our tests in staging pre-prod and production, mainly because of the fact that we were able to whitelist our users easily. In order to run tests in production, we needed to ensure that our tests wouldn’t affect analytics. To address this, we worked on adding a flag to a user – this is an optional field, but if it has been set to “test.” the user becomes restricted and the analytics API also ignores those users. 

By being able to control these factors it was very easy for us to run our tests in production, and we had configs that were like a flip of a switch in the backend when we wanted to run these tests in production. Make sure that you’re able to test in production or that your test automation is as seamless as possible. Remember that if you want to write stable tests, you need to be able to control your data, how it is being generated, and what are the variables associated with it.

Especially at startups, I have seen that most of the features are still in experimental mode. So these experiments get turned on randomly and suddenly, for 10% of your user base. The consequence is that our tests will be different because the user is part of a segment. We were able to control this issue again because we had a feature flag with a toggle switch for those users. If this user is a test user, then don’t add them to any of these experimental buckets. So that was easier for us to control how we want our tests to behave whether in staging or in production. We were confining all the variables and putting them under a specific boundary, which helped us to seamlessly shift Left ⬅ or ➡ Right

How does empowering teams help in determining the overall success of test automation efforts?

In some of my roles, I have been successful in implementing a test automation framework and process to improve the overall CI/CD pipeline because of the support from my managers and my directors, who were open to hearing feedback. I’ve realized that most of the time, the problem is not just within the quality team because there are a lot of dependencies with other teams as well. There are also a lot of cross-platform dependencies involved, which increase the complexity.

Another critical aspect to this is the OKRs that the company focuses on and how test automation goals can be tied to that vision. Then, indirect test automation will become crucial for the company, and after that teams will be open and want to empower other teams to help each other. It’s just that you need to position your problem statement in a way that they understand that this is going to help the company’s goals. It is also about communicating to other teams, helping them understand how we can help them if we improve test automation.

When the management shares the desire to improve test automation and asks authentic questions about how to support, you immediately see the positive impact. Because we had a (psychologically) safe space, we could openly say that we share dependencies and we felt empowered by our leaders to get in touch with other key stakeholders to identify roadblocks. Once we reached out to those teams, they initially resisted, but when we were able to tie our goals to the company’s goals, they made time to support our efforts.

I have also worked at places where we were unable to accomplish test automation success because of friction from the development team and a lack of support from the dependent teams. Very early on in my career, I joined a team where I was responsible for setting up a test automation framework, and when I spoke to the development team, they resisted the efforts stating that they don’t believe in tests even though that was the impediment that was preventing quality engineers from moving forward.

Due to lack of direction or empowerment from our leaders, our team ended up working on the test framework anyway, siloed in a different repository only to realize after a year that these efforts went unnoticed! In order for our test automation to be successful, we need to empower teams with the knowledge of how automation can help them move faster and not slow down their progress.

How do you imagine the future of test automation in the upcoming years?

We’re definitely moving towards leveraging AI and ML in the testing space. I believe a few companies are already starting to practice AIOPS and MLOPS.  It definitely cannot replace testing per se, but will help supplement our current efforts and act as a catalyst to help us continue to move faster. Let’s say that we decide to involve AI and ML, then you don’t need a human being to monitor and update the quarantine test suite. Instead, you can write a script that will keep learning and will also share analytics regarding the stability of your test suite along with the gaps you need to fulfill to meet your metric goals.  

As a software quality advocate and as an automation engineer, your focus is on understanding if the test is a good use case for automation in the first place. This process consists of random steps and questions such as: Can it be automated? Can it help developers get faster feedback? Can your script take care of figuring out the reason for test flakiness? Or is this because of a different issue?

It comes up and gives you a recommendation, saying this test is flaky because of a test environment or it’s tested with a lack of proper test data. Then the person behind the scenes basically goes and fixes the issue. In the not-so-distant future, the system starts making recommendations based on all the data it analyzes. Then we just need to plug in and fix things. I think that’s the direction we’re moving, which will help us move even faster! 🏃‍♀️

Do you think that the communities & the industry are ready to start adopting AI and ML?

In my opinion, the changes are happening very slowly. I think it’s going to take us a while because there are a couple of tools launching here and there, but I don’t think they are easily pluggable into an environment, as they need to be customized according to each organization and their dependencies. So at this point, it is at a proof of concept stage. I believe that it’ll take us a few more years to get there. And also the fact that everyone agrees that shifting left is important but nobody realizes that we also need to be adaptable enough to shift right if needed.

Companies are slowly understanding its importance, and these days even when you look at job descriptions, you need to have automation skills, but you would also be expected to have Docker skills or understand your CI/CD system. That is because companies realize that is definitely critical for the role since it will help you as a quality engineer to move faster. You also want to have a better understanding of the test ops side of the world, it will help the organization to adapt to shifting left or right. It is going to be a slow transition, similar to how long it took these companies to move from waterfall to agile. These days we can’t find a single company following the waterfall model any more. Now that the DevOps transition has begun, AIOPS and MLOPS will definitely follow soon. 

Any resources you recommend to our audience?

Leading Quality”  is one of my favorite books. It empowers quality engineers to think outside the box, while leveraging real-life industry experiences. The best quote from the book in my opinion is “Continuous deployment without quality is just delivering continuous bugs to your customer”. The 3 C’s are key to a successful implementation of quality: customer issues lead to company issues which eventually impact the quality engineer’s career. Finally, for success in the test automation space, always connect the quality team’s work to business outcomes.

These were my thoughts on my most asked questions regarding how to build a customized approach to ensure success in your test automation 🥇 

About the author

Niranjani Manoharan

An accomplished software engineering leader, with 13+ years of experience in heading software engineering surrounding test and developer productivity for industry pioneers like Lyft, Pinterest, eBay and Twitter. Holds a Masters in software engineering from San Jose State University and then worked in the Silicon Valley navigating between startups and tech giants. A motivational, influential trailblazer and collaborator, who builds and implements solutions to elevate velocity, scalability, and testability.

Twitter- https://twitter.com/RanjaniRambles

Blog- https://www.niranjanimanoharan.org/

Leave a Reply