logo logo

Shifting Right with Testing and Monitoring in Production

Shifting Right

Many testing leaders over the last several years have been advocating shifting left, meaning getting involved earlier in the design and development process. By understanding the design and implementation of the application, testers can go into active testing with a more comprehensive understanding of its strengths and weaknesses.

The contrarian that I am, over that same period of time I’ve been advocating not shifting left, but shifting right. That is, continuing to test and monitor applications even after they were deployed.

Cloud deployment has made this both possible and necessary. While the cloud enables global reach and almost infinite scalability, it is far more complex than deploying to an enterprise data center. In most cases, testers generally run their tests on local systems or on simplified cloud systems that don’t well represent the deployment environment. If there are differences between the test bed and the production environment, testing can’t accurately reflect what issues might arise in production.

That’s why testers have to be responsible for monitoring and testing applications in production. The work isn’t done yet.

Convincing veteran testers that it is important to test in production can be a challenge. Despite acknowledging that the production environment may be substantially different than the test bed, testers have long-held biases against post-deployment testing. Tinkering with an application in production in the data center can have serious consequences, and testers are conditioned to think that their job is done once the application is turned over to IT.

But that’s simply not the case anymore. Recovering from an application failure can be faster in the cloud. At best, monitoring the application in production will likely speed diagnosis, as teams examine performance and availability data. At worst, it should be possible to roll back to a previous version if necessary, taking perhaps only minutes.

Also, monitoring and testing in the cloud enable testers to find issues before they become problems for users. It’s obvious why this is important in this era of digital user experience. Finding and fixing a bug before users encounter it is a significant win for the enterprise, because it doesn’t expose users to a bad experience, and perhaps lost productivity and sales.

Synthetic Testing in Production

Testers are familiar with automating user interface testing – that is, recording user steps and recording them for automated replay. In production, this is referred to as synthetic testing.  It is often a part of a monitoring solution, so that the application can be monitored during testing.

These are tests that work through common user scenarios, often defined by the user stories that form the basis of the application. Once recorded, they can be executed at any time of the day or night.  In doing so, they can test conditions of the application at different times and loads. You can even direct requests to different data centers in different parts of the world to better understand availability and response times.

This is also the best way of finding errors and other issues before users do. Synthetic testing can expose faulty logic or bugs that weren’t discovered during pre-production testing, even if users haven’t reported any problems yet. These issues can then be diagnosed and fixed before they impact any users or business.

Monitoring for Health and Efficiency

I haven’t yet said a lot about what we should be monitoring on a production application. First, we monitor some of the same things we do in the data center – CPU, disk, and network. Those remain fundamental in understanding how your application is behaving.

But with the increasing emphasis on user experience, the traditional monitoring metrics are taking a backseat to measures that affect real users. Some monitoring tools employ real user monitoring, or RUM, in order to check on time to return data, time to paint a new screen, and likelihood of getting a 404 or other error.  This enables testers to report on deficiencies in user experience.

But the analytics associated with monitoring does more than give you a snapshot of application health. They provide information on trends, which enable testers to identify if an application may have a problem in the future. By storing large amounts of monitoring data over a period of weeks or even months, it’s possible to identify memory leaks, bottlenecks, or other issues before they cause larger problems.

Testers Need Monitoring Data

But testing also has to include continuous monitoring and testing in production for cloud applications. Why? Because testing has changed over the last several years. With practices such as DevOps and continuous deployment, as well as less of a distinction between phases of the application lifecycle, including production. In practices such as agile and DevOps, testers certainly have to shift right to make sure the application works as expected in production.

I’ve talked a bit about what to monitor and why, but not about how to monitor. There are a wide variety of application monitoring tools available. I’ll name a few here, but do your own research based on your unique needs. The most comprehensive commercial tools include Dynatrace, New Relic, AppDynamics, DataDog, and Sumo Logic, but there are literally dozens of tools to choose from.

There are also a variety of open source monitoring solutions. Geekflare has a description of six popular open source tools, while OpenAPM incorporates a flow that enables you to select the best solutions based on your stated needs.

I’m not advocating for any particular tool. Monitoring systems have different capabilities and price points, and there is no “one product fits all”. But monitoring for availability, performance, load, and errors has become critical to the quality of a cloud-based application. And the many solutions available attest to the notion of the importance of these activities.

Are you advocating for the Shift Left or for the Shift Right approach? Share in the comments below, we’d ❤ to hear your thoughts!

About the author

Peter Varhol

Peter Varhol is a well-known writer and speaker on software and technology topics, having authored dozens of articles and spoken at a number of industry conferences and webcasts. He has advanced degrees in computer science, applied mathematics, and psychology, and is Managing Director at Technology Strategy Research, consulting with companies on software development, testing, and machine learning. His past roles include technology journalist, software product manager, software developer, and university professor.

Leave a Reply

FacebookLinkedInTwitterEmail