5 Testing Strategies For Deploying Microservices

With rigorous development and pre-production testing, your microservices will perform as they should. However, microservices need to be continuously tested against actual end-user activity to adapt the application to changing preferences and requests. This article will cover five deployment strategies that will help developers and DevOps teams when releasing new features or making changes to microservices-based applications.

Always Test in Production

A staging environment is a necessary but insufficient step toward building a robust microservices infrastructure. Although indicative, it is not a substitute for evolving user traffic and behavior. When real users interact with the software during production, the services may not perform as efficiently.

Users are unpredictable. They may make requests you have not accounted for when drawing up communication paths. Or your microservices architecture may not sustain the actual production traffic. When testing with real end-users, you are sure of your software’s performance and can improve it or add new functionality. 

The disadvantage of working with real users is that they will be impacted by any errors. However, with specific deployment strategies, you can identify the issues and minimize their impact on those users. 

Depending on your goals, you can implement the following testing strategies in the deployment stage. They can be performed individually or in various combinations, as each approach varies in its contribution to deployment.

Blue-Green Deployment

A blue-green deployment uses two distinct yet identical production environments. For example, you deploy an inactive version of your application in the blue environment. At the same time, the latest version available to end users runs in the green environment. Next, you push code to add new functionality to the blue environment. 

At this stage, you test the services in the blue environment to ensure that they are running smoothly. You can automate this process by using basic smoke tests. When the microservices pass these tests, you can instruct the load balancer to gradually redirect user traffic to this new environment.

If the new functionality poses no problems, the router moves all traffic from the green to blue environment, but can immediately revert traffic in case of any errors. 

Due to the identical nature of both environments and their reliance on each other, there is little to no downtime. The end user’s experience is largely unaffected and the developers can move quickly to fix any bugs.

The old environment serves as a backup, making it possible to execute a rollback without any downtime. After releasing the latest version, you can decide to kill the old instances, but only after you are confident that the new environment will continue running without errors. You can also clone both blue and green environments to repeat the process with other feature deployments in the future.

Canary Deployment

Canary deployment is similar to blue-green testing in that it protects against release-related risks.

The phrase “canary deployment” originates from an old mining practice of placing caged birds in mine shafts to detect the presence of harmful gases. If present, the gases would kill the canary before affecting the miners, thus providing an early warning to vacate the mine immediately.

A canary deployment works on a similar principle. Instead of creating two separate environments, a canary deployment operates within the same microservice or infrastructure. The developers roll out a new service or application version with changes only to a fraction of the end-users.

You can quickly detect errors or vulnerabilities while using the new service. The impact is temporary and minimal as the experiment is only run on a subset of users with most remaining unaffected. Once they are working and have passed verification, you can scale up the changes. 

The biggest challenge for those running canary deployments is the requirement to run multiple versions of the same microservices so that they can be deployed to a few users at a time. Implementing multiple versions also means you need to keep track of which users are on which versions to run accurate business metrics and analytics. Such monitoring can get increasingly complex with multiple canary deployments within the same application.

Feature Flags

A feature flag or feature toggle is a change or feature written inside conditional code. Developers can turn the feature on or off depending on their testing requirements while the application is running and in use.

The code is already deployed and there are two code paths: One with the code that implements the feature and one without it. The developer only needs to choose one of the two paths. When the switch is toggled on, the code chunk is executed as part of the flow. When switched off, that code is skipped and the feature is not implemented. The rest of the source code continues to run as usual, independent of the feature flag’s condition, with no disturbances in the end user’s experience.

Once integrated, a feature flag allows you to turn on a feature for a select group of users. Unlike a canary deployment where the selection is random, feature toggles are usually employed in specific cases. For example, developers do not create two separate applications to implement free and paid subscription tiers. Instead, operating a feature flag allows you to make a particular feature accessible only to subscription users.

Feature flags are especially attractive during testing as developers can use them to run small experiments throughout the application without relinquishing control. Being able to flip the switch remotely also makes rollbacks easier. The risk is negligible as the application is self-sufficient in that it can continue to run without the feature. 

Feature flags generate complex user-to-service paths by way of permutations and combinations. For a specific user, it is difficult to determine the exact route they took when using the services. Feature flags complicate this by allowing each user to have a different experience, ultimately making the application hard to debug. These toggles are easy to get started with minimal risks, but their maintenance can quickly become complex. It is best to use them only when needed and with other deployment strategies.

Traffic Shadowing

With traffic shadowing or mirroring, the router duplicates incoming traffic to an already-released service and gives the copy to another service. The request and response mechanism between the user and the existing service remains intact. On the other hand, the second service with a copy of the traffic contains new features that require testing. Consequently, it does not interfere with the existing process. Instead, the copy is used to test its functionality.

Its most significant benefit is that it allows the new version to receive the same traffic that is currently being received by the service it seeks to replace. There is no need to create test data or worry about replicating scale. This accuracy comes with little risk, as there is no tangible impact on the existing services. Developers can run all relevant tests on this production environment, such as testing for errors and performance metrics. The new version’s responses, which are not sent to users, can also be compared to those of the production service. Both versions operate independently of each other with different end goals. This can happen in real-time or a copy can be saved and replayed for future testing.

Traffic shadowing can be used with other deployment techniques like blue-green or canary deployments. After successfully shadowing a production environment, the changes can be rolled out gradually using a canary deployment to gain maximum confidence before a full release.

Shadowing can have unintended consequences, though, so you should exercise caution when deploying this strategy with services that have third-party dependencies. 

A/B Testing

Unlike blue-green and canary deployments, A/B testing is focused on user perception and experience of new features. It measures if and how end users are interacting with these features, whether they are easy to notice and use and gauges the application’s overall functionality. It provides developer teams with business-level insight to improve their application as the features are ultimately implemented using code.

A/B testing divides users into groups that access different features. Group A, for example, sees a different user interface than group B, although members in both groups are making the same requests to access the application. Traffic is routed to separate builds or different configurations on a common build. This is done by considering aspects like the users’ operating systems and user agents. It must be run on a sample that is representative of your end users. The test also needs to display statistically significant results to be valid.

This test can be combined with blue-green or canary deployments as they handle the actual feature deployments that this strategy tests. After comparing the versions shown to the groups, the one that has performed better can be pushed to release for all users. 

Building your Deployment Strategy

Each strategy outlined can be used independently or together in a combination that best suits your goals, workflow and microservices requirements. They allow you to identify and reduce the impact of any vulnerabilities that may only surface at the final stage of your software’s release. Implementing them can be a somewhat complex process, especially with larger architectures and dependencies. But by implementing any of these strategies, your team will release the best possible version of your application. 

5 Testing Strategies For Deploying Microservices

Leave a Reply