Have you ever been asked by your team or a client about the benefits of automation? Or why should they invest additional resources into it?
Maybe you have been in a situation where a complex architectural change was introduced to the system which resulted in hours, if not days of manual regression testing of the relevant features?
If situations such as these ring a bell, this story about how automation increased efficiency and saved us days of regression tests is for you.
A bit of background
The product for which I am a Senior Quality Assurance engineer is an order management app for warehouses. It has been live since June 2021 and improves efficiency by drastically decreasing the time of managing an order. Recently, we identified the need of dividing the main flow of our application, which lived in one microservice, into two separate microservices to be able to provide users with new solutions.
There were a lot of tasks; 2-months-worth work of several BE developers, the introduction of asynchronous consuming messages from one microservice into other microservices, etc. And after all of that, we did it.
We were ready to integrate the change that impacts the whole application. You would probably expect a serious workload for the QA and the team, dealing with dozens of bugs and many extra days if not weeks of fixing and adjusting the solution. However, we avoided all of that!
What and how we automate
Before I reveal how many bugs we had and how long it took us to test the system several times, I would like to first talk about our automation processes. We have automated tests on every layer of our automation. Unit and integration tests (with other services mocked) are managed by Java developers. Whenever there is a new feature or a bug that can be tested on a unit level – they do it.
On the other hand, we also have E2E tests checking user flows with Cypress on the Frontend. These are written by Frontend developers with the same set of rules – that every new feature needs tests.
In the beginning, we had some issues with the integration of services that utilize GraphQL and are federated through Apollo. Sometimes one change in a microservice caused unexpected responses in another microservice and we were detecting that quite late: either during testing or even in production.
We decided to add another layer of integration tests, managed by the QA. Their prime directive was to test the federation of microservices and warn us immediately if the new backend change would trigger an error somewhere else or required us to adjust the frontend in a way we did not predict.
Since these are pure API tests, we also use them to test all the negative flows which would take too much time for E2E tests. Armed with all of this we faced the challenge of a big, architectural change.
A big (and not-so-time-consuming) reveal
When we first merged to staging the final change that connected the new microservice, replacing at the same time a part of the old flow with the new, everything… went well! All the basic user flows worked correctly at first glance. The E2E tests passed, and the API tests after a small adjustment accommodating the asynchronous communication between the services were also given the green light.
After testing with automation, I manually tested features that were at the center of this change, and this revealed 3 minor bugs that were addressed and tested on the very same day. Even when we needed to revert the change to unblock another team that was deploying features in the old microservice which then was remerged again, it was not a problem – automation did the regression in a matter of minutes.
In other words, a big architectural change prepared for 2 months took only a bit more effort than testing a new feature in our system! And we were confident about that change and the state of the system after it was completed.
The criteria that contributed to the success of automation
- Automation covering everything from unit testing, through the microservices and FE-BE’s integrations to E2E flows on the frontend allowed us to test almost the whole system in a matter of minutes
- Adding tests regularly allowed us to be ready for what came next.
- Each ticket contributing to this architectural change which could be tested right after it was done, was tested. Either by being treated with the automation or manually by developers or the QA. This way we caught any issues before they even had time to become problematic.
- The last contributor is…
Communication and teamwork
Why am I highlighting this one in particular you may be wondering? All that I described would not be possible if our team communication was not near to perfect. The engineering team was regularly informing the business team about our progress.
Backend developers made sure that all members of the engineering team understood the technicalities of the changes, they shared and consulted their solutions weekly during our engineering team’s share meeting.
They listened to and addressed all the doubts, questions, and comments from the business, Frontend devs, and QA. I, as a QA, communicated the strategy for testing and contributed as much as I could throughout the whole process.
I tested each important milestone and made sure I understood how the whole system works to be able to prepare myself, the API tests, and the team for testing that feature. There were many places in which we could fail; however, we overcame all of them by working as one team with one goal.
The future of automation
What will the future bring? I don’t know the exact answer to that question. The microservice separation has made space for new innovations. No matter what it will be, I am quite confident when it comes to the quality of our product as it is protected by both a great team and a lot of automated testing and processes.