CodeCraft’s POV: Learning From Automation Testing Failures

In the world of software testing, tests sometimes fail! Here’s CodeCraft’s POV on getting it right to avoid flakey testing

9 min readSep 14, 2021

Quality Assurance is a very important part of a successful software development methodology. With the trend turning towards Agile, a quick testing feedback, quick deployment and a quick deliverable have become a must. You need to ensure that you research and plan to suit these SDLC methodologies to facilitate your CI/CD pipeline. CI/CD ensures feedback after every check-in. This in general means a faster development process. Faster deployment enables developers to provide any number of bugs fixing in a minimum amount of time. The code is only deployed on the customer’s end after ensuring that it does not break. However, with sites and apps being equipped with increasingly sophisticated features, manual testing becomes a complicated and long-winded task and with automation testing, this can be made a lot easier.

Table of Content:

1. How to analyse automated software testing failures?
2. How can we avoid flaky tests?

Above was the process that we followed to ensure the code was error-free, deployable and also helped us to ship better code faster. Now lets only focus on the test phase of the pipeline, which provides most important feedback about the build once all the automated tests are executed after every developer checks in some new code in the form of a new feature or a bug fix. As a tester, I was responsible for this phase of the pipeline and oversaw if everything was going smooth. But this is not the case all the time, when the pipeline breaks at the test phase and build turns red, there is mass panic everywhere and now everyone wants to know the reason behind the automation testing failures. For this, I was carrying out a test failure analysis to analyse a failed test and figure out what went wrong.

**How to analyse automated software testing failures?**

Automation testing failures are not bad altogether, in fact we write tests to fail to find vulnerability in the system. This is not the problem, but If the time spent investigating test results exceeds the time saved by running automated tests, then automation does not improve output quality and it’s not worth the cost. To reap the benefits of automation testing, it is essential to know how to properly handle the growing amount of test results. A better understanding of results usually creates more transparency within and outside of the testing team.

In Software testing in order to achieve effective failure analysis, below are some of the questions that every tester needs to ask themselves.

First question you need to ask yourself is “Did the test fail because of a problem with the software that you were testing, or because of a problem with your test?”. After all, before you go telling the developers that their code has a bug, you should make certain that the problem was not caused by a test that you wrote. To understand where the problem lies whether with the software or with your test, you need to know the root cause of the issue.
Second question you need to ask yourself is “If the failure was caused by a problem with your application’s code, how many builds or configurations are affected by the failure?”. Now if you have multiple test environments make sure to replicate this and also try it out on multiple devices to see if it is device specific and finally search in automation history for the last time when this test was passed.
A final question you need to ask yourself is “how significant is the failure?”. Understand how critical of an impact this has on the application. Is it significant enough that you need to delay deployment until it is fixed? Or is it a relatively minor issue that doesn’t warrant cancelling a whole deployment?

In general automation testing failures can be caused by the error prone commit done by the developer or the application under test (AUT) has changed or simply the tests designed are flaky in nature. Let’s focus more on the latter as it is a more serious issue that we generally see in our automated software testing reports.

How can we avoid flaky tests?

Too many software testing projects often fail due to flaky automation tests. To help you avoid these pitfalls, these are some of the best practices that are helpful in avoiding flakiness.

1. Avoid UI testing whenever possible
2. Focus on more automation testing scenario instead of test cases
3. Stop testing multiple things in one script
4. Prerequisite should never be done using driver approach
5. Stop designing test script that are dependent on each other
6. Testing “scripts” thoroughly before committing
7. Excessive use of xpath as locator
8. Control the controllable
9. Designing good test automation framework

1. Avoid UI testing whenever possible

Everyone is guilty of doing this, we look at a user story and automate its acceptance criteria on the UI layer itself. This is not wrong, but when test scenarios are complex there are chances that tests might turn out to be flaky. Instead of doing this, we have to target asserting on correct layers. Modern web applications are now clearly divided between backend and frontend. The backend is mostly composed of a number of REST web services or APIs with easily accessible end-points. The application’s logic can be tested at the API layer also instead of always resorting to validate functionality at the UI layer which is at best cumbersome.

**2. Focus more on automating testing scenario instead of test cases**

As part of manual testing when testers write test cases, normally they break scenario’s into multiple steps or into test cases sometimes. When designing a test script, always target one scenario at a time whenever possible. There need not be 1:1 mapping always with test cases. This doesn’t mean combining 100 test cases into a single test script. This applies only when you’re testing a simple flow and making multiple validation at the same time. This also helps with maintaining test scripts when test cases keep on growing. Do bear in mind that by automating a test, you are not really testing, you’re only merely checking that the feature in question is satisfying some acceptance criteria. You cannot automate testing, but you can automate the checking of known facts.

3. Stop testing multiple things in one script

This may conflict with a point that I made earlier, but what I’m trying to get is that we should not put multiple assertions on single UI elements. Instead, we have to keep our test script as simple as possible. We should not assert on something that might change tomorrow which causes the script to break. Always remember that most of the flaky tests are due to bad assertions from our code.

4. Prerequisite should never be done using UI driver approach

Test cases might have a certain dependency or precondition that has to be met before executing the test case. When we automate these kinds of test cases we are likely to use a UI driven approach to satisfy the precondition, then proceed with the testcase. We fail to note that the test is never even executed, if the script fails in the precondition stage. To avoid these failures, whenever possible use api’s to meet these requirements.

5. Stop designing test scripts that are dependent on each other

Attempting to execute hundreds of test cases in an exact and predefined order is not a good idea. The reason being, that if your test suite of hundreds of tests must be run in a certain order, and one of the test cases fail, then you must run the entire suite again when re-testing. And again, identifying the error would require manual inspection. This is obviously very inefficient. This approach works against the benefits that come with test automation; flexibility, agility, etc. This clearly defeats the purpose of testing which dictates that each single case can run on its own without being dependent on other cases, and that the order in which cases are run should not matter.

6. Testing “scripts” thoroughly before committing

Most of the time after designing a new test case, we run the test a couple of times and see if its passing. If we see a green check, we move on with automating other test cases. There is a fundamental flaw in this approach as we fail to understand that one test case may fail in a few different ways and we did not even test failure scenarios yet. And maybe for a different set of data system behaviour is slightly different from before which was also not handled. Therefore, we must test with multiple combinations of test data before signing off on the current test case.

7. Excessive use of xpath as locator

Most of the time, developers fail to allot IDs to all the web elements while it is mandatory for every web element to have an ID for effective testing. So, we as testers opt for xpath instead, knowing these xpath are slow. If the automated test script is not able to find these web elements within a prescribed time limit, the test fails resulting in flaky tests. Therefore it is better to ask the development team to add ID’s wherever possible.

8. Control the controllable

In the world of software testing, consistency is the mother of quality. When using manual testing to verify the automation failures, we might not be able to replicate failures all the time. In some cases It happens that a test case may pass on our local setup and as soon as we push it to CI/CD setup the same test case fails. Most of us have faced this one time or another. This can happen with our test cases as well, if we have not taken machine speed or network speed into account. Machine we execute on may be slower in performance and the network may be slower when compared to the machine we executed on local. We have to anticipate this behaviour and handle them while developing the scripts with use of waits provided by selenium testing tool.

9. Designing good test automation framework

Automation testing requires the right tools, test automation frameworks, and technical knowledge to yield results. Before building an automation framework, first you need to select the right tool for the project. For that you need to know the application being tested is web-based or mobile-based? To test the former, use Selenium testing to automate your tests. For the latter, Appium is one of the best possible tools for automation. When creating a Test Automation Framework, we should consider the following main points

To be able to create automated tests quickly by using appropriate abstraction layers
The framework should have meaningful logging and reporting structure
Should be easily maintainable and extendable
An error handling and retry mechanism to rerun failed tests

Conclusion

Your tests will fail, at least sometimes. Test automation failure analysis is a key pillar of continuous testing. Continuous testing creates a lot of test results data, which in turn results in failed test cases. The way you react to the failures plays a pivotal role in shaping the effectiveness of your overall testing strategy. Automation testing, if done wrong or with no thought process, is a waste of time and provides no value to anyone. The right failure testing analysis solution allows you to focus on actual failures that may be a risk to the business, not the false alarms. And as you mature your DevOps process and expand test automation, smart test reporting will become critical as you expand. Above points are not geared towards any specific kinds of testing tools, but can be considered as general best practices across any framework whether it is used for selenium testing, appium testing or any other tool for that matter. If the above thing is done well, you will have no problems maintaining automation scripts as you scale.