Avoid Frequently Failing Tests With Continuous Integration

10 min readJun 18, 2020

Web and mobile applications have become an integral part of our daily lives. But the more frequently software is used, the more likely it is that problems, issues, or bugs will occur in the systems. In order to avoid or limit such behavior, the software industry has increasingly necessitated the use of Continuous Integration (CI) systems in complex software development.

The Benefits of Continuous Integration

In short, continuous integration is a practice that allows the tracking of small changes. Any changes to the code are noted, packaged, and tested, thus decreasing the risk of regression and preventing problems with the integration of new versions early on. The benefits of CI include getting more frequent and faster feedback from the software development process, developers, and stakeholders. Frequent and reliable versions lead to improved product quality and greater customer satisfaction. Additionally, continuous delivery between the development process and the development teams is enhanced, and manual tasks can be simplified and minimized.

At the same time, adopting continuous practices shouldn’t be considered a trivial task of organizational processes and practices, and the tools may not be ready to support the highly complex and challenging nature of these habits. This is why complex multi-layer solutions must be applied at all levels of the software development life cycle (SDLC).

The Pitfalls of Continuous integration

With the implementation of CI tools, we ensure a better way for QA professionals to deploy automated tests across multiple instances, runs, and builds to uncover defects at a faster rate and improve the regression results. Meanwhile, the risk of an accidental and frequent test failure immediately increases. These failures may not relate to a specific bug, problem, or issue — they can be due to many external factors, such as misconfiguration in the CI platform, performance issues, lack of hardware resources, or incompatibility between instances and standards. Software teams sometimes rely too much on unit testing and ignore functional and acceptance testing, which could lead to a working code but misunderstood business requirements.

In order to avoid frequently failed tests, there are several factors we should consider, and in this article, we’ll look in detail at the measures we can take against frequently failed tests caused by bad configuration.

Test Automation

Test automation increases the depth and speed of the quality assurance process. It’s the core of Agile and a key element of successful development. To facilitate the automation of test cases, we use software frameworks such as JUnit in Java, Robot Test Framework, and Selenium WebDriver. There are also tools for code inspection like SonarQube — an open-source platform for continuous inspection of code quality that performs automatic reviews with static analysis of code to detect bugs, code smells, and security vulnerabilities.

Despite the well-known business advantages of test automation, it’s important to understand that it isn’t a magic wand. There are several factors that can compromise the integrity of a test automation solution. For example, automation engineers are responsible for deciding whether an organization should accept test automation, and they often overlook the factors that can influence success.

Why Do Automation Tests Fail?

Here are the most common reasons for test automation failure:

Lack of experience in designing and/or implementing frameworks and inability to incorporate improvement
Lack of proof of concept (POC) on pilot phases
Lack of in-depth analysis of test reports or stack trace reports
Lack of clarity on what to test and what to exclude from tests
Lack of exceptions and functionality logs for debugging/stack trace
Improper maintenance of site storage/security vulnerability/GDPR concepts

Moreover, the automation solution is complex and triggers disputes such as:

Will the organization need automation testing?
What is the optimal measure of automation? How could this be achieved?
Are test cases required for all features in the application, or there will be specific areas for distribution?

There is no one-size-fits-all answer to these questions as they depend on subjective factors like teams, projects, and organization. The best solution is one that meets the unique requirements of a company and drives collaboration between development, QA, and business teams. This can guarantee transparent and effective communication within the timeframe and provide the necessary scope and quality of practical experience for the end-user. A lack of effective communication between these teams can lead to critical failures in the process. Thus, companies must first ensure they have identified the “right reasons” for introducing test automation.

Once they have identified the needs for automation, they face a number of challenges when it comes to implementation:

Poorly designed software architecture — a building requires a solid foundation, and so must software be designed for a robust and sound architecture. The inability to integrate key scenarios, design common issues, create risk mitigation, and factor in the implications of long-term key decisions can jeopardize the integrity of any web application. While companies can use modern tools and platforms to simplify the task of building applications, there is a need to continue with the utmost precision and care in designing applications to ensure all scenarios and requirements are covered. The lack of such careful monitoring can result in poor architecture. This leads to unstable software that is unable to support existing or future business requirements and is challenging to deploy or manage in a production environment.
Incorrect identification of automation goals — when deciding the type of testing framework to be implemented, it becomes critical to first identify the organization’s automation goals. A framework library must be selected on the basis of its capacity to achieve these objectives. Companies that do not align their goals with the functions of the framework run the risk of incomplete test automation and increased costs.
Choosing the wrong test automation framework — data-driven and keyword frameworks are two of the most commonly used test automation frameworks. They both have their advantages and disadvantages, and organizations must be aware of the differences in order to choose the most appropriate one for their goals. They have to make sure that the architecture approach provides facility and tool support as well as customization. Improper valuation and basis can lead to a negative return on investment and big delays in release basis.

Jenkins Continuous Integration

There are many continuous integration tools whose purpose is to build continuous delivery mechanisms. Some of the most popular are Buddy, TeamCity, Jenkins, Travis CI, Bamboo, GitLab CI/CD, CircleCI, and CodeShip. In this study, we’ll review a narrower approach through the automated Jenkins continuous integration tool.

A free and open-source automation server written in Java, Jenkins helps to automate the non-human part of the software development process, with continuous integration and facilitating technical aspects of continuous delivery. It’s a server-based system that runs in servlet containers such as Apache Tomcat. It supports version control tools, including AccuRev, CVS, Subversion, Git, Mercurial, Perforce, TD/OMS, ClearCase, and RTC, and can execute Apache Ant, Apache Maven, and based projects as well as arbitrary shell scripts and Windows batch commands. Jenkins is released under the MIT License.

Builds can be triggered in different ways, for example by committing in a version control system, by scheduling, via a CronJob mechanism, and by requesting specific build URL parameters. It can also be triggered after other builds in the queue have completed. Jenkins’ functionality can be extended with many developed plugins.

Very often, for example in integration tests, some batch tasks running in parallel or with a background thread may also violate the performance value of testing. In case the script is run in a real environment — are there any other external requests that might affect the pipeline build? Are there other tests running in parallel? In addition, in asynchronous applications, the execution order may sometimes not be taken for granted or by priority.

Now we’ll follow up on the important cases where we may have similar problems with frequent test failures due to missed settings steps in the pipeline build.

Caching — Caches must be taken into consideration when designing tests. Sometimes due to time manipulation (time-lapse), caches, or stagnant data, test results can become unpredictable. Specs could frequently fail because old content was loaded, or the new content was not loaded properly. In some other cases after the code was executed once, this could cache a lot of the dynamic functionality and mislead the performance of our app. It is very important to measure the first execution made, and we should consider clearing the main cache in every single build if there is caching stored after deploying the application.
Cleaning step — A good test should always adjust its expected point environment and cleanse any personalized behavior to a vanilla state. This is one of the most unpredictable scripts to identify as it’s not one that fails, but one that may affect all tests. It’s an issue that most software organizations have so the clearing step is important for further independent testing.
Dynamic content — When testing user interfaces, we want tests to perform as expected. Sometimes the script may have to wait for the dynamic content to load first like frontend, for example. There can be asynchronous data calls which can be delayed due to infrastructure slowness or a milliseconds delay. In this situation, the explicit specific wait should be obtained or implicit internal wait should be created in our test automation framework. We ensure in this way that the tests won’t fail due to a delay in the dynamic content.
“Time bombs” — We shouldn’t assume that tests will always run in the same time zone as the one the application was designed in. We must accurately measure time intervals. If the test is limited by the current time or by the whole day (24 hours), we should consider all special cases with timing.
Infrastructure issues — In some cases, frequent failure is not due to the test being flaky. It may be a test framework error, Selenium WebDriver error, other library, or a changed version of the browser that we currently run. Other incidents such as failures at a CI node, network problems, a database disruption, and a misconfigured module can be the cause of infrastructure issues. We should always check the used technologies and their collaboration with the infrastructure.
Third-party systems — When integration tests don’t work in a difficult external environment, this depends on third-party systems. It’s a good practice to check the correctness of external systems, if available, and every component that the systems interact with. There have to be tests that confirm the integration with external systems. It’s a good approach to export these specific tests to another package or try deleting all external systems when checking the integrity of the system. These tests are called integration specs related to third-party system tests.

Optimizing Slow Builds Process

Large codebases can cause the integration of software components to take considerable time. To determine if the problem is related to the size or integration of these components, check how long the compilation step takes. If this step turns out time-consuming, perform a gradual build instead of a complete one.

The incremental build will only compile and regenerate the modified files. This can be risky because depending on how this is done, it may not get all the benefits of CI, and the build could break. An effective CI system is associated with risk mitigation, and ideally, the integration environment should be cleaned by removing old files and then compiling/regenerating the code to effectively determine if something is broken.

Therefore, use incremental builds only if needed and explore other areas that lead to slow builds. Some zones may be scaled up. For example, if there is a Java system with a natural DLL or a shared object library that rarely changes, it might be wise to restore this library only once a day or in a single iteration.

In fact, some may argue that this rare DLL or shared object is treated as a separate CI project and a part of the project using dependencies.

Building System Components Separately

In most cases, integration takes a long time to build because of the time it takes to compile the source code and other dependable files. In this situation, we can divide the software into smaller subsystems (modules) and build each of them individually. To build application components separately, we create independent projects for each subsystem. This can be done by the CI system — just make one of the main projects a subsystem. If there are changes to one component based on dependencies, the other projects are also restructured and will speed the process of building code.

It’s almost mandatory to avoid brittle dependencies in our CI platform as every dependence leads to an unstable build and complicated pre-configuration. The figure illustrates three steps that let us steady our CI integration build and guarantee a smooth and hassle-free implementation.

Conclusion

With the advent of Continuous Integration in software development, the bugs encountered in production environments are limited, but problems with building and running tests are compounded by many dependencies and configuration issues. We provided a series of solutions that facilitate the transition to CI and specifically the Jenkins CI build tool. Additionally, we demonstrated that there is a solution to consolidate and avoid tests that are flaky and unstable. By having these steps presented, we can be sure that problems that occur in builds via CI tools can be minimized, and the pipeline of different tests will run smoothly. By rearranging script build in steps, we ensure that tests will run in sync and sorted by type, which is a better way when it comes to test plan reports of the iteration cycle. This approach demonstrated everything needed to facilitate the process of Continuous Integration and the running of automated tests.

Original post can be found here.

Authored by Denislav Lefterov:

Denislav’s professional journey in the field of software testing started when he was still a university student. Quality assurance (QA) is a vast area with innovations coming up every year and, hence, numerous opportunities for growth, he says. In search of new horizons, his path crossed ours in 2018.

He joined MentorMate’s QA team where he’s in charge of automation in a healthcare insurance project — an extremely dynamic one which uses the latest technologies in software development. In the future, Denislav plans on acquiring new skills in artificial intelligence (AI) as a component of software testing as he firmly believes that software will increasingly rely on AI and other innovative technologies.

In addition to his job as a QA specialist, he’s engaged with social activities focused on helping children with special educational needs (SEN). He’s developed a platform with useful materials on the matter, and in 2013, he received an award from the Bulgarian Association of Information Technologies for his contribution towards children with SEN.