Understanding the Value of Code Coverage in Software Testing
Written on
Chapter 1: The Importance of Code Coverage
Several organizations and governmental bodies set minimum standards for code coverage, but is this truly beneficial?
Photo by Markus Spiske on Unsplash
A few years back, a colleague recounted his experience at a company that was developing a product. They had an opportunity to license it to a government entity, which was an almost perfect match—except for one critical metric: code coverage. While I can’t recall the exact percentage, let’s assume it was 80%, and that particular agency mandated that as a minimum requirement. The challenge was that the software had nearly zero code coverage.
Interestingly, the product was functioning effectively with numerous actual users. Despite the absence of automated tests, the code quality and overall product were exceptional. However, that was insufficient for the government agency, which relied solely on numerical proof generated by a testing suite. Consequently, they decided to boost the code coverage to meet the requirements.
How does one enhance the code coverage of an existing application? The answer is straightforward: by adding tests that execute the code.
He shared how they approached the situation. They began incorporating tests for known functionalities, and gradually, the code coverage increased. However, after several weeks, they realized this method was too slow. They pivoted their strategy to analyze coverage reports, targeting significant uncovered sections of code, and created tests for those areas. While this accelerated the process, it remained sluggish. Over time, the quality of the tests deteriorated, losing both focus and effectiveness. By the final weeks, the tests became nearly worthless, yet they fulfilled the requirement of executing the code, thereby increasing the coverage. After months of effort, they finally achieved the required ratio.
He described this experience as the most frustrating of his career. Initially, when he began to share his story, I thought to myself, "Wow, a government agency that values testing—how commendable!" But as he continued to explain the ordeal they faced to meet the coverage metric, I became alarmed. I recognized how misguided that requirement was. Yes, the application now had tests, but they were ineffective.
Writing tests after the code has been developed can be perilous. Robert C. Martin often emphasizes that creating tests post-development feels wrong, and he is correct. Typically, when a programmer writes a test after coding, the sentiment is, "Why am I writing an automated test for something I know works because I just tested it manually?" As was the case with my colleague, this process can feel futile. Yet, that sentiment is valid. The primary purpose of a test is to prevent unintended functionality breaks in the future. We should be testing functionalities rather than focusing solely on the code itself. This becomes problematic when the code is already written.
Moreover, we must ensure that both the production code and the test code are thoroughly examined. The goal of a test is to fail when functionality breaks; if it doesn’t fail, we have no way of knowing if the test code is reliable. Test code can also harbor bugs, and tests created after the code often contain many flaws.
Tests that have bugs are ineffective; they cannot identify when functionalities fail, thus failing their intended purpose. However, they still contribute to code coverage. If a minimum coverage target exists, it's possible to quickly write tests until that target is met. If SonarQube indicates that coverage is too low, one can always produce a unit test laden with mocks to meet the requirement. These tests are largely ineffective, serving merely to satisfy the metric.
I have encountered numerous worthless tests, to the extent that I have videos demonstrating how I could eliminate the functionality code while the test still passes.
Our objective should be to cover functionalities, not just the code. Code coverage merely indicates that the code has been executed, not that it performed as expected. While we may hope that mandating code coverage will encourage programmers to create tests, without adequate practice, they may only become adept at producing flawed tests that superficially enhance coverage without addressing the functionalities.
That said, code coverage is not entirely without merit. It serves a specific purpose that can aid in learning testing and avoiding common mistakes. If we create tests before writing the code and then develop the minimal code necessary to pass those tests, we can achieve close to 100% code coverage. Should we write excessive code or mismanage a refactor, the coverage will decrease. Thus, it acts as a useful indicator of when unnecessary code has been introduced.
Another significant aspect is that programmers often possess an innate sense for identifying missing features or edge cases. While they may also tend to over-engineer solutions, code coverage can help reveal those new scenarios that the business may not have considered. This presents an opportunity to discuss these new cases and determine the appropriate course of action.
So, is code coverage pointless? Not at all. Should management impose minimum coverage requirements? No. It should solely serve as a tool for developers to identify instances of over-engineering.
Chapter 2: Improving Your Code Coverage
This chapter discusses strategies to enhance code coverage and ensure effective testing practices.
The first video, "Improving Your Code Coverage," offers insights into effective ways to enhance your testing strategies while ensuring quality.
Chapter 3: The Pitfalls of Pursuing 100% Code Coverage
In this chapter, we explore the drawbacks of striving for perfect code coverage metrics and how it can lead to ineffective testing.
The second video, "100% Code Coverage Is Useless," discusses the potential pitfalls of focusing solely on coverage percentages without ensuring test effectiveness.