I've been reading a lot lately about all the TDD topics coming from plenty of TDD enthusiasts in the dev community. However, no one is talking about how valid those tests really are, how code coverage can be a false safety net and what can we do about it.

Straight to the Showcase

Imagine you have a UserService with a single method that pulls a User from the storage via IUserRepository by the given id and checks whether he's an adult or not. The criteria for adulthood can vary worldwide; for this example, let's stick to the Croatian law stating anyone 18 or older is considered an adult.

public class UserService
{
    private readonly IUserRepository _userRepository;

    public UserService(IUserRepository userRepository)
    {
        _userRepository = userRepository;
    }

    public bool IsUserAdult(int id)
    {
        User? user = _userRepository.GetUser(id);

        if (user == null)
        {
            throw new UserNotFoundException();
        }

        return user.Age >= 18;
    }
}

So far so good, right? Now, let's write a few unit tests to push the test coverage all the way up to 100% for our use case.

public class UserServiceTests
{
    private readonly Mock<IUserRepository> _userRepositoryMock;
    private readonly UserService _userService;

    public UserServiceTests()
    {
        _userRepositoryMock = new Mock<IUserRepository>();
        _userService = new UserService(_userRepositoryMock.Object);
    }

    [Fact]
    public void IsUserAdult_WhenUserIsAdult_ReturnsTrue()
    {
        _userRepositoryMock
            .Setup(r => r.GetUser(It.IsAny<int>()))
            .Returns(new User { Age = 20 });

        var result = _userService.IsUserAdult(1);

        Assert.True(result);
    }

    [Fact]
    public void IsUserAdult_WhenUserIsNull_ThrowsUserNotFoundException()
    {
        _userRepositoryMock
            .Setup(r => r.GetUser(It.IsAny<int>()))
            .Returns(null as User);

        Assert.Throws<UserNotFoundException>(
            () => _userService.IsUserAdult(1));
    }
}

To test our UserService, we used a combination of xUnit and Moq libraries. We have two different test methods, the first one asserting that the user is indeed an adult when he's older than 18, and the second one making sure that UserNotFoundException is thrown when the user does not exist in the underlying data storage.

To check the code coverage, we'll use yet another library, called Coverlet. After installing it, all we have to do is run the following command:

dotnet test /p:CollectCoverage=true /p:CoverletOutputFormat=opencover

And the results show that our tests flow went through every single line of code of our UserService.

Starting test execution, please wait...
A total of 1 test files matched the specified pattern.

Passed!  - Failed:     0, Passed:     2, Skipped:     0, Total:     2, Duration: 4 ms - Dotnet.Stryker.Test.dll (net8.0)

Calculating coverage result...
  Generating report '...\coverage.opencover.xml'

+----------------+------+--------+--------+
| Module         | Line | Branch | Method |
+----------------+------+--------+--------+
| Tests          | 100% | 100%   | 100%   |
+----------------+------+--------+--------+

+---------+------+--------+--------+
|         | Line | Branch | Method |
+---------+------+--------+--------+
| Total   | 100% | 100%   | 100%   |
+---------+------+--------+--------+
| Average | 100% | 100%   | 100%   |
+---------+------+--------+--------+

With the addition of a report generator tool, we can also get a neat visual representation of it.

dotnet tool install -g dotnet-reportgenerator-globaltool
reportgenerator -reports:TestResults/coverage.opencover.xml -targetdir:coveragereport -reporttypes:Html

Yay, we wrote unit tests, and our code coverage is 100%! Our code is now verified and bug-free!

WRONG!

This is a super simple scenario so it was easy to figure out that we never actually checked what happens in our flow when a user is under 18 years old, or exactly 18 years old. However, in the real-world scenario where the application you're working on is of a much larger scale and use cases and flows are numerous, it's impossible to catch everything you might've missed. Relying purely on the line coverage to claim that your code is bug-free is not really a wise idea. So... what can we do about it?

Mutation Testing to the Rescue!

What's mutation testing? Mutation Testing is a form of software testing where certain statements in the source code are modified or "mutated" to check whether the existing test cases are able to find these defects. The goal is to evaluate the quality of the existing test cases and their ability to catch bugs.

In mutation testing, a change is made to the program's source code and then the test suite is run again. If the test suite fails (which is the desired outcome), then the mutation is "killed." If the test suite passes, then the mutation "lives." The quality of the test suite can then be evaluated by the percentage of mutations it is able to kill.

Stryker.NET is a mutation testing framework for .NET applications. It works by altering your source code, running your tests, and reporting on the effectiveness of your tests. It helps developers to find the parts of their code that are not adequately tested, and therefore, might contain hidden bugs. Stryker.NET offers support for a wide range of testing frameworks and is actively maintained and updated, providing a powerful and flexible tool for mutation testing in the .NET ecosystem.

Now, let's check how Stryker.NET behaves when we run it against our test cases.

Stryker.NET Execution

Stryker, like that test coverage tool, can be installed as a global dotnet tool. To do so, all you have to do is run the following command:

dotnet tool install -g dotnet-stryker

Now we can move to the tests folder and let Stryker do its job against our test suite.

dotnet stryker

   _____ _              _               _   _ ______ _______  
  / ____| |            | |             | \ | |  ____|__   __| 
 | (___ | |_ _ __ _   _| | _____ _ __  |  \| | |__     | |    
  \___ \| __| '__| | | | |/ / _ \ '__| | . ` |  __|    | |    
  ____) | |_| |  | |_| |   <  __/ |    | |\  | |____   | |    
 |_____/ \__|_|   \__, |_|\_\___|_| (_)|_| \_|______|  |_|    
                   __/ |                                      
                  |___/                                       


Version: 3.10.0

[02:04:29 INF] Analysis starting.
[02:04:32 INF] Found project ...\git\dotnet-stryker\Dotnet.Stryker\Dotnet.Stryker.csproj to mutate.
[02:04:32 INF] Analysis complete.
[02:04:32 INF] Building test project ...\git\dotnet-stryker\Dotnet.Stryker.Test\Dotnet.Stryker.Test.csproj (1/1)
[02:04:38 INF] Number of tests found: 2 for project ...\git\dotnet-stryker\Dotnet.Stryker\Dotnet.Stryker.csproj. Initial test run started.
[02:04:41 INF] 7 mutants created
[02:04:41 INF] Capture mutant coverage using 'CoverageBasedTest' mode.
[02:04:42 INF] 2     mutants got status Ignored.      Reason: Removed by block already covered filter
[02:04:42 INF] 2     total mutants are skipped for the above mentioned reasons
█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████100,00% │ Testing mutant 5 / 5 │ K 4 │ S 1 │ T 0 │ ~0m 00s │                                                                                                 00:00:06

Killed:   4
Survived: 1
Timeout:  0

Your html report has been generated at:
...\git\dotnet-stryker\Dotnet.Stryker.Test\StrykerOutput\2023-07-31.02-04-28\reports\mutation-report.html
You can open it in your browser of choice.
[02:04:48 INF] Time Elapsed 00:00:19.5833809
[02:04:48 INF] The final mutation score is 80.00 %

From the report, it's visible that one of the created mutation test cases survived which indicates our test suite does not deal with all the possible parameter combinations properly.

Let's take a look at the detailed mutation-report.html file that was generated for us.

From this visual report, it's visible that 4 scenarios Stryker came up with passed successfully, 2 of them were ignored as they're already covered with our test cases and one failed. From the underlined code it's easy to determine that not all possible age combinations are covered in our test suite, and we already knew that, but it confirms that Stryker is doing its job and it will do its job for far more complex scenarios for you in the real application as well.

Now, let's enhance our test case and see how it will affect Stryker's report.

Enhancing the Test Suite

We don't have to do much here, by changing the first test method from Fact to Theory which can accept test parameters for different scenarios, we can cover under 18, exactly 18, and over 18 years old cases. It would look like this:

[Theory]
[InlineData(16, false)]
[InlineData(18, true)]
[InlineData(20, true)]
public void IsUserAdult_WhenUserIsAdult_ReturnsTrue(int age, bool expectedResult)
{
    _userRepositoryMock
        .Setup(r => r.GetUser(It.IsAny<int>()))
        .Returns(new User { Age = age });

    var isAdult = _userService.IsUserAdult(1);

    Assert.Equal(expectedResult, isAdult);
}

And re-running Stryker.NET tests afterward, we get the following results:

dotnet stryker

   _____ _              _               _   _ ______ _______  
  / ____| |            | |             | \ | |  ____|__   __| 
 | (___ | |_ _ __ _   _| | _____ _ __  |  \| | |__     | |    
  \___ \| __| '__| | | | |/ / _ \ '__| | . ` |  __|    | |    
  ____) | |_| |  | |_| |   <  __/ |    | |\  | |____   | |    
 |_____/ \__|_|   \__, |_|\_\___|_| (_)|_| \_|______|  |_|    
                   __/ |                                      
                  |___/                                       


Version: 3.10.0

[03:27:37 INF] Analysis starting.
[03:27:41 INF] Found project ...\git\dotnet-stryker\Dotnet.Stryker\Dotnet.Stryker.csproj to mutate.
[03:27:41 INF] Analysis complete.
[03:27:41 INF] Building test project ...\git\dotnet-stryker\Dotnet.Stryker.Test\Dotnet.Stryker.Test.csproj (1/1)
[03:27:47 INF] Number of tests found: 4 for project ...\git\dotnet-stryker\Dotnet.Stryker\Dotnet.Stryker.csproj. Initial test run started.
[03:27:49 INF] 7 mutants created
[03:27:49 INF] Capture mutant coverage using 'CoverageBasedTest' mode.
[03:27:51 INF] 2     mutants got status Ignored.      Reason: Removed by block already covered filter
[03:27:51 INF] 2     total mutants are skipped for the above mentioned reasons
[03:27:51 INF] 5     total mutants will be tested
█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████100,00% │ Testing mutant 5 / 5 │ K 5 │ S 0 │ T 0 │ ~0m 00s │                                                                                                 00:00:05

Killed:   5
Survived: 0
Timeout:  0

Your html report has been generated at:
...\git\dotnet-stryker\Dotnet.Stryker.Test\StrykerOutput\2023-07-31.03-27-36\reports\mutation-report.html
You can open it in your browser of choice.

It's clearly visible no mutations survived against our test suite now and we filled the holes in our test cases.

Conclusion

While striving for high code coverage is important in testing, it should not be the only factor considered. As illustrated in our simple scenario, even with 100% code coverage, it was possible to overlook certain critical tests. The concept of mutation testing, as demonstrated with Stryker.NET, helps expose this fallacy by altering the source code and examining whether existing tests can identify these modifications as defects. This method adds another layer of robustness to our test suite, further increasing our confidence in the tested code.

It's crucial to remember that achieving 100% code coverage doesn't guarantee a bug-free application. Rather, it should be viewed as a measure of the code areas that have been exercised by tests. Therefore, we should complement it with other techniques like mutation testing that verify the effectiveness of our tests.

Lastly, incorporating testing as an integral part of the development process and using powerful tools like xUnit, Moq, Coverlet, and Stryker.NET can significantly enhance the reliability of our applications. They can help us identify untested or weakly tested parts of our codebase and consequently, lead us towards more resilient, maintainable, and bug-resistant software.

Hopefully, you learned something from this one. Is there something I missed you'd like to add? Let me know in the comments.