Quis custodiet ipsos custodes? This is a Latin phrase from the Roman poet Juvenal, which means Who will guard the guards themselves? It's a good question. Is the law truly equal for everyone? Do the guards apply the law correctly?

In software development, the guards are the tests. Yes, tests must ensure that our code meets business requirements, meaning our code does what it is supposed to do. Tests are also the guardians who must watch that we don't “break” anything when we make a modification; they must warn us.

But, how can we ensure that we have enough tests? Yes, we can measure code coverage, better known by its English name Code Coverage:
In software engineering, code coverage, also called test coverage, is a percentage measure that indicates the degree to which the source code of a program is executed when executing a given set of tests.

As can be seen from the definition, code coverage only measures the percentage of lines that have been executed, not the quality of the tests. I have seen many projects with almost 100% code coverage, but most of the tests were irrelevant or not very useful.

Quantity and quality are different things. But how can we detect the quality of our tests? And who controls whether our tests (the guards) perform their task well? Mutation Testing helps us with this task.

What is mutation testing?

The idea of mutation testing is to simply modify the code covered by tests, checking if the existing test suite for this code will detect and reject the modifications. It is used to design new tests and evaluate the quality of existing tests. What are the tests with poor quality?

Underlying assumptions

Mutation testing is based on two ideas:

Basic concepts

Mutation operators/mutators

A mutator is the operation applied to the original code. Basic examples include changing a '>' operator to a '<', substituting '&&' operators for '||', and substituting other mathematical operators.

Mutants

A mutant is the result of applying the mutator to an entity. A mutant is a modification of the code at runtime that will be used during the execution of the test suite.

Killed/surviving mutations

When the set of tests is executed against the mutated code, there are two possible outcomes for each mutant: the mutant has died or has survived. A killed mutant means that at least one test has failed as a result of the mutation. A mutant that has survived means that our test suite has not detected the mutation and, therefore, needs to be improved.

How does mutation testing work?

Since mutation testing validates the quality of our tests, before launching the mutation tests, the tests are executed and must pass. Otherwise, you cannot proceed.

If all tests pass correctly, the mutation testing library begins to create mutations. For each mutation, all tests are executed.

Let's look at an example of 10 tests for which 5 mutants can be created.

  1. All tests are executed. If all pass, proceed to step 2.
  2. The first mutant is created, and all tests are executed again. As we see in the following image, the third test fails. This means the mutant was detected and killed.
table with tests and results. we see that on the third test, the test fails

👍 If our tests fail after the mutation, then we can say that the mutation was detected and killed.

  1. The second mutant is created, and all tests are executed again. This time all tests pass without failing, and the mutant was not detected. Therefore, the mutant survived.
the second mutant is launched and we see that all tests pass. the result comes out as "survived"

The quality of the tests is measured based on the percentage of killed mutations. Mutation tests check if the tests are effective.

This is the list of automated tools for Mutation Testing:

Mutation testing example with Fizz Buzz Kata

Let's see it with an example of a kata called Fizz Buzz. The requirements of the kata are simple:

Here is our code for FizzBuzz.java:

public class FizzBuzz {


   public String convert(int number) {
       if (isDivisibleBy(3, number)) {
           return "Fizz";
       }


       if (isDivisibleBy(5, number)) {
           return "Buzz";
       }


       if (isDivisibleBy(15, number)) {
           return "Fizz";
       }


       return String.valueOf(number);
   }


   private boolean isDivisibleBy(int divisor, int number) {
       return number % divisor == 0;
   }
}

And this is our FizzBuzzTest.java:

java
class FizzBuzzTest {


   private FizzBuzz fizzBuzz;


   @BeforeEach
   void setUp() {
       this.fizzBuzz = new FizzBuzz();
   }


   @ParameterizedTest
   @CsvSource({"1,1", "2,2", "4,4"})
   void convert_regular_number_to_string(int input, String expected) {
       String actual = fizzBuzz.convert(input);


       assertThat(actual).isEqualTo(expected);
   }


   @ParameterizedTest
   @ValueSource(ints = {3, 6, 9})
   void convert_numbers_divisible_by_3_and_not_divisible_by_5_to_Fizz(int input) {
       String actual = fizzBuzz.convert(input);


       assertThat(actual).isEqualTo("Fizz");
   }


   @ParameterizedTest
   @ValueSource(ints = {5, 10, 20})
   void convert_numbers_divisible_by_5_and_not_divisible_by_3_to_Buzz(int input) {
       String actual = fizzBuzz.convert(input);


       assertThat(actual).isEqualTo("Buzz");
   }


   @ParameterizedTest
   @ValueSource(ints = {15, 30, 45})
   void convert_numbers_divisible_by_15_to_Fizz(int input) {
       String actual = fizzBuzz.convert(input);


       assertThat(actual).isEqualTo("Fizz");
   }
}

I'm using Maven in this example, but if you want to use it with Gradle, you can follow this tutorial: Gradle quick start.

Reviewing the results

We test if it works by executing the following commands:

mvn clean test 
mvn pitest:mutationCoverage

It's necessary to execute both commands in order, because Pitest works with compiled code. So, if the code and tests aren't compiled, you won't see the result of the last changes we've made. If you haven't made any changes, it's enough to execute the second command. A faster command can be:

mvn clean compile test-compile 
mvn pitest:mutationCoverage

But personally, I like fast feedback, so I prefer to run the tests first and then mutationCoverage. Below, we see the following error:

we get an error that it's below 95%

The mutation score is 90% and must be at least 95%. Let's review the results of the Pitest report. We open the *index.html which can be found in the folder target -> pit-reports.

Screenshot of the index where the pit reports challenge is located

We open it in our favorite browser and review the results:

pit test coverage report: results

We open the report until we reach FizzBuzz.java and review the report:

Results of fizzbuzz.java. We see that on line 15 there is a mutant that has survived

We see that on line 15 there is a mutant that has survived. Below we can see the mutations:

List of mutations. On line 15 there is a mutant that has survived

We see that the sixth mutation consists of changing the return value to an empty string "" and the mutant has not been detected by the tests.

Improving our code using the Pitest report results

If we review the code and look at line 15, we see that the condition will never be reached because, if the number is divisible by 15, it is also divisible by 3, so it would be met in the condition on line 7, whose condition is that it is divisible by 3 and returns "Fizz".

We move the check if the number is divisible by 15 as the first instruction of our method convert(int number):

public class FizzBuzz {

   public String convert(int number) {
       if (isDivisibleBy(15, number)) {
           return "Fizz";
       }


       if (isDivisibleBy(3, number)) {
           return "Fizz";
       }

       if (isDivisibleBy(5, number)) {
           return "Buzz";
       }

       return String.valueOf(number);
   }

   private boolean isDivisibleBy(int divisor, int number) {
       return number % divisor == 0;
   }
}

We run the mutation tests again, they pass correctly, and the build finishes. We review the report and see that the mutation score is 100%.

Pit test report results. the report returns a 100% success

Reviewing the changes, we detect that, in our haste, we had made a mistake. For values divisible by three and five, we returned Fizz instead of FizzBuzz. Let's fix it in the code and tests and see that everything works correctly.

Thanks to mutation tests, we can detect small, unintentional errors.

What do we do with surviving mutants?

Analyzing surviving mutants is fundamental for improving code quality. While some reveal significant problems in the code or tests, others represent equivalent mutations or simply noise. However, all provide valuable information about the effectiveness of the tests.

Types of surviving mutants

We can divide surviving mutants into three categories:

  1. Noisy mutants

In this category we can include:

This code, in most cases, is automatically generated with the IDE or using Lombok, so it doesn&#39;t add much value if we perform mutation tests on it. It should be excluded from mutation coverage.

  1. Mutants that cannot be killed

This group of mutants gives us valuable information for refactoring These mutants can show us:

  1. Mutants with valuable information

These are mutants that reveal real data and/or significant problems in our code or tests. We must pay attention to them and solve the problems they show us.

Mutation operators

There are infinite possible changes depending on the size of our code. These are some of the mutators:

Void method mutator

If we have a method that returns nothing, it means it's a method with “side effect”, meaning it changes a global state or something in the infrastructure. For this reason, Pitest removes all code from the method to see if the tests fail:

removes content in "public void method"

Null return mutator

changes "return new" to "return null" in the code

Constant mutator

removes if field and replaces it with "return 3"

Optional mutator

returns optional.empty()

The effective implementation of mutation testing

For mutation testing to be possible, our tests must meet the following requirements:

Unit tests meet these requirements, so we should exclude the following tests from mutation testing:

Conclusions

Mutation testing is an advanced validation technique that helps ensure code quality by evaluating the effectiveness of existing tests. Its implementation provides several key benefits:

In summary, mutation testing is a valuable technique that elevates software quality by strengthening its testing system, fostering more efficient code, and ensuring that errors are detected before reaching production. You can see all the code with the step-by-step commits in the GitHub repository fizzbuzz-mutation-testing.

Tell us what you think.

Comments are moderated and will only be visible if they add to the discussion in a constructive way. If you disagree with a point, please, be polite.

Subscribe