Mutate: inject bugs into your programs!

Recently added to the package catalog, documented here: Mutate

This is a library for mutating programs, i.e. injecting possible bugs by making small syntactic changes to the program syntax.

There's an example in the docs prologue.

It might be used to create a corpus of known-buggy programs (that's why I wrote it).
Or to evaluate the quality of a test suite through mutation testing.
Or something else?

I'm posting it here in the hope that it can be useful to someone else as well!

Source is here

11 Likes

Thank you @llazarek
This is new to me but I'm finding your documentation very helpful.
best regards
Stephen

This is very cool, but I'm curious about the underlying reasoning. The documentation covers the 'how' but not the 'why'; can you give me a tl;dr so that I don't need to immediately dig into the literature?

The example mutator changes (if c t f) into (if (not c) t f) which is essentially guaranteed to break the software. Is the point simply to verify that you have test cases that will catch this, or is there more to it?

Thanks @spdegabrielle and @dstorrs!

Yeah you're exactly right @dstorrs that the if-swap very likely breaks the program, and that one way to use mutation is to check that a test suite catches such bugs. That's the rationale behind mutation testing: a good test suite should be able to catch bugs injected into the code-to-be-tested.

On the other hand, I can elaborate on my use case as a different example of why one might want mutants. I've been using mutation to analyze the debugging information different tools (like contracts) provide when programs go wrong. To do that, I need a corpus of buggy programs where I know for each one exactly where the bug is. So I use mutation to turn a set of programs that aren't known to be buggy into a huge number of mutants, each with a different potential bug that serves as a different scenario for my analysis.

2 Likes

Though... you do need to verify that the change actually introduces a bug, right? Certainly one possible scenario is that you make a change in dead code. I'm sure you've already thought about that.

Yes indeed. And that can be a tricky thing to check! In my use-case we take the simple and conservative choice to filter for mutants that cause the program to crash.

In the mutation testing program I built, I only mutate code that is covered by test cases. Mutating dead-code according to the test suite is not useful, and slows down the process of making the test suite robust.

Here is a related scenario: Given one expression that is green on coverage, the associated tests are also green, but when you remove that said expression, the tests are still green ⇒ the given expression is not tested. That can happen when side effects are not tested.

Mutation testing will verify the test suite is robust. I think it can be summarized like:

(for-each (lambda (mutated-program) 
                   (assert (not (program-check mutated-program)))) 
          (mutate program))

Where program-check runs program's test suite against the mutated program. In case of timeout, program-check must return #f.

Dead-code will not be exercised by the test suite, hence whatever mutation is done to it, program-check will return the nominal result #t which is an error according to mutation testing: the test suite is not robust enough. It is known beforehand because of the coverage. The problem of mutating dead code according to the test suite, is that it yields many "candidate mutations" that are noise.

I struggled with the following:

  • For all mutations, the program test suite fails ⇒ mutation testing success ⇒ test suite is robust;
  • It should^W must be possible to run the test suite concurrently; otherwise it is far too costly to run;
  • The more knowledge you can infer from the code, the more interesting mutations it is possible to infer; it is possible to filter out mutations that will always fail, such as replacing + with list, and avoid a combinatorial explosion.

Mutation testing is fuzzing for code.

2 Likes