A Two Coffee Problem: Breaking Not Bad

You'll will often hear developers lament at the fact unit tests are broken and need fixing.

While this can sometimes feel like needless extra overhead in implementing a feature the breaking of these tests is an important part of the ebb and flow of development and has value in its own right.

Many may be scoffing at that sentiment, so what is so great about failure?

Change is Good

We can learn a lot about the changes we have just made to the code by the failure it introduces into our unit tests.

Firstly it demonstrates that we have changed the function of the code we have been working on, I would be much more concerned if no unit tests were broken, that would imply our changes have had no impact on the code or our unit testing is inadequate.

This time the change was intentional next time it might not be and surely we would want the tests to fail in this scenario?

But its not just a case of seeing failing tests, its also a matter of seeing which ones have failed.

Do the tests that are failing make sense within the context of the change that was made or are they actually demonstrating an unintended side effect or problem with the engineering of the code? Ask yourself should these test be broken? 

Don't just blindly fix the tests that are failing also make sure that this all adds up. When it comes to pushing these changes the tests you had to fix will also serve as documentation for the reviewer of your work so they can do the same thing.

Walking on Egg Shells

Whilst extolling the virtues of failure its important that unit tests failing is always in relation to change in the code base and that its scale is in proportion to changes being made.

Unit tests shouldn't fail when no-one touched anything, running the tests over and over again should always produce the same results. Race conditions and brittleness are not qualities we want in our code bases's unit tests.

When measuring the scale of failure we should consider several factors.

Its natural to just look at the number of broken tests but this can be misleading, we also need to consider the amount of time needed to fix the tests even if a relatively small number have been broken.

If we change an interface or a method signature then its natural that a proportionally large number of tests may break, if however only a handful of tests fail but serious thought needs to be put into how the functionality can now be effectively tested then this is indicative of wider problems. 

Taking the Red with the Green

Ultimately we have to ask ourselves why did we write the tests if we didn't expect or even want them to fail occasionally? Are tests that never break actually worthwhile?

Unit tests are a comfort blanket that gives a nice warm feeling that our code is still fit for purpose and still does what it says on the tin. Its difficult to derive this warmth that they truly can be relied upon if we never see them fail, especially if we know the code base is changing.

When we are practicing TDD we take satisfaction from seeing a set of unit tests gradually change from red to green, we do this because it proves the code we have just written does what we wanted it to.

We can take similar satisfaction from seeing the same transition from red to green when a change we make breaks some tests. It provides demonstrable evidence that the code is still doing what we want it to.

We need to realise that the value of unit tests increases over time, they have value when the code is first written to help us engineer interfaces and behaviour and to validate new code works. But they have even more value as time goes by to highlight when code is broken and act as documentation for how code should be used.

Failure is a naturally part of this process and is simply part of the information unit tests convey.

Next time you break some tests don't sigh, instead smile because your unit tests once again have proven their worth.  

A Two Coffee Problem

Sunday 6 November 2016

Breaking Not Bad

No comments:

Post a Comment

Blog Archive