A Two Coffee Problem: May 2019

Monday, 27 May 2019

Effective Test Automation

As a DevOps mentality has taken hold in the industry teams have increasingly focused on test automation. This is mainly because an effective test automation strategy can be a driver for an increased release cadence by reducing the amount of manual effort involved in declaring code ready for production.

On the face of it this may seem like a straight forward endeavour. Write the tests, run the tests and release the software. However like many aspects of software engineering there is a subtly to its proper application and it is definitely the case that when badly implemented it can lend no value, and at worst can actively degrade quality.

Effective test automation cannot be defined or explained in a single blog post but presented below are a few things to look out for when judging the effectiveness of test automation in your code base.

Red Means Stop

The number one mistake that teams make when implementing automated testing is for there to be no consequences when tests fail. There are often various reasons given for this, there is a bug in that test, it's a timing issue, that wouldn't happen when the application runs for real.

In all those situations if the underlying issue can't be addressed then those tests shouldn't be run because they offer little value. The whole purpose of an automated test suite is to act as a traffic light for the code base, if we're all green we are good to go, if anything is red we need to stop and fix it. Having a situation where software is still released despite tests failing will create a culture of ignoring tests, this will either increase the amount of manual testing you need to do to get comfortable with the codebase or you will knowingly ship with defects.

Much more value can be derived from a smaller set of automated tests that are robust and meaningful than trying to create a larger set of tests that are fragile and where the results require significant interpretation.

Fix Early or Release Broken

Once you have a set of reliable tests you are prepared to put faith in the next most important factor is when you run them. The further right in the development timeline the tests are run the more pressure there will be to ship code regardless of the results.

Depending on the timespan of your project it is also likely the cost of fixing defects found further right in the process will be higher, this will likely be in spite of the fact that any fix may be more like a patch than a well engineered solution.

The further left the tests run, by which we mean the nearer to the time the developer first writes the code, the more time is available to find a fix and the cheaper the fix is likely to be. The closer the tests are run to the proposed release date the more it will become a box ticking exercise due to the significant pressure of continuing regardless of the results.

Variable Scope and Focus

The release of any piece of software is often focussed on particular areas of functionality, this naturally means the potential for bugs or issues is higher in these areas. Whilst there are likely to be core areas of functionality that always need to be verified we can maximise the effectiveness of an automated test suite by allowing it to adapt to the current focus of developers attention.

This shift in focus may be automatic based on analysis of developer commits or it may be via configuration or any such mechanism that allows manual changes in emphasis. Knowing that the available automation resources have been focused on the areas of code most likely to have regressed will go a long way to increasing confidence that these new or adapted features are ready to ship.

The building of an automated test suite is always done with the best of intentions but implementations often end up being sidelined and not given particular relevance in the release process. This usually comes from a view point that simply writing the tests is enough, this isn't the case. Tests that don't relay unequivocal information about the state of the code base, or that are run too late for any information to be effectively acted upon represented wasted effort.

To avoid this decide on your areas of nervousness when releasing and try and develop strategies for these concerns to be addressed via automation. Also treat this automation like any other code, expect it to need refactoring and developing as the code base moves on. Treat it like a living breathing area of code that is your ally in making the important decision over when something is ready to ship.

Sunday, 19 May 2019

Next Level Over Optimisation

Over optimisation is a common pitfall for software engineers. It is frequently categorised by a disproportionate amount of effort being spent on striving for performance levels that have yet to be proven necessary.

However a similar mindset, causing similar diminishing returns, can be seen in other areas of software engineering when too much emphasis is placed on one particular aspect of the code. Software engineers are by nature problem solvers and can often demonstrate compulsive tendencies, this combination can sometimes cause an obsession with providing the ultimate solution to a problem.

Which aspects of code, other than performance, can be over optimised? What are the down sides to the emphasis becoming obsessive.

Code Size

Engineers can full into a trap of trying to devise ways to provide the same solution with an ever decreasing amount of code. Code is not a business asset, our goal isn't to strive to increase the amount of it we have, it is to devise efficient solutions to problems using code as a tool. However measures of efficiency should also include maintainability and extensibility.

To this end there is a tipping point where decreasing the amount of code also decreases the readability. At the time this is first manifested this can seem a minor concern, you still understand how the code works, but maintainability and extensibility and long term goals they can't be derived in the moment.

The over emphasis on reducing code size can be driven from an assumption that it will drive performance and a desire to prove engineering skill by being able to do the same with less. A more pragmatic approach would recognise that the performance benefits, if they do actually exist, are likely to have a negligible impact on users. An engineers skill shouldn't necessarily be judged on any one piece of code, we must also factor in whether that code continues to offer value over a prolonged period of time, both in the form it was originally written and by being adapted and extended over time without being re-written.

Re-Use

You will often see an engineers face light up when they realise they can re-use some previously written code in a new situation. Re-usability is a fine trait for code to have, it increases the speed solutions can be delivered and as previously stated we aren't in the code generation business so opportunities to deliver using less of it should be embraced.

But, sometimes this drives code to be declared re-useable prematurely. When this turns out not to be the case it can lead to the code that is attempting to r-use it being unnaturally bent to fit in with the underlying code, or the interface to the re-useable code degenerating over time as it is mangled to suit slightly different use cases. The fear generated by these potential outcomes can also breed a reluctance the make changes to shared code because of the potential knock on effects.

This is not an argument against code re-use, it is something that should always be strived for. But it's important to develop a sense of when code is truly re-usable and when it would create a rigidness in the code base that will ultimately serve to degrade the intended benefits of re-use. One approach to dealing with this is to not be afraid of refactoring code once the re-usability of code has become apparent, conversely when re-use is starting to cause problems don't be afraid to recognise this fact and allow responsibility for the functionality to be handed back to the consumers.

Configurability

Related to an over emphasis on re-use can be a drive to make every aspect of an area of code configurable. In order to increase the flexibility of code, and therefore open up more opportunities for its re-use, we make every aspect of its functionality configurable.

As with most topics we have discussed, configurability is an admirable quality but if its overdone it can have a negative impact on readability and maintainability. Potential users of the code are presented with a sea of possible options that they may struggle to choose from or appreciate the subtleties of. When it comes to changing the code in question this can be hampered by having to maintain support for a large number of use cases and combinations.

This problem can be addressed by not forcing users to deal with the array of options being presented by offering sensible defaults for particular use cases. If an experienced engineer, who has a good understanding of the problem domain, chooses to dig deeper they can. But consumers who simply want an "out of the box" solution need not dig this deep.

Pragmatism and practicality are important qualities for engineers. Certain aspects of code, most notably the SOLID principles, are important enough to evangelise but good code is not defined by any one measure. Code quality is also not always possible to measure in the moment, decisions made at the time of writing can have unintended consequences further down the line.

It won't be possible to avoid this kind of over optimisation, the approach to a piece of code is based on understanding of the problem at the time of writing. As this knowledge grows mistakes will be surfaced as things go in a different direction to initially anticipated. Don't fear these situations, recognise that they exist, learn to recognise the signs, and develop strategies for finding reverse gear.

Tuesday, 14 May 2019

Goldilocks Abstraction

Abstraction is one of the more powerful weapons a software engineer has in his or her arsenal. Its proper application is fundamental to ensure the code is engineered and not simply written.

A formal definition of abstraction might be:

"The process of removing physical, spatial, or temporal details in order to more closely attend to other details of interest."

A more practical description would simply be the hiding of implementation detail in order to make a system easier to understand, use and maintain. The moment that we decided to not write code as simply one large procedural block we are engaging in abstraction. Modelling our problem domain and the solution we have engineered into distinct, and hopefully well formed, elements.

However whilst abstraction is fundamental to software engineering it does also demonstrate aspects of a diminishing return that eventually tips into being detrimental to the quality of the code. Abstraction is good but it is possible to have too much of a good thing.

Too Little

When a code base has too little abstraction it can bombard the user with detail. This can make it difficult to separate functionality from implementation making it harder to identify and take advantage of re-use opportunities.

An area of code maybe very close to what you require, but because it is fused with a particular implementation the opportunity for it to be re-used is lost. It is also likely to increase the complexity of maintenance, the impact of required changes in implementation spreads due to the amount of the codebase that is directly exposed to this detail.

Code with too little abstraction will very often be found to be breaking SOLID principles. A lack of adherence to the Single Responsibility Principle and the Interface Segregation Principle will lead to large unwieldily classes. Breaking the Open-Closed principle will as already stated lead to large scale refactoring for even apparently simple changes.

As referenced in our more formal definition of abstraction, having too little of it exposes a lot of detail, so much so that it is harder to concentrate on the details you actually care about.

Too Much

So given that rather damming assessment of the impact of having too little abstraction we may assume that we should be trying to implement more of it at every opportunity. In actual fact having too much abstraction can lead to very similar problems.

A codebase with too much abstraction can also hide opportunities for re-use simply because it becomes increasingly difficult to identify what a piece of code is designed to do. This is usually because the abstraction has created a greater distance between the code and the problem space it was originally supposed to work in.

High levels of abstraction can also create problems when debugging code. As you try to step through a problem you seem to plunge ever deeper down a call stack whilst never appearing to get any nearer the code that is doing the actual work or the code that is manifesting the bug.

Too much abstraction can actually increase the amount of detail in a codebase that must be understood to work within it. It may hide implementation detail but it creates large amounts of knowledge related specifically to the layers and layers of code that have been implemented. This will often create a high barrier to entry for new members of the team or even when experienced engineers re-visit once forgotten code.

Just Right

So we have established that there is a middle ground of just enough abstraction to realise the benefits without creating too much and swinging towards the negatives that can bring. As with many aspects of software engineering it is almost impossible to get this right. Any codebase you work in will on occasions veer either side of the desired middle ground.

When adding an abstraction try to identify clearly what detail it is designed to protect the caller from, how will it make the codebase easier to work within the future and how many other layers of abstraction will it be part of. Having multiple layers of abstraction is not necessarily wrong if the purpose of each can be cleanly and quickly explained and understood.

Also learn to recognise the signs of a codebase having too little or too much abstraction.

Do relatively simple changes in implementation detail seem to lead to large numbers of points? Are you put off from making changes to dependencies or interaction with lower level systems because of the impact of the change? These types of problems are indicates of too little abstraction.

Do you find it difficult to follow the flow of a piece of functionality from end to end? Do you find it difficult to identify where changes needs to be made when assessing a proposed change in functionality? These can be signs of too much abstraction in a codebase.

As software engineers because of the inherent complexity in coding we often aren't aiming for perfection but simply something slightly better than adequate. An important factor in this is understanding the pro's and con's of all the tools in the toolbox as well as recognising the signs when any particular tool is being over-used. Abstraction is no different, it is a fundamental tool that any engineer must master but this also includes knowing when enough is enough.