Tuesday 14 May 2019

Goldilocks Abstraction



Abstraction is one of the more powerful weapons a software engineer has in his or her arsenal. Its proper application is fundamental to ensure the code is engineered and not simply written.

A formal definition of abstraction might be:

"The process of removing physical, spatial, or temporal details in order to more closely attend to other details of interest."

A more practical description would simply be the hiding of implementation detail in order to make a system easier to understand, use and maintain. The moment that we decided to not write code as simply one large procedural block we are engaging in abstraction. Modelling our problem domain and the solution we have engineered into distinct, and hopefully well formed, elements.

However whilst abstraction is fundamental to software engineering it does also demonstrate aspects of a diminishing return that eventually tips into being detrimental to the quality of the code. Abstraction is good but it is possible to have too much of a good thing.

Too Little

When a code base has too little abstraction it can bombard the user with detail. This can make it difficult to separate functionality from implementation making it harder to identify and take advantage of re-use opportunities.

An area of code maybe very close to what you require, but because it is fused with a particular implementation the opportunity for it to be re-used is lost. It is also likely to increase the complexity of maintenance, the impact of required changes in implementation spreads due to the amount of the codebase that is directly exposed to this detail. 

Code with too little abstraction will very often be found to be breaking SOLID principles. A lack of adherence to the Single Responsibility Principle and the Interface Segregation Principle will lead to large unwieldily classes. Breaking the Open-Closed principle will as already stated lead to large scale refactoring for even apparently simple changes.

As referenced in our more formal definition of abstraction, having too little of it exposes a lot of detail, so much so that it is harder to concentrate on the details you actually care about.     

Too Much

So given that rather damming assessment of the impact of having too little abstraction we may assume that we should be trying to implement more of it at every opportunity. In actual fact having too much abstraction can lead to very similar problems.

A codebase with too much abstraction can also hide opportunities for re-use simply because it becomes increasingly difficult to identify what a piece of code is designed to do. This is usually because the abstraction has created a greater distance between the code and the problem space it was originally supposed to work in.

High levels of abstraction can also create problems when debugging code. As you try to step through a problem you seem to plunge ever deeper down a call stack whilst never appearing to get any nearer the code that is doing the actual work or the code that is manifesting the bug.

Too much abstraction can actually increase the amount of detail in a codebase that must be understood to work within it. It may hide implementation detail but it creates large amounts of knowledge related specifically to the layers and layers of code that have been implemented. This will often create a high barrier to entry for new members of the team or even when experienced engineers re-visit once forgotten code. 

Just Right

So we have established that there is a middle ground of just enough abstraction to realise the benefits without creating too much and swinging towards the negatives that can bring. As with many aspects of software engineering it is almost impossible to get this right. Any codebase you work in will on occasions veer either side of the desired middle ground.

When adding an abstraction try to identify clearly what detail it is designed to protect the caller from, how will it make the codebase easier to work within the future and how many other layers of abstraction will it be part of. Having multiple layers of abstraction is not necessarily wrong if the purpose of each can be cleanly and quickly explained and understood.

Also learn to recognise the signs of a codebase having too little or too much abstraction. 

Do relatively simple changes in implementation detail seem to lead to large numbers of points? Are you put off from making changes to dependencies or interaction with lower level systems because of the impact of the change? These types of problems are indicates of too little abstraction.

Do you find it difficult to follow the flow of a piece of functionality from end to end? Do you find it difficult to identify where changes needs to be made when assessing a proposed change in functionality? These can be signs of too much abstraction in a codebase.

As software engineers because of the inherent complexity in coding we often aren't aiming for perfection but simply something slightly better than adequate. An important factor in this is understanding the pro's and con's of all the tools in the toolbox as well as recognising the signs when any particular tool is being over-used. Abstraction is no different, it is a fundamental tool that any engineer must master but this also includes knowing when enough is enough.



No comments:

Post a Comment