A Two Coffee Problem

Monday, 28 January 2019

Unit of Work

The number of individual work items required to achieve a software engineering outcome can be staggering. If the execution of these tasks is to result in success they need to be organised into a structure to avoid chaos.

Within agile teams we have developed a structure containing many different types of work items, features, themes, epics, stories, tasks and possibly many others are an every day part of teams lexicon and operation.

Whilst their definition can sometimes seem arbitrary or open to interpretation they shouldn't be seen as a collection of things to do. They must be written and grouped to achieve a purpose and have a defined outcome. Whilst this may seem obvious it can be surprising how often we can lose sight of this.

Defined Value

The purpose of properly segregating what work needs to be completed isn't simply to ensure a team is kept occupied. It must also serve to help planning and act as documentation for what is going to be built, what has been built and why.

This means each item must explain its value to the business. Providing functionality for users is clearly valuable to any business but constraining work items to only being described in these terms is limiting and doesn't acknowledge that there are other reasons to write code that benefits the business.

This needs to be equally true when we group together work items. As an example it can be tempting to treat items such as epics simply as large collections of user stories, but such a loose definition can lead to the collection being superficial. No matter what the scale completing an item of work should bring a self contained benefit, it shouldn't imply a benefit that isn't actually achievable until more work is done.

Do No Harm

If a team is to follow a DevOps mentality, to ship continuously on an automated basis, then each work item must result in shippable code. This doesn't mean it needs to implement an entire feature, it maybe that nothing changes the user can see or interact with whilst still achieving a business benefit of moving towards something bigger.

However it shouldn't result in the code base being unstable or incomplete unless further items are completed. If this is the case then it indicates that the original division of work was wrong or that a different implementation approach was required.

Sometimes a dose of realism is required to accept that the required functionality requires a large amount of effort, dividing the work up into too smaller chunks to try and change that situation won't result in achieving the goal any quicker.

A Plan Not An Idea

Work items shouldn't document ideas they should document plans to achieve a statable and desirable outcome.

Generally if you review an untidy backlog many of the work items that have lingered for sometime will be ideas that haven't made the journey from initial inception to work the team can and should do. There are many ways to document ideas but they must be refined and pass through a filtering process before they should be considered for execution by the team.

This shouldn't be seen as stifling creativity, teams absolutely should be free to spit ball ideas, but when it comes to spending some of the teams valuable effort and expertise on delivering one of them then it needs to be undertaken with a clear plan of what they will do and why it is worthwhile them doing it.

Nothing is hard and fast with agile, whilst we may have taken a decision to push back against large upfront planning that doesn't mean we have adopted chaos. Agile does involve planning but with a dose of realism on what can be controlled and what can't. It isn't possible to do this planning without understanding the work that will get you there. The effort of your team is a currency that you must plan to spend wisely and not to squander on a series of unrelated tasks with no clear purpose.

Monday, 21 January 2019

Drowning Under The Waterfall

Whilst the software engineering fraternity has widely embraced agile methodologies it has also struggled to let go of its waterfall roots. The behaviours that waterfall encourages have not been entirely let go and continue to creep into the ways that teams approach delivering code.

It is understandable why this is a difficult break-up, the benefits that waterfall claims to offer are intoxicating. It claims to offer certainty and the ability to plan an exact strategy for success, but ultimately these are hollow promises that it cannot deliver.

It cannot deliver because the world it is intended to operate in doesn't exist.

Crystal Ball

A vital piece of equipment required to implement an effective waterfall strategy is a crystal ball. It requires stakeholders to have certainty over their requirements with no gaps or uncertainties, it also requires developers to understand all the problems that will arise when trying to implement that vision.

The reason for this is that waterfall creates elongated feedback loops. It requires initial up front design, development and finally test. If at any point stakeholders are found to have missed or misunderstood a requirement, or developers to have not anticipated an implementation problem, the price paid in elapsed time is heightened.

Software development is fair too complicated to implement a strategy based on certainty. This complication exists both in its implementation and in defining the required outputs. Effective development strategies shorten feedback loops and encourage collaboration in an attempt to combat rethinking and lessen its impact.

Excluding the User

The big bang release approach encouraged by waterfall has the unintended consequence of sidelining end users. Stakeholders act as a proxy for the user, as they do in most development methodologies, but the issue caused by waterfall is it takes away the possibility for them to be wrong.

A plan based on perfectly predicating users wants and needs is high risk and likely to fail. A more realistic approach to accept the possibility of being wrong and to structure releases to reduce the size and impact of any possible mis-step.

Every release should be a learning experience whilst equally allowing time and resource to implement those learnings. If each release is a wager on your understanding of your users how much of your development teams effort and hard work will you risk?

Linear World

One of the most alluring aspects of waterfall is its ability to predict outcomes over long periods of time allowing a teams planning horizon to be far into the future. The issue with this is it implies a linear and predictable understanding of all the factors that can influence the outcome of a project.

Very little about human behaviour is linear, this applies equally to people charged with delivering a project as well as the intended users once the project delivers.

The interesting aspect of this flaw is that the illusion of certainty in predications is so attractive that we fail to be to able to let it go despite contradictory and frequent evidence to the contrary. It has almost become a cliché for an IT project to be late and over budget, this is despite the fact that all these projects would have started with a carefully crafted plan that was supposed to mean this time it would be different.

No-one wants to hear the answer "we don't know" or "we're not sure" but if uncertainty exists this answer invites the possibility of a contingency to be planned, "we got it wrong" offers no such hope.

No-body can claim that any methodology has all the answers, but an advantage of agile is its willingness to embrace the things we don't know and structure our approach accordingly. As attractive as it may be to be seduced by a methodology that claims to hold all the cards its important to remember that we develop code in the real world and we are therefore subject to its foibles and unpredictability, learn to live with that and you'll start making better decisions.

Tuesday, 15 January 2019

The Art of Software Engineering

Software development is rightly categorised as an engineering discipline, there are many well established principles and practices that deliver more robust and scalable code. But compared to its contemporaries such as structural or mechanical engineering it is still a relative young practice.

This combined with the fact that coding can be a blank canvas with many possible ways to achieve an outcome mean it can also exhibit traits more akin with an art form.

Non-Scientific Measures

Developers will often describe code as being good or bad, this is very rarely a judgement based on measurement and opinion on the quality of code may not even always be universal. These opinions can drive new proposals for solutions to common scenarios, providing new takes on old problems.

Whilst I'm sure this can happen in other engineering disciplines it feels like software is unique in the way it can revisit old problems that already have solutions. To put this in context, structural engineers probably no longer debate how to build a wall, a correct solution has been found and we've moved onto other areas.

Correctness in software engineering can be more opaque, it is weighted against the opinions and beliefs of the engineer making the assessment.

This isn't necessarily a negative, a willingness to re-evaluate and even criticise can be a strength if it's used as a driver to ensure that all possible avenues have been explored and no ideas discounted.

Valued Characteristics

One of the things that can drive a change of view of the correctness of code can be a change in the characteristics that we most value. Most languages and frameworks have many options for tooling that can be used to quantify these measures and different philosophies may choose to put more emphasis on one or the other.

This might be testability, reusability, speed or efficiency. Whilst when questioned engineers would realise the benefits of all these things they may choose to sacrifice one for the other because they feel it's a better measure of correctness.

To continue our analogy, the valued characteristics of a wall are unlikely to change. It needs to be a strong and durable, dissenting voices amongst engineers are likely to be few and far between and unlikely to gain traction.

An advantage of this can be that it develops strategies and practices to optimise certain aspects of our code. Even if you decide that this optimisation isn't the sole focus of your engineering effort their existence is still an advantage when you feel that particular aspect of your software could be improved.

No Raw Material

Where software engineering's uniqueness is harder to question is in its lack of a raw material. Most other engineering disciplines take raw materials, whether it be bricks and mortar or steel and plastic, and turn them into the desired structures and assemblies.

The raw material of software engineering is the creativity, ingenuity and effort of the developers that practice it. The advantage of this is the freedom it gives to evolve quickly and achieve impressive things on minimal budgets, the disadvantage comes when mindsets assume closer alignment with more traditional engineering.

Scaling software development is a difficult, and possibly still unsolved problem. If you need to build a wall then clearly having more people laying bricks will speed up the process or enable you to build a bigger wall. Taking the same approach to painting a portrait is likely to lead to an incoherent and unsatisfying outcome.

Realistically software engineering exists somewhere between those two extremes. In the early days of a project adding more engineers can increase output whilst maintaining quality, but this relationship is not linear and will very quickly breakdown if the differences to traditional engineering aren't realised.

It maybe that these aspects of software engineering are related to its relatively young age, after another hundred years of software development maybe we will see less flux. Or it could be the case that software engineering has qualities that make it inherently unique in its approach. Rather than something to be feared we should embrace the freedom our discipline gives us to try new things, evolve our thinking and change the shape of the code we build.

Sunday, 6 January 2019

Code Recycling

If you observe a group of software engineers discussing an area of code for a long enough period you will hear the phrase "reuse". We rightly hold the benefits of code reuse in high esteem, it can reduce development time, provide increased levels of reliability and increase the adoption of desired practices.

But is this the whole story? Are there disadvantages to reuse? What different forms can reuse take?

This is potentially a large area of discussion with almost philosophical overtones but one potential approach is to analyse what is being reused and identify the pro's and con's of the approach.

Reuse of Effort

The simplest form of reuse that is often considered sinful is copy and paste, this is where code is simply duplicated, and potentially modified, but results in the same code existing in multiple places in a code base.

This allows the developer doing the copying to take advantage of the thought and effort expended by another to solve a shared problem. The downside comes from the duplication of code. Code is not an asset to an organisation because it needs to be managed and maintained, duplicating an area of code creates the potential to duplicate bugs and the effort of fixing them.

So code should never be copied?

Well the answer to that question is almost always going to be yes but as with many things it doesn't hurt to apply some pragmatism on occasions. Copying a small number of lines of code when the effort of defining an abstraction and creating a dependency for other areas of code may not always be the wrong thing to do.

The analysis of whether or not this is the wrong thing to do needs to take into the account the number of times the code has been copied. Copying it once when defining an abstraction might be problematic might be able to be justified, copying it any more times than that is indicating that the opportunity for reuse, and its associated benefits, is presenting itself.

Reuse of Code

When we've made the assessment that copying the code would be wrong then we look to reuse the code itself. This can take many forms whether it be creating a library or having shared source code.

This is where the benefits of reuse are realised. Reusing code decreases development time by providing ready made solutions to common problems, reusing the code in multiple places allows it to be battle hardened and honed overtime.

So code reuse is always a good thing?

Well the answer to this is usually going to be yes but it isn't always an absolute. Reusing code creates dependencies and this doesn't come without its own potential headaches. A large web of dependencies can stifle the ability for a code base to be refactored, maintained and grow. It is no surprise that many of the SOLID principles relate to reducing the scope of individual areas of code this will therefore reduce the scope of the dependency on it.

This shouldn't be taken as an argument against reuse but as a word of caution to think it through. Reuse code when the abstraction will hold and the opportunity for reuse is clear and unambiguous.

Reuse of Function

Ultimately we aren't actually trying to reuse code but functionality and we have developed ways to achieve this that can go beyond code reuse.

Software as a Service (SaaS) is common place among cloud providers but the same approach can be taken within your own code base. Divide and conquer has long been an effective strategy for engineering well defined code bases and techniques such as microservices have this at their heart.

A library doesn't have to be the only way to reuse functionality, using REST APIs can also allow functionality to be reused whilst also providing a really clear interface with the potential to evolve over time as requirements change.

This won't always be the correct approach and will come down to the scale of functionality that needs to be reused. Having a microservice to perform operations on strings would rightly be deemed ridiculous, but caching or complex data manipulation maybe more likely candidates.

Identifying and effectively implementing code reuse is a core aspect of software engineering. We have many tools available to us to achieve this and whilst it will nearly always be the right thing to do the potential pitfalls must also be considered and factored into the strategy.

Sunday, 16 December 2018

Language Lessons

Developers will very often define themselves by the programming language they are using to apply their skills. Based on the perceived strengths and weaknesses of different languages they will also very often judge their counterparts along similar lines.

To a large extent this view point is misguided. The differences between languages in the same paradigm are often superficial and the majority of developers will use multiple languages during their career.

The reason developers will utilise many different languages is because they are an evolving tool, this evolution can be tracked through various generations as new concepts have been developed and different approaches postulated.

First Generation (1GL)

1GL languages, also known as machine language, is code that can be directly stored and executed by the CPU, in other words 1's and 0's.

Programming in a 1GL language would often involve the manipulation of switches or other hardware and is almost unrecognisable from modern programming techniques.

Second Generation (2GL)

2GL languages start to introduce constructs to allow developers to read and make sense of the code. They are formed of combing instructions that are executed directly by the CPU being targeted. However they provide little to no abstraction from the steps taken to run the program, developers operating at this level are manipulating registers via simple mathematical and logical operations.

Coding at this level has the potential to be very fast and very efficient, but equally is extremely complex, error prone and difficult to debug.

Code written at this level is also not portable, it will only run on the generation of processors it has been written for. While most developers will not operate at this level the code they write using higher level languages is often compiled down to a 2GL program with the subsequent binary file that is deployed representing a 1GL language.

Third Generation (3GL)

3GL languages introduced many of the basic programming constructs that all modern day developers are familiar with. This includes logical constructs such as if, else-if and looping constructs such as for, while, do-while.

Whilst still describing what needs to happen when the program runs it describes this behaviour in a more algorithmic fashion as opposed to describing how the CPU should manipulate bits and bytes.

A certain division of 3GL languages also gave rise to the concept of Object Oriented (OO) languages which attempted to represent programs in terms of collections of data and functionality that interact in a manner designed to model the problem space being solved.

Some 3GL languages also attempted to make code portable by being compiled to a 2GL language that runs on a virtual machine and therefore not tied to a particular physical CPU.

Fourth Generation (4GL)

4GL languages attempt to improve upon 3GL by operating upon larger data constructs. Sometimes this distinction can be subtle but 3GL languages often operate on relatively simple and low level data structures such has strings, booleans and value types.

Operating with higher level data constructs can make these languages less general purpose and often leads to 4GL languages being specialised around a particular purpose such as graphics or database manipulation.

Fifth Generation (5GL)

5GL languages attempt to shift the view point of code from how an outcome should be achieved to describing what that outcome should be. Rather than a programmer coding an algorithm they describe the constraints the program should operate in and the desired outcome.

The boundary between 5GL and 4GL languages is often blurred. Some 4GL languages also try to operate along the lines of what as opposed to how and are sometimes miscategorised as 5GL languages.

It would be wrong to view this evolution of languages as each subsequent generation being better than its predecessor. Mostly the changes between generations are focussed around making the programmers life easier and allowing them to describe algorithms in higher level terms, making code more readable to someone that understands the problem that is being solved.

Many coders still operate at 2GL or lower 3GL level because of the potential to achieve higher performance and efficiency. A good example of this is the prevalence that languages such as C and C++ still have in fields such as graphics and gaming.

There is no language to rule them all, eventually you may be required to solve a problem that requires a different approach, this requires you to be comfortable to go to the toolbox and select the correct tool for the job in hand no matter what generation that may be.

Whatever generation of language your are using don't define yourself by it. A programming language is a tool and you would likely be as adept at applying your coding skill in many other languages.

Monday, 10 December 2018

Threading Things Together

I would predict with a high degree of certainty that all software engineers reading this could relay war stories of issues caused by multi-threading.

Cast very much under the guise of a necessary evil, multi-threading seems to have a unique ability to cause confusion with difficult to diagnose defects and perplexing behaviour. Because of this features of many popular languages are evolving to try and help developers deal with this complexity and find better routes around it.

So if we can't live without threads how can we learn to live with them?

Recognising Dragons

One of the best ways to deal with potential issues is to recognise them coming over the horizon and try and steer the code base away from them.

Chief among these potential issues are race conditions, events happening out of the expected order and therefore driving unexpected outcomes. These issues are nearly always caused by the mutation of data by multiple threads. Whenever you decide to transition an area of code to be multi-threaded particular attention needs to be paid to the reading and writing of data, left unchecked these areas will almost certainly cause you problems.

Once your system goes multi-threaded you will very often end up in a situation where there is some dependency between them, this situation can all to easily end up with a deadlock, Thread A requires Thread B to complete its action but Thread B cannot complete because it requires resources currently being held by Thread A, nobody can move forward and the system is deadlocked.

It's also important to realise that threads are not entirely an abstract construct, they do have an underlying reliance on infrastructure and are therefore finite. It is therefore possible for your system to become starved of threads and hit a limitation in throughput.

Learning the Language

Multi-threading is definitely an area of software engineering where going your own way is not advisable. The complexity of implementing an effective implementation combined with the potential pitfalls aren't conducive to devising custom solutions.

Let others take the strain by learning the conventions for multi-threading within your chosen language, these are likely the result of a large number of man hours by experts and will be demonstrably effective and robust.

It's likely that various approaches are possible operating with varying degrees of complexity. As with many areas of software development a healthy aversion to complexity will not steer you far wrong, always strive for the simplest possible solution.

Many languages are developing features to remove the concern around thread creation and management from the shoulders of developers, a prime example of this would be the introduction of async\await within .NET or Completablefuture in Java. Whilst it will still be possible to cause yourself problems with these kinds of features hopefully it's harder to blow your foot off.

Strategic Decisions

Even once you understand the constructs of your language it's still imperative that your codebase has a defined strategy around multi-threading.

Trying to analyse the behaviour of a code base where new threads maybe put into operation at any moment is infinitely harder than assessing code that will transition to be multi-threaded in a predictable and prescribed manner.

If this strategy is well thought through then it can look to protect developers from some of the common pitfalls.

Any sufficiently complex software will require the use of multi-threading to deal with the throughput demands being placed on it. A healthy dose of fear around this aspect of code usually comes to all developers as their experience grows.

Use this fear as a catalyst for defining a strategy and a healthy laziness in looking for language features to take the strain. Multi-threading may on occasion cause you to have sleepiness nights but it is possible to find a way to let the magic happen without disaster being round corner.

Sunday, 4 November 2018

The Final Step

We all spend our days following the processes and practices of software development, the final step of all our hard work is deployment to production.

Viewed in its simplest terms this is merely swapping out our latest and greatest bits and bytes for our previous efforts, we've completed all our testing and can deploy with a 100% confidence. Whilst this is the perfection we are aiming for no development team is ever beyond making mistakes.

Software development is a non-trivial discipline and some of our deployments will unintentionally introduce problems that we will have to deal with. With this in mind what strategies can we use to make our lives easier?

Update in Place

The simplest possible approach is to deploy your new code to your existing infrastructure by replacing the currently running code. However what this approach gains in simplicity it loses in the ability to deal with failure.

Firstly this means your infrastructure is not in a controlled state. By definition with this approach your server are pets, they are long serving machines that are the product of multiple deployments. This would make it very difficult to recreate this exact infrastructure in light of a catastrophic failure, it is also likely to make it harder for you to keep your lower environments in a state that is representative of your production environment.

Secondly this approach makes it impossible to roll-back your changes should the need arise. Yes you can re-deploy your old code but this simply represents another deployment to an environment that you have already lost faith in, it does not given the certainty that if you take this action your problems will definitely be resolved.

In light of these drawbacks we need an approach that provides more certainty and gives us an effective escape plan should the worst happen.

Blue\Green

One such approach is Blue\Green deployments.

With this approach you maintain two identical production environment, Blue and Green. At any moment in time one is your production environment and one provides a staging post for your next release.

When the time comes you deploy into for example the Blue environment and complete the necessary commission testing to give the confidence to release. At this point you switch your production traffic from pointing at your Green environment to you new and shiny Blue environment.

Because you completed your commission testing against these exact server running this exact deployment of code you can be confident that this final switch of traffic is relatively risk free. Should the worst happen and you need to roll back your release you simply switch the traffic back to the Green environment. Because this is the exact environment that was previously dealing with your production traffic this is a clean and precise roll back to a previous state.

Having two production environment, combined with an effective roll back strategy, also gives you freedom to destroy, rebuild and reconfigure your infrastructure.

One note of caution, whilst traffic switching and duplication of environments may provide deployments benefits when it comes to your code the same cannot be said of your databases. Duplicating your production databases isn't likely to be practical and any roll back of your code cannot simply throw away any data written in between deployments.

The only real tactic to combat this is to try and avoid database changes that are destructive or not backwards compatible.

Canary Deployments

A further refinement can be made to the Blue\Green approach, even with that approach you switch all your production traffic and put all of your users on the new code base in one swoop.

It is also possible to phase this approach to gradually move users across to the new release, this limits exposure if problems are found and allows them to be fixed prior to all your users being affected.

The groups of users that are directed towards the new code, could be random, users that meet a certain criteria based on the make-up of the release or users that may take a more forgiving view of any problems. This could be internal\staff users or users that have signed up to be part of beta releases in order to be the first to try out your new features.

As your confidence in the release grows you can dial up the percentage of users exposed to the new code.

The same warning about databases applies equally here when you have the potential to have different users writing and reading different data.

In a DevOps environment deployment should never be nerve wrecking because you don't trust the process. Errors and bugs are a reality of life as a software engineer but the mechanism you use to ship your software should be well understood and trusted.

The final hurdle should be no more daunting or arduous than any other barrier you faced whilst developing your release. Fear in your deployment process will encourage you to release less frequently and this doesn't benefit you or your users.

Sunday, 21 October 2018

The Value of Working

In software development we spend a large amount of time defining, debating and discovering how things work. It sounds like a simple enough question, does this software work or not? To a first approximation there can be a simple yes or no answer but the reality is usually more complicated.

Software engineering is often full of debate about the rights and wrongs of a solution, sometimes they are critical sometimes they are superficial.

Given that, how can we define when software is working?

But, why?

The fact that software is working should come with an explanation as to how, anytime this explanation delves into magic or the unexplainable "just because" then the code being fit for purpose can be called into question.

This is because the working state of software isn't binary, there can be degrees of working. Something may work but be insecure, something may work but be inefficient, something may work but only if you don't stray from the golden path.

No software is perfect, or potentially even truly finished, so some of these short comings may be acceptable, but without an effective explanation of how something works they will remain undiscovered for that determination to be made.

Obviously limits need to be placed on this explanation, it isn't necessary to understand how everything works from your code all the way down to the metal, at least not in most circumstances. But every line of code that came from your key strokes should have a clear explanation to define its existence.

Show Me

Every line of code that ever produced a defect was at one point looked upon by a developer declared to be working. But working shouldn't be an opinion, it should demonstrable by the presence of a passing test.

Legacy code is often defined as code without tests and not necessarily related to age. The reason for this is because the fear that legacy code generates comes from the inability to prove it is working. This limits the ability to refactor or evolve the code because of the risk that it will no longer fulfil its duties.

In order to maintain our faith in tests they should evolve as our understanding of the code being tested evolves. Defects in production happen despite the face tests declared the code to be working, when this is proven not to be the case the tests should evolve to catch this imperfection next time.

Enough is Enough

Code being declared to be working is not necessarily the end of the story, working is a minimum requirement. Is it secure? Will it scale? Can it be tested? Are all questions that may prolong the amount of attention that needs to be paid to the code being worked on.

While these things are important they can also lead to procrastination, the ability to recognise when something is good enough only comes with experience. Inexperience often tends to push towards either stopping at the first solution or a tendency to prematurely optimise for situations that may never arise.

Developing a more rounded attitude to this situation is born from a pragmatism that software needs to ship combined with the scars of releasing before something was truly working. To this point whether or not software continues to work post release is also something that shouldn't be taken for granted, continuing to monitor your software once its in the hands of users is what will enable you to make a better judgement next time around.

Software engineering is a discipline where the state of readiness of the end product is not a universal truth and can be in the eye of the beholder. Often something can be deemed working because it has shipped, reality often eventually proves this fallacy wrong. To a greater or lesser extent no-one has other shipped bug-free software, we shouldn't torment ourselves about that fact but just realise that working is a state in time, our job is to optimise our code such that it works for the longest possible period of time.

Sunday, 14 October 2018

Feature Proof

A software development team is in a constant cycle of delivering features, they are often the currency on which they are judged, measuring the velocity at which they can be delivered into production.

But not all features are created equally, they don't all turn out to be successful so how can we tell a potentially effective feature from one that will turn out to be a waste of effort, or maybe even worse potentially degrading to the overall experience?

These are difficult questions and this post won't provide a full-proof way to deliver great new features, it does present some criteria on which you may want to judge a potential feature when it is first proposed to be the next big thing.

Two Faced

Any new feature must provide a benefit both to users and to the business, if either of these groups end up dissatisfied then the feature will ultimately die.

A feature that only benefits the user while potentially cool and fun for the user will struggle to justify the effort involved in delivering and maintaining it, there should always be a defined benefit for the business in delivering it.

This benefit can be indirect or subtle, not every feature needs to deliver sales, but the benefit must be understood and as we will move on to discuss should be measurable. If the business benefit becomes too intangible then it can become difficult to determine success, if this happens too frequently its easy to suddenly find yourself with a system that delivers no positive business outcomes.

A feature that only delivers business value will struggle to gain traction and will do harm by persuading your users that there is nothing for them to gain from engaging with your software. Eventually a critical mass of users will reach this conclusion and your user base will collapse.

A good feature could be controversially described as a bribe, or at least a situation where you and users come to an informal unspoken agreement, they will do what you want them to do in exchange for what you're prepared to offer them.

Verifiable Success

The success of a feature shouldn't be intangible or abstract, the success of a business is neither of these things so the feature you develop to achieve that success shouldn't be either.

Before a new feature enters development there should be a hypothesis on why it will be successful and how that success is going to be measured. As we've already discussed success doesn't have to just be about the bottom line but any success worth achieving should be measurable.

Basing success on measurable analytics gives you the freedom to explore less obvious ideas, combined this with A\B testing and an effective mechanism to turn features on and off and you will provide yourself a platform to take more risks with the features you decide to try.

This also presents the opportunity to learn, the behaviour of large numbers of users is often unpredictable and counter intuitive. In this environment deploying a feature that has no defined measure of success is akin to gambling that your knowledge of your user base is complete and comprehensive, how confident are you that this is the case?

Do No Harm

Each new feature comes on the heels of those that have come before it, if you've been effective these existing feature will be delivering value both for you and your users. If this is the case then the ultimate fail would be for a new feature to degrade the performance or otherwise compromise this value chain.

No new feature should put at risk the effective operation of what has come before. This shouldn't be because of any shortcomings in the feature itself but also its development shouldn't serve to distract development effort from fixing defects or inefficiencies that currently exist in production.

Users can become frustrated by a lack of features, but they become angry when features they do want to use fail them. Too often a release can be seen as underwhelming if it only contains defects fixes but these releases can deliver the most value because they are fixing and improving features that you know users are already engaged with.

Feature development is an inexact science, if there was a guaranteed formula to delivering them then no enterprise would ever fail. It also means that the advice given in this post also comes with no guarantee, but hopefully it enforces the fact that new features need thought and a feature for the sake of delivering a feature is unlikely to benefit anybody. Once again in software engineering we may have found in an instance where less is more.

Sunday, 7 October 2018

Testing Outside the Box

Automated testing of some description is now common place within the majority of development teams. This can take many different forms, unit testing, integration testing or BDD based testing.

These different approaches are designed to test individual units of code, complete sub-systems or an entire feature. Many teams will also be automating non-functional aspects of their code via NFT and Penetration testing.

But does this represent the entire toolbox of automated testing tools that can be utilised to employ a shift left strategy? Whilst they certainly represent the core of any effective automation strategy if we think outside the box then we can come up with more inventive ways to verify the correct operation of our code.

Randomised UI Testing

Using BDD inspired tests to verify the implementation of a feature usually involves the automation of interaction with the applications UI, simulating clicking, scrolling, tapping and data entry. This interaction will be based on how we intend users to navigate and interact with our UI, however users are strange beasts and will often do things we didn't expect or cater for.

Randomised UI testing attempts to highlight potential issues if users do stray off script by interacting with the UI in a random non-structured way. The tests do not start out with a scenario or outcome they are trying to provoke or test for, instead they keep bashing on your UI for a prescribed period of time hoping to break your application.

Sometimes these tests will uncover failures in how your application deals with non-golden path journeys. On occasion the issues it finds may well be very unlikely to ever be triggered by real users but non the less highlights areas where your code could be more defensive or less wasteful with resources.

Mutation Testing

Unit tests are the first line of defence to prove that code is still doing the things it was originally intend to do, but how often do we verify that the tests can be relied upon to fail if the code they are testing does develop bugs?

This is the goal of mutation testing, supporting tooling will deliberately alter the code being tested and then run your unit tests in the hope that at least one test will fail and successfully highlight the fact that the code under tests is no longer valid. If all your tests pass then these mutations could be introduced by developers and potentially not be caught.

The majority of tooling in this area makes subtle changes to the intermediary output provided by interpreted languages. This may involve swapping logical operators, mathematical operators, post and prefix conditions or assignment operations.

Issues highlighted by mutation testing enable you to improve your unit tests to ensure that they cover all aspects of the code they are testing. It's also possible that they will highlight redundant code that has not material impact and can therefore be removed.

Constrained Environment

Up until now we've concentrated on testing the functional aspects of code but failures in non-functional aspects can be equally impacting to users. One approach to this is Non Functional Testing (NFT) that pours more and more load on your system in order to test it to breaking point.

Another approach can be to run your system under normal load conditions but within an environment that is constrained in someway.

This might be running with less memory than you have in your standard environment, or less CPU. It might mean running your software alongside certain other applications that will compete for resource or deliberately limiting bandwidth and adding latency to network calls.

In a mobile or IoT context this could take the form of running in patchy network conditions or with low battery power.

Although this style of testing can be automated they don't necessarily produce pass\fail output, instead they allow you to learn about how your system reacts under adversity. This learning may highlight aspects of your system that aren't apparent when resources are plentiful or conditions are perfect.

It's also possible that this style of testing will show that you can reduce the resources of the environment you deploy your code into and benefit from cost savings.

The first goal of any automated testing approach should be to verify correct operation of code but with a little imagination its possible to test other aspects of your code base and create feedback loops that drive continual improvement.

Effective development also involves treating all code that your write equally, tests are as much part of your production code as the binaries that are deployed into the production.

Defects in these test will allow defects into your production environment that will potentially impact users. Not only should care be taken in writing them but they should be under the same focus of improvement as every other line of code that you write.