A Two Coffee Problem: 2018

Sunday 16 December 2018

Language Lessons

Developers will very often define themselves by the programming language they are using to apply their skills. Based on the perceived strengths and weaknesses of different languages they will also very often judge their counterparts along similar lines.

To a large extent this view point is misguided. The differences between languages in the same paradigm are often superficial and the majority of developers will use multiple languages during their career.

The reason developers will utilise many different languages is because they are an evolving tool, this evolution can be tracked through various generations as new concepts have been developed and different approaches postulated.

First Generation (1GL)

1GL languages, also known as machine language, is code that can be directly stored and executed by the CPU, in other words 1's and 0's.

Programming in a 1GL language would often involve the manipulation of switches or other hardware and is almost unrecognisable from modern programming techniques.

Second Generation (2GL)

2GL languages start to introduce constructs to allow developers to read and make sense of the code. They are formed of combing instructions that are executed directly by the CPU being targeted. However they provide little to no abstraction from the steps taken to run the program, developers operating at this level are manipulating registers via simple mathematical and logical operations.

Coding at this level has the potential to be very fast and very efficient, but equally is extremely complex, error prone and difficult to debug.

Code written at this level is also not portable, it will only run on the generation of processors it has been written for. While most developers will not operate at this level the code they write using higher level languages is often compiled down to a 2GL program with the subsequent binary file that is deployed representing a 1GL language.

Third Generation (3GL)

3GL languages introduced many of the basic programming constructs that all modern day developers are familiar with. This includes logical constructs such as if, else-if and looping constructs such as for, while, do-while.

Whilst still describing what needs to happen when the program runs it describes this behaviour in a more algorithmic fashion as opposed to describing how the CPU should manipulate bits and bytes.

A certain division of 3GL languages also gave rise to the concept of Object Oriented (OO) languages which attempted to represent programs in terms of collections of data and functionality that interact in a manner designed to model the problem space being solved.

Some 3GL languages also attempted to make code portable by being compiled to a 2GL language that runs on a virtual machine and therefore not tied to a particular physical CPU.

Fourth Generation (4GL)

4GL languages attempt to improve upon 3GL by operating upon larger data constructs. Sometimes this distinction can be subtle but 3GL languages often operate on relatively simple and low level data structures such has strings, booleans and value types.

Operating with higher level data constructs can make these languages less general purpose and often leads to 4GL languages being specialised around a particular purpose such as graphics or database manipulation.

Fifth Generation (5GL)

5GL languages attempt to shift the view point of code from how an outcome should be achieved to describing what that outcome should be. Rather than a programmer coding an algorithm they describe the constraints the program should operate in and the desired outcome.

The boundary between 5GL and 4GL languages is often blurred. Some 4GL languages also try to operate along the lines of what as opposed to how and are sometimes miscategorised as 5GL languages.

It would be wrong to view this evolution of languages as each subsequent generation being better than its predecessor. Mostly the changes between generations are focussed around making the programmers life easier and allowing them to describe algorithms in higher level terms, making code more readable to someone that understands the problem that is being solved.

Many coders still operate at 2GL or lower 3GL level because of the potential to achieve higher performance and efficiency. A good example of this is the prevalence that languages such as C and C++ still have in fields such as graphics and gaming.

There is no language to rule them all, eventually you may be required to solve a problem that requires a different approach, this requires you to be comfortable to go to the toolbox and select the correct tool for the job in hand no matter what generation that may be.

Whatever generation of language your are using don't define yourself by it. A programming language is a tool and you would likely be as adept at applying your coding skill in many other languages.

Monday 10 December 2018

Threading Things Together

I would predict with a high degree of certainty that all software engineers reading this could relay war stories of issues caused by multi-threading.

Cast very much under the guise of a necessary evil, multi-threading seems to have a unique ability to cause confusion with difficult to diagnose defects and perplexing behaviour. Because of this features of many popular languages are evolving to try and help developers deal with this complexity and find better routes around it.

So if we can't live without threads how can we learn to live with them?

Recognising Dragons

One of the best ways to deal with potential issues is to recognise them coming over the horizon and try and steer the code base away from them.

Chief among these potential issues are race conditions, events happening out of the expected order and therefore driving unexpected outcomes. These issues are nearly always caused by the mutation of data by multiple threads. Whenever you decide to transition an area of code to be multi-threaded particular attention needs to be paid to the reading and writing of data, left unchecked these areas will almost certainly cause you problems.

Once your system goes multi-threaded you will very often end up in a situation where there is some dependency between them, this situation can all to easily end up with a deadlock, Thread A requires Thread B to complete its action but Thread B cannot complete because it requires resources currently being held by Thread A, nobody can move forward and the system is deadlocked.

It's also important to realise that threads are not entirely an abstract construct, they do have an underlying reliance on infrastructure and are therefore finite. It is therefore possible for your system to become starved of threads and hit a limitation in throughput.

Learning the Language

Multi-threading is definitely an area of software engineering where going your own way is not advisable. The complexity of implementing an effective implementation combined with the potential pitfalls aren't conducive to devising custom solutions.

Let others take the strain by learning the conventions for multi-threading within your chosen language, these are likely the result of a large number of man hours by experts and will be demonstrably effective and robust.

It's likely that various approaches are possible operating with varying degrees of complexity. As with many areas of software development a healthy aversion to complexity will not steer you far wrong, always strive for the simplest possible solution.

Many languages are developing features to remove the concern around thread creation and management from the shoulders of developers, a prime example of this would be the introduction of async\await within .NET or Completablefuture in Java. Whilst it will still be possible to cause yourself problems with these kinds of features hopefully it's harder to blow your foot off.

Strategic Decisions

Even once you understand the constructs of your language it's still imperative that your codebase has a defined strategy around multi-threading.

Trying to analyse the behaviour of a code base where new threads maybe put into operation at any moment is infinitely harder than assessing code that will transition to be multi-threaded in a predictable and prescribed manner.

If this strategy is well thought through then it can look to protect developers from some of the common pitfalls.

Any sufficiently complex software will require the use of multi-threading to deal with the throughput demands being placed on it. A healthy dose of fear around this aspect of code usually comes to all developers as their experience grows.

Use this fear as a catalyst for defining a strategy and a healthy laziness in looking for language features to take the strain. Multi-threading may on occasion cause you to have sleepiness nights but it is possible to find a way to let the magic happen without disaster being round corner.

Sunday 4 November 2018

The Final Step

We all spend our days following the processes and practices of software development, the final step of all our hard work is deployment to production.

Viewed in its simplest terms this is merely swapping out our latest and greatest bits and bytes for our previous efforts, we've completed all our testing and can deploy with a 100% confidence. Whilst this is the perfection we are aiming for no development team is ever beyond making mistakes.

Software development is a non-trivial discipline and some of our deployments will unintentionally introduce problems that we will have to deal with. With this in mind what strategies can we use to make our lives easier?

Update in Place

The simplest possible approach is to deploy your new code to your existing infrastructure by replacing the currently running code. However what this approach gains in simplicity it loses in the ability to deal with failure.

Firstly this means your infrastructure is not in a controlled state. By definition with this approach your server are pets, they are long serving machines that are the product of multiple deployments. This would make it very difficult to recreate this exact infrastructure in light of a catastrophic failure, it is also likely to make it harder for you to keep your lower environments in a state that is representative of your production environment.

Secondly this approach makes it impossible to roll-back your changes should the need arise. Yes you can re-deploy your old code but this simply represents another deployment to an environment that you have already lost faith in, it does not given the certainty that if you take this action your problems will definitely be resolved.

In light of these drawbacks we need an approach that provides more certainty and gives us an effective escape plan should the worst happen.

Blue\Green

One such approach is Blue\Green deployments.

With this approach you maintain two identical production environment, Blue and Green. At any moment in time one is your production environment and one provides a staging post for your next release.

When the time comes you deploy into for example the Blue environment and complete the necessary commission testing to give the confidence to release. At this point you switch your production traffic from pointing at your Green environment to you new and shiny Blue environment.

Because you completed your commission testing against these exact server running this exact deployment of code you can be confident that this final switch of traffic is relatively risk free. Should the worst happen and you need to roll back your release you simply switch the traffic back to the Green environment. Because this is the exact environment that was previously dealing with your production traffic this is a clean and precise roll back to a previous state.

Having two production environment, combined with an effective roll back strategy, also gives you freedom to destroy, rebuild and reconfigure your infrastructure.

One note of caution, whilst traffic switching and duplication of environments may provide deployments benefits when it comes to your code the same cannot be said of your databases. Duplicating your production databases isn't likely to be practical and any roll back of your code cannot simply throw away any data written in between deployments.

The only real tactic to combat this is to try and avoid database changes that are destructive or not backwards compatible.

Canary Deployments

A further refinement can be made to the Blue\Green approach, even with that approach you switch all your production traffic and put all of your users on the new code base in one swoop.

It is also possible to phase this approach to gradually move users across to the new release, this limits exposure if problems are found and allows them to be fixed prior to all your users being affected.

The groups of users that are directed towards the new code, could be random, users that meet a certain criteria based on the make-up of the release or users that may take a more forgiving view of any problems. This could be internal\staff users or users that have signed up to be part of beta releases in order to be the first to try out your new features.

As your confidence in the release grows you can dial up the percentage of users exposed to the new code.

The same warning about databases applies equally here when you have the potential to have different users writing and reading different data.

In a DevOps environment deployment should never be nerve wrecking because you don't trust the process. Errors and bugs are a reality of life as a software engineer but the mechanism you use to ship your software should be well understood and trusted.

The final hurdle should be no more daunting or arduous than any other barrier you faced whilst developing your release. Fear in your deployment process will encourage you to release less frequently and this doesn't benefit you or your users.

Sunday 21 October 2018

The Value of Working

In software development we spend a large amount of time defining, debating and discovering how things work. It sounds like a simple enough question, does this software work or not? To a first approximation there can be a simple yes or no answer but the reality is usually more complicated.

Software engineering is often full of debate about the rights and wrongs of a solution, sometimes they are critical sometimes they are superficial.

Given that, how can we define when software is working?

But, why?

The fact that software is working should come with an explanation as to how, anytime this explanation delves into magic or the unexplainable "just because" then the code being fit for purpose can be called into question.

This is because the working state of software isn't binary, there can be degrees of working. Something may work but be insecure, something may work but be inefficient, something may work but only if you don't stray from the golden path.

No software is perfect, or potentially even truly finished, so some of these short comings may be acceptable, but without an effective explanation of how something works they will remain undiscovered for that determination to be made.

Obviously limits need to be placed on this explanation, it isn't necessary to understand how everything works from your code all the way down to the metal, at least not in most circumstances. But every line of code that came from your key strokes should have a clear explanation to define its existence.

Show Me

Every line of code that ever produced a defect was at one point looked upon by a developer declared to be working. But working shouldn't be an opinion, it should demonstrable by the presence of a passing test.

Legacy code is often defined as code without tests and not necessarily related to age. The reason for this is because the fear that legacy code generates comes from the inability to prove it is working. This limits the ability to refactor or evolve the code because of the risk that it will no longer fulfil its duties.

In order to maintain our faith in tests they should evolve as our understanding of the code being tested evolves. Defects in production happen despite the face tests declared the code to be working, when this is proven not to be the case the tests should evolve to catch this imperfection next time.

Enough is Enough

Code being declared to be working is not necessarily the end of the story, working is a minimum requirement. Is it secure? Will it scale? Can it be tested? Are all questions that may prolong the amount of attention that needs to be paid to the code being worked on.

While these things are important they can also lead to procrastination, the ability to recognise when something is good enough only comes with experience. Inexperience often tends to push towards either stopping at the first solution or a tendency to prematurely optimise for situations that may never arise.

Developing a more rounded attitude to this situation is born from a pragmatism that software needs to ship combined with the scars of releasing before something was truly working. To this point whether or not software continues to work post release is also something that shouldn't be taken for granted, continuing to monitor your software once its in the hands of users is what will enable you to make a better judgement next time around.

Software engineering is a discipline where the state of readiness of the end product is not a universal truth and can be in the eye of the beholder. Often something can be deemed working because it has shipped, reality often eventually proves this fallacy wrong. To a greater or lesser extent no-one has other shipped bug-free software, we shouldn't torment ourselves about that fact but just realise that working is a state in time, our job is to optimise our code such that it works for the longest possible period of time.

Sunday 14 October 2018

Feature Proof

A software development team is in a constant cycle of delivering features, they are often the currency on which they are judged, measuring the velocity at which they can be delivered into production.

But not all features are created equally, they don't all turn out to be successful so how can we tell a potentially effective feature from one that will turn out to be a waste of effort, or maybe even worse potentially degrading to the overall experience?

These are difficult questions and this post won't provide a full-proof way to deliver great new features, it does present some criteria on which you may want to judge a potential feature when it is first proposed to be the next big thing.

Two Faced

Any new feature must provide a benefit both to users and to the business, if either of these groups end up dissatisfied then the feature will ultimately die.

A feature that only benefits the user while potentially cool and fun for the user will struggle to justify the effort involved in delivering and maintaining it, there should always be a defined benefit for the business in delivering it.

This benefit can be indirect or subtle, not every feature needs to deliver sales, but the benefit must be understood and as we will move on to discuss should be measurable. If the business benefit becomes too intangible then it can become difficult to determine success, if this happens too frequently its easy to suddenly find yourself with a system that delivers no positive business outcomes.

A feature that only delivers business value will struggle to gain traction and will do harm by persuading your users that there is nothing for them to gain from engaging with your software. Eventually a critical mass of users will reach this conclusion and your user base will collapse.

A good feature could be controversially described as a bribe, or at least a situation where you and users come to an informal unspoken agreement, they will do what you want them to do in exchange for what you're prepared to offer them.

Verifiable Success

The success of a feature shouldn't be intangible or abstract, the success of a business is neither of these things so the feature you develop to achieve that success shouldn't be either.

Before a new feature enters development there should be a hypothesis on why it will be successful and how that success is going to be measured. As we've already discussed success doesn't have to just be about the bottom line but any success worth achieving should be measurable.

Basing success on measurable analytics gives you the freedom to explore less obvious ideas, combined this with A\B testing and an effective mechanism to turn features on and off and you will provide yourself a platform to take more risks with the features you decide to try.

This also presents the opportunity to learn, the behaviour of large numbers of users is often unpredictable and counter intuitive. In this environment deploying a feature that has no defined measure of success is akin to gambling that your knowledge of your user base is complete and comprehensive, how confident are you that this is the case?

Do No Harm

Each new feature comes on the heels of those that have come before it, if you've been effective these existing feature will be delivering value both for you and your users. If this is the case then the ultimate fail would be for a new feature to degrade the performance or otherwise compromise this value chain.

No new feature should put at risk the effective operation of what has come before. This shouldn't be because of any shortcomings in the feature itself but also its development shouldn't serve to distract development effort from fixing defects or inefficiencies that currently exist in production.

Users can become frustrated by a lack of features, but they become angry when features they do want to use fail them. Too often a release can be seen as underwhelming if it only contains defects fixes but these releases can deliver the most value because they are fixing and improving features that you know users are already engaged with.

Feature development is an inexact science, if there was a guaranteed formula to delivering them then no enterprise would ever fail. It also means that the advice given in this post also comes with no guarantee, but hopefully it enforces the fact that new features need thought and a feature for the sake of delivering a feature is unlikely to benefit anybody. Once again in software engineering we may have found in an instance where less is more.

Sunday 7 October 2018

Testing Outside the Box

Automated testing of some description is now common place within the majority of development teams. This can take many different forms, unit testing, integration testing or BDD based testing.

These different approaches are designed to test individual units of code, complete sub-systems or an entire feature. Many teams will also be automating non-functional aspects of their code via NFT and Penetration testing.

But does this represent the entire toolbox of automated testing tools that can be utilised to employ a shift left strategy? Whilst they certainly represent the core of any effective automation strategy if we think outside the box then we can come up with more inventive ways to verify the correct operation of our code.

Randomised UI Testing

Using BDD inspired tests to verify the implementation of a feature usually involves the automation of interaction with the applications UI, simulating clicking, scrolling, tapping and data entry. This interaction will be based on how we intend users to navigate and interact with our UI, however users are strange beasts and will often do things we didn't expect or cater for.

Randomised UI testing attempts to highlight potential issues if users do stray off script by interacting with the UI in a random non-structured way. The tests do not start out with a scenario or outcome they are trying to provoke or test for, instead they keep bashing on your UI for a prescribed period of time hoping to break your application.

Sometimes these tests will uncover failures in how your application deals with non-golden path journeys. On occasion the issues it finds may well be very unlikely to ever be triggered by real users but non the less highlights areas where your code could be more defensive or less wasteful with resources.

Mutation Testing

Unit tests are the first line of defence to prove that code is still doing the things it was originally intend to do, but how often do we verify that the tests can be relied upon to fail if the code they are testing does develop bugs?

This is the goal of mutation testing, supporting tooling will deliberately alter the code being tested and then run your unit tests in the hope that at least one test will fail and successfully highlight the fact that the code under tests is no longer valid. If all your tests pass then these mutations could be introduced by developers and potentially not be caught.

The majority of tooling in this area makes subtle changes to the intermediary output provided by interpreted languages. This may involve swapping logical operators, mathematical operators, post and prefix conditions or assignment operations.

Issues highlighted by mutation testing enable you to improve your unit tests to ensure that they cover all aspects of the code they are testing. It's also possible that they will highlight redundant code that has not material impact and can therefore be removed.

Constrained Environment

Up until now we've concentrated on testing the functional aspects of code but failures in non-functional aspects can be equally impacting to users. One approach to this is Non Functional Testing (NFT) that pours more and more load on your system in order to test it to breaking point.

Another approach can be to run your system under normal load conditions but within an environment that is constrained in someway.

This might be running with less memory than you have in your standard environment, or less CPU. It might mean running your software alongside certain other applications that will compete for resource or deliberately limiting bandwidth and adding latency to network calls.

In a mobile or IoT context this could take the form of running in patchy network conditions or with low battery power.

Although this style of testing can be automated they don't necessarily produce pass\fail output, instead they allow you to learn about how your system reacts under adversity. This learning may highlight aspects of your system that aren't apparent when resources are plentiful or conditions are perfect.

It's also possible that this style of testing will show that you can reduce the resources of the environment you deploy your code into and benefit from cost savings.

The first goal of any automated testing approach should be to verify correct operation of code but with a little imagination its possible to test other aspects of your code base and create feedback loops that drive continual improvement.

Effective development also involves treating all code that your write equally, tests are as much part of your production code as the binaries that are deployed into the production.

Defects in these test will allow defects into your production environment that will potentially impact users. Not only should care be taken in writing them but they should be under the same focus of improvement as every other line of code that you write.

Sunday 30 September 2018

Adolescent Behaviour

The start-up holds an almost mythological place in the mindset of the technology industry, often perceived as a zen like state with all elements in perfect balancing producing perfect software.

This article is not meant to be an attack on the notion of a start-up, many large organisations can benefit from the principles and practices they observe.

Instead it presents some views of behaviours often observed in a start-up culture, not necessarily good or bad but simply a consequence of being new on the block with limited resources and a strong desire to be successful quickly.

Bend the Truth

Start-ups are often in a constant state of seeking funding, whether directly via venture capitalists or investors, or indirectly via trying to attract customers to provide an income stream. This means they are very often pitching, both formally and informally, they are trying to get their message across and advertise their capabilities and ambitions.

This may seem a controversial statement but sometimes they lie, and actually this is ok.

Especially in the very early days if a start-up was honest about their current capabilities they would have very little to pitch so it is natural that this gets embellished to present future goals as already being in the here and now.

The critical factor is that everyone needs to recognise that the truth is being bent.

Those being pitched to need to understand the nature of a start-up in this position and judge them more on whether they think they can get to the destination they are pitching and not whether they believe they have already arrived.

In turn the start-up should not present whats required to its developers as if this was always supposed to be the case and acknowledge hard work and savvy will be required to meet expectations.

It is possible to sell a dream as reality and hit deadlines but all parties must acknowledge the nature of the pitch and agree to work together to make the lie true.

Building Front to Back

Another consequence of the constant need to be pitching is the hunt for the wow factor. No matter the level of engineering genius it is difficult to get people excited about an effective and scalable infrastructure design or testable and adaptable software architecture.

This leads to a natural concentration on the frontend over the backend, to developing an awe inspiring UI\UX that is teasing at the functionality and possible ingenuity underneath.

In time if the venture is to be a success an equally effective backend will be joined with this snazzy and flashy front end but initially there will be growing pains as the first iteration creaks under the weight of the frontend ambitions.

Again the important factor here is an acknowledgement of reality, such that we don't over estimate what is achievable in this initial phase of the organisations development, along with the fact that effort will need to be expended on building the capability that was hinted at.

Make Noise

The goal of many start-ups is not to scale their original idea into a Goliath, starting from zero and scaling out to infinity is an extremely difficult undertaking. Thats why often many are looking for a shortcut that often involves becoming part of a larger organisation.

A tactic for achieving this is to make noise about the possibilities your technology points at, to wet the appetite of other organisations and tempt them into taking a chance that this could be the next big thing.

This tends to lead to effort being put into the core technological aspects of a proposition and not necessarily on scaling it out. Scale is put off to another day when the resources available to the organisation may have grown significantly and quickly, both financial and in terms of experience of taking things to the next level.

As with the previous points this approach isn't a problem providing all involved are in on it and honest about the objective of the initial short and medium term goals.

Not all the points made in this article are relevant or accurate for all start-ups. So called unicorns who start life or soon achieve vast resources are often not constrained by the issues mentioned here.

But this is a very different situation to be in compared to a small band of people trying to build success from humble beginnings, this presents a unique challenge without a guaranteed blue print for success. This leads to certain unique behaviours that while at first may seem questionable is understandable given the circumstances and can lead to success despite the initial seemingly overwhelming odds.

Sunday 16 September 2018

Advancing the Art

Software Engineers have an intimate relationship with programming languages. Vigorous Arguments and spirited discussions will ensue whenever the advantages of one over the other is debated.

Sometimes these arguments while enjoyable revolve around technical nuance. Whilst different paradigms exist, functional, procedural or object oriented being examples, languages within each are often exhibit similar features and functionality.

Most languages continue to evolve, introducing new ideas to try and stay relevant to developers and the industry. Sometimes these new features have a fundamental impact on the approaches that are taken to writing code. In this article I'll present what I feel to be some such features, they aren't necessarily recent but there impact cannot be overestimated.

Managed Runtimes

The differences between managed an unmanaged code lies less in the constructs of the language itself and more in the target they are compiled to.

Unmanaged code compiles to the instruction set of a physical machine, whereas managed code compiles to be run on an idealised virtual machine. Many examples of this exist but two of the most common are the Java Virtual Machine (JVM) and the .NET Common Language Runtime (CLR).

The original intention of this approach was to make code more portable, the so called "Compile once run anywhere" methodology, but it has also brought many additional benefits.

These benefits have been related to both performance and security but chief among them is the advent of garbage collection. Because of this a whole generation of developers now don't know the pain of managing memory, trying to match-up allocation and deallocations, pointer arithmetic and heap exhaustion. Whilst it is still clearly possible to generate memory leaks many tools now exist to help with this.

The benefits of a managed runtime are continuing to be felt, emerging technologies such as Web Assembly are attempting to bring more languages choices to web development, the implications of these approaches and the impact they will have on software are probably still yet to be realised.

Asynchronous Programming

Phil Karlton is often credited with the quote:

"There are only two hard things in Computer Science: cache invalidation and naming things"

Many have suggested adding to that list, one that would get my vote would be concurrency. As code evolves beyond the trivial there almost inevitably comes a point where we need to be able to do multiple things at once, equally inevitable are the problems we get ourselves into when we reach this point.

There was a time when developers were directly exposed to thread scheduling and expected to manage this complex and error prone area. Thankfully modern languages have provided abstractions to protect us from ourselves and allow concurrency to be approached in a safe manner.

These abstractions are usually hiding a large amount of complexity and they do not mean that concurrency is always completely safe and bug free, but as with managed memory they do allow developers to not expose themselves directly to this complexity and thus give them a fighting chance of getting things right.

Lambda Expressions

In the past all functionality had to be bound to an identifier, this made it difficult for functionality to be passed between entities. Its true that languages like C have function pointers but even in this scenario the functionality itself is not unbound and can be a difficult concept to use effectively.

The emergence of lambda expressions has enabled variables relating to functionality to be defined that are truly first class citizens, this can lead to very expressive code that provides extremely readable solutions to what were once difficult problems.

A good example of such a technique is the .NET Language Integrated Query (LINQ) feature. Using this feature operations on large datasets that would have traditionally required potentially large blocks of complicated code can now be very simply expressed.

The ability to pass functionality around as easily as we do data opens up new techniques and approaches that previously wouldn't have been impossible or at least would have made for inelegant and buggy code.

Reflection

It used to be that code was mostly geared around operating on various forms of data structures, with the invention of reflection it was possible for code to become meta, code could be written to operate on other pieces of code.

This can be a double edged sword, on occasion reflection can be used to get developers out of corners that they are responsible for painting themselves into. Reflection is a powerful technique that is sometimes used as a nuclear option when more efficient solutions could have been found.

Despite this the addition of reflection to the toolbox was a big step forward for the art of software engineering and opened up whole new ways of thinking about problems and there solutions.

Many people reading this will have other language features that they feel should be on this list. My views on this are likely to be influenced by my own experience in development but this speaks to the diversity that exists within software engineering.

We should consider ourselves lucky to work in an industry where the raw materials available to us are constantly evolving and growing. These advances allow previous scars to heal, whether these be related to memory leaks, deadlocking or pointer confusion, the pain of dealing with these problems inspires us to solve them once and for all so that future generations of engineers don't have to experience the same frustrations.

Sunday 9 September 2018

Rears on Seats

An adage that could be applied to many software projects would be something along the lines of from small beginnings large inefficient teams grow.

In the early days of a project a well focused, dedicated and small team can appear to achieve miracles in terms of the scale of what can be built and the time it takes to build it. When faced with such tremendous success it is a common and natural reaction to assume that if we grow this well performing team then we will be able to achieve even more in even less time.

Unfortunately this thinking is flawed but despite this its a lesson the software industry has failed to learn almost since its inception.

Software as a Commodity

Much of this flawed thinking comes from viewing software as a commodity, more engineers means more software and more software means more functionality? The problem with this approach is it views software as a raw material with intrinsic value, this in turn equates software development with a production line.

Engineers aren't simply producing chunks of software to package and ship to production, they are in fact trying to work as a team to craft and refine a single piece of software.

To continue the production line analogy, software engineers aren't all building individual cars, they are a race team trying to perfect a single race car. When viewed like this it should be clear that more craftsmen won't help complete the job any quicker or to any greater degree of quality.

Software has no intrinsic value, we aren't employing engineers to produce more of it, we are employing them to produce just enough for us to extract value, and to try and increase the amount of value we can squeeze out of any giving quantity of code.

Scaling Capability Not Code

As an organisation engineering software its important to understand within the context of scaling what is it your are trying to scale?

If this is simply the amount of code you can produce then increasing the number of engineers in your organisation will achieve this, if however you want to scale your capability to deliver reliable and robust functionality then the solution maybe more subtle.

Software engineering resource is more akin to a sponge than a brick. It can be counter productive to try and line them all up to build a wall, instead you need to concentrate on squeezing more out of the resource you already have.

This doesn't mean working them harder, it means having faith in their abilities and asking them how they think they could be more efficient or more productive. At a certain point the answer will be that more engineers are required, reaching this point prematurely will ultimately have a much bigger negative impact.

Your engineers are experts in the development of software both in general and with regard to the software you currently have, they will also be experts on how your current processes can be improved and refined.

People Bring Overhead

Software engineers do not exist or work in isolation, software is developed by teams. As both the number of people in a team and the number of teams themselves grow this comes with an increase in certain overheads.

Effective communication between engineers will decline, more formal processes will grow and dynamism will take a back seat.

Of course this isn't inevitable, many organisations have large and successful teams. But effort is required to think about how to structure these teams to ensure that barriers don't grow between them.

Ineffective communication between teams is quite possibly the number one reason why large numbers of engineers can fail to be more than or even equal to the sum of their parts.

There are no hard and fast rules as to how to accomplish this but trying to resist the temptation to throw resource at a problem will at least to delay these headaches until the point where there is more certainty that a team needs to grow.

As with many things an important step when trying to scale software development is acceptance that you'll probably get it wrong. Recognising the signs that will highlight this will ensure that it isn't your wrongness that scales. The next step is in embracing the fact that imposing a scaling strategy is likely to be ineffective, instead, placing it in the hands of those that are being asked to scale has much more potential to produce the right answer.

Sunday 19 August 2018

Divide and Deploy

Certain patterns and practices can come and go within software engineering, either becoming defunct, disproven or otherwise going out of fashion. However, divide and conquer as epitomised by principles such as Single Responsibility has always been a strategy to deliver scaleable and robust solutions.

It is the application of this mindset that lead to the advent of microservices. This technique divides an application into a group of loosely coupled services each offering specific, cohesive and a relatively narrow set of functionalities.

Any architecture or pattern when described succinctly can sound overly simplfied, the description in the previous paragraph doesn't enlighten the reader to the devil in the detail when it comes to the implementation and application of a pattern such as microservices.

This post also won't cover these topics in enough detail to fully arm the reader but it will attempt to point in the right direction.

What is Micro?

The effectiveness of any microservices approach will be reliant on developing the correct segregation of duties for each service, this provides the modularity that is the basis for the success of the pattern. The reason this aspect of the approach is non-trivial is because it is possible for microservices to be too small.

This can create increased cognitive load on those tasked with working on the system as well as creating a ripple effect when changes are required as each service becomes less independently deployable due to its increased dependencies on other services to achieve an outcome.

A good starting point can be to align your initial design to the agents of change within your organisation, this might be areas such as ordering, billing, customer or product discovery. This can enhance the benefit of the scope of change being contained whilst also allowing engineers and the rest of the business to have a common view of the world.

Another more technical analysis can be to asses which is more time consuming, to re-write or to re-factor? A sweet spot for a microservice and an indication of its scope is whether it would be quicker to re-write or to re-factor when a wide reaching change is required.

This isn't to say that it is optimal to be continually re-writing services but the freedom offered by an architecture where this is the case can be a great aid to innovation and the effective implementation of change.

Micro-Communication

Once we have a suite of microservices how can they be drawn together to form a coherent and competent system whilst remaining loosely coupled? Two possible solutions to this conundrum are REST APIs and an Event Bus.

These two different approaches to communication between services fulfil the differences between explicit communication and a loose transmission of information.

On occasion two microservices will need to explicitly communicate, as an example an ordering service may need to communicate with a payment service and require an immediate response.

In these situations services can present a RESTful API surface to allow this kind of interaction. Intrinsically linked to this approach is the concept of service discovery, there are various approaches that can be taken but essentially this involves microservices registering their capabilities and the APIs they offer to clients to allow a system to form dynamically as new services are introduced.

This kind of communication exists where interactions between services are necessarily pre-defined but sometimes it is desirable and advantageous to allow these interactions to be looser.

An event bus offers this capability by allowing services to broadcast the occurrence of certain events and allowing others to react to those events without having to be aware of the source.

Following a similar example the same ordering service on successful completion of an order might broadcast this as an event so that services that handle communication with customers can take the appropriate action. The loosely coupled nature of this communication allows behaviour to be easily changed without the need for refactoring in the service handling the initial action.

Micro-Deployment

So now we have our collection of properly sized and loosely coupled services how do we manage their deployment into our environment?

A key element of microservice deployment is independence, a sure sign that a microservices architecture is flawed is when it is necessary to deploy a number of services concurrently. This speaks to strong coupling caused either by a tightly coupled domain or an incorrect segmentation of responsibilities.

A lack of independence also negates the intrinsic promotion of a CI\CD approach that microservices when applied correctly can offer.

Independence of deployment when combined with a load balanced blue\green approach also creates an environment where changes can be gradually introduced and quickly rolled back if problems present themselves. This is in contrast to a monolithic architecture where a small problem in one area may result in working changes in other areas having to be rolled back.

Microservices is a pattern that comes more with a serving suggestion than a recipe. Its proper application cannot be applied via a cookie cutter methodology. There are certain aspects that are universal but a good architecture will only be designed by understanding how they can be applied to your particular situation whilst keeping an eye on the successful realisation of its benefits. Many of these benefits will enable you to make mistakes by giving you the freedom to reconfigure, refactor and realign your architecture without causing widespread disruption.

Sunday 5 August 2018

The Miracle of HTTP

The web is now all pervasive in the majority of peoples lives, sometimes its explicit when we are surfing the web and sometimes is implicit for example in relation to the rise of the Internet of Things.

The complexity of the implementation of the web is on a scale that can cause you to wonder how this thing always keeps working. The perceived speed of advances in the capability of the web can lead you to think that the technology is constantly evolving and refining.

Whilst this may be true for the technologies that enable us to implement things on the web, many of the underlying technologies that underpin the web and keep it working have remained surprisingly static since its inception.

Once of these technologies is the Hypertext Transfer Protocol (HTTP).

What Did HTTP Ever Do For Me?

HTTP is a request\response protocol that allows clients, for example your desktop browser, to request data from a server, for example the server allowing you to read this blog.

It acts within the application layer of the Internet Protocol suite and handles the transfer of resources, or hypermedia, on top of a transport layer connection provided by protocols such as the Transmission Control Protocol (TCP).

It is a descriptive protocol that allows requests to describe what the user wants to do with the indicated resource, to this end each request uses an HTTP verb such as GET, POST, PUT and DELETE and each response uses status codes such as OK, NOT FOUND, FORBIDDEN to form an exchange that describes the interactions during a session.

Each request and response also contains certain Header Fields that can be used to further describe both the contents of the request\response and the parties sending them.

If you were to see an exchange of HTTP requests between your browser and a web site you would be surprised at how much it is human readable, understandable and intuitive. This speaks volumes for the initial design of the protocol and it's elegant simplicity.

Without HTTP we wouldn't have a way to allow the enormous amount of data on the internet to be queried for, retrieved, uploaded or managed.

The Brief History of HTTP

Given the importance of HTTP you may think that it has been under constant revision and refinement as the use and the size of the web grows.

Actually this is not the case, the first published version of the HTTP protocol (v0.9) was published in 1991. This was followed by v1.0 in 1996 and v1.1 in 1997, with v1.1 of the protocol still being the prevalent driving force behind the majority of web usage.

Over the intervening years there have been clarifications and refinements to v1.1 and the built in extensibility designed into the protocol has allowed improvements to be made, but still the protocol delivering the majority of the benefits we derive from the web has remained largely unchanged since what most would consider the birth of the public use of the internet.

This really does go to show the benefit of designing for simplicity, if you develop robust and descriptive technology this can be the mother of invention when creative people pick up the tool you've provided them and combine it with their imagination.

What The Future Holds

So does all this mean that HTTP is done and dusted with all problems solved? Well no, our use of the web and the nature of the data we channel through it have changed since the 90's and while HTTP is able to service our demands there is scope for improvement.

HTTP/2, derived from Google's experiments with the SPDY protocol, is a major release and update to the HTTP protocol. It focuses on performance and efficiency improvements to adapt HTTP to the ways in which we now use the web.

Officially standardised in 2015 the backwards compatibility of HTTP/2 with HTTP v1.1 allows the adoption of the new protocol to spread without fundamentally breaking the web.

As of 2018 around 30% of the top 10 million web sites support HTTP/2 with most major browsers also providing support for the new version of the protocol.

The substantial effort involved in being able to successfully advance a fundamental protocol such as HTTP whilst not breaking the web speaks to the relative slow progression of new versions of the protocol.

Its easy to take the web for granted without admiring the engineering that originally built the basis for all the future innovation that has subsequently changed all our lives. Anyone that has worked in an technological industry will to speak to the difficulty in getting things right and the normality of deciding this isn't going to work and being forced to start-over.

Whilst it would be misleading to assume this never happened during the development of HTTP, the small number of published versions of the protocol combined with the explosive and transformative impact of the web truly is something to behold and admire.

Next time your using the web, or an app, or talking to your smart speaker tip your hat to the engineering that not only made it all possible but that also provided the stability for the evolution that got us to this point.

Monday 30 July 2018

The Start-Up Delusion

Many organisations of different sizes have a software engineering function, from a small band of developers to a large team of people covering many specialisms.

As organisations grow they can often pine for the days when they were smaller, looking back on these days and determining this is when they were most efficient, productive and successful.

Sometimes this nostalgia can be through rose tinted glasses, we lose sight of the cause of our previous success and this means we fail in trying to reproduce it.

What form do these delusions take? What mistakes do we make in analysing why things seemed better then?

Need for Speed

When we look back on our days as a smaller team we often refer to the speed that we feel we used to work and produce results. We castigate ourselves for not being able to produce results as quickly as we used to.

But to look at this in terms of speed is to misunderstand the cause of the productivity we now long for. The time taken to produce the code for any given feature does not change between a small team and a large team, the difference comes from the scope of the feature we are trying to implement.

The amount of scope a start-up will include in a feature is often dramatically less than will be attempted by a larger team. Start-ups want, and need, to get code into production quickly, to achieve this edge cases are often dismissed, scaleability is reduced to the next milestone without over reaching for something that is well over the horizon.

Start-ups do not distort the mathematics of what is possible in a certain period of time, instead they embrace the certainty that if time is fixed the only lever that can be adjusted is scope. They are masters in pragmatism, doing just enough to get over the line, as organisations grow adjustments in scope become harder because you have a user base that expects more from you and will have a reduced tolerance for missing functionality.

Different or Better

Disruption is a quality we tend to hold in high regard, this leads to frustration when we feel we held back from being disruptive by our size. I believe a cause of this frustration is a misunderstanding of what it means to be disruptive.

Disruption is often equated with being different, but being different can simply be a pseudonym for being wrong. Disruption isn't just a function of a software engineering department, it isn't about features, its about re-defining a marketplace and creating new ways to deliver value to users.

When your disruption is successful it will naturally start to transition towards normal, others will cotton onto your foresight and begin to copy.

In this situation continuing to be disruptive will become harder, instead you should look to iterate on your initial innovation and extract the rewards of being first.

There is only so many times you can change a market place before you slide into just trying to be different, but being different without an understanding of why that approach or new feature will be better, both in terms of user satisfaction but also in terms of your ability to extract value, will eventually lead to mistakes and you giving up your position as the leader in the market you created.

Successful start-ups just need one good idea that they can run with, there goal is to make noise and attract attention. As an organisation grows focus will naturally change to wanting to realise the potential of all that hard work.

Barriers to Communication

As organisations grow they naturally start to build up processes and communication channels start to become more formal.

In 1967 Melvin Conway coined Conway's Law:

"organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."

If organisations do wish to return to their roots as a start-up they would do well to heed the message of Conway's Law.

Whilst each individual new process thats introduced may come from good intentions they have a cumulative effect that degrades both the frequency and quality of the communication between teams.

To return to the topic of speed, this lack of communication in organisations that consist of multiple development teams will put the brakes on the output like nothing else.

This is also a different process to reverse, to have the approach of a start-up isn't just about the engineers mindset your whole organisation must be structured with that goal.

Its tempting to think that start-ups have access to knowledge or magic that enable them to achieve the impossible or work to a different set of rules.

In fact they are a demonstration of factors that we are all aware of and have it under our control to master.

If an organisation truly wants to achieve this hallowed state then it isn't enough to simply tell your employees to think that way, your organisation from top to bottom must embrace these factors and not shy away from them.

Sunday 22 July 2018

Securing the Machine

Security is now a front and centre consideration when designing any software solution, the explosion in the ability of technology to help achieve goals applies equally to attackers as to the good guys.

We have also come to realise that not all threats are external, we have in recent times seen the emergence of the concept of a malicious insider, someone who plays a role in delivering software or a user of it who abuses their access to steal data or otherwise cause mischief.

This means as much consideration needs to be given to the security of the machine producing the software as to the software itself.

Privilege and Roles

For large systems the amount of infrastructure involved in building, testing and hosting it can be substantial, many team members will need various degrees of access to this infrastructure to perform a variety of tasks ranging from the mundane to the complex and far reaching.

Its easy in this situation to end up with super users who have access to do almost anything.

Aside from malicious intent this level of access also places a higher potential cost on any mistakes that may be made.

A role based approach that promotes segregation of duties provides much more protection against any user abusing their access, this can also ensures that multiple people need to be involved to complete more potentially dangerous or impactful tasks.

This approach also makes it easier to compose the correct level of access for a user without having to resort to granting super user status or granting high level access for one particular use case.

The lack of super users is also an advantage if an outside attacker manages to compromise the system or any of its users.

Secrets

Software involves many secrets, service credentials, connection strings and encryption keys to name a few.

An effective infrastructure needs to include a mechanism for managing these secrets, this needs to be both within the development process, so not every developer knows production secrets for example, and also in production to prevent exposure following any attack.

Many solutions are available for these situations, the majority of which rely on the secrets being defined within the environment in which the software is running as opposed to being in the source code itself.

This then allows access to this area of the environment to be strictly controlled and where possible be a one way transaction for users i.e. users enter secrets but only the software that needs them can ever get them out again.

This can even be taking a stage further where secrets are randomly generated inside the environment so no user ever needs to know them.

Audit Trail

Despite all the measures that may be put in place it won't be possible to reduce to zero the possibility of a destructive change, whether that be deliberate or not.

When this happens it is paramount to answer two questions, exactly what has been compromised? And what needs to be done to revert the change?

The first can be answered by having automated audit logs that ensure all changes are automatically recorded in terms of what was changed and by whom. We are used to this with source control systems like Git and this approach is becoming increasingly adopted for infrastructure changes with tooling promoting infrastructure as code.

A second advantage of this kind of tooling is the ease at which changes can be reverted, rather than having to discover and unpick changes they can be reversed much like a commit to a code base might be.

To return to our discussion about privilege and roles, the best defence against any destructive change, deliberate or not, is to ensure that major impactful changes cannot be instigated by individuals and instead require sign-off and validation from others.

An effective audit strategy also need to be able to asses impact, this may be by recording data flowing in and out of a system, recording the errors being generated or unusual system activity that would not be deemed normal.

The purpose of this article isn't to try and make you see potential threats all around your development team, while malicious insiders can be very real you much more likely to be threatened by unintentional incompetence. People make mistakes, limiting the potential security implications of these mistakes is equally as important is defending yourself against someone that intends your system harm.

A sensible approach promoting least privilege, emphasising the importance of secrets and keeping an accurate system of record for changes will all help in both scenarios. Don't see your teams roles as just writing good, and secure, software also see it as developing an equally good, and secure, machine that moves this code through to production.

Monday 9 July 2018

Defining Software Architecture

The role of software architecture can sometimes be difficult to define, it can be easy to demonstrate its absence when a project or a system is saddled with technical debt but the impact when its being done well can sometimes be missed because the apparent simplicity it fosters.

However software architecture should not be an isolated pursuit, it should be a tool to achieve the aims of an organisation, when this is applied well the goals of the architect and the goals of the organisation are symbiotic.

It can sometimes be trite to attempt to simplify complex subjects into bullet points or pearls of wisdom, the points made here are not an attempt to do this, they are simply trying to help in providing an explanation for the role of software architecture.

Modularity and Patterns

The creating of software is an expensive pursuit, individuals skilled in its production are an expensive resource. This means that every last drop of value needs to be squeezed from their efforts, in short there is an economic benefit to code re-use.

Every time a piece of code is re-used the value returned on the original effort to produce it is increased.

Code re-use is only viable when the opportunity for re-use can be seen and the code itself can be cleanly extracted. Good architecture provides structure and patterns for implementation that promote both of these aspects and make it more likely that code, and the value it provides, can be re-used.

Good architecture also provides a framework for code, both old and new, to be stitched together to form a coherent system that can meet objectives. Bad architecture breeds re-writes and duplication where the achievement of value always comes at an expensive cost.

Flexibility

The priorities of a business are subject to change, the predictability of the frequency and direction of that change can vary.

Good architecture can adapt to these shifting sands and demonstrate a flexibility in the things it can achieve.

In part this is accomplished by the same modularity we've previously talked about, when a system is the sum of its parts then these parts can be re-worked and replaced without having to contemplate starting from scratch or undertake major surgery on a code base.

The second, equally important, aspect to flexibility is that this change can be quickly deployed.

It is fast becoming a fact that the effectiveness of a software engineering function can be measured by the frequency in which it deploys code. Against this backdrop an architecture that can't deliver a fast pace and unrestricted frequency of deployments will be deemed ineffective and lacking.

Good architecture wouldn't be overly simplified by being categorised as the smooth implementation of business change.

Containing Complexity

Writing software is a complicated business, it is so incredibly easy to do it wrong. Coupled with this the businesses and domains that it is attempting to model and facilitate are often equally complex.

Without a good architecture to provide a path through this complexity software can become fragile, ineffective and ultimately a burden on the organisation it is designed to serve.

A good architecture recognises this tendency towards complexity and the problems that can bring. To counteract this it tries to simplify where possible and contain complexity where this isn't possible.

Understanding where complexity lies in a system allows care to be taken when modifying these areas while allowing the speed of development in areas freed from complexity to be increased.

A bad architecture breeds complexity and allows it to put the brakes onto development activity, it makes it more likely that change will bring unintended consequences and acts as a multiplier for these effects having an even bigger impact in the future.

As mentioned at the start of this post software architecture can't be summed up in so few words as we have here, it is an activity that is very difficult to become proficient at and potentially cannot be mastered.

However, having certain goals in mind and being able to explain its benefits can help to garner an appreciation for the discipline and an understanding for the pitfalls of not giving it proper consideration.

Good architecture can be undermined when it takes a complex situation and makes it appear straight forward, sometimes its presence is only felt when bad architecture lays out this complexity for all to see.