A Two Coffee Problem: July 2018

Monday, 30 July 2018

The Start-Up Delusion

Many organisations of different sizes have a software engineering function, from a small band of developers to a large team of people covering many specialisms.

As organisations grow they can often pine for the days when they were smaller, looking back on these days and determining this is when they were most efficient, productive and successful.

Sometimes this nostalgia can be through rose tinted glasses, we lose sight of the cause of our previous success and this means we fail in trying to reproduce it.

What form do these delusions take? What mistakes do we make in analysing why things seemed better then?

Need for Speed

When we look back on our days as a smaller team we often refer to the speed that we feel we used to work and produce results. We castigate ourselves for not being able to produce results as quickly as we used to.

But to look at this in terms of speed is to misunderstand the cause of the productivity we now long for. The time taken to produce the code for any given feature does not change between a small team and a large team, the difference comes from the scope of the feature we are trying to implement.

The amount of scope a start-up will include in a feature is often dramatically less than will be attempted by a larger team. Start-ups want, and need, to get code into production quickly, to achieve this edge cases are often dismissed, scaleability is reduced to the next milestone without over reaching for something that is well over the horizon.

Start-ups do not distort the mathematics of what is possible in a certain period of time, instead they embrace the certainty that if time is fixed the only lever that can be adjusted is scope. They are masters in pragmatism, doing just enough to get over the line, as organisations grow adjustments in scope become harder because you have a user base that expects more from you and will have a reduced tolerance for missing functionality.

Different or Better

Disruption is a quality we tend to hold in high regard, this leads to frustration when we feel we held back from being disruptive by our size. I believe a cause of this frustration is a misunderstanding of what it means to be disruptive.

Disruption is often equated with being different, but being different can simply be a pseudonym for being wrong. Disruption isn't just a function of a software engineering department, it isn't about features, its about re-defining a marketplace and creating new ways to deliver value to users.

When your disruption is successful it will naturally start to transition towards normal, others will cotton onto your foresight and begin to copy.

In this situation continuing to be disruptive will become harder, instead you should look to iterate on your initial innovation and extract the rewards of being first.

There is only so many times you can change a market place before you slide into just trying to be different, but being different without an understanding of why that approach or new feature will be better, both in terms of user satisfaction but also in terms of your ability to extract value, will eventually lead to mistakes and you giving up your position as the leader in the market you created.

Successful start-ups just need one good idea that they can run with, there goal is to make noise and attract attention. As an organisation grows focus will naturally change to wanting to realise the potential of all that hard work.

Barriers to Communication

As organisations grow they naturally start to build up processes and communication channels start to become more formal.

In 1967 Melvin Conway coined Conway's Law:

"organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."

If organisations do wish to return to their roots as a start-up they would do well to heed the message of Conway's Law.

Whilst each individual new process thats introduced may come from good intentions they have a cumulative effect that degrades both the frequency and quality of the communication between teams.

To return to the topic of speed, this lack of communication in organisations that consist of multiple development teams will put the brakes on the output like nothing else.

This is also a different process to reverse, to have the approach of a start-up isn't just about the engineers mindset your whole organisation must be structured with that goal.

Its tempting to think that start-ups have access to knowledge or magic that enable them to achieve the impossible or work to a different set of rules.

In fact they are a demonstration of factors that we are all aware of and have it under our control to master.

If an organisation truly wants to achieve this hallowed state then it isn't enough to simply tell your employees to think that way, your organisation from top to bottom must embrace these factors and not shy away from them.

Sunday, 22 July 2018

Securing the Machine

Security is now a front and centre consideration when designing any software solution, the explosion in the ability of technology to help achieve goals applies equally to attackers as to the good guys.

We have also come to realise that not all threats are external, we have in recent times seen the emergence of the concept of a malicious insider, someone who plays a role in delivering software or a user of it who abuses their access to steal data or otherwise cause mischief.

This means as much consideration needs to be given to the security of the machine producing the software as to the software itself.

Privilege and Roles

For large systems the amount of infrastructure involved in building, testing and hosting it can be substantial, many team members will need various degrees of access to this infrastructure to perform a variety of tasks ranging from the mundane to the complex and far reaching.

Its easy in this situation to end up with super users who have access to do almost anything.

Aside from malicious intent this level of access also places a higher potential cost on any mistakes that may be made.

A role based approach that promotes segregation of duties provides much more protection against any user abusing their access, this can also ensures that multiple people need to be involved to complete more potentially dangerous or impactful tasks.

This approach also makes it easier to compose the correct level of access for a user without having to resort to granting super user status or granting high level access for one particular use case.

The lack of super users is also an advantage if an outside attacker manages to compromise the system or any of its users.

Secrets

Software involves many secrets, service credentials, connection strings and encryption keys to name a few.

An effective infrastructure needs to include a mechanism for managing these secrets, this needs to be both within the development process, so not every developer knows production secrets for example, and also in production to prevent exposure following any attack.

Many solutions are available for these situations, the majority of which rely on the secrets being defined within the environment in which the software is running as opposed to being in the source code itself.

This then allows access to this area of the environment to be strictly controlled and where possible be a one way transaction for users i.e. users enter secrets but only the software that needs them can ever get them out again.

This can even be taking a stage further where secrets are randomly generated inside the environment so no user ever needs to know them.

Audit Trail

Despite all the measures that may be put in place it won't be possible to reduce to zero the possibility of a destructive change, whether that be deliberate or not.

When this happens it is paramount to answer two questions, exactly what has been compromised? And what needs to be done to revert the change?

The first can be answered by having automated audit logs that ensure all changes are automatically recorded in terms of what was changed and by whom. We are used to this with source control systems like Git and this approach is becoming increasingly adopted for infrastructure changes with tooling promoting infrastructure as code.

A second advantage of this kind of tooling is the ease at which changes can be reverted, rather than having to discover and unpick changes they can be reversed much like a commit to a code base might be.

To return to our discussion about privilege and roles, the best defence against any destructive change, deliberate or not, is to ensure that major impactful changes cannot be instigated by individuals and instead require sign-off and validation from others.

An effective audit strategy also need to be able to asses impact, this may be by recording data flowing in and out of a system, recording the errors being generated or unusual system activity that would not be deemed normal.

The purpose of this article isn't to try and make you see potential threats all around your development team, while malicious insiders can be very real you much more likely to be threatened by unintentional incompetence. People make mistakes, limiting the potential security implications of these mistakes is equally as important is defending yourself against someone that intends your system harm.

A sensible approach promoting least privilege, emphasising the importance of secrets and keeping an accurate system of record for changes will all help in both scenarios. Don't see your teams roles as just writing good, and secure, software also see it as developing an equally good, and secure, machine that moves this code through to production.

Monday, 9 July 2018

Defining Software Architecture

The role of software architecture can sometimes be difficult to define, it can be easy to demonstrate its absence when a project or a system is saddled with technical debt but the impact when its being done well can sometimes be missed because the apparent simplicity it fosters.

However software architecture should not be an isolated pursuit, it should be a tool to achieve the aims of an organisation, when this is applied well the goals of the architect and the goals of the organisation are symbiotic.

It can sometimes be trite to attempt to simplify complex subjects into bullet points or pearls of wisdom, the points made here are not an attempt to do this, they are simply trying to help in providing an explanation for the role of software architecture.

Modularity and Patterns

The creating of software is an expensive pursuit, individuals skilled in its production are an expensive resource. This means that every last drop of value needs to be squeezed from their efforts, in short there is an economic benefit to code re-use.

Every time a piece of code is re-used the value returned on the original effort to produce it is increased.

Code re-use is only viable when the opportunity for re-use can be seen and the code itself can be cleanly extracted. Good architecture provides structure and patterns for implementation that promote both of these aspects and make it more likely that code, and the value it provides, can be re-used.

Good architecture also provides a framework for code, both old and new, to be stitched together to form a coherent system that can meet objectives. Bad architecture breeds re-writes and duplication where the achievement of value always comes at an expensive cost.

Flexibility

The priorities of a business are subject to change, the predictability of the frequency and direction of that change can vary.

Good architecture can adapt to these shifting sands and demonstrate a flexibility in the things it can achieve.

In part this is accomplished by the same modularity we've previously talked about, when a system is the sum of its parts then these parts can be re-worked and replaced without having to contemplate starting from scratch or undertake major surgery on a code base.

The second, equally important, aspect to flexibility is that this change can be quickly deployed.

It is fast becoming a fact that the effectiveness of a software engineering function can be measured by the frequency in which it deploys code. Against this backdrop an architecture that can't deliver a fast pace and unrestricted frequency of deployments will be deemed ineffective and lacking.

Good architecture wouldn't be overly simplified by being categorised as the smooth implementation of business change.

Containing Complexity

Writing software is a complicated business, it is so incredibly easy to do it wrong. Coupled with this the businesses and domains that it is attempting to model and facilitate are often equally complex.

Without a good architecture to provide a path through this complexity software can become fragile, ineffective and ultimately a burden on the organisation it is designed to serve.

A good architecture recognises this tendency towards complexity and the problems that can bring. To counteract this it tries to simplify where possible and contain complexity where this isn't possible.

Understanding where complexity lies in a system allows care to be taken when modifying these areas while allowing the speed of development in areas freed from complexity to be increased.

A bad architecture breeds complexity and allows it to put the brakes onto development activity, it makes it more likely that change will bring unintended consequences and acts as a multiplier for these effects having an even bigger impact in the future.

As mentioned at the start of this post software architecture can't be summed up in so few words as we have here, it is an activity that is very difficult to become proficient at and potentially cannot be mastered.

However, having certain goals in mind and being able to explain its benefits can help to garner an appreciation for the discipline and an understanding for the pitfalls of not giving it proper consideration.

Good architecture can be undermined when it takes a complex situation and makes it appear straight forward, sometimes its presence is only felt when bad architecture lays out this complexity for all to see.

Monday, 2 July 2018

RESTful Experience

Many different technologies have been devised to provide structure to the transfer of data from here to there and back again. These have ranged from the heavyweight to the lightweight.

REST, or Representational State Transfer to give it it's full name, is now the prevalent mechanism when dealing with an API to retrieve or send data, with these APIs being described as RESTful.

What does it mean for an API to be RESTful? Is it simply the sending and receiving of JSON via HTTP requests?

Although REST does have some requirements around the plumbing of how requests are made and received there are also philosophies that go deeper than this traffic management aspect.

Stateless

All the information necessary to process a RESTful request should be contained in the request itself. This is to say that the server receiving the request should not need to use state about the current session in order to process the request.

This puts the power with the client to decide which APIs to call in whatever order, it ensures the API surface is unambiguous with no required knowledge of the order APIs should be called in or any possible side effects.

This statelessness should extend to authentication and authorisation which each request containing the necessary information for both those important factors to be fulfilled.

Its important to realise that this only applies to the state of the session and the processing of the requests, the state of the resources and data being accessed is of course subject to change between requests and will have the concept of state.

Uniform Interface

REST APIs deal in the currency of resources, a resource can be almost any data item, this could represent a customer, a shopping basket, a book or a social media post.

Resources should be uniquely identifiable and have a representation in the system that is descriptive and processable.

Operations on these resources should be via standard HTTP verbs that describe the operation that is taking place.

GET: Read.
POST: Create.
PUT: Update\Replace.
DELETE: Remove.

The HTTP response codes returned from using any of these requests should also be related to the state of the resource, for example:

404: Resource with that unique identifier cannot be found.
405: Not allowed, such as when a resource cannot be deleted or modified.
409: Conflict, when a resource with that unique identifier already exists.
And so on...

The paths used when accessing these resources should also be self explanatory and naturally readable.

GET /api/v1/customer/ - Return all customers.
GET /api/v1/customer/866823e5 - Return a specific customer.
GET /api/v1/customer/?lastname=smith - Return all customers with a last name of smith.

All of this structure allows an API to be self discoverable by a user familiar with the resources being represented.

The path can also be used to impose a versioning system, ensuring that when breaking changes must be made to the how the API behaves or the representation of the data being returned that this is non-impactful for existing consumers of the API.

Cacheable and Layered

Much of what we've discussed allows REST APIs to implement certain characteristics to increase performance such as caching.

GET requests within a REST API should be cacheable by default with standard HTTP mechanisms being used to control the life time of the cached data. This helps reduce latency whilst also reducing the strain being placed on the backend sources of the data.

Segregating an API based on resources allows for a certain hierarchy and layered architecture to be put in place.

This lends itself to the micro-services model allowing systems to be composed by narrowly focused components dealing with a particular element of the domain being modelled.

REST has come to dominate the API landscape because the application of a few simple rules greatly simplifies the process of implementing an API as well as reducing the barriers to entry for a consumer to get an up and running with the API.

On occasion it may be difficult to always adhere to the rules we've laid out and it may be the case that an API being RESTful is a metric rather than an absolute. But these situations should be rare and once you have acquired a RESTful eye you will soon become adept at modelling your world according to its guidelines.