A Two Coffee Problem: October 2019

Sunday 20 October 2019

Responsibility Segregation

A consistent property of bad code is a lack of segregation between responsibilities. Relatively large classes will implement multiple facets of functionality and therefore be responsible for more than one aspect of a system.

This will lead to code that is difficult to follow, difficult to maintain and difficult to extend. Those large classes will frequently be modified because they are responsible for many things, if some of these changes are sub-optimal then technical debt gradually accumulates and grows. To finish the vicious circle this can compound the original problem leading to more technical debt and the downward spiral continues.

Command Query Responsibility Segregation (CQRS) is a design pattern focused on addressing this situation by defining clear responsibility boundaries and encouraging proponents to ensure these boundaries aren't breached.

Commands and Queries

Within the CQRS pattern functionality is either a command or a query.

A query is a piece of functionality that given a context will interrogate a data source to return the requested data to the caller. Critically a query should be idempotent and not change the state of the underlying data in any way.

A command is more task driven, it is a piece of functionality that given a context will change the state of an underlying data source or system. Because of its inherent side effects it should not be used to return data to the user as this is the role of a query, instead just returning the result of the downstream operation. This requirement around what a command should return can in practice be difficult to achieve, more often than not some data is required to come back from the command but the important aspect is that callers are aware that commands perform operations on data and therefore have side effects.

Although not explicitly part of the pattern an effective CQRS implementation will also not chain queries or command together. The layers of abstraction this builds can make the code difficult to follow and understand, this can lead to unintended consequences when a caller doesn't realise the chain of events that will unravel. Instead queries and commands should be composed by callers making individual and separate calls to each element in turn, potentially passing data between them and building an aggregated response to return upstream.

Database Origins

Although in the last section the pattern is presented in abstract terms, and CQRS can be applied too many different areas, the origins of the approach comes from the application of CRUD when dealing with databases.

Within this world there can be many advantages to treating reads and writes differently. Firstly there can be advantages to using different models depending on whether data is being queried or modified, also the load presented by reads and writes is often not symmetrical so being able to separate the workloads can bring performance and efficiency advantages.

Having strong abstractions over the top of data access also enables more flexibility in the approach to underlying storage with users being protected from the nuances this may involve via the interface presented by commands and queries.

Advantages

First and foremost the advantage of CQRS is the re-use that can be achieved by the promotion of separating concerns. When code does one thing and does it well the opportunity for re-use is increased. Quite often when classes are bigger and do more the functionality they offer will always be almost what you need but not quite. This either leads to a new class being created with a slightly modified interface, leading to duplication, or the existing class being tweaked leading to its integrity being further degraded.

Effectively segregated code is also likely to be easier to test since the interface to the code will be simpler and it is likely to have fewer dependencies.

Finally the code base as a whole will be understandable with a clearer structure. New members of your team will quickly be able to asses what the code base is capable of by looking at the queries and commands that can be executed.

No one pattern can be a solution for all problems but certain qualities of well constructed code should be promoted above all others. Segregation of responsibility is one of those qualities, it is almost the very definition of good architecture to promote this quality and ensure its adoption and adherence.

As a code bases grows and evolves you will likely have to introduce additional concepts alongside that of commands and queries, providing these new elements have a strong identity and clearly defined role within your system then this will enable your code to grow whilst still maintaining the well defined structure that is a recipe for success.

Sunday 13 October 2019

Striding To Be Secure

Whenever software is deployed it is a virtual certainty that at some point it will come under some form of attack. This might be via a bot evaluating your infrastructure for the possibility of exploiting known vulnerabilities, or a concerted effort from hackers to make your code expose data and functionality it shouldn't.

Resources like the OWASP Top 10 can help you recognise common security mistakes but each piece of software, and the use cases it implements, will present varying and sometimes specific security flaws. This means it can be a valuable exercise to take a step back and try and analyse your software from the point of view of an attacker.

STRIDE is a mnemonic that can help with this kind of threat analysis by identifying the six categories of attack that hackers may try and perpetrate against your code.

Spoofing

An authenticated system will rely on some mechanism for a user to identify themselves. Spoofing is when an attack is successfully able to identify themselves as another user. This doesn't necessarily mean breaking passwords but on attacking the mechanism your system users to prove on subsequent requests that authentication has taken place.

As a rather trivial example if your system relied on users including an HTTP header indicating their user ID as a means of authentication this could easily be spoofed by an attacker by simply including the ID of the user they are trying to spoof.

Spoofing can actually occur before users even get to your code via attacks such as DNS or TCP/IP spoofing where attacks imitate your site into to lure unsuspecting users into entering their information.

These attacks will generally be countered by careful analysis of authentication systems to ensure that identity cannot be falsified.

Tampering

Tampering occurs when an attack is successfully able to modify data in transit or at rest for a malicious purpose.

The various forms of injection attack represent the classic examples of tampering. This may be SQL injection, cross site scripting or any attack that allows an attacker to inject their own code into the application.

Tampering attacks can be addressed by taking a healthy distrust in all input from the outside world and sanitising it before it gets anyway near forming part of the execution path.

Repudiation

Repudiation is the act of being able to deny that an act or operation took place. This will generally occur if your system does not have sufficient logging to be able to track all user operations, or by allowing attacks to change or destroy logs in order to cover their tracks.

The defence against repudiation is the robust implementation of audit logging. This should cover all user interactions but also the behaviour of your infrastructure and any other data source that can be used to forensically analyse whats was happening in your system at any given point of time.

Information Disclosure

Information disclosure is perhaps the worst nightmare of any business if its systems come under attack, it occurs any time an attacker is able to view data that they shouldn't be allowed to see.

This can be caused by improper application of authorisation, insecure transport mechanism, a lack of encryption, or a lack of segregation between elements of a system allowing hackers to jump from a non-critical element to a more critical part of the system.

Your systems production data needs to be treated with the utmost care and attention, access controls and authorisation must be robustly implemented to ensure that only entitled users are ever allowed to view or export data.

Denial of Service

Denial of Service (DoS) attacks are unique in the sense that they are not necessarily aimed at extracting data from your system or causing to execute specific functionality for an attacker, instead they are simply designed to stop your software being able to offer its intended functionality to your user base.

They can take many different forms but generally involve presenting your code and your infrastructure with more work than it is capable of handling, this means your site become unavailable to legitimate users or to become so slow as to be useless to them.

The exact method of protection against these kinds of attacks will vary depending on your functionality and infrastructure but will usually depend on being able to effectively measure and categorise the traffic entering your system alongside the ability to deny and block suspicious traffic at the edge of your network.

Elevation of Privilege

Privilege elevation occurs whenever a user is able to perform operations that they shouldn't be able to perform based on their role within the system, they are generally higher level functions usually reserved for administrators.

These attacks will generally rely on an insecure authorisation mechanism, as an example if a users role is controlled via a query string element then an attacker will be able to elevate their system privilege by simply inserting this element into their requests.

We deploy our software into a dangerous world, at some point it will come under attack. There is no silver bullet that means security can be deemed as finished. You are involved in a constant battle with attackers but you can often gain great insight into your system and identify areas for improvement by trying to think the way they think.

Sunday 6 October 2019

Getting to Production

If you're a server side developer the fruition of your efforts is achieved once your code gets to production. For a team to be effective and efficient it's important that this final stage in releasing is not a scary or frightening proposition. Delivering value into production is the whole reason for your team to exist so it should be a natural and unhindered consequence of your efforts.

Achieving this productive release chain requires a deployment strategy that inspires confidence by being slick, repeatable and with a get out of jail solution for when things don't work out quite as expected.

There is no one correct strategy, this will depend on the makeup of your team, the code you are writing and the nature of your production environment. Each possibility has it's advantages and disadvantages and it will be up to you to decide what works best for your situation.

Redeploy

Perhaps the simplest strategy involves simply redeploying code to your existing servers. This will usually involve momentarily blocking incoming traffic whilst the new code is deployed and verified.

The advantage of this strategy is it doesn't involve standing up any additional infrastructure to support the release and is a straight forward and simple process to follow. However it does have disadvantages.

Firstly it involves service downtime, you are not able to service requests whilst the release is in process. Additionally since the release must be tested after deployment, whilst traffic is being blocked, it can mean this testing is conducted under pressure that is not necessarily conducive to thoroughness.

The other major disadvantage is the inability to roll back the release should a problem develop. Either you must fix forward by deploying updated code or by redeploying the previous release. Both of these mean repeating the original release process by blocking traffic, deploying and testing.

Blue Green

Using this strategy you maintain two identical production stacks, one blue and one green. At any point in time one is your active stack serving production traffic and one is inactive and only serving internal traffic, or stood down entirely. The release process involves deploying to the inactive stack, conducting appropriate testing and then switching production traffic to be served from the newly deployed code.

The advantage of this strategy is it involves virtually no down time and allows for thorough testing to be completed, on the infrastructure that will serve the traffic, prior to the release without causing impact to users. It also has the substantial benefit that should the worst happen and you need to roll back the release this is simply achieved by switching traffic back to the previously active stack, which has not been modified since it was last serving production traffic.

The main disadvantage of this approach is in the cost of maintaining two production stacks. However taking advantage of modern cloud techniques means your inactive stack only needs to be up and running in the build up to a release reducing the addtional cost.

A second disadvantage can be seen if your application is required to maintain a large amount of state, the transition of this data between the two stacks needs to be managed in order to avoid disruption to users.

Canary

Canary deployments can be viewed as a variant of the Blue Green approach. Whereas a Blue Green deployment moves all production traffic onto the new code all at once a Canary deployment does this in a more gradual phased manner.

The exact nature of the switch will depend on your technology stack. When deploying directly to servers it will likely take the form of applying a gradually increasing traffic waiting between the old and new stacks. As traffic is moved onto the new code error rates and diagnostics can be monitored for issues and the weighting reversed if necessary.

This approach is particularly well suited to a containerised deployment where the population of containers can gradually be migrated to the new code via the normal mechanism of adding and removing containers from the environment.

The advantage of this approach is that it makes it far easier to trial new approaches and implementations without having to commit to routing all production traffic to the new code. Sophisticated techniques can be applied to the change of traffic weighting to route particular use cases or users to the new code whilst allowing the majority of traffic to be served from the tried and tested deployment.

However there can be disadvantages, the deployment of new code, and any subsequent roll back, are naturally slower although this can depend on your technology stack. Depending on the nature of your codebase and functionality having multiple active versions in production can also come with its own problems in terms of backwards compatibility and state management.

As with most decisions related to software engineering whether or not any particular solution is right or wrong can be a grey area. The nature of your environment and code will influence the choice of approach, the important thing to consider is the properties that any solution should offer. These are things like the ability to quickly roll back to a working state, the ability to effectively monitor the impact of a release along with factors such as speed and repeatability.

Keep these factors in mind when choosing your solution and you will enjoy many happy years of deploying to production.