A Two Coffee Problem: 2021

Sunday, 16 May 2021

Security Pitfalls

Whether or not a software application will be exposed to security threats is mostly a case of when rather than if. We can often be tempted to think that we need to be more concerned by the attention of a sophisticated attacker with a large arsenal of weapons. However it's unfortunately the case that most threats are a result us failing to learn from previous mistakes.

The OWASP top ten, largely seen as the industry standard guide to prevalent security threats, has consistently been dominated by well known attacks. These vulnerabilities have been with us for a long time and have well known mitigations, yet they continue to be seen in countless applications.

In this article I won't go through all of the OWASP top ten, instead I will highlight some of the categories that the threats fall under and the mitigations that should always be at the forefront of your mind.

Don't Trust The Outside World

Still by far the most commonly observed vulnerabilities relate to applications putting too much trust into input received from the outside world. These injection attacks fall broadly into three categories, SQL Injection, Cross Site Scripting (XSS) and Cross Site Request Forgery (XSRF).

SQL Injection can come in many forms but occurs whenever user input is directly included in SQL statements without being sanitised. This either causes more data to be exposed to the caller than they are entitled to view or allows them to run potentially destructive commands against the database.

XSS occurs when an attacker is able to inject client script into web pages being viewed by other users. XSS scripting attacks are often categorised as Non-Persistent (reflected) and Persistent. An example of a reflected attack might be when an attacker is able to manipulate the query string submitted to a page that is then in some form included in the page contents causing the script to execute within the victims browser. An example of a persistent attack might be if an attacker is able to include client side script into data that is stored by the server and subsequently rendered in multiple victims browsers, this could be via a message forum or some kind of profile page that is viewable my multiple users.

XSRF occurs when an attacker is able to get a victim to submit client side script to a page that trusts the user. This will generally involve trying to get a user to click a link or otherwise submit data to a site they are currently logged into. An example might be sending a user a link to a site that includes potentially damaging query string parameters if the user has administrator privileges.

The protection against all these attacks is to distrust all input from outside your application. Never assume that a user will only ever submit data in the intended format, sanitise all inputs and block where necessary where script or potentially executable code is detected.

Not Implemented Here

Many well defined and proven techniques for securing applications exist, these range from encryption and authentication to data integrity checks. Often these techniques rely on necessarily complex logic and mathematics.

Although sometimes flaws in this logic can be found it is far more common for attacks to exploit mistakes in their implementation rather than weaknesses in the core algorithm. Many of these techniques can be interesting to implement, and many engineering teams will frequently have a "not implemented here" attitude only wanting to rely on code written internally.

Security is not an area of coding to have this attitude towards, unless you are an expert in the field it will be very likely that you will make a mistake in its implementation. These mistakes may not be immediately visible and may not relate to the core application of the technique but none the less they may give an attacker the crack in the door that they are looking for.

Trust in the implementations written by experts, even they will have to release patches when small mistakes are uncovered but their expertise and ability to verify their work will put you in a much better position than if you try and go it alone.

Undefined Behaviour

Many categories of attacks revolve around getting applications to behave in an unexpected way. These range from buffer overflow and arbitrary code execution to memory corruption and insecure deserialisation.

Quite often these attacks exploit the very low level details of code execution, also thankfully many built in defences such as stack canaries and Address Space Layout Randomisation (ASLR) exist to try and protect you against them being exploited against your application.

Despite this you still need to be alert to when your application is exhibiting unintended behaviour. This is true even if the observed behaviour is not considered a bug. It may be that the apparent side effects are benign but it's important you always understand why your application behaves the way it does.

Attackers will very often begin probing your system by doing unusual or unexpected things to view your systems response. They will look for ways to exploit the behaviour they observe and possibly chain together multiple instances of unexpected behaviour to get the result they want.

Always take the view that if something isn't supposed to happen then the system should be configured to not allow it to happen regardless of whether or not the consequences of it happening seem dangerous. In a similar manner to injection attacks show a healthy distrust for the world outside your application and its adherence to how your application is supposed to be used.

We sometimes assume that all potential attackers are extremely sophisticated. Although such attackers exist there are many that continue to exploit relatively simple and well known attack vectors. Because these attacks are well known and have existed for some time they also have well proven mitigations. Try to ensure that your application doesn't fall prey to these defensible attacks and always be mindful of the OWASP Top Ten.

Sunday, 25 April 2021

Docker Basics

In my previous post I covered the explanation of Kubernetes terminology that finally helped me gain an understanding of its purpose and operation. In this post I decided to delve one layer deeper to cover the basics of Docker.

As with my previous post I must add a disclaimer that I am not a Docker expert so the descriptions below aren't meant to be authoritative, but they are the explanations that make sense to me and aided my initial understanding of the topic.

Containers vs Virtual Machines

Prior to the advent of containers, of which Docker is one implementation, the most common deployment mechanism in cloud computing was virtual machines.

Virtual machines provide full hardware virtualisation by the means of a hypervisor, they include a full operating system install along with abstractions of hardware interfaces. Software is installed on the machine as if it was a normal physical server, however since the hypervisor can support multiple virtual machines on a single physical server they enable the available resources to be maximised.

Containers do not try to provide an abstraction of a whole machine, instead they provide access to the host operating system kernel whilst also allowing each individual container to be isolated from each other. Not only is this a more efficient use of resources but it allows software to be packaged in such a way as to include all required dependencies in an immutable format via container images.

Daemon and Clients

Docker follows a client server architecture. The Docker daemon provides an API for interacting with Docker functionality and also manages containers running on the host machine.

The Docker client provides users with a mechanism for interacting with the Docker daemon. It allows users to build, run, start and stop containers as well as many other commands for building and managing docker containers.

A second client called Docker Compose allows users, via a YAML file, to define the make up of a whole system comprising multiple containers. It defines which containers should run along with various configuration information related to issues such as networking or attachment to storage.

Images, Containers and Registries

A Docker image defines an immutable template for how to build a container. A powerful aspect of docker is that it allows images to be based on other images creating a layered approach to their construction. For example you may define an image for your container to start with an image for the operating system you want to work with, then add the image of the web server you want to use followed by your application. These steps are defined in a Docker File that provides the instructions on how each layer should be built up to define the container image.

A container is a running instance of an image. When running a container you define the image you want it to be based on plus any configuration information it might need. The important aspect is that the container contains everything necessary for the application to run. As opposed to deployment to a virtual machine that might rely on certain dependencies already being present a container is self contained and therefore highly portable.

A Docker registry is a means for storing and sharing images, it acts like a library for different container images that can be updated as new versions of the container are defined. When using Docker Compose to define the make up of a system you will often specify the version of a container to run by pointing at a particular version of a container within a registry.

Clearly a technology as complex as Docker has many intricacies and complexities that I haven't covered in this post. However more advanced topics are always easier to approach once you have sound understanding of the basics. Never try to tackle the task of understanding everything about an area of technology, instead see it as a journey and accept it may take some time for the knowledge to take hold. The explanations I've provided in this post helped me on that journey, hopefully they can help you too.

Sunday, 18 April 2021

Kubernetes Basics

As containerization became a more and more popular deployment choice it was natural that tools would need to be developed to manage systems that may comprise large numbers of containers each focusing on different aspects of functionality.

Kubernetes is one such tool providing an orchestration layer for containers to handle everything from lifecycles and scheduling to networking.

It took me quite some time to get to grips with the concepts behind Kubernetes, I think this was largely because the definitions and explanations online can vary greatly. Presented below are the definitions that finally enabled me to understand what Kubernetes is trying to do and how it goes about achieving it.

I am not a Kubernetes expert so by no means am I presenting these explanations as definitive, all I hope is that they help someone else start their journey towards understanding the purpose and operation of Kubernetes.

Nodes

Nodes are the virtual machines that make up a Kubernetes cluster that can run and manage applications.

One node is designated the master and implements the control plane functionality to maintain and manage the cluster. Other nodes orchestrated by the master run the applications that the cluster is responsible for. Which nodes run which applications will vary depending on the nature of the applications alongside constraints such as CPU and memory usage.

Pods

A pod is the smallest deployment unit of Kubernetes. It can run one or more containers, since Kubernetes treats the pod as a single unit when it is started or stopped then so are all the containers within it.

Whilst in theory a pod could be comprised of multiple container types it is a common pattern for there to be a one to one relationship between a pod and container, for example to provide an API or access to an underlying datastore.

Sometimes addtional container types may be added to a pod to provide cross cutting concerns to the main container. This will quite often follow the sidecar pattern and be related to functionality such as acting as a sink for logging or providing a proxy to a network interface.

Deployments

We said earlier that one of the functions of Kubernetes is to manage lifecycle and scheduling concerns, a deployment is how we indicate to Kubernetes how these things should be dealt with.

A deployment might define:

A pod and an associated container image.
That a certain number of instances of the pod should be running at all times.
CPU and memory requirements for each pod, this may also involve setting limits for the amount of resource pods should be allowed to consume.
A strategy for how an update to pods should be managed.

Kubernetes will attempt to ensure that the deployment always matches the state described. if your application crashes then an unresponsive pod will be swapped out for a fresh one, if the amount of resource a pod is consuming increases then an existing pod may be moved to a node with more available resource.

When you update your deployment to reference a new version of your container then Kubernetes will also manage the transition from the existing pods to new pods that are running your updated container.

Services

Now with our application running in containers within pods we need a way for other applications in the cluster to be able to take advantage of it.

We wouldn't want pods to have to directly communicate with other pods, not only would this cause problems from a networking point of view since pods can come and go, but also we need a mechanism to ensure load is distributed across all the pods running the application.

Services within Kubernetes act a bit like a load balancer, they sit above a group of pods providing a consistent route to the underlying functionality. When a pod requires functionality implemented by another pod it sends a network request to a DNS entry defined by Kubernetes that represents the service endpoint.

Pods can now be freely added and removed from the service and pods don't need to be aware of each other in order to make use of their functionality.

Ingresses

Services provide an internal route to functionality provided by pods but it's likely that we will want to make some of this functionality available outside the cluster.

An ingress exposes an HTTP endpoint outside of the cluster that points at an internal service. In this sense an ingress acts like a reverse proxy onto the internal load balancer provided by the service allowing applications outside the cluster to invoke the underlying functionality.

An ingress can also provide other functionalities such as path based routing or SSL termination to present a consistent and secure interface to the world outside the cluster.

This has been a whirlwind tour of the basic concepts within Kubernetes, it is by no means exhaustive. I hope it enables you to understand the purpose of Kubernetes to aid your learning of the intricacies of an actual Kubernetes cluster. The devil is always in the detail but understanding of the fundamental concepts provides a solid bed on which to build the rest of your understanding.

Thursday, 1 April 2021

Creating Chaos

In software development chaos engineering is the process of running experiments against a system in order to build confidence in its ability to withstand unexpected conditions or changes in environment.

First developed in 2011 by Netflix as part of their adoption of cloud infrastructure, it's underlying principles have been applied to many situations but typically experiments include things such as:

Deliberately causing infrastructure failures, such as bringing down application servers or databases.
Introducing less favourable network conditions by introducing increased latency, packet loss or errors in essential services such as DNS.

In an attempt to automate these experiments Netflix developed a tool called Chaos Monkey to deliberately tear down servers within its production environment. The guarantee that they would see these kinds of failures helped foster an engineering culture of resilience and redundancy.

We may not all be brave enough to run these experiments within our production environment but if we choose to experiment in the safety of a test environment then what principles should be following?

Steady State Hypothesis

A secondary advantage to chaos engineering is the promotion of metrics within the system. If you are to run automated experiments against your system then you must be able to measure their impact to determine how the system coped. If the system behaviour was not observed to be ideal and changes are made then metrics also act as validation that the situation has improved.

Before running an experiment you should define an hypothesis around what you consider the steady state of your system to be. This might involve error rates, throughput of requests or overall latency. As your experiment runs these metrics will indicate if your system is able to maintain this steady state despite the deterioration in the environment.

Vary Real World Events

It's important that the mechanisms you use to degrade the environment are representative of the real world events your system might have to cope with. We are not looking to simulate an event such as server failing we are actually going to destroy it.

How you choose to approach the make up of the failures being introduced is likely to depend on the impact such an event could potentially have and\or the frequency at which you think such an event might occur.

The important consideration is that there should be some random element to the events. The reason for employing chaos engineering is to acknowledge the fact that for any reasonably complicated system it is virtually impossible to accurately predict how it will react. Things that you may have thought cannot happen may turn out to be possible.

Automate Continual Experiments

As you learn to implement the principles of chaos engineering you may rely on manual experimentation as part of a test and learn approach. However this can be an intensive process, the ultimate goal should be to develop the ability to run continual experiments by introducing a level of automation to the experiments.

Many automated tools, including Chaos Monkey, now exist to aid this type of automation. Once you have an appreciation on the types of experiments you want to run, and are confident your system produces the metrics necessary to judge the outcome, then these tools should be used to regularly and frequently run experiments.

The principles of chaos engineering are finding new application in many different aspects of software development, including topics such as system security for example by deliberately introducing infrastructure that doesn't conform to security best practices to measure the systems response and it's ability to enforce policy.

Not every system will lend it's self to a chaos engineering approach, for example an on-premise system where servers are not as easily destroyed as is the case in the cloud may limit options for running experiments. There also needs to be consideration as to the size of the potential blast radius for any experiment and a plan for returning to previous environmental conditions should the system fail to recover.

Your system's reaction to a large number of the experiments you run will likely surprise you in both good and bad ways. As previously stated for a system of any reasonable complexity it is unrealistic to expect to have an accurate view of how the system works under all possible conditions, the experiments you run are a learning exercise to try and fill in these gaps in your knowledge and ensure you are doing all you can to make sure your system performs the role your users want it to.

Sunday, 10 January 2021

Everything in the Repo

Interaction with source control is a daily task for most developers, the idea of not managing source code in this way would seem unthinkable. The advantages that effective source control can give have lead many to look to include more of the material and information required to write, deploy and run software to be part of the same standard development practices.

This idea has gone by many names, at WeaveWorks they have coined the term GitOps. Although in their description of the process they assume a container based deployment using Kubernetes, the principles they define for an effective GitOps strategy could be applied too many different deployment scenarios.

The Entire System Described In The Repository

No matter the nature of the software you are writing it will need to be built and deployed. To achieve this most teams will have defined CI/CD pipelines to build and deploy the code to various deployment environments.

A GitOps strategy ensures that these pipelines, and the infrastructure they serve, are declared alongside the source code. By cloning the repo you should have access to all the information required to understand the system.

The Canonical Desired System State Versioned in Git

Once your entire system is under source control then you have a single source of truth for its current state and also for any previous state in the past. Changes to CI\CD and infrastructure are tracked alongside the code of the application allowing you to move back and forth in time and maintain a working system.

The most obvious advantage this gives is in dealing with an unintended breaking change to the application related to CI\CD or infrastructure changes. Without these things being under source control you have to follow a painful process of trying to understand the changes that have been made and defining a plan for undoing these changes or trying to fix forward. A GitOps strategy reduced this task to something as simple as a Git Revert command or redeploying from a previous release branch.

Approved Changes That Can Be Automatically Applied To The System

When applying changes to an applications source code developers are used to going through a review process before changes are applied. This may involve a peer review by another developer and\or by following a shift left strategy it may involve a series of automated tests to ensure correctness.

By following a GitOps strategy these process can be applied to changes to CI\CD and infrastructure as well as code. As with any shift left strategy this reduces the chances of the team being impacted by changes that may inadvertently break pipelines, result in a non-working application after deployment, or unintentionally increase costs due to a misconfigured infrastructure change.

Software Agents to Ensure Correctness and Alert on Divergence

Your ability to follow this principle will vary based in your deployment model, but in essence by having source control be the source of truth for your system it enables software to automatically detect when this doesn't match the reality of your deployment and make the appropriate changes.

Not only do this mean you get to see your changes reflected in your environments at a faster pace it also decreases the time to recover from human error once the bad change set has been reversed.

When looking to apply these principles you will have to analyse how they can best be implemented for your application and the environments you deploy into. As with most philosophies there is no one size fits all approach, the degree to which you are applying these principles maybe an intangible measure rather than an absolute. But as always an appreciation for the benefits is the key, and using this to guide your approach and maximise your effectiveness.

Sunday, 3 January 2021

Cryptographic Basics

Cryptography while essential in modern software engineering is a complicated subject. While there is no need to gain an understanding of the complex mathematics that underly modern cryptographic techniques, a well rounded engineer should understand the available tools and the situations in which they should be used.

What is presented below is by no means an in depth examination of cryptography but is a primer into the topics that are likely to come up as you try to ensure your code base is well protected.

Encryption vs. Hashing

Encryption and hashing are probably the two primary applications of cryptography but the use case for each is different.

Encryption is a two-way i.e. reversible process. In order to protect data either at rest or in transit encryption can be applied such that only those that have the corresponding key can view the underlying data. Encryption is therefore used to protect data in situations where access to the data needs to be maintained but also protected from unauthorised disclosure.

Hashing is a one-way i.e. irreversible process. Taking data as an input a hashing algorithm produces a unique digest that cannot be used to get back to the original data source. Hashing is therefore used in situations where either the integrity of data needs to be verified or where the data being stored is very sensitive and therefore only a representation of the data should be stored rather than the data itself. A common example of the latter would be the storage of passwords.

Stream vs Block Ciphers

Encryption is implemented by the application of ciphers, algorithms that given an input (referred to as plain text) will output the same data in an encrypted form (referred to as cipher text).

These ciphers are often categorised based on how they view the input data.

Stream ciphers view the data as a constant stream of bits and bytes, they produce a corresponding stream of pseudo random data that is combined with the input data to produce the encrypted output. A block cipher divides the data up into fixed size blocks, using padding to ensure the overall size of the encrypted data is a whole number of these fixed sized blocks.

Stream ciphers have proven to be complicated to implement correctly mainly because of their reliance on the true randomness of the generated key stream. Because of this the most popular ciphers are mostly block ciphers such as the Advanced Encryption Standard (AES).

While block ciphers are now the most widely used attention also needs to be paid to the mode they are used in. The mode largely controls how the blocks are combined during the encryption process. When using Electronic Code Book (ECB) mode then each block is encrypted separately and are simply concatenated to form the encrypted output. While this may seem logical it leads to weaknesses, when separate blocks contain the same data they will lead to the same output which can present an advantage to a possible attacker. For this reason other modes such as Cipher Block Chaining (CBC) combine each block as the algorithm progresses to ensure even if blocks contain the same data they will produce different encrypted output.

Cryptographic Hashing

As we discussed earlier a hashing function is a one-way function that produces a unique digest of a message. Not all hashing algorithms are explicitly designed for cryptographic purposes.

A cryptographic hashing function should have the following properties:

It should be deterministic, meaning the same input message will always lead to the same digest.
It should be a fast operation to compute the digest of a message.
It should be computationally infeasible to generate a message that gives a specific digest.
It should be computationally infeasible to find two messages that produce the same digest.
A small change in the input message should produce a large change in the corresponding digest.

When an algorithm has these qualities it can be applied to provide digital signatures of Message Authentication Codes (MACs) to protect the integrity and authenticity of data either at rest or in transit.

We said earlier that there is no need to to understand the complex mathematics behind these cryptographic techniques, to take this a step further it's important that you don't attempt to understand or implement these techniques yourselves. The complexity involved means the likelihood of making a mistake in the implementation is high, this can lead to bugs that can be exploited by attackers to undermine the security you are trying to implement.

Instead you should view cryptography as a tool box providing implements you can use to protect you and your users, the important thing to learn is which tool should be used for which job and become and expert in its application.