A Two Coffee Problem: 2025

Monday, 17 March 2025

The Virtual World

I don't have any statistics to back up this claim but given the prevalence of cloud native approaches I am willing to bet that there is more virtualised hardware in the world than real physical computers.

We take for granted that we can bring up a virtual machine within a few clicks, use it for a vast variety of different workloads and then spin it back down again. Servers have gone from things that need to be carefully maintained and looked after to an ephemeral resource we can create and throw away.

The technology unpinning this ability is the hypervisor. A piece of software that provides an abstraction of physical resources such as CPU and memory to allow the creation of virtual machines. This allows a powerful host server to be utilised to provide a large number of isolated guest machines, increasing efficiency and productivity.

History of Virtualisation

Virtualisation had previously existed to the extent that multiple different software applications could run on the same hardware concurrently. In the late 1960s IBM developed a research tool called SIMMON that took this a step further allowing hardware resources to also be virtualised on mainframe computers. This was soon extended to cover operating system resources such as kernel tasks such that the idea of having virtual machines built on top of real hardware was born.

For many years this virtualisation was the preserve of mainframe systems, until in around 2005 the first attempts at virtualisation for x86 systems started to gain momentum.

Originally the hypervisors being developed were complex and prone to relatively slow performance, but as technology advanced the level of virtualisation that could support the cloud services we now take for granted started to emerge.

Types of Hypervisor

Hypervisors can be broadly categorised into two types, type one and type two.

Type one hypervisors run directly on the host machines hardware and therefore eliminate the need for an underlying operating system. For this reason they are often referred to as native or bare metal hypervisors.

Type one hypervisors are very efficient and often more secure, they are typically used within data centres on powerful servers hosting a large number of virtual machines.

Type two hypervisors run on the host machines operating system in the same way as any other application, they provide an abstraction of the host operating system to create an isolated process the host can interact with. For this reason they are often referred to as host hypervisors.

Type two hypervisors are less efficient than type one because the host operating system prevents them from having direct control of the underlying hardware. However they are a more practical option when virtualisation is needed on a machine that isn't a server.

Benefits of Virtualisation

Aside from the ability to turn one server into multiple virtual machines, what other benefits does virtualisation bring?

As we've already touched on virtualisation can greatly increase efficiency. Rather than having multiple applications each being hosted on their own server, potentially not making full use of the available resources. All applications can be hosted on their own virtual machine on the same server.

The ability for virtual machines to be created quickly and automatically also provides an element of scalability that doesn't exist if new physical servers need to be added to a farm. The fact the underlying system is virtualised also provides an element of portability, the application can be hosted on any physical server capable of running a virtual machine to the same specification.

Coupled to this idea of portability is also the concept of snapshots and failure recovery. Since the environment an application is running in is software controlled it allows the current state of the virtualised hardware to be recorded into a snapshot which can be used to deal with failure by returning the system to a known good state.

Virtualisation also provides the ability to create crafted environments for legacy systems. Where an older application may have requirements that are no longer compatible with modern hardware, virtualisation offers a means to create such an environment whilst still utilising a cloud based architecture.

We sometimes take cloud computing for granted without thinking about the technology that underpins it. For the majority of applications it isn't really necessary to have a knowledge of how hypervisors are enabling them to be deployed into a crowd environment. But I still think it can helpful in us being well rounded engineers to at least have knowledge of the layers in our stack, what they provide and their limitations and strengths.

Monday, 10 March 2025

Distributing Problems

Patterns and practices in software engineering can be very cyclical, the names and terminology applied can change but we often go back to old ideas that had previously been dismissed.

I think this is indicative of the fact that many problems in software development don't have perfect answers. It is often a balance of pro's and con's and sometimes the weighting applied to them varies over time and so we flip between different competing approaches.

One of these areas relates to whether it is better to centralise or distribute. Should our applications be monolithic in nature or a collection of loosely coupled distributed parts. For a long time the argument as been seen to be won by distributed computing and the microservices approach. However in recent times the monolith approach has started to be seen as not always being the wrong idea.

This article shouldn't be seen as an argument against a distributed approach, it should be viewed more as an argument against premature optimisation. By understanding the drawbacks of distribution you can make a better judgement on whether it's the right approach for your application right now.

Interconnected Services

Distributed computing is a relatively broad term, but within the context of this article we are taking it to mean the pattern of dividing an application up into a collection of microservices.

Usually microservices are built to align to a single business domain or process with the application being the sum of these parts communicating via a lightweight protocol such as RESTful APIs, or increasingly via an event driven architecture.

You can see from this that the term microservice is quite loosely defined, a lot of the issues that are created when applying this approach can often be traced back to the fact that defining the division between microservices and their relative size is quite a hard problem.

The best explanation I've seen for this is that a microservice should be as easy to replace as to refactor, basically meaning microservices shouldn't be so large as to negate the option of starting again with their design.

I think this idea is much easier to apply when starting with a blank sheet of paper. When splitting up an existing application it is often more pragmatic to not subdivide too quickly, as further splitting an existing service is often easier than trying to coalesce several services back into one once they've been split.

Fallacies of Distributed Computing

In 1994 L. Peter Deutsch at Sun Microsystems devised a list of seven fallacies of distributed computing building on top of the work of Bill Joy and Dave Lyon.

The fallacies represent seven assumptions that often cause the architecture and development of a distributed system to head in the wrong direction.

The first is that the Network is Reliable, this often leads to services not being written with network related error handling in mind, meaning when network errors do occur services often stall and become stuck consuming resources while waiting for a response that isn't forthcoming.

The second and third are related in that Latency is Zero and Bandwidth is Infinite can both cause developers to give little thought to the nature of the data that is propagating through the network.

Number four is that the Network is Secure, which can lead to a complacency where possible intrusion from malicious insiders isn't considered.

Number five is that Network Topology Doesn't Change, which in a similar way to two and three are indicative of us not thinking about the fact that the network our applications operate in is a dynamic element in the same way as our code.

Number six is There is One Administrator, this can cause us to fail recognise inconsistent or contradictory policies around network traffic and routing.

Number seven is that Transport Cost is Zero, here we need to factor into our thinking that an API call and the resultant transfer of data has a cost in terms transmission time.

Strategies to Compensate

The fallacies described in the previous section shouldn't be seen as arguments for why distributed systems shouldn't be built, they are things that should be considered when they are.

We can often think that our software is deployed into an homogeneous environment with perfect conditions, but this is often not the case.

Errors in transport can occur so we should have an affective strategy to detect these errors, retry calls if we believe this may lead to a successful outcome, but also have a strategy for when calls continue to fail such as a circuit breaker to avoid filling the network with requests that are unlikely to received the desired response.

We must realise that as load increases on our system the size of the data we are passing between elements may start to be a factor in their performance. Even if each individual request/response is not large in sufficient quantities their impact may be felt.

We have to maintain a healthy distrust of the elements of our system we are interacting with. A zero trust approach means we do not inherently trust any element simply because it is inside the network, all elements must properly authenticate and be authorized.

We must also consider that when we subdivide our system into more elements that we are introducing a cost in the fact those elements will need to communicate, this cost must be balanced with the benefit the change in architecture would bring.

These are only some of the things we need to think about with a distributed approach. This post is too short to cover them in great detail, but the main takeaway should be that a distributed approach isn't cost free and sometimes it might not offer advantages over a monolithic approach. Getting a distributed approach right is hard and not an exact science, many things need to be considered and some missteps along the way should be expected.

As with any engineering decision its not right or wrong its a grey area where pro's and con's must be balanced.

Tuesday, 25 February 2025

Solid State World

It's often said that technology moves quickly, I'm actually of the opinion that its our learning of how to utilise technology that actually grows rapidly.

The invention of certain pieces of technology are transformative, so much so that in the initial stages we struggle to grasp its potential. Then over time the realisation of whats possible grows and more and more applications materialise.

The World Wide Web was one of those technological jumps forward, and Artificial Intelligence will undoubtedly, if it isn't already, be another notable point in history. However I believe the true giant leap forward is often not talked about as much as it should be since it underpins everything that has come since and kick started a revolution that has lasted 75+ years.

That giant leap forward was the invention of solid state electronics.

Triodes to Transistors

The early building block of electronics was the vacuum tube triode. These devices had applications across radar, radio, television and many other spheres. However they were large and power angry devices making it difficult to use them reliably in increasingly complex devices.

The key to the miniaturisation of electronics was the invention of the field effect transistor. The theory behind these devices was first formulated in the 1920s, but it wasn't until the late 1940s that the first practical examples were built.

Many different scientists independently discovered and worked on the transistor concept, however William Shockley, John Bardeen and Walter Brattain are widely credited with its invention. In the late 1950s at Bell Laboratories they first produced the Metal Oxide Semiconductor Field Effect Transistor (MOSFET).

These transistors had many applications, once of which was being able to act as a "digital" switch, with this the era of the semiconductor and solid state electronics was born.

Printing with Light

Once the MOSFET had been invented the next challenge related to developing techniques for reliably being able to manufacturer them whilst also being able to continue their miniaturisation. The ability to produce transistors at a smaller size would enable them to be more densely packed as well as reducing power consumption.

The idea of photolithography, using light to print patterns into materials, had been around for some time. But in the late 1950s Jules Andrus at Bell Laboratories looked to use similar techniques to build solid state electronics (Moe Abramson and Stanislaus Danko of the US Army Signal Corps are also credited with inventing similar techniques in the same period).

Using this technique a semiconductor substrate is covered with a material called photoresist. A mask is then placed over the top such that only certain areas of the material are exposed to a light source. The exposed areas go through a chemical change that renders them either soluble or insoluble in a developer solution depending on the type of photoresist that is used. Finally the material then goes through an etching process to leave the desired pattern of the semiconductor based components.

Typically now Ultra Violet light is used in the process, but much of the drive for ever smaller and smaller transistors was driven by using ever decreasing wavelengths of light in the photolithography process.

Moore's Law

Gordon Moore was an American engineer who originally worked at Schokley Semiconductor Laboratory, a company founded by William Schokley the co-inventor of the transistor. He would latter go on to be a founder member of both Fairchild Semiconductor and Intel.

In 1965 while working at Fairchild Semiconductor he was asked what he thought might happen to semiconductor technology over the next ten years. His answer to that question would eventually come to be known Moore's Law.

Put simply Moore's Law states that the number of transistors that can fit into a given size of integrated circuit (or "chip") will double roughly every two years.

Originally Moore predicted this would be the case for ten years, however remarkably it has continued to hold for at least 55 years with industry voices only in the last couple of years starting to question if we may now be starting to reach the limit of Moore's original prediction.

The fact the industry has been following Moore's law for so long has been a major driver in continually increasing processing power that has been a catalyst for innovation. It is now believed that the transistor is the most widely manufactured device in human history. In 2018 it was believed as many as 13 sextillion (13 followed by 21 zeros) had been manufactured.

The lives of virtually every human on the planet has been touched by solid state electronics and the technologies it underpins. It is fair to say that the birth of solid state electronics marked the birth of our technological age and changed the world forever. Of all the transformative technologies that have come since, and will continue to be developed in the future, I don't believe any will be as impactful as this initial giant leap forward.