Kubernetes: A Love Story
Share
The life of application developers in today’s cloud-native world has been profoundly affected by two conceptual developments: containers and microservices. These have led to the creation and maturation of orchestration engines, which offer software developers more simplicity, velocity, reliability and agility in launching and managing cloud-first applications. As we move towards a cloud-native world it is important to understand how software development will need to evolve, and what effect these orchestration engines will have on our future. In order to do this I will provide the reader with some insight into the motivations and underlying theory behind Google’s Kubernetes, the most popular open source container orchestration engine that exists today. I will briefly summarize the historical moments that led to the creation of Kubernetes, focus in on containers (the underlying work-horse behind Kubernetes), describe what Kubernetes does and doesn’t do, and try to envision where this technology is going.
Our story begins in the late 1960s when some smart cookies at IBM started to spend a lot of time and effort developing robust time-sharing solutions. Given how expensive computational resources were at the time they wanted to develop a way to share their machines with large groups of users spread across the globe. To do this, they came up with the concept of virtualization which gave engineers the ability to simulate hardware objects (like CPUs, GPUs and hard drives) that behaved identically to the corresponding physical device. This led to the development of a new piece of software called a hypervisor, which allowed a physical host machine to operate multiple virtual machines (VMs) as guests, to help maximize effective use of computing resources such as memory, network bandwidth, and CPU cycles… and there was much rejoicing.
This meant that one computer could be shared by a number of users, as they could each run a VM in parallel on one operating system (OS). More importantly, if a VM experienced an error, crash, or malware attack, it wouldn’t extend to any other VMs or other machines, so multiple people could share computing resources and act as if they owned the entire machine without having to worry about bad actors doing dangerous things. As time went on the number of use cases for VMs continued to flourish, with engineers using them to test how their software would run on different operating systems, storing snapshots of virtual machines as a way of backing-up information, and doing things that they wouldn’t dare do with their own machine, like executing malicious code or visiting suspicious websites. However, these VMs were only effective because historically most software applications were written using what is called a monolithic architecture i.e. massive blocks of code. This meant that both the user-interface and data access code were combined into a single program, which introduces a number of issues when you try and operate an application at scale. With a monolithic architecture every module in your codebase is tightly coupled and if a single one fails the entire system can break down, if you have conflicting resource requirements for different parts of your application it becomes hard to provision for them independently, and if you try to add new features to your application you need to make sure they integrate properly with everything that has come before. In the same way that professional sprinters find it much harder to shave off the last few milliseconds of their 100-meter dash times, as applications scale massive changes are needed to glean even small improvements in efficiency, and it started to become clear to large enterprises that this issue was becoming untenable. A new architecture was needed, and when that architecture was coined ‘microservices’ in 2011 it sparked the transition from virtual machines to the workhorse of modern software applications: containers.
Microservices are a software development technique that arrange an application as a collection of loosely coupled services which communicate over a network to operate as a collective. The most intuitive way to think about them is to consider splitting up your monolith into a bunch of smaller applications, and instead of having functions call each other, you rely on API calls between these smaller applications (called microservices) to transfer information and get the job done. Initially, virtual machines were used as the building blocks for running these microservices. Each virtual machine corresponded to a specific microservice, and you could spawn a lot of these to work in parallel on your existing hardware, but there was a problem — virtual machines each require their own copy of an operating system, they carry their own libraries, and simulate all of the hardware as if they were an independent computer, which means that they take up a lot of memory. The solution for this memory problem came in the form of containers.
Containers work similarly to virtual machines except that they don’t need their own operating system as they can run directly on the host OS. Like virtual machines it is best practice for a container to represent a single microservice, but as illustrated in the image below they can provide the same functionality as VMs with far less memory:
Additionally, because they share a common operating system it is much easier for containers to be updated with bug fixes, patches, and so on. In order to use containers all one needs is an operating system and a sufficient amount of compute and memory, which makes them extremely portable to different machines and the weapon of choice for hybrid cloud solutions. A hybrid cloud is a computing environment that uses a mix of on-prem (local data centers), private cloud (a cloud that is tailored to a company’s specific needs and is not publically available), and public cloud services (AWS, Azure, GCP, etc), and is becoming the standard enterprise architecture. Containers work particularly well here as once they are packaged they can run on any kind of hardware in exactly the same way. Thus, even if a company has a hybrid cloud environment, they can deploy their containers in exactly the same way in each location, which is very powerful.
While the microservice architecture, and the use of containers, provide massive improvements in efficiency, and allow for hybrid deployments, switching to this paradigm meant an inordinate amount of orchestration work between these various containers. Again, where one used to have a simple function call reaching out to a part of code in the same monolithic application, now we are relying on API calls, and we need to deal with the headache of networking, and latency. If a container dies we need to spawn a new one, we need to scale services on the fly and make sure that communication isn’t interrupted, and we need to be able to update our services in real time. This is where Kubernetes (k8s) comes in.
Kubernetes is a container orchestration engine that solves some of the problems inherent to the microservice architecture out of the box. It scales your services when traffic to your application spikes, it guards your application against failures by constantly checking the health of nodes and containers and respawning them if needed, and manages the routing of traffic and load balancing across your application. It allows you to automatically rollout new software or roll back to a previous deployment in case of failure, supports canary deployments which let you test your new deployment in parallel with your previous version before transitioning between the two, and supports a wide spectrum of programming languages and frameworks. In short, managing communication and information flow between services is an incredibly complex problem, and if you want to use a microservice architecture you will be forced to solve this problem. Unless a company is willing and able to implement a system similar to Kubernetes in-house, which would require significant investment and a world-class engineering team, then Kubernetes is the de-facto, and to some extent inescapable, choice.
Despite its advantages Kubernetes is far from perfect. As a result of this, a number of third party applications have started to gain traction addressing Kubernetes shortcomings. These applications deal with some of the headaches that come up when enterprises attempt to create increasingly complex Kubernetes deployments, and fix inherent issues that one would expect to encounter when attempting to apply a ‘one size fits all’ piece of software to a unique business context. Some of the most commonly used apps deal with continuous integration and deployment, ensuring that services are up to date and can be deployed without downtime, monitoring of the myriad services that make up an enterprise architecture, managing configurations of these services, keeping deployments secure, and enabling efficient storage of information. This isn’t a short list of issues, and one of the primary issues facing Kubernetes today is how difficult it is to implement properly. Everyone wants to transition towards containers and microservices, but the rush to adopt this shiny new technology has led to suffering and heartbreak for development teams that are ill equipped to manage this complex framework. As such, one would argue that the most successful Kubernetes companies will be the ones that succeed in reducing the complexity of using Kubernetes in production.
Kubernetes offers an exciting opportunity for enterprises to piece together their architecture from a variety of third-party sources without having to spend as much time creating these disjointed pieces in house. Unfortunately, many companies do not have the technical maturity or talent to deploy a fully functional and optimized Kubernetes instance, and this is a massive opportunity for both new entrants and incumbents to capitalize on. We are already starting to see incumbents like Amazon, Google and Microsoft deploy more closely managed Kubernetes services[1] in which an increasing number of the previously identified issues are solved out of the box, but enterprises are rightfully concerned that this will lock them into a single vendor that will make it hard for them to leave. On the opposite end of the spectrum, there are a number of startups trying to improve the current Kubernetes workflow by providing clean UI experiences for installing, updating, and delivering Kubernetes applications to enterprises that may not have the technical expertise to manage all of this themselves. These startups allow their clients to run their deployments on public clouds, on premise, and using hybrid deployments, but they face stiff competition from the aforementioned gorillas in the room.
Though it is hard to predict who will drive the change to Kubernetes, one thing is clear: microservices are here to stay. They are our best response to the cloud-native world we are moving towards, and are based on breakthroughs that have underpinned modern computing for over half a century. We are far from an ideal world in which the complexity inherent in networking, storage, security, CI/CD, and monitoring will be provided by one, all-encompassing solution, but we are heading in that direction. Kubernetes will increase the velocity of software deployment, reduce the barrier to creation for new software engineers and enterprises, and make it easier than ever to turn ideas into reality. By merging together fully functional pieces of an application and running them on the cloud the engineering world is once again setting itself apart as one of the few truly collaborative professional communities. Since Alan Turing first posited the idea of intelligent machines few things have remained constant, but we have relentlessly moved towards a world in which the power of computation is available to more people, with less difficulty, for less money. What was once reserved for academics and researchers is now publicly available, where you once needed a degree to make a website we now have services like Wix, and though you used to have to write software entirely by yourself we now have open-source tools like Kubernetes to simplify your journey from zero to one. Kubernetes is the logical next step in an inevitable march towards a world in which software is fully democratized, and I for one could not be more excited.
[1] It’s important to note that all of these tech giants tried to create orchestration engines of their own, and Google’s Kubernetes has become the clear victor.
Sebastien Goddijn
Author