First things first. Kubernetes (and its variations) is great and currently, the only logical choice for companies building up microservices in the cloud (no matter what kind). This article has nothing to do with the technology itself. Instead, it’s a combination of insights I have gathered from working with customers embarking on Kubernetes projects over the last three years on an almost daily basis.
This article should not dissuade or to discourage you from embarking on a Kubernetes journey. Quite the opposite. It’s intended to serve as thoughtful guide to adopting Kubernetes as the project it stands today. So if you think the company you work for has already managed to overcome many of the issues raised below, Kubernetes was a wise choice. But for the rest of us, keep reading.
If you don’t know what Kubernetes (also referred as K8s) is and what it was designed for, you should take a look here first before continuing.
As many of you know, I’m Cloud Solution Architect focused on app modernization. This means that most of my day-to-day work is all about understanding customer’s business and technical outcomes related to the migration of one or more applications to the cloud and then, advising on best-practices.
I have seen tons of Kubernetes-projects succeed. I have also watched equally many fail. How can two companies with equivalent size, financial and technical capacity can experience different results on implementing a modernization approach on top of Kubernetes?
I went through a deep analysis myself and the conclusions I gathered are detailed below. Hopefully, this article will give you, a baseline understanding of the considerations required in adopting Kubernetes that your company will need to reflect on to succeed.
Reason 1: Kubernetes is complex, very complex
Very, very, very complex. That’s not a bad thing per se, but it does mean that the team working around this environment must know what they’re doing. The results of not knowing what you are doing can be catastrophic.
There are a few reasons people (including yours truly) claim Kubernetes is “complex”.
Abstraction of a cloud platform
Think of it this way: Kubernetes is (or at least, was designed to be) nothing but a big abstraction of cloud platforms for applications running on top of containers.
In Kubernetes, management-wise, you can have every single aspect also managed by most cloud platforms: Network, Storage, Compute, Security, third-party components (extensibility), so on and so forth. In other words, you can think of Kubernetes as being a management engine encompassing all those aspects.
Obviously, such management level of management required granularity and granularity usually incurs additional complexity.
You probably don’t want to take that path (because there are various robust offers for managed Kubernetes cluster across all major cloud vendors in the market and it helps a lot on not having to care deeply about the infrastructure side of the house), but you could, literally, spin up a Kubernetes cluster on top of an existing on-premise infrastructure your company might have in place and still take advantage of features like self-healing, auto-scalability, etc.
But doing so brings with it a huge management burden. Once you assume that responsibility, there are lots of aspects that require manual intervention, as we will discuss in detail later on.
Kubernetes is a technology designed to orchestrate containers targeting high-availability, scalability and performance on top of clustered host machines. It does mean we are talking about distributed computing and underlying aspects of it (storage, networking, memory and everything else).
Following a common architecture for most clustering technologies, Kubernetes introduces the concept of “master” and “worker” nodes.
At a high level, master nodes play the critical role of managing and orchestrating all the operational aspects of a given cluster (it really acts as a brain for the cluster) while worker nodes in the other hand, receive all the loads sent by the master(s) and then complete the jobs. Of course, with every operation being tracked (and synchronized) in near real-time between the two.
Figure 1 will give you a high-level idea of what a Kubernetes’ cluster is comprised of when it comes to hosts and its logical grouping.
I’m not getting into the details of components and how the communication happens behind the scenes because that’s not the intent of this article and everything is very well explained in the Kubernetes documentation. However, it should give you the sense of complexity that surfaces when you start dealing with a K8s cluster at the host level. Just to help you think a bit about some of the concerns that now you have hosting-wise:
- What happens if a master node dies?
- What happens if, for some reason, multiple worker nodes fail?
- In case of having multiple master nodes for redundancy, how do you manage backup for the etcd databases associated with each master?
- How do you patch the hosts under that cluster?
- The list goes on and on…
Orchestration of Containers
Ok, now that we have (barely) scratched the surface of the complexity involved with managing the hosts under the cluster, we can quickly talk about the complexity of the container orchestration process.
Kubernetes engines (yes, there are several out there and they are usually referred to as agents) provide a very effective approach to run containers in a highly-available and highly-scalable way by leveraging an approach known as “Desired state configuration”. In other words, Kubernetes will always try to match object’s current state with its desired state (defined when it was deployed).
As an example, let’s imagine we want to get our container up and running in the cluster. You then deploy using the “deployment” controller. Your container will be assigned to a virtual pod and then, it will be scheduled to run in a given worker node. A Kubernetes component (called Kubelet) will timely monitor the status of the pod and send that information back to another component in master that is in charge of deciding what to do with that information. If the current state doesn’t match the desired state, by leveraging a controller called replicaset, K8s will then make sure to redeploy the troubled pod to match the “running” state.
That’s great, isn’t it? But to be able to do so, as you might imagine, Kubernetes relies in various components spread all over both master and worker nodes, as you can see through the Figure 2.
But, what happens when one of these “agents” fail? Would you (or somebody on your team) feel comfortable troubleshooting those elements?
Besides, Kubernetes introduces a bunch of new concepts. Concepts that change quite a bit the traditional way of operating IT infrastructure. Let’s get into just a few of them.
- Distributed Computing. To fully take advantage of Kubernetes features and benefits, applications need to be (re)designed to properly work in a distributed fashion, meaning, its parts need to be independent and sitting in different “physical” locations within the cluster. You have to be able to deal with the implications of it (we will discuss some of them later on).
- Impact on application design. Application-wise, the more direct implication when thinking about how to leverage it in Kubernetes is the change on the design. It does mean that the introduction of concepts like API-first, stateless, microservices, bounded-contexts, decoupling and more, are very likely to be needed. Can you deploy a monolith in Kubernetes? Yes, technically speaking you could. But one of the biggest benefits of K8s would then be lost: the agility of scaling services independently. Could you host your stateful app in Kubernetes? Yes, you could. But again, if it wasn’t designed for it, it is not recommended.
- Big dev teams required. One of the great benefits of the microservices approach is that companies can scale product development across different teams in parallel and (supposedly) deliver value to the business more quickly. It does indicate that microservices projects are suited to big development teams, meaning, multiple teams within the company. Well, if you are doing microservices with one single team… chances are you will find yourself disappointed.
- Communication. As part of its new concepts, Kubernetes introduces a new way for the apps running in a pod within the cluster to communicate – Services. It is a network abstraction that allows pods to communicate one-to-another and with the external world. But because services present some limitation, especially when exposing pods to the outside, the concept of Ingress Controllers were added and have been gaining popularity. You’ll have to live with that too. Not to talk about DHCP, that is not built-in in Kubernetes and you have to rely in open-source projects like Calico, for instance, to get it done. It’s just a lot to take on. So now, your pod has an IP. Then, the service has it own IP. Then the node where your pod is running has its IP too. How does everything come together?
- Security. When you think about security in a traditional VM-based environment, immediately comes to mind things like WAF, Firewall, so on so forth. But, you’re no longer running on top of VMs. You run on top of containers that live in a pod on top of Kubernetes’ own network abstraction. How does that translate within Kubernetes? How do you restrict access from one pod to another? How do you create restriction rules for inbound and outbound? What changes if you have for ingress in place? How do you handle secrets? So many things to think about…
- Further concepts. There’s a lot more to take on, believe me. Namespaces, Labels, Volumes, Policies, Batch, Scaling, Service Discovery, Rollout and Rollback, Contexts, Resource Limits, DaemonSet, ReplicaSet, StatefulSet, Taints, the list goes on. The thing is, Kubernetes can get as complex as you want it to be.
Reason 2: Level of maturity with microservices still low
I’ve touched on this briefly already, but let me be clear here: Kubernetes was designed (you can take a look here to understand the motivations of its creation) to orchestrate microservices with horizontal scaling (rather than having big machines do so). That fact automatically imposes an important group of “requirements” if you will, to every company actively considering the adoption of Kubernetes as foundational infrastructure for applications.
- Stateless. When thinking about microservices, ideally, you design for stateless (applications which don’t preserve data in server/memory). We do this for different reasons. For example, because each request is an isolated “entity” with transient lifecycle, it makes the horizontal autoscaling of the underlying infrastructure (both containers and virtual servers) serving those requests quicker and more effective. I could also feature out the simplification of CI/CD, auto-healing, among others. Fact is, you should design for stateless rather than stateful (despite the fact that Kubernetes does support stateful services nowadays). That’s for sure.
- Communication. How does the architecture/development team design the communication between microservices? Is this direct (container-to-container)? What happens if, all of a sudden, communication breaks down? Is there a messaging technology in between? Is there a policy in place that restrict communication among a group of services to another?
- Database access. One of the fundamental principles of microservices-based applications is that every microservice talks to and only to its own database. There is no sharing whatsoever of databases among microservices. This means that a system with 100 microservices are going to rely on 100 different databases. Would you say your company has a solid plan for how to manage that complex set of databases? Will this run with the same cluster of the microservices or outside of it? If this is going to be external, have you though about controlling access to those? How would you manage secrets, connection strings, and such? Will those databases run at the same cloud infrastructure of the microservices or will it be mixed between clouds? Again, lot to digest.
- API Management. Most of microservices-based applications nowadays are structured on top of RESTFul APIs (HTTP-based) and by doing so, having a strategy in place no handle HTTP requests becomes key. How would that work for your scenario in a Kubernetes cluster? Another piece of the cake.
I’ll stop there, featuring only the most critical aspects of the microservices projects, otherwise, this article is going to become a deep dive in microservices, which is not the intent of it. Believe me there is even more to think about here.
Reason 3. DevOps reality still far from the ideal
I just mentioned that, Kubernetes and microservices are deeply related, right? As part of a daily practice for microservices projects, you (your team) will be required to deploy dozens, hundreds, even thousands of microservices everyday in a given Kubernetes cluster.
In fact, if there is one problem that microservices solve really well, it’s deployment. Specially if you adopt microservices for the right reasons. But again, this is not the focus of this article, so I’ll stop there.
I don’t need to drill in more to make you understand that, there is only one way to accomplish this, and that way is called DevOps and everything related to it. Some aspects to consider (please, look for those answers) in here would be:
- What’s your process of packaging, building and publishing microservices to the images repositories?
- What’s your process for testing out and validating images being generated by your Continuous Integration (CI) pipelines (assuming that there are some in place already) before they get put in production?
- Does your image repository support all the security standards to communicate safely with your Kubernetes’ cluster?
- Everything in Kubernetes is deployed as a code (YAML files that, once arrive into Kubernetes agents are converted to JSON for cluster persistence and internal operation). Have your DevOps engineers mastered Kubernetes concepts, configurations and objects enough to properly write pipelines that deploy applications in a consistent way, extracting all the incredible features of the technology?
- What about security? Is there a DevSecOps approach in place? How does it relate to Kubernetes concepts and operation model?
- How will your team manage to have blue/green deployments in Kubernetes?
- What about Canary release? Remember, Kubernetes exists to provide a flexible, reliable and high-scalable way to host microservices. How are you planning to have new deployments in production without impacting running environments.
Reason 4. Low level of knowledge in Linux
Kubernetes is a Linux-based technology, that orchestrate containers, that are also Linux-based technologies for the most part. It does mean that you and the team operating that cluster’s environment has to be considered fluent in Linux. Some aspects of it that are critical:
- Basic administrative commands. When manipulating a Kubernetes environment, often times you need to check-out some tools on top of Linux OS to perform usual tasks. Some examples would include: double checking the content of file, editing its content, moving a file from one directory to another, changing its permissions, verifing if the place where you’re at is the correct one. Among plenty of other things. Things like
grep(the list goes on and on).
- Files and directories. The standard GNU/Linux operating system has a file management hierarchy that organizes the data stored on the computer. Identifying the places where specific data is stored is one of the first things you need to know to understand how the OS is organized and so is your container and Kubernetes’ files.
- Differences between Absolute and Relative Path. Would you say that you have the ability to successfully identify the differences between an absolute and relative path for a file or directory when you see one? This is going to be key to successfully work with containers and apps in Linux.
- Terminal editors. While learning by doing, your Linux-based study often requires you to make changes in order to create, update, and/or remove information from configuration files. Each of these actions can be achieved through the use of text editors made available on most Linux operating systems. They include: vi/vim, emacs, nano, among others.
- Understanding and working with files and permissions. One of the important aspects of working with Linux is the ability to easily change files and directories permissions on-the-fly. Linux does offer a fine-grained way to do so via
chmodcommand, but you should be familiar with the different levels of permissions before hand as the work with Kubernetes will often require you to make a given script “executable” or a given object to be writeable, so on do forth.
- The list goes on, but you get where we’re going with this, right?!
Reason 5. Does your team understand Helm?
Here is the thing: Helm is not a built-in construct within Kubernetes so you can do Kubernetes without Helm. But, experience shows that it is almost impossible to work well with Kubernetes without using Helm.
Why? Helm is a package manager for Kubernetes. It is K8s equivalent of yum or apt. Helm deploys charts, which you can think of as a packaged application. It is a collection of all your versioned, pre-configured application resources which can be deployed as one single unit. You can then deploy another version of the chart with a different set of configuration they can co-live in the same cluster (please, see Figure 3 to understand how Helm interacts with a given Kubernetes cluster).
For those experimenting on landing zones from Cloud Adoption Framework (CAF) and blueprints designed by all major cloud providers, a Helm chart is something, at least conceptually, fairly close to it. It allows you, for instance, to create an ecosystem of Master-Replica for your MySQL databases running within the Kubernetes cluster as one single deployment unit.
Deployment-wise, it is hard to think doing something really serious in Kubernetes nowadays without having Helm in place. So, if the team working on the Kubernetes environment is new to Helm, I would strongly recommend holding off a bit before to starting to use it in production.
Reason 6. Maturity with containers still low
Would you say your company has a good practice in place for containerizing applications? Meaning, does everybody in the team understand what it means to build an effective docker file, designing an app for the “minimum” rather than for the “comfortable”? Do people know what a running container means in fact? In case of a failure, are they able to troubleshoot at the container level?
These are only some of the questions you might ask yourself when it comes to containers. But there are also other aspects. For instance, Docker is the default container engine supported by Kubernetes as of today so your application probably runs on top of it, right? But, what happens if Docker leaves Kubernetes (say for some commercial reason in the future)?
This is not just wild conjecture. Starting with Kubernetes 1.20, Docker is officially deprecated. You still can run your docker containers on top of Kubernetes and will probably be able to continue to do so at least until late 2021. But, it will be gone some point and you need to be prepared for that change.
Specifically on that point, some of you might know that Cri-o and containerd (actually, containerd is the preferred one) are being considered as the next default engines within Kubernetes to replace to Docker. There is a good article explaining the reasons by which the community behind Kubernetes is making that decision in here. In fact, cloud providers like Microsoft have already defined containerd as the official container engine in AKS.
The point is here that concerns with containerization in Kubernetes go far beyond just containerizing apps. Because Kubernetes is such a rapidly evolving technology, you will need to think about a robust containerization strategy that supports potential future changes, whatever they might be.
There are definitely another aspects and concepts that you should consider at the company level before getting started with Kubernetes, but I’ll leave those for another post. The ones outlined here are the most critical ones and are in and of themselves quite a bit to digest.
Something I have learned by going down this path for years now with customers is that Kubernetes does push a cultural change within both Development and Operations teams. Operationally speaking, it usually breaks down traditional ways of “how” we deliver code, design applications, as well as how we go about skilling ourselves and our teams on the various technologies involved.
If after reflecting on the aspects I raised above, you think your company and internal teams are right on track on to it, that’s awesome! Chances are your Kubernetes’ initiative will succeed. Otherwise, holding off for a while on Kubernetes could end up being a wise decision.
Hope this helps!