Software applications used to be designed and deployed as monolithic applications, where a single instance performed all the business functions.
Most of these applications were for internal enterprise use, meant to be used by a single organization, deployed, and maintained on-premises.
These kinds of applications had (and continue to have) a constant set of predetermined parameters. A moderate amount of downtime was acceptable.
But today, software applications and digital products provide business-critical and complex services. Downtime is unacceptable, and the parameters, functions, and features are rarely (if ever) constant. Add to this the requirement of being available from everywhere: monolithic application structures just don't cut it any more. Workload, requirements, and rules of operations will change, and fast.
Kubernetes (and similar offerings out there) help mitigate many of these challenges. We’ll dive deeper into that here.
The size of the applications and the requirements for round-the-clock availability made on-premise storage and maintenance extremely challenging. On-demand cloud computing solved that problem by providing the resources needed to run applications as a service. Zemoso continues to be a cloud-service-provider agnostic product studio. Though depending on the kind of product you are building, we might have really strong preferences.
Cloud computing took care of the physical hardware problem; agile product development and deployment remained a challenge.
• Large build times. Extremely large code bases resulted in large build times, i.e., large downtime for deployment. But high availability is a critical requirement, and downtime is unacceptable.
• Slow Development. Developing large code bases is slow. Since the entire code base is encompassed in a single instance, a broken code from one single developer within the team could block the work of other developers. Agile software development advocated evolutionary development, early delivery, and continual improvement. But developers didn't embrace this due to the high cost of deployment.
• Poor resilience to failures. One error in performing a single business function could bring down the entire application since a single code base handles everything.
• Low resource utilization. When the workload to perform a particular business function suddenly increases, the entire monolithic application has to be scaled. This results in sub-optimal consumption of computing resources. In a world of cloud, redundant usage of resources costs money.
The microservices style of architecture helped mitigate the above problems to a great extent.
In this style, the application is designed as a collection of loosely coupled services (smaller applications). Each service is responsible for a self-contained business function. These services communicate with each other whenever necessary and work together to meet all requirements.
Decomposing an application into different smaller services makes it easier to understand, develop, and test.
• Only a small service has to be rebuilt when the code is updated, thus reducing deployment time, which ensures continuous delivery and deployment.
• The decomposition also results in applications becoming more resilient to failures, as downtime is observed only for error-prone services.
• Workload fluctuations could be handled much more efficiently with optimum utilization of cloud resources.
Like any good thing, microservices architecture brought challenges of its own:
• Redundant usage of cloud resources. Microservices architecture has many moving parts and it isn't feasible to manually control the usage of all the cloud resources. Developers needed a solution that controlled utilization at a granular level.
• The nightmare of deployment operations. In Micro services, we’re splitting a single application into smaller applications. Communication between the services needs to be efficient and stable, and the OS of the host should have the required dependencies before deployment.
• The system now has multiple smaller applications instead of one. Monitoring the overall application is much more complex.
• This complexity grows with the number of services, and so does the task of executing deployment operations. It was necessary to have a robust approach to better handle the complexity of deployment
The first key innovation that became an integral part of the solution was Containerization. Containerization is the practice of deploying applications by packaging them in a container.
A container is an OS-level virtualization, in which the kernel allows the existence of multiple isolated user-space instances. Simply put, a container is a fully functional virtual computer, running inside another computer (host). Each container is isolated from other containers, and from the host.
Containers have their own file systems. They can’t see each other's processes, and their computational resource usage can be bounded. Docker, Containerd, and rkt are examples of container systems.
• Decoupling of application and infrastructure: Applications directly deployed onto the host machine entangled the executables, configuration, libraries, and life cycles with each other and the host OS. Packaging each application and its dependencies into a self-sufficient isolated container decouples the application from the host OS and infrastructure. It allows deployment on any host, without worrying if the host has all the dependencies installed.
• Isolation of resources: Another important advantage of containerization is the isolation of resources. When all the applications are directly installed on the host, a fatal error in one application can bring down or corrupt other applications, or even crash the entire host in some cases. If applications are deployed in containers and multiple containers are running on a host, a fatal error inside one container won’t affect other containers or the host.
Containerization of applications adds great value and simplifies the process of developing, deploying, and maintaining distributed applications.
A container orchestration system allows the user to effectively manage the deployment of containerized applications.
Here’s a recap of some of our key requirements:
• Applications handling large-scale, complex operations must have minimal downtime
• Applications should be resilient to large fluctuations in workload
• Failure to utilize cloud resources optimally will result in incremental costs.
• Reliably perform deployment operations to support evolutionary development.
Many teams eventually realized that designing a container orchestration system, which addressed all of these problems together, was the next evolution of containerization to be truly effective. It could make the system resilient, efficient, and completely abstract the complexity of deployment operations from the user.
The engineering team at Google worked on multiple internal projects to solve deployment problems. The Google engineering team designed a system to automate containerized application deployment, scaling, and management. With rare and useful insights, Google could solve this problem. They later democratized their findings and solutions by reshaping the concepts behind their internal projects to work with open-source technologies. In 2014, Google open-sourced this project as Kubernetes.
What's Kubernetes?
According to official documentation:
Kubernetes (K8s) is an open-source system for automating the deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.
In our day to day work, as we develop and deploy digital products for enterprises and startups, these uses help us create even more agile products:
• Kubernetes can run containerized applications of any scale without any downtime.
• It can self-heal containerized applications, making them resilient to unexpected failures.
• It can auto-scale containerized applications as per the workload, and ensure optimal utilization of cloud resources.
• Kubernetes greatly simplifies the process of deployment operations, allowing it to be performed reliably with a couple of commands.
Cloud Native Computing Foundation (CNCF) — an open-source software foundation dedicated to making cloud-native computing universal and sustainable — maintains Kubernetes now.
Kubernetes orchestrates computing, networking, and storage infrastructure for the user. It eliminates the need for direct manual orchestration in a cluster and automates the orchestration process so that applications are highly available and compute resources are optimally utilized.
• Service discovery and load balancing. Kubernetes ensures efficient and reliable communication between Microservices within an application.
• Horizontal scaling: Kubernetes will autoscale to balance abrupt surges. It'll scale up replicas of a service during the surge or scale down after to manage costs.
• Self-healing: Kubernetes will automatically initiate a new, healthy replica of a service when it goes down due to an error.
• Automated rollouts and rollbacks: Kubernetes easily takes care of deployment operations like rollouts of new versions, and rollbacks.
• Secret and configuration management: Kubernetes provides built-in mechanisms to effectively store and manage configuration (like environmental variables, database connections) across different environments (e.g. : production, test, development). It also minimizes accidental leaks for sensitive configuration data.
• Storage orchestration: Kubernetes effectively manages the storage required by an application. Storage management is separated into multiple steps. The allocated storage is first configured, then a claim is made whenever an application in the cluster requires this storage. Kubernetes provides excellent integration with various storage solutions supported by cloud providers.
Kubernetes offers a large set of features, some of which are unrelated to its core functionalities. It easily integrates with other tools that provide support functions to ensure that the product/application is working as intended.
• Kubernetes doesn’t deploy source code or build the application. Continuous integration, delivery, and deployment (CI/CD) workflows aren't a core feature. Automation tools like Jenkins integrate with Kubernetes for such capabilities.
• Kubernetes doesn't provide application-level services, such as message buses, data processing frameworks, databases, caches, or storage systems. These components can run on Kubernetes, or can be accessed by applications running on Kubernetes via portable mechanisms, such as the Open Service Broker.
• Kubernetes doesn't dictate logging, monitoring, or alerting solutions. Kubernetes comes with some integrations as proof of concept and external tools best-suited for your particular use are recommended. Fluentd is a good choice for handling logging. Prometheus is a popular choice for monitoring and altering. Helm and Envoy are also popular projects, which work well with Kubernetes and simplify the workflow. Find more here: https://www.cncf.io/projects/.
Kubernetes has added great value to our product build and deployment efforts with its great interoperability with external tools.
Once configured, Kubernetes greatly simplifies the manual workload during deployment operations. Its priorities lie in providing high flexibility and configurability over ease of use.
Configuring Kubernetes for a large production environment is a complex task. Correctly configuring and maintaining a production system requires nuanced expertise, and the learning curve is steep.
Many teams that tried to switch to Kubernetes abruptly without proper training, faced difficulties and often switched back to their old solution. If you are new to Kubernetes, you should account for time to learn the fundamentals before you embark on a digital transformation journey.
As the size of the project grows, the amount of configuration required also grows. In large projects, maintaining and updating Kubernetes configuration can be challenging. To solve this, Kustomize, a Kubernetes native configuration management solution, is under development. Many teams use external tools like Helm, to manage Kubernetes configuration.
Kubernetes is very active. New releases are made frequently. Although rare, new changes may not be backward compatible and could cause production outages. Thus, a testing mechanism for detecting application-breaking changes after updates is essential.
Here are examples of other projects that tried to solve the same problems as Kubernetes.
This project didn't directly tackle the problem of container orchestration. However, there’s some overlap that merits scrutiny and comparison to Kubernetes.
A few years before the release of Kubernetes by Google, Netflix released a set of libraries that handled some of the challenges developers faced with Microservices architecture.
These libraries were built from the experience and insights gained by the Netflix engineering team, while scaling their infrastructure on Amazon Web Services (AWS). The Java-based Spring Framework was built on top of these libraries, and they released Spring Cloud Netflix, which simplified the process of integrating Netflix libraries with Spring applications.
These releases quickly gained traction, and were adopted by many teams struggling to get their Microservices architecture right. They provided capabilities like service discovery (Eureka), routing (Zuul), fault tolerance (Hystrix), etc. But using these libraries had some drawbacks.
Most of them were written primarily in Java. This made using these libraries/services in applications, implemented in other languages, and frameworks difficult. The application code also had to incorporate the logic necessary to communicate with these libraries/services.
Kubernetes doesn't restrict the choice of language or framework. Applications deployed on Kubernetes are containerized, and are completely unaware of the deployment infrastructure. Kubernetes handles service discovery, routing, health monitoring, etc, allowing applications to focus on core-business logic.
Spring Cloud Netflix has little development activity today, compared to Kubernetes, which is very active.
Docker Swarm is the cluster management and container orchestration solution that comes with Docker Engine.
Released in 2015, Docker Swarm is native to the Docker environment, and doesn't require any additional software. It supports diverse functionalities. Service discovery, load balancing, rolling updates, auto-scaling, and state reconciliation, are available in Docker Swarm.
Over the years, valuable Kubernetes have also been added to Docker Swarm. Docker Swarm is very simple to use and doesn’t require an extended learning period.
Docker Swarm sacrifices configurability and flexibility in favor of simplicity and ease of use. A large ecosystem of specialized external solutions doesn’t integrate as easily with Swarm. Few cloud providers support integrations with Swarm, unlike in Kubernetes.
Swarm may sometimes be a laggard in making the most valuable features available. Kubernetes is definitely more active in rolling out capabilities.
Docker Swarm can run only Docker containers, whereas Kubernetes supports numerous container systems including Docker. Unlike Kubernetes, which is completely a community project, Docker Inc. maintains Swarm.
If you have a team that’s already using Docker, and is very comfortable with Docker CLI, and you aren't interested in integrations with external tools or cloud providers, you could consider using Docker Swarm. Otherwise, choose Kubernetes.
Marathon is the container orchestration framework that runs on top of the Distributed Cloud Operating System (DC/OS), which is an open-source distributed operating system, based on the Apache Mesos distributed systems kernel.
Marathon can’t run without DC/OS, thus only projects using DC/OS can use Marathon. On the other hand, Kubernetes can run on DC/OS, along with numerous other platforms and operating systems.
Marathon offers features like service discovery, load balancing, health checks, CLI, GUI, etc. But Kubernetes offers more mature versions of these features, and many other features that aren’t part of Marathon.
Marathon is native to DC/OS, takes advantage of this, and performs operations more efficiently and effectively. Kubernetes has better integration with external services and cloud providers. Use Marathon if it is essential for your project. Explore Kubernetes for the flexibility it offers.
The most important factor, where Kubernetes outperforms all its alternatives, is development activity. Development activity is a reliable metric to measure the longevity of the project, and the support for it.
The fact that Kubernetes has such a strong developer community behind it, is often the reason most teams choose to use it in their production environment.
To summarize, anyone interested in designing large-scale products that remain highly available and fault-tolerant, while optimizing the consumption of computing resources, should consider using Kubernetes for container orchestration. It'll be crucial to have an engineering team with the expertise to configure, maintain and update Kubernetes.
Kubernetes enjoys the largest development activity and supports a large ecosystem of tools and services.
An earlier version of this blog was published on Medium by the author.
Software applications used to be designed and deployed as monolithic applications, where a single instance performed all the business functions.
Most of these applications were for internal enterprise use, meant to be used by a single organization, deployed, and maintained on-premises.
These kinds of applications had (and continue to have) a constant set of predetermined parameters. A moderate amount of downtime was acceptable.
But today, software applications and digital products provide business-critical and complex services. Downtime is unacceptable, and the parameters, functions, and features are rarely (if ever) constant. Add to this the requirement of being available from everywhere: monolithic application structures just don't cut it any more. Workload, requirements, and rules of operations will change, and fast.
Kubernetes (and similar offerings out there) help mitigate many of these challenges. We’ll dive deeper into that here.
The size of the applications and the requirements for round-the-clock availability made on-premise storage and maintenance extremely challenging. On-demand cloud computing solved that problem by providing the resources needed to run applications as a service. Zemoso continues to be a cloud-service-provider agnostic product studio. Though depending on the kind of product you are building, we might have really strong preferences.
Cloud computing took care of the physical hardware problem; agile product development and deployment remained a challenge.
• Large build times. Extremely large code bases resulted in large build times, i.e., large downtime for deployment. But high availability is a critical requirement, and downtime is unacceptable.
• Slow Development. Developing large code bases is slow. Since the entire code base is encompassed in a single instance, a broken code from one single developer within the team could block the work of other developers. Agile software development advocated evolutionary development, early delivery, and continual improvement. But developers didn't embrace this due to the high cost of deployment.
• Poor resilience to failures. One error in performing a single business function could bring down the entire application since a single code base handles everything.
• Low resource utilization. When the workload to perform a particular business function suddenly increases, the entire monolithic application has to be scaled. This results in sub-optimal consumption of computing resources. In a world of cloud, redundant usage of resources costs money.
The microservices style of architecture helped mitigate the above problems to a great extent.
In this style, the application is designed as a collection of loosely coupled services (smaller applications). Each service is responsible for a self-contained business function. These services communicate with each other whenever necessary and work together to meet all requirements.
Decomposing an application into different smaller services makes it easier to understand, develop, and test.
• Only a small service has to be rebuilt when the code is updated, thus reducing deployment time, which ensures continuous delivery and deployment.
• The decomposition also results in applications becoming more resilient to failures, as downtime is observed only for error-prone services.
• Workload fluctuations could be handled much more efficiently with optimum utilization of cloud resources.
Like any good thing, microservices architecture brought challenges of its own:
• Redundant usage of cloud resources. Microservices architecture has many moving parts and it isn't feasible to manually control the usage of all the cloud resources. Developers needed a solution that controlled utilization at a granular level.
• The nightmare of deployment operations. In Micro services, we’re splitting a single application into smaller applications. Communication between the services needs to be efficient and stable, and the OS of the host should have the required dependencies before deployment.
• The system now has multiple smaller applications instead of one. Monitoring the overall application is much more complex.
• This complexity grows with the number of services, and so does the task of executing deployment operations. It was necessary to have a robust approach to better handle the complexity of deployment
The first key innovation that became an integral part of the solution was Containerization. Containerization is the practice of deploying applications by packaging them in a container.
A container is an OS-level virtualization, in which the kernel allows the existence of multiple isolated user-space instances. Simply put, a container is a fully functional virtual computer, running inside another computer (host). Each container is isolated from other containers, and from the host.
Containers have their own file systems. They can’t see each other's processes, and their computational resource usage can be bounded. Docker, Containerd, and rkt are examples of container systems.
• Decoupling of application and infrastructure: Applications directly deployed onto the host machine entangled the executables, configuration, libraries, and life cycles with each other and the host OS. Packaging each application and its dependencies into a self-sufficient isolated container decouples the application from the host OS and infrastructure. It allows deployment on any host, without worrying if the host has all the dependencies installed.
• Isolation of resources: Another important advantage of containerization is the isolation of resources. When all the applications are directly installed on the host, a fatal error in one application can bring down or corrupt other applications, or even crash the entire host in some cases. If applications are deployed in containers and multiple containers are running on a host, a fatal error inside one container won’t affect other containers or the host.
Containerization of applications adds great value and simplifies the process of developing, deploying, and maintaining distributed applications.
A container orchestration system allows the user to effectively manage the deployment of containerized applications.
Here’s a recap of some of our key requirements:
• Applications handling large-scale, complex operations must have minimal downtime
• Applications should be resilient to large fluctuations in workload
• Failure to utilize cloud resources optimally will result in incremental costs.
• Reliably perform deployment operations to support evolutionary development.
Many teams eventually realized that designing a container orchestration system, which addressed all of these problems together, was the next evolution of containerization to be truly effective. It could make the system resilient, efficient, and completely abstract the complexity of deployment operations from the user.
The engineering team at Google worked on multiple internal projects to solve deployment problems. The Google engineering team designed a system to automate containerized application deployment, scaling, and management. With rare and useful insights, Google could solve this problem. They later democratized their findings and solutions by reshaping the concepts behind their internal projects to work with open-source technologies. In 2014, Google open-sourced this project as Kubernetes.
What's Kubernetes?
According to official documentation:
Kubernetes (K8s) is an open-source system for automating the deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.
In our day to day work, as we develop and deploy digital products for enterprises and startups, these uses help us create even more agile products:
• Kubernetes can run containerized applications of any scale without any downtime.
• It can self-heal containerized applications, making them resilient to unexpected failures.
• It can auto-scale containerized applications as per the workload, and ensure optimal utilization of cloud resources.
• Kubernetes greatly simplifies the process of deployment operations, allowing it to be performed reliably with a couple of commands.
Cloud Native Computing Foundation (CNCF) — an open-source software foundation dedicated to making cloud-native computing universal and sustainable — maintains Kubernetes now.
Kubernetes orchestrates computing, networking, and storage infrastructure for the user. It eliminates the need for direct manual orchestration in a cluster and automates the orchestration process so that applications are highly available and compute resources are optimally utilized.
• Service discovery and load balancing. Kubernetes ensures efficient and reliable communication between Microservices within an application.
• Horizontal scaling: Kubernetes will autoscale to balance abrupt surges. It'll scale up replicas of a service during the surge or scale down after to manage costs.
• Self-healing: Kubernetes will automatically initiate a new, healthy replica of a service when it goes down due to an error.
• Automated rollouts and rollbacks: Kubernetes easily takes care of deployment operations like rollouts of new versions, and rollbacks.
• Secret and configuration management: Kubernetes provides built-in mechanisms to effectively store and manage configuration (like environmental variables, database connections) across different environments (e.g. : production, test, development). It also minimizes accidental leaks for sensitive configuration data.
• Storage orchestration: Kubernetes effectively manages the storage required by an application. Storage management is separated into multiple steps. The allocated storage is first configured, then a claim is made whenever an application in the cluster requires this storage. Kubernetes provides excellent integration with various storage solutions supported by cloud providers.
Kubernetes offers a large set of features, some of which are unrelated to its core functionalities. It easily integrates with other tools that provide support functions to ensure that the product/application is working as intended.
• Kubernetes doesn’t deploy source code or build the application. Continuous integration, delivery, and deployment (CI/CD) workflows aren't a core feature. Automation tools like Jenkins integrate with Kubernetes for such capabilities.
• Kubernetes doesn't provide application-level services, such as message buses, data processing frameworks, databases, caches, or storage systems. These components can run on Kubernetes, or can be accessed by applications running on Kubernetes via portable mechanisms, such as the Open Service Broker.
• Kubernetes doesn't dictate logging, monitoring, or alerting solutions. Kubernetes comes with some integrations as proof of concept and external tools best-suited for your particular use are recommended. Fluentd is a good choice for handling logging. Prometheus is a popular choice for monitoring and altering. Helm and Envoy are also popular projects, which work well with Kubernetes and simplify the workflow. Find more here: https://www.cncf.io/projects/.
Kubernetes has added great value to our product build and deployment efforts with its great interoperability with external tools.
Once configured, Kubernetes greatly simplifies the manual workload during deployment operations. Its priorities lie in providing high flexibility and configurability over ease of use.
Configuring Kubernetes for a large production environment is a complex task. Correctly configuring and maintaining a production system requires nuanced expertise, and the learning curve is steep.
Many teams that tried to switch to Kubernetes abruptly without proper training, faced difficulties and often switched back to their old solution. If you are new to Kubernetes, you should account for time to learn the fundamentals before you embark on a digital transformation journey.
As the size of the project grows, the amount of configuration required also grows. In large projects, maintaining and updating Kubernetes configuration can be challenging. To solve this, Kustomize, a Kubernetes native configuration management solution, is under development. Many teams use external tools like Helm, to manage Kubernetes configuration.
Kubernetes is very active. New releases are made frequently. Although rare, new changes may not be backward compatible and could cause production outages. Thus, a testing mechanism for detecting application-breaking changes after updates is essential.
Here are examples of other projects that tried to solve the same problems as Kubernetes.
This project didn't directly tackle the problem of container orchestration. However, there’s some overlap that merits scrutiny and comparison to Kubernetes.
A few years before the release of Kubernetes by Google, Netflix released a set of libraries that handled some of the challenges developers faced with Microservices architecture.
These libraries were built from the experience and insights gained by the Netflix engineering team, while scaling their infrastructure on Amazon Web Services (AWS). The Java-based Spring Framework was built on top of these libraries, and they released Spring Cloud Netflix, which simplified the process of integrating Netflix libraries with Spring applications.
These releases quickly gained traction, and were adopted by many teams struggling to get their Microservices architecture right. They provided capabilities like service discovery (Eureka), routing (Zuul), fault tolerance (Hystrix), etc. But using these libraries had some drawbacks.
Most of them were written primarily in Java. This made using these libraries/services in applications, implemented in other languages, and frameworks difficult. The application code also had to incorporate the logic necessary to communicate with these libraries/services.
Kubernetes doesn't restrict the choice of language or framework. Applications deployed on Kubernetes are containerized, and are completely unaware of the deployment infrastructure. Kubernetes handles service discovery, routing, health monitoring, etc, allowing applications to focus on core-business logic.
Spring Cloud Netflix has little development activity today, compared to Kubernetes, which is very active.
Docker Swarm is the cluster management and container orchestration solution that comes with Docker Engine.
Released in 2015, Docker Swarm is native to the Docker environment, and doesn't require any additional software. It supports diverse functionalities. Service discovery, load balancing, rolling updates, auto-scaling, and state reconciliation, are available in Docker Swarm.
Over the years, valuable Kubernetes have also been added to Docker Swarm. Docker Swarm is very simple to use and doesn’t require an extended learning period.
Docker Swarm sacrifices configurability and flexibility in favor of simplicity and ease of use. A large ecosystem of specialized external solutions doesn’t integrate as easily with Swarm. Few cloud providers support integrations with Swarm, unlike in Kubernetes.
Swarm may sometimes be a laggard in making the most valuable features available. Kubernetes is definitely more active in rolling out capabilities.
Docker Swarm can run only Docker containers, whereas Kubernetes supports numerous container systems including Docker. Unlike Kubernetes, which is completely a community project, Docker Inc. maintains Swarm.
If you have a team that’s already using Docker, and is very comfortable with Docker CLI, and you aren't interested in integrations with external tools or cloud providers, you could consider using Docker Swarm. Otherwise, choose Kubernetes.
Marathon is the container orchestration framework that runs on top of the Distributed Cloud Operating System (DC/OS), which is an open-source distributed operating system, based on the Apache Mesos distributed systems kernel.
Marathon can’t run without DC/OS, thus only projects using DC/OS can use Marathon. On the other hand, Kubernetes can run on DC/OS, along with numerous other platforms and operating systems.
Marathon offers features like service discovery, load balancing, health checks, CLI, GUI, etc. But Kubernetes offers more mature versions of these features, and many other features that aren’t part of Marathon.
Marathon is native to DC/OS, takes advantage of this, and performs operations more efficiently and effectively. Kubernetes has better integration with external services and cloud providers. Use Marathon if it is essential for your project. Explore Kubernetes for the flexibility it offers.
The most important factor, where Kubernetes outperforms all its alternatives, is development activity. Development activity is a reliable metric to measure the longevity of the project, and the support for it.
The fact that Kubernetes has such a strong developer community behind it, is often the reason most teams choose to use it in their production environment.
To summarize, anyone interested in designing large-scale products that remain highly available and fault-tolerant, while optimizing the consumption of computing resources, should consider using Kubernetes for container orchestration. It'll be crucial to have an engineering team with the expertise to configure, maintain and update Kubernetes.
Kubernetes enjoys the largest development activity and supports a large ecosystem of tools and services.
An earlier version of this blog was published on Medium by the author.
Software applications used to be designed and deployed as monolithic applications, where a single instance performed all the business functions.
Most of these applications were for internal enterprise use, meant to be used by a single organization, deployed, and maintained on-premises.
These kinds of applications had (and continue to have) a constant set of predetermined parameters. A moderate amount of downtime was acceptable.
But today, software applications and digital products provide business-critical and complex services. Downtime is unacceptable, and the parameters, functions, and features are rarely (if ever) constant. Add to this the requirement of being available from everywhere: monolithic application structures just don't cut it any more. Workload, requirements, and rules of operations will change, and fast.
Kubernetes (and similar offerings out there) help mitigate many of these challenges. We’ll dive deeper into that here.
The size of the applications and the requirements for round-the-clock availability made on-premise storage and maintenance extremely challenging. On-demand cloud computing solved that problem by providing the resources needed to run applications as a service. Zemoso continues to be a cloud-service-provider agnostic product studio. Though depending on the kind of product you are building, we might have really strong preferences.
Cloud computing took care of the physical hardware problem; agile product development and deployment remained a challenge.
• Large build times. Extremely large code bases resulted in large build times, i.e., large downtime for deployment. But high availability is a critical requirement, and downtime is unacceptable.
• Slow Development. Developing large code bases is slow. Since the entire code base is encompassed in a single instance, a broken code from one single developer within the team could block the work of other developers. Agile software development advocated evolutionary development, early delivery, and continual improvement. But developers didn't embrace this due to the high cost of deployment.
• Poor resilience to failures. One error in performing a single business function could bring down the entire application since a single code base handles everything.
• Low resource utilization. When the workload to perform a particular business function suddenly increases, the entire monolithic application has to be scaled. This results in sub-optimal consumption of computing resources. In a world of cloud, redundant usage of resources costs money.
The microservices style of architecture helped mitigate the above problems to a great extent.
In this style, the application is designed as a collection of loosely coupled services (smaller applications). Each service is responsible for a self-contained business function. These services communicate with each other whenever necessary and work together to meet all requirements.
Decomposing an application into different smaller services makes it easier to understand, develop, and test.
• Only a small service has to be rebuilt when the code is updated, thus reducing deployment time, which ensures continuous delivery and deployment.
• The decomposition also results in applications becoming more resilient to failures, as downtime is observed only for error-prone services.
• Workload fluctuations could be handled much more efficiently with optimum utilization of cloud resources.
Like any good thing, microservices architecture brought challenges of its own:
• Redundant usage of cloud resources. Microservices architecture has many moving parts and it isn't feasible to manually control the usage of all the cloud resources. Developers needed a solution that controlled utilization at a granular level.
• The nightmare of deployment operations. In Micro services, we’re splitting a single application into smaller applications. Communication between the services needs to be efficient and stable, and the OS of the host should have the required dependencies before deployment.
• The system now has multiple smaller applications instead of one. Monitoring the overall application is much more complex.
• This complexity grows with the number of services, and so does the task of executing deployment operations. It was necessary to have a robust approach to better handle the complexity of deployment
The first key innovation that became an integral part of the solution was Containerization. Containerization is the practice of deploying applications by packaging them in a container.
A container is an OS-level virtualization, in which the kernel allows the existence of multiple isolated user-space instances. Simply put, a container is a fully functional virtual computer, running inside another computer (host). Each container is isolated from other containers, and from the host.
Containers have their own file systems. They can’t see each other's processes, and their computational resource usage can be bounded. Docker, Containerd, and rkt are examples of container systems.
• Decoupling of application and infrastructure: Applications directly deployed onto the host machine entangled the executables, configuration, libraries, and life cycles with each other and the host OS. Packaging each application and its dependencies into a self-sufficient isolated container decouples the application from the host OS and infrastructure. It allows deployment on any host, without worrying if the host has all the dependencies installed.
• Isolation of resources: Another important advantage of containerization is the isolation of resources. When all the applications are directly installed on the host, a fatal error in one application can bring down or corrupt other applications, or even crash the entire host in some cases. If applications are deployed in containers and multiple containers are running on a host, a fatal error inside one container won’t affect other containers or the host.
Containerization of applications adds great value and simplifies the process of developing, deploying, and maintaining distributed applications.
A container orchestration system allows the user to effectively manage the deployment of containerized applications.
Here’s a recap of some of our key requirements:
• Applications handling large-scale, complex operations must have minimal downtime
• Applications should be resilient to large fluctuations in workload
• Failure to utilize cloud resources optimally will result in incremental costs.
• Reliably perform deployment operations to support evolutionary development.
Many teams eventually realized that designing a container orchestration system, which addressed all of these problems together, was the next evolution of containerization to be truly effective. It could make the system resilient, efficient, and completely abstract the complexity of deployment operations from the user.
The engineering team at Google worked on multiple internal projects to solve deployment problems. The Google engineering team designed a system to automate containerized application deployment, scaling, and management. With rare and useful insights, Google could solve this problem. They later democratized their findings and solutions by reshaping the concepts behind their internal projects to work with open-source technologies. In 2014, Google open-sourced this project as Kubernetes.
What's Kubernetes?
According to official documentation:
Kubernetes (K8s) is an open-source system for automating the deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.
In our day to day work, as we develop and deploy digital products for enterprises and startups, these uses help us create even more agile products:
• Kubernetes can run containerized applications of any scale without any downtime.
• It can self-heal containerized applications, making them resilient to unexpected failures.
• It can auto-scale containerized applications as per the workload, and ensure optimal utilization of cloud resources.
• Kubernetes greatly simplifies the process of deployment operations, allowing it to be performed reliably with a couple of commands.
Cloud Native Computing Foundation (CNCF) — an open-source software foundation dedicated to making cloud-native computing universal and sustainable — maintains Kubernetes now.
Kubernetes orchestrates computing, networking, and storage infrastructure for the user. It eliminates the need for direct manual orchestration in a cluster and automates the orchestration process so that applications are highly available and compute resources are optimally utilized.
• Service discovery and load balancing. Kubernetes ensures efficient and reliable communication between Microservices within an application.
• Horizontal scaling: Kubernetes will autoscale to balance abrupt surges. It'll scale up replicas of a service during the surge or scale down after to manage costs.
• Self-healing: Kubernetes will automatically initiate a new, healthy replica of a service when it goes down due to an error.
• Automated rollouts and rollbacks: Kubernetes easily takes care of deployment operations like rollouts of new versions, and rollbacks.
• Secret and configuration management: Kubernetes provides built-in mechanisms to effectively store and manage configuration (like environmental variables, database connections) across different environments (e.g. : production, test, development). It also minimizes accidental leaks for sensitive configuration data.
• Storage orchestration: Kubernetes effectively manages the storage required by an application. Storage management is separated into multiple steps. The allocated storage is first configured, then a claim is made whenever an application in the cluster requires this storage. Kubernetes provides excellent integration with various storage solutions supported by cloud providers.
Kubernetes offers a large set of features, some of which are unrelated to its core functionalities. It easily integrates with other tools that provide support functions to ensure that the product/application is working as intended.
• Kubernetes doesn’t deploy source code or build the application. Continuous integration, delivery, and deployment (CI/CD) workflows aren't a core feature. Automation tools like Jenkins integrate with Kubernetes for such capabilities.
• Kubernetes doesn't provide application-level services, such as message buses, data processing frameworks, databases, caches, or storage systems. These components can run on Kubernetes, or can be accessed by applications running on Kubernetes via portable mechanisms, such as the Open Service Broker.
• Kubernetes doesn't dictate logging, monitoring, or alerting solutions. Kubernetes comes with some integrations as proof of concept and external tools best-suited for your particular use are recommended. Fluentd is a good choice for handling logging. Prometheus is a popular choice for monitoring and altering. Helm and Envoy are also popular projects, which work well with Kubernetes and simplify the workflow. Find more here: https://www.cncf.io/projects/.
Kubernetes has added great value to our product build and deployment efforts with its great interoperability with external tools.
Once configured, Kubernetes greatly simplifies the manual workload during deployment operations. Its priorities lie in providing high flexibility and configurability over ease of use.
Configuring Kubernetes for a large production environment is a complex task. Correctly configuring and maintaining a production system requires nuanced expertise, and the learning curve is steep.
Many teams that tried to switch to Kubernetes abruptly without proper training, faced difficulties and often switched back to their old solution. If you are new to Kubernetes, you should account for time to learn the fundamentals before you embark on a digital transformation journey.
As the size of the project grows, the amount of configuration required also grows. In large projects, maintaining and updating Kubernetes configuration can be challenging. To solve this, Kustomize, a Kubernetes native configuration management solution, is under development. Many teams use external tools like Helm, to manage Kubernetes configuration.
Kubernetes is very active. New releases are made frequently. Although rare, new changes may not be backward compatible and could cause production outages. Thus, a testing mechanism for detecting application-breaking changes after updates is essential.
Here are examples of other projects that tried to solve the same problems as Kubernetes.
This project didn't directly tackle the problem of container orchestration. However, there’s some overlap that merits scrutiny and comparison to Kubernetes.
A few years before the release of Kubernetes by Google, Netflix released a set of libraries that handled some of the challenges developers faced with Microservices architecture.
These libraries were built from the experience and insights gained by the Netflix engineering team, while scaling their infrastructure on Amazon Web Services (AWS). The Java-based Spring Framework was built on top of these libraries, and they released Spring Cloud Netflix, which simplified the process of integrating Netflix libraries with Spring applications.
These releases quickly gained traction, and were adopted by many teams struggling to get their Microservices architecture right. They provided capabilities like service discovery (Eureka), routing (Zuul), fault tolerance (Hystrix), etc. But using these libraries had some drawbacks.
Most of them were written primarily in Java. This made using these libraries/services in applications, implemented in other languages, and frameworks difficult. The application code also had to incorporate the logic necessary to communicate with these libraries/services.
Kubernetes doesn't restrict the choice of language or framework. Applications deployed on Kubernetes are containerized, and are completely unaware of the deployment infrastructure. Kubernetes handles service discovery, routing, health monitoring, etc, allowing applications to focus on core-business logic.
Spring Cloud Netflix has little development activity today, compared to Kubernetes, which is very active.
Docker Swarm is the cluster management and container orchestration solution that comes with Docker Engine.
Released in 2015, Docker Swarm is native to the Docker environment, and doesn't require any additional software. It supports diverse functionalities. Service discovery, load balancing, rolling updates, auto-scaling, and state reconciliation, are available in Docker Swarm.
Over the years, valuable Kubernetes have also been added to Docker Swarm. Docker Swarm is very simple to use and doesn’t require an extended learning period.
Docker Swarm sacrifices configurability and flexibility in favor of simplicity and ease of use. A large ecosystem of specialized external solutions doesn’t integrate as easily with Swarm. Few cloud providers support integrations with Swarm, unlike in Kubernetes.
Swarm may sometimes be a laggard in making the most valuable features available. Kubernetes is definitely more active in rolling out capabilities.
Docker Swarm can run only Docker containers, whereas Kubernetes supports numerous container systems including Docker. Unlike Kubernetes, which is completely a community project, Docker Inc. maintains Swarm.
If you have a team that’s already using Docker, and is very comfortable with Docker CLI, and you aren't interested in integrations with external tools or cloud providers, you could consider using Docker Swarm. Otherwise, choose Kubernetes.
Marathon is the container orchestration framework that runs on top of the Distributed Cloud Operating System (DC/OS), which is an open-source distributed operating system, based on the Apache Mesos distributed systems kernel.
Marathon can’t run without DC/OS, thus only projects using DC/OS can use Marathon. On the other hand, Kubernetes can run on DC/OS, along with numerous other platforms and operating systems.
Marathon offers features like service discovery, load balancing, health checks, CLI, GUI, etc. But Kubernetes offers more mature versions of these features, and many other features that aren’t part of Marathon.
Marathon is native to DC/OS, takes advantage of this, and performs operations more efficiently and effectively. Kubernetes has better integration with external services and cloud providers. Use Marathon if it is essential for your project. Explore Kubernetes for the flexibility it offers.
The most important factor, where Kubernetes outperforms all its alternatives, is development activity. Development activity is a reliable metric to measure the longevity of the project, and the support for it.
The fact that Kubernetes has such a strong developer community behind it, is often the reason most teams choose to use it in their production environment.
To summarize, anyone interested in designing large-scale products that remain highly available and fault-tolerant, while optimizing the consumption of computing resources, should consider using Kubernetes for container orchestration. It'll be crucial to have an engineering team with the expertise to configure, maintain and update Kubernetes.
Kubernetes enjoys the largest development activity and supports a large ecosystem of tools and services.
An earlier version of this blog was published on Medium by the author.