We all are familiar with a classic problem of "It works on my machine". We must have experienced this problem in which a particular program works on our system but it doesn't work on our fellow developer's system. This happens due to the difference in environments. Maybe our program uses a library of a particular version and the other developer has a different version of that library in his/her system. The solution to this problem is "container" which is getting quite popular nowadays.
Whenever we hear the term "container", we tend to connect it to Docker or Vagrant. But container technology has been around for a quite long time and is not just associated with Docker or Kubernetes only.
What is Containerization?
It is the process of bundling an application, its required libraries/dependencies, configurations, environment settings, etc., into a single unit called "container". In simple terms, it involves bundling an application with whatever is required for its execution at run-time into a single unit, which is known as "container". The most popular containerization ecosystems are Docker and Vagrant. There are also some other containerization tools in the market like Rocket.
Containerization vs Virtualization
Generally, people get confused between containerization and virtualization and think that container is some sort of lightweight VM. We can't also blame them completely because this is how Docker marketed its containerization technology in its early days. They marketed it as “unlike a VM that starts in minutes, Docker containers start in about 50 milliseconds”. There have been many comparisons between them, so let’s try to understand it better.
If we want to run multiple applications or systems on a single server then we need containerization or virtual machines.
Virtualization architecture
Let’s understand how virtual machine architecture works.
-
Infrastructure: Firstly, we have a layer of infrastructure that is nothing but your laptop, desktop, a server in your company’s data center room, or a virtual server over the cloud (IAAS). It has RAM, ROM, CPU, and all the resources that a typical system consists of.
-
Host Operating System: On top of that infrastructure, we need an OS in order to use it. In the case of VM architecture, it is known as Host OS.
-
Hypervisor: It is a software that creates and runs virtual machines. We can think of VMs as a self-contained computer packaged into a single file, but we need something to run these files and this is where Hypervisor comes in.
-
Guest OS: Suppose we want to set up 3 VMs in order to run 3 different applications in isolation. For that, we need to spin up 3 guest operating systems on top of our Hypervisor. If each guest OS is of around 1 GB, then 3 GB of disk space will be used up for the guest OSs only. Also, each OS uses some other resources also for its functionality. So we can observe the wastage of resources here.
-
Binaries/Libraries: Our application might have few dependencies in the form of binaries and libraries. We need to have those dependencies installed in our guest machine for the proper execution of our application. Since each application can have different dependencies, so they are also present in each guest machines separately occupying some resources.
-
Application: Then finally we have our application running on top in isolation.
The machine on which hypervisor runs is known as host machine and each VM is known as guest machine.
Containerization Architecture
Now let's understand the architecture of Containerization:
-
Infrastructure: Similar to VM architecture, first we have a layer of infrastructure which is nothing but your laptop, desktop, a server in your company’s data center room or a virtual server over the cloud (IAAS). It has RAM, ROM, CPU, and all the resources that a typical system consists of.
-
Operating System: On top of that infrastructure, we need an OS in order to use it.
-
Containerization Engine: (in case of Docker, it is known as Docker Engine) Instead of a Hypervisor, we have something called a containerization engine. It is a software that helps to create, run, and manage containers. It ensures that containers exist in isolation and separate from the host OS and also distributes the resources among them.
-
Binaries and libraries: Unlike VMs, we don’t need guest OS here. So, we directly have the dependencies required by our applications in the form of binaries and libraries inside our containers.
-
Application: Then finally we have our application running on top inside its own container (Docker container) in isolation.
Virtualization is basically running multiple machines on a single machine in isolation whereas Containerization constitutes of running multiple applications on a single machine in isolation.
Comparison between Containerization and Virtualization
-
Speed: Containers are light-weight in comparison to VMs and start immediately. VMs have their own guest OS which requires a significant amount of boot time and hence, VMs take time to start. “Unlike a VM that starts in minutes, Docker containers start in about 50 milliseconds”.
-
Level of abstraction: Containers are the abstraction of operating system layer and each container is basically an application running and sharing the OS kernel with other containers but at the same time exists in isolation (they think that they are the only one using the OS but it’s not the case). Whereas VMs is the abstraction of the hardware layer. Each VM represents a single physical machine, running on a single server and sharing its infrastructure with other VMs (they think that they are the only ones using the hardware resources but it’s not the case).
-
Size: Containers are small in size, in few MBs whereas VMs have generally large size, sometimes of few GBs.
-
Resources: Containers consume fewer resources as they don’t have a separate OS for each container, unlike VMs because an OS itself consumes a lot of disk space and other resources.
-
Security: VMs wins in the case of security and are more secure than containers. VMs exist in complete isolation with each other and have their separate OSs, so if a single VM is compromised then it doesn’t affect other VMs. Also, they can use their own set of security protocols. On the other hand, containers are just applications with their runtime, running in isolation but using the same OS kernel. So if the OS gets compromised or one of the applications corrupts the OS then all the applications will get affected. They all have to follow the same security protocols of the host machine.
-
Portability: Containers are easily portable because of the small size of their images whereas VMs images would be of huge size as it will contain the complete OS, application, bins, and libraries making it less portable.
Use cases of Virtualization and Containerization
When we have an application that is capable of using the full capacity of an OS and security and isolation is one of our priorities then we should go for VMs. If we want our applications to be portable and light-weight with quick boot time then we should go for Containers. Both have their own advantages and disadvantages. Based on our use cases and requirements we should choose one of them.
Conclusion
In this article, we understood the concept of containerization which is used by Docker also. The problem that it solves and the reason for the comparison between containerization and virtualization. Then we compared the architectures of containerization and virtualization. Finally, we compared them on various factors like speed, resources, security, portability, size and level of abstractions.