Docker Containers - Theory and Use

First published — Aug 15, 2023
Last updated — Aug 19, 2023
#infrastructure #cloud #tools

Docker. Images and Containers. Description and use.

Table of Contents

Introduction

Docker is a complete system for creating software images and running them in containers.

Images are binary snapshots of files and directories packaged together as a unit.

Containers are running instances of those images.

Containers are somewhat similar to virtual machines, but different. Virtual machines emulate complete hardware, so usually a whole operating system is running in them. Containers, on the other hand, are also isolated like virtual machines, but run under the host’s kernel. That means they don’t require a kernel or any other OS files to run – one could create an image consisting of just one statically-compiled executable file.

Being able to run containers in isolated environments under the host OS requires a lot of functionality provided by the kernel. Two primary groups of such functionality provided by the Linux kernel are Linux control groups and Linux namespaces .

As a consequence of running on the host kernel, it is not possible to run completely different operating systems or different architectures in containers; program files need to be compatible.

Docker is a higher-level technology that builds on top of numerous Unix and networking concepts. To understand it properly, you should be familiar with basic topics described in Series - Basic and with .

Docker Components

We mentioned that Docker is a complete system. It consists of the following main components:

  • Dockerfile syntax, which allows one to define images using a procedural syntax. Images can be defined easily by referencing other, existing images and customizing them

  • Engine for building the images, which takes Dockerfiles as input and produces Docker images as output

  • Engine for running the containers, which runs images in isolated, container environments. Processes running in containers are visible to the host OS

  • Functionality for downloading Docker images, which allows downloading existing images from public or private (authenticated) Docker registries

  • Functionality for uploading Docker images to public or private Docker registries

Command docker is a single program to access all of the described functionality. Different command line options invoke different parts of functionality.

In addition, we could mention Docker Compose, a tool that has its config file in docker-compose.yml and allows running multiple containers as a group. For example, if a service you want to run consists of a couple separate containers.

And there is also Docker Swarm, which can centrally manage a fleet of machines running Docker. It is out of the scope of this article.

Why Use Docker?

There is a number of entirely different reasons why you might use Docker.

For example:

  • If the host OS is outdated and can’t be upgraded, a newer version of a component can be run in a Docker container

  • If you want to try some software, but not risk it making modifications to your host OS, you might restrict it to running in a container

  • Some software is complex to install, so its authors might prepare ready-made Docker containers. That can greatly reduce time to results

  • Container images (Docker or others) are often basic units of deployment in the cloud, such as in Kubernetes

Why Not Use Docker?

This section is not intended to steer you away from using Docker, but to help you position it in your mental map more precisely. Containers are sometimes ideal to deploy software and tinker with systems and concepts, but there could be concerns.

When you want to use some software on GNU/Linux, you usually don’t use containers, but install it via the host’s native package management tools (such as Debian’s apt). Then, you configure the software if needed, and run it. That is the default workflow. You are involved in the whole process, while also taking maximum advantage of all the effort that package maintainers have invested in:

  • Reviewing the licensing and quality of the software
  • Making it adhere to the distribution’s defined standards
  • Integrating it into the distribution’s standard procedures and tools
  • Documenting it and often providing configuration examples
  • Pre-configuring it and generally making it ready and easy to use

Docker images, on the other hand, can be created by anyone. Images are not verified or tuned by the distribution’s package maintainers, and the software present in them is not installed and configured manually by end users. Both steps are already done in advance by image authors.

That raises the following concerns:

  • Images may contain code or behavior that you would not approve. Since images are bigger and less transparent than packages, you might start using them without knowing, or lack the determination necessary to audit and remove the offending parts

  • Software images, being pre-installed and pre-configured, can deliver functionality quickly. But if you rely only on images, you never learn how to install and configure the software yourself. That makes you potentially miss out on the features of original software that were not made available through the image, and in general reduces your level of skill

  • Images may serve to conveniently distribute and popularize software delivered under non-free licensing terms, which should be avoided whenever possible

  • Using software through images and containers, or through proprietary platforms on which they are deployed, might make you accustomed to using “software as a service”, rather than demanding to have full control and ownership of your software, data, and devices

Docker Installation

Docker

Docker installation is not a part of this article since it is very adequately covered elsewhere.

For Debian GNU-based systems, see Install Docker Engine on Debian and then return here.

Permissions

In a default scenario, Docker uses a simple permission model where all members of group docker are able to use it.

So our first task is to add the current user to group docker:

sudo adduser $USER docker

Adding user `user' to group `docker' ...
Done.

The operating system caches user group memberships for performance. Group memberships are re-read on first user login. Thus, for the cache to be refreshed and the new group membership applied, you should completely log out of the system, and then log back in. However, that may be inconvenient, so in the meantime there is a command that will force only the current shell to apply the new group:

newgrp docker; newgrp $USER

Alternatively, if that asks for a password (which it shouldn’t), there is another method:

sudo su - -c "su - $USER"

To confirm that you have the necessary privilege to use Docker, simply run id to check that “docker” is in the list of auxiliary groups, and then run e.g. docker ps. If no direct error message is printed, you are OK.

Quick Start

Starting Containers

As mentioned, Docker is a system for creating software images and running those images in containers.

However, we do not have to build all images ourselves. Docker maintains a public registry of available images, and as soon as we reference an image that does not exist locally, Docker will connect to its public Internet registry and try download it from there.

I find Docker’s eagerness to look up images remotely quite inconvenient – it only takes a 1-letter typo for Docker to not find an image locally and go download and run it from the Internet!

Let’s start by confirming that, in a clean installation, we do not have any containers or images. The following commands will just print empty results:

# Show running containers
docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

# Show all containers
docker container list -a
CONTAINER ID   IMAGE     COMMAND     CREATED          STATUS          PORTS     NAMES

# Show all images available locally
docker images
REPOSITORY    TAG       IMAGE ID       CREATED        SIZE

Now, knowing about Docker’s automatic lookup of images in the public registry, we can run our first container “hello-world”.

When we run the commands, the first part will be informing us about downloading the image. The second part will be the actual message “Hello from Docker” – but it will be so confusingly big and unintuitive that it will be easily mistaken for a service announcement than an indication of success.

Here’s the first part of the output in which the image is being downloaded:

docker run --rm hello-world

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
719385e32844: Pull complete 
Digest: sha256:dcba6daec718f547568c562956fa47e1b03673dd010fe6ee58ca806767031d1c
Status: Downloaded newer image for hello-world:latest

And the second part which shows the actual “Hello from Docker!”:

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

In short – it worked!

Now we can check our list of local images again. One image will be there:

docker images

REPOSITORY    TAG       IMAGE ID       CREATED        SIZE
hello-world   latest    9c7a54a9a43c   3 months ago   13.3kB

The command docker ps, which shows running containers, will still be empty. That is because our container has started, printed the message, and exited, so there are no running containers at the moment:

docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

Building Images

While we have the “hello-world” image at hand, let’s show the simplest possible customization and a build of our own Docker image.

We already mentioned that Docker commands for building the containers are found in files named Dockerfile, and that new images can be built on top of existing ones.

So let’s create a very simple image, based on “hello-world” and its program /hello that printed the welcome message.

To make our image a little different, we are going to start from “hello-world” and then add a new command /hello2 into it, which will just print a brief Hello, World! to the screen and exit, like the original image should have done.

First, we need to create the hello2 program. If you have programmed in C, you will clearly recognize the following snippet as a C program. But in any case, type the following commands to install the compiler, create one .c file, and compile it. The resulting program will be in file hello2:

sudo apt install build-essential

echo -e '#include <stdio.h>\nint main() { printf("Hello, World!\\n");}' > hello2.c
gcc -o hello2 -static hello2.c

./hello2

Hello, World!

Now that we have our program, let’s create a Dockerfile for our new image.

FROM hello-world
COPY ./hello2 /
CMD [ "/hello2" ]

The above lines specify that we want to use an existing image “hello-world” as a base, copy file hello2 from the host to /hello2 in the new image, and define the command (“CMD”) that will run every time we run this image.

Note that Dockerfiles only define how images will be built, not how they will be named or which version they will have; those options are passed at build time.

We can then build the image with:

docker build -f Dockerfile -t hello-world2 .  # (Don't forget the dot at the end)

Once the image is built, we can verify its presence in the local Docker cache:

docker images

REPOSITORY    TAG       IMAGE ID       CREATED        SIZE
hello-world2  latest    d97789789d8d   4 seconds ago  775kB

And we can now start a temporary container, based on image “hello-world2”, and run command /hello2 in it.

docker run --rm hello-world2 /hello2

Hello, World!

If you remember, our image has already been configured to start /hello2 as the default command running in the container. Because of that, we did not have to explicitly say /hello2.

But the image contains both /hello and /hello2. What if we wanted to run the original /hello?

When a command is present on the command line, it will overwrite the image’s default “CMD”. So if we wanted to run /hello instead of /hello2 in our container, we would simply write:

docker run --rm hello-world2 /hello

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
...
...

Removing Images

The images “hello-world” and “hello-world2” are extremely simple. They consist of programs /hello and /hello2 which print welcome messages and exit. There is nothing else useful we can do with them, other than maybe inspecting them for the sake of practice, and then removing them from the local cache:

docker image inspect hello-world

{
		"Id": "sha256:9c7a54a9a43cca047013b82af109fe963fde787f63f9e016fdc3384500c2823d",
		"RepoTags": [
				"hello-world:latest"
		],
		...
		...
		...

docker image rm hello-world
docker image rm hello-world2

It is possible that above commands will fail, saying:

Error response from daemon: conflict: unable to remove repository reference "hello-world2" (must force) - container ........... is using its reference image d97789789d8d

That simply means there exist containers which use this image, so the image cannot be removed. List containers and remove them before removing the images:

docker container list -a
docker container rm ...........

docker image rm hello-world
docker image rm hello-world2

Lastly, working with Docker, you will quickly notice that its cache can easily fill gigabytes of disk space, so we will also show a space-saving command here. That command will not make a difference with only few images, but will come handy in the future.

docker system df
docker system prune

Interacting with Containers

By default, for easier communication, Docker creates one default virtual network subnet and all containers get assigned an IP from that subnet so they can talk to each other.

The containers also have access to the host OS’ networking, so if the host machine is connected to the Internet, containers will be able to access it as well.

However, other than that, Containers run with pretty much everything separate from other containers, including storage.

That is great for isolation, but may be a problem for durability of data. While it is quite normal to have long-lived containers, containers are often also created temporarily, and all their data is lost when containers are stopped and removed.

Similarly, container isolation may be a problem if we want software in the containers to interact with the outside system more.

There are two ways to enable that interaction:

  • The first one, of limited applicability, is to copy additional files or other required data into the container

  • The second, and more useful one, is to mount host OS directories (ad-hoc volumes) into the container, and/or expose container’s network ports to the host OS

Mounting disk volumes inside containers is done with option -v HOST_PATH:CONTAINER_PATH, and exposing ports is done with option -p HOST_PORT:CONTAINER_PORT.

Let’s show that in practice.

Network Services in Containers

We have seen the “hello-world” image in the previous chapter. The image did not exist locally, so it was automatically pulled from Docker’s public registry when we ran it.

That container did not require much interaction. All it did was print a welcome message and exit.

But now, to show network interaction with containers and set things up for other examples, we are going to explicitly download and then run a Docker image for Apache HTTP (web) server.

The image name is httpd:

docker pull httpd

To be useful, an HTTP server must of course be accessible. So we are going to run the container and route the host OS’ port 8080 to port 80 in the container. Port 80 is a standard port on which unencrypted (non-SSL) web servers are listening.

docker run -ti --rm -p 8080:80 httpd

With that command, the container will start in foreground mode.

We can now use a web browser to open http://0:8080/ and we will be greeted by Apache’s simple message “It works!”.

When you are done with the test, press Ctrl+c to terminate the process. Because of option --rm, the container will also be automatically removed.

Data in Containers

But, what about a more useful website? What if we had a personal or company website, and wanted to serve it from this container?

If you are familiar with the basics of HTTP protocol, you know the original idea was that a client would request a particular URL on the server, that URL would map to some HTML file on disk, and the server would return the file contents to the user.

From the documentation on Docker official image 'httpd' , we see that Apache’s root directory for serving HTML files is /usr/local/apache2/htdocs/.

Therefore, the simplest thing we could do to serve our website instead of the default “It works!” would be to copy our files over the default ones.

Let’s do that now and confirm that it worked by seeing the message change from “It works!” to “Hello, World!”:

First, we will create a directory public_html/ containing one page for our new website:

mkdir public_html
echo "<html><body>Hello, World!</body></html>" > public_html/index.html

Then, we will create a separate Dockerfile, e.g. Dockerfile.apache, for our new image:

FROM httpd
COPY ./public_html/ /usr/local/apache2/htdocs/

Finally, we will build and run the image:

docker build -f Dockerfile.apache -t hello-apache2 .  # (Don't forget the dot at the end)

docker run -ti --rm --name test-website -p 8080:80 hello-apache2

Visiting http://0:8080/ will now show our own website and message “Hello, World!”.

We are done with the test so press Ctrl+c to terminate the process.

Data in External Volumes

The previous example works, but copying data into images is not very flexible. When data changes, we need to rebuild images and also restart containers using them.

As mentioned earlier, the solution is to mount host OS directories (ad-hoc volumes) into the container with option -v HOST_PATH:CONTAINER_PATH.

Since we already have our public_html/ directory, and mounting volumes does not require changing the original images, we can use http image directly:

docker run -ti --rm --name test-website-volume -p 8080:80 -v ./public_html:/usr/local/apache2/htdocs/ hello-apache2

Visiting http://0:8080/ will now show our new website and message “Hello, World!”.

But the example is not functionally equivalent to the previous one. This data is now “live”. If we modify file public_html/index.html or any other file in the public_html/ directory on the host OS, and visit it through the browser, we will immediately see their current content. (In some cases you might need to press Ctrl+r or F5, or Ctrl+Shift+r or click Shift+Reload, to cause browser to update its cache.)

Furthermore, since we now have a long-running container, we can verify its presence in the output of docker ps:

docker ps

CONTAINER ID   IMAGE     COMMAND            CREATED      STATUS      PORTS                                    NAMES
66bb93476t99   httpd     "http-foreground"  1 hour ago   Up 1 hour   0.0.0.0:8080->80/tcp, :::8080->80/tcp    smart_williams

The last field, NAMES, shows a randomly generated name by Docker. It should be overridden with a purposely chosen name when you intend containers to be of more than temporary use.

Running Commands in Containers

In containers, you can only run commands that exist in the underlying image.

As long as they exist, you can run them at container startup, or later after the container has already been running.

Let’s look at each option.

At Startup

From Dockerfile

There are two Dockerfile directives that define the default program to run in the container at startup – ENTRYPOINT and CMD.

The command that Docker will run by default is $ENTRYPOINT $CMD.

We have seen an example in our earlier Dockerfile:

FROM hello-world
COPY ./hello2 /
CMD [ "/hello2" ]

Additional examples:

ENTRYPOINT [ "/some/program", "--with-option", "123" ]
CMD [ "/some/program", "--with-option", "123" ]

Note that ENTRYPOINT and CMD above show the preferred “exec” syntax, but a “shell” syntax is also available. See more about that in e.g. Docker documentation for ENTRYPOINT .

From Command Line

It is possible to override both ENTRYPOINT and CMD on the command line, at time of container startup.

Just one specific is that when you specify the command to run on the command line, it overrides CMD, not ENTRYPOINT. To change ENTRYPOINT, you would have to specifically pass option --entrypoint ....

We have already seen above how we can specify custom commands to run in a container, e.g.:

docker run --rm hello-world2 /hello

In Runtime

Often times we want to connect to containers that are currently running.

Let’s first start a container running Debian GNU/Linux:

docker run --name my_debian -ti --rm debian

root@5821b3a41434:/#

Then, in another terminal let’s run docker ps to confirm our container is running:

CONTAINER ID   IMAGE     COMMAND   CREATED          STATUS          PORTS     NAMES
5821b3a41434   debian    "bash"    15 seconds ago   Up 13 seconds             my_debian

Now with a running container, we can execute commands in it via docker container exec. Here is an example that shows disk space:

docker container exec -ti my_debian df -h

Filesystem      Size  Used Avail Use% Mounted on
overlay          15G  6.0G  7.6G  44% /
tmpfs            64M     0   64M   0% /dev
tmpfs           1.2G     0  1.2G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
/dev/xvda3       15G  6.0G  7.6G  44% /etc/hosts
tmpfs           1.2G     0  1.2G   0% /proc/asound
tmpfs           1.2G     0  1.2G   0% /proc/acpi
tmpfs           1.2G     0  1.2G   0% /proc/scsi
tmpfs           1.2G     0  1.2G   0% /sys/firmware

If there is a shell in the container, which in Debian image of course is, we can also run the shell directly:

docker container exec -ti my_debian /bin/bash

root@5821b3a41434:/#

The shell can be exited with command exit or by pressing Ctrl+d, as usual.

Hybrid

It is completely fine to combine both approaches, specifying a command to run at startup as well as connecting and running additional commands later, in runtime.

In many container images that implement some client/server applications, such as databases, it is customary that their Docker image will start the server by default, but if you want to start a client, then you override the command to just start the client.

You can do this either by running docker run IMAGE_NAME [CMD] twice, one time without, and one time with the command manually specified. This will run the same image twice, in two separate containers, and you will be able to confirm this with docker ps.

However, you can also run the second command with docker container exec CONTAINER CMD and it would have a similar, but different effect. It would run the second command in the first container, rather than starting two separate containers.

Automatic Links

The following links appear in the article:

1.
2. Series - Basic - /series/basic
3. Install Docker Engine on Debian - https://docs.docker.com/engine/install/debian/
4. Docker Documentation for ENTRYPOINT - https://docs.docker.com/engine/reference/builder/#entrypoint
5. Linux Control Groups - https://en.wikipedia.org/wiki/Cgroups
6. Linux Namespaces - https://en.wikipedia.org/wiki/Linux_namespaces
7. Docker Official Image 'Httpd' - https://hub.docker.com/_/httpd