If you are a full-stack developer, you need to know about Docker. In fact, this technology allows you to use DevOps and makes your deployments faster. With it, you can literally push the new version of your app to your servers with a click. Dockerfile is the lynchpin of this, and in this dockerfile tutorial, we will see how to use it.
Do you want to know more about DevOps? Check out DevOps for Full Stack developers.
Introduction to DockerFile
Dockerfile is a pillar component of Docker. Hence, we should spend a few words on Docker before discussing the dockerfile itself. In this section, we will discuss Docker, the dockerfile, and why it is crucial.
What is Docker?
Docker is probably the most popular containerization technology out there. It allows us to create and run containers. This may sound simple, but it is not if you are new to containers. So, what is a container?
A container is a lightweight version of a virtual machine. It is pretty much like a virtual machine, but it is extremely lighter to run: we see containers with as low as 50MB of RAM! While this may sound incredible, there is no real trick behind this. It is just that a virtual machine fully replicates the OS, while containers do not do that – delegating some tasks to the hypervisor.
Another feature of containers is that they are stateless and pre-configured. Imagine you want to run an Apache webserver on a virtual machine. First, you would create the VM, and then install Apache on it. After that, your VM with Apache is ready, and you can turn it off and on at your will. Containers have a different paradigm. Each container has its definition of what it should look like in its final state. So, when you turn it on, it will come up with all the software and dependencies installed. And, whatever you save in the container’s disk, will be lost when the container is powered down. Stateless, in fact, means that it does not preserve its state (its disk).
All of this means that we can use containers almost like virtual machines. I would say like virtual machines, but better. Some people are turned off by the stateless part, but that is actually a benefit. In fact, it forces you to separate the data layer (database) from your app. You can prepare special rules to have some containers preserve their disk, but that is advanced stuff.
What is dockerfile?
Dockerfile is the configuration file for docker. It is just that, a file that has the exact name dockerfile
(with no extension), and it tells docker how to create a container.
When we have a dockerfile, we can run docker build
to create the container image from that dockerfile. The container image is the image of our container, shutdown, ready to be deployed. Then, we can take this image and deploy it – actually create one or more container instances live and running from that image.
We can go even beyond that. In fact, we can publish our image to a docker registry, either public (Docker Hub) or private. Much like npm
dependencies in Node.js or pip
dependencies in Python, we can then download the image from the cloud and run it.
So, the dockerfile is the first step to define what the container of our application should look like. Even if dockerfile can rely on external files, for simplicity you should try to have all the configuration of the container inside dockerfile. And, obviously, the source code of your app next to the dockerfile.
More on Docker…
In this dockerfile tutorial – you know this by now – we will see how to create a dockerfile. Instead, we won’t focus on how to download or deploy images with docker. If you want to dive deeper and understand more facets of docker, you can read this tutorial on How to use Docker.
Tutorial: Create a Dockerfile
We can start by clearly define what we want to accomplish. That is, we want to containerize our application. Imagine you have a Node.js backend application, and that you want to create a container that runs it. Our goal will be to load the container with all the configuration and source code needed so that it can run it as soon as it is powered on.
Where to place the dockerfile
The first step is to create a dockerfile file. That is it, name it just dockerfile
without extension. This file should lay at your project root, so in the same place where you have your package.json
(in case of this Node.js project).
.
├── README.md
├── dockerfile <-- DOCKERFILE IS HERE
├── node_modules
├── package-lock.json
├── package.json
├── spec
└── src
As easy as that, just place an empty file named dockerfile
in your project root and we are good to go.
Decide the OS
Our container will have to run some operating system of some sort. In 99.99% of the cases, we want this operating system to be some Linux distro because it is open source and lightweight.
The dockerfile tells docker how to create our container image, and docker processes it top-down. The instructions at the top are run first, the ones at the bottom later. Hence, the very first instruction is to tell Docker on which OS we want to base our container.
We can do this with the FROM command. This command simply says “we are basing our container image on this other container image”. As such, you want to provide the name of a container image with its version. Yes, another container image. This means we do not have to use a barebone OS – we can use another image with some software preinstalled.
Since our app runs Node.js, we can leverage an image that already has it installed. We picked Node.js with version 12, and the image that gives us that is node:12
. Hence, our first line will look like this:
FROM node:12
If you want to go barebone, you could. You would have to install Node.js yourself in that case. Anyway, you can find many images in the Docker Hub, including barebone OS. For example, if you want Debian you can go with FROM debian:stretch
.
Since we can inherit only from one image, you can have only one FROM
statement in your dockerfile.
Working Directory
A good next step is to define our working directory. That is the place where we want to move our cursor to run all subsequent commands. If you were to prepare a VM yourself from the command line, this is the equivalent to cd /your/working/directory
. As such, this step is optional but highly recommended.
Other than defining our working directory, we also set the working directory for our live container. This means that whenever we tell docker to run a specific CLI command inside our container, it will run from this folder by default.
To define the workdir, we simply use the WORKDIR
command. If you are using the node:12
image, Node is already setup to run a Node.js app from /usr/src/app
, so this is where we should go.
WORKDIR /usr/src/app
Tweaking with dependencies
If you are running a simple Node.js app, you may not need this step. In fact, this would not be needed if you are following this tutorial. Yet, it is a crucial step when creating most containers, so we should spend a few words.
With the RUN command, you can define a set of commands to run inside the CLI when creating the container image. That is, you are effectively preconfiguring it – and dockerfile is all about pre-configuration. For example, you may need some special dependencies. You can use backslash (\
) to provide multiple commands on multiple lines with just a single RUN
instruction.
The following examples installs the Java Runtime Environment.
RUN \
apt-get update && \
apt-get upgrade -y && \
apt-get install -y default-jre && \
We gave no command in the first line (just \
) for better readability.
Another command you may want to use is ENV
, to define environment variables. You want to define only static environment variables, that is the ones you need to make your dependencies work. You should not define here application variables such as links to the database. Instead, you should inject those in the container when you run it – not when creating its image.
In any case, getting back to our JRE example, we can define the JAVA_HOME variable like so.
ENV JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/jre"
Using your own files
This is where the magic happens. In fact, so far we just created a container image that is an empty box. It is able to run a Node.js – any Node.js app – but it comes with no app. We need to put our source code in the right place.
We can do so with the COPY
command. This command copies a file or folder from your system inside the container image. That is the reason why it was so important to have our dockerfile at the root of our project. In fact, when copying, we refer to the files in our system relative to the position of our dockerfile. Instead, on the container-side, they refer to the folder defined with WORKDIR
. If you dot define any, it will rely on the one defined in your FROM
image.
We generally want to copy our package.json
and package-lock.json
, our entry point (index.js
), the src
folder, and the node_modules
folder. Note that we can wildcard the path (package*.json
).
COPY package*.json ./
COPY index.js ./
COPY .babelrc ./
COPY src ./src
COPY node_modules ./node_modules
We also copied .babelrc
because our app is using Babel to run ES6 JavaScript.
Note that we have two ways to go. The first, the one we used here, is to run npm install
on our system, and then copy the created node_modules
folder in the container. Instead, the second approach would be to copy only the package*.json
and then run installation inside the container (RUN npm install
).
These two may sound equivalent, but the first is much better. Why? In case you have dependencies from private repositories, you do not need to put the credentials to the repository inside the container image. Even better, you may have a DevOps pipeline that runs tests and it does so in the local system. So, if your test succeeds, you want to copy those dependencies that have been tested and you are 100% sure they will work.
Expose your app
At this point, our app is ready to run but our container would still be useless. That is because we need to enable some external network access. In other words, we want to give people the possibility to reach our container.
When deploying an application in a Docker hypervisor, users will have to traverse many layers before reaching your app. While your app is inside your container, your users are out there in the wild wild Internet. To reach your app, they will have to traverse the Internet, your firewall, a load balancer inside your docker infrastructure, and then finally your container.
Everything else is out of scope here, but as of now your container is airtight and won’t let any connection in. We need to expose a TCP/IP port. The one we want to expose is port 80, the one from HTTP. That is so easy:
EXPOSE 80
Obviously, EXPOSE
is just one side of the story. This is the equivalent of making a little hole in the container shell so that users can enter from it. Whatever we have something inside ready to accept connections is another story. However, in this case, we have, because our app will listen on port 80 (and we should make sure of that). If you are running Express.js, pay attention where you have app.listen()
.
What about HTTPS? Normally, containers expose HTTP. Then, you have a load balancer to expose with HTTPS to the user and proxy the connection to the container.
Running your app
With all this, we created something like a snapshot, the final position we want our container to be in. That is not enough, we need to go a little further. Specifically, we need to tell docker what command to run when the container image starts into an actual live container.
We can provide any CLI command here, but we have one constraint. Any command we provide will run in the foreground. In fact, each container can run just one single process. You can have an image capable of running multiple processes, but you will have to spin-off from that image one container for each process you want to run.
This can sound like a limit, but it is consistent with the statelessness of containers. By running a single process per container, we can tie the container to that process. Hence, if the process fails the container will crash. And, if the container crashes, our orchestrator can take care of starting up a new one.
So, instead of thinking about containers like stateless virtual machines, think of them like processes with an environment onboard.
To define which command to run, we use the CMD command. We should provide an array of words. In the following example, we are trying to run npm run start
.
CMD ["npm", "run", "start"]
Our Complete Dockerfile
In this dockerfile tutorial, we saw all the major components of our dockerfile. For your convenience, you can find them all here together, ready for copy-and-paste.
FROM node:12
WORKDIR /usr/src/app
COPY package*.json ./
COPY index.js ./
COPY .babelrc ./
COPY src ./src
COPY node_modules ./node_modules
EXPOSE 80
CMD ["npm", "run", "start"]
Pro Tip: Single File
At the very beginning of this dockerfile tutorial, we saw that you should strive to have everything fit inside your dockerfile. That is, everything but your actual application source code.
What do we mean by that? If you followed this tutorial, it may seem hard or impossible to have something outside your dockerfile. Yet, things can start to be more obvious if we move to configure a more legacy web server such as Apache or Tomcat.
Those web servers rely on a configuration file to run, which from the standpoint of our app is not part of the application. It is part of the environment. Since it is an environment configuration, you should include it in your dockerfile.
Specifically, you should not COPY
that web server configuration file, because it would mean your dockerfile relies on an external file (the one you copy, indeed). Instead, you should echo
(or cat
) it directly in the right position, so that the body of the file is contained within the dockerfile.
An example will make things much clearer.
# Don't do this
COPY server.xml ./server.xml
# Do this instead
cat <<EOF >> server.xml
<?XML version="1" encoding="UTF-8"?>
<Server port="80" shutdown="SHUTDOWN">
<!-- Rest of XML Config -->
</Server>
EOF
In Conclusion
Congratulations, you made it! By now, you should have all the tools needed to properly containerize your application. Not only that, but you should have the tools to do it in a good way.
At first, putting your app into a container may seem like unecessary complexity, but the more you realize how much maintaining an app is actually complex, the more you will realize how this is a lifesaver.