Search This Blog

Saturday, 7 May 2016

Why the order of the Docker commands in the Dockerfile matters

The Docker image is built from directives in the Dockerfile. Docker images are actually built in layers and each layer is part of the image's filesystem. A layer either adds to or replaces the previous one. Each Docker directive, when run, results in a new layer and the layer is cached, unless Docker is instructed otherwise. When the container starts, a final read-write layer is added on top to allow to store changes to the container. Docker uses the union filesystem to merge different filesystems and directories into the final one.

When the Docker image is being built, it uses the cached layers. If a layer was invalidated (one of the invalidation criteria is the checksum of files present in the layer, so when a file has changed during the build), Docker reruns all directives from the one, whose cached layer was invalidated, to the current command to recreate/create up-to-date layers. This makes Docker efficient and fast, and it also means the order of commands is important.

Example:


WRONG

1
2
3
4
5
6
WORKDIR /var/www/my_application            container directory where the RUN, CMD will be run

COPY .  /var/www/my_application            puts all our application files, including package,json, to its resting place in the container
RUN ["npm", "install"]                     installs application nodejs dependencies - this invalidates the cached layer for the COPY directive 
RUN ["node_modules/bower/bin/bower.js", "install", "--allow-root"]   installs application frontend dependencies - this invalidates the cached layer for the COPY directive          
RUN ["node_modules/gulp/bin/gulp.js"]      runs task runner that again invalidates the COPY directive layer by creating new (minified etc) files


CORRECT


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
WORKDIR /var/www/my_application            container directory where the RUN, CMD will be run

COPY package.json  /var/www/my_application this layer will only be invalidated if the package,json changes
RUN ["npm", "install"]                     installs application nodejs dependencies

COPY bower.json  /var/www/my_application   this layer will only be invalidated if the bower.json  changes
RUN ["node_modules/bower/bin/bower.js", "install", "--allow-root"]  installs application frontend dependencies    

COPY .  /var/www/my_application            puts all our application files to its resting place in the container

RUN ["node_modules/gulp/bin/gulp.js"]      runs task runner

Wednesday, 4 May 2016

Dockerizing a nodejs application requiring AWS authentication

Background


To dockerize an application/run an application in a Docker container, refers to running the application in a light, portable virtual machine (container), that contains all it needs to run the application without the need to have to set up the environment/dependencies that the application requires, whether locally or on a Virtual Machine, to be able to run it.

Docker works in a client/server mode. We need to have a Docker daemon (server) running in the background and we use the Docker CLI (command line interface, client) to send Docker directives (to build an image, run it, inspect containers, inspect the image, log into the container and more) to the Docker daemon. Docker directives can be either run by a root/sudoer or it is possible to create a docker group, add one's user to that group and run the docker CLI by that user.

Docker container provides a stripped version of the Linux OS. Docker image provides the application dependencies.
We say we load the image into the container and run the application, for which the image was built, in the container.

The order of the Docker commands in the Dockerfile matters:

The image is built based on directives found in the Dockerfile. Docker images are built in layers. Each layer is a part of the image's filesystem. It either adds to or replaces the previous layer. Each Docker directive, when run, results in a new layer and the layer is cached, unless Docker is instructed otherwise. When the container starts, a final read-write layer is added on top to allow to store changes to the container. Docker uses the union filesystem to merge different filesystems and directories into the final one.

When the Docker image is being built, it tries to use the previously cached layers. If a layer was invalidated (one of the invalidation criteria is the checksum of files present in the layer, so when a file has changed during the build), Docker reruns all directives from the one, whose cached layer was invalidated, to the current command to recreate/create up-to-date layers. This makes Docker efficient and fast, but it also means the order of commands is important.

Example:


WRONG

1
2
3
4
5
6
WORKDIR /var/www/my_application             container directory where the RUN, CMD will be run

COPY .  /var/www/my_application             puts all our application files, including package,json, to its resting place in the container
RUN npm install                                                           installs application nodejs dependencies - this invalidates the cached layer for the COPY directive 
RUN ["node_modules/bower/bin/bower.js", "install", "--allow-root"]   installs application frontend dependencies - this invalidates the cached layer for the COPY directive          
RUN ["node_monules/gulp/bin/gulp.js"]     runs task runner that again invalidates the COPY directive layer by creating new (minified etc) files

CORRECT

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
WORKDIR /var/www/my_application            container directory where the RUN, CMD will be run

COPY package.json  /var/www/my_application this layer will only be invalidated if the package,json changes
RUN npm install                            installs application nodejs dependencies

COPY bower,json  /var/www/my_application   this layer will only be invalidated if the bower.json  changes
RUN ["node_modules/bower/bin/bower.js", "install", "--allow-root"]  installs application frontend dependencies    

COPY .  /var/www/my_application            puts all our application files to its resting place in the container

RUN ["node_monules/gulp/bin/gulp.js"]      runs task runner


Essential commands:

Command to create the docker image:

[sudo] docker build image-name  .

[sudo] docker run  [ -t        -i  ]         image-name                [ ls -l ]
-----------|----------         |         |                      |                                |
runs the container      |         |                      |   command to run interactively in the container
                                   |         |     image to run in the container
                                   |    interactive    
                                   |
creates a pseudo terminal with stdin and stdout

Command to create the docker image:

[sudo] docker run ubuntu /bin/echo 'Hello everybody'
[sudo] docker run   -t -i   ubuntu /bin/bash                               (creates an interactive bash session that allows
                                                                                                        to explore the container)

Terminology


  • docker container ..........  provides basic Linux operating system
  • docker image ...............  loading the image into the container extends the container basics with the required dependencies to provide the desired functionality, eg running a web application, running, populating and maintaining a database etc
  • Dockerfile ..................... contains docker instructions/commands for creating the image
  • Docker hub/registry ...... Docker Engine (providing the core docker technology) makes it possible for people to share software through uploading created images

Requirements

  1. download the docker software to be able to run the docker daemon and the CLI binary, which allows to work with docker container and docker images
  2. create a docker image , which, after loading into a Docker container, sets up the container for running an application
  3. run the image in the docker container. There is a public docker registry (or it's possible to have a private one), from where it is either possible to download an already existing image and use it as is, or it is possible to use as existing image as a basis for creating a customized one.

Download and install the Docker

  1. installation instructions for Linux based Operating Systems:
    1. https://docs.docker.com/engine/installation/

Implement the application


                   TODO

Dockerize the application


Create a docker image build file called Dockerfile


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
FROM node:argon
ENV appDir ${appDir:-/var/www/}

RUN mkdir -p ${appDir}
WORKDIR ${appDir}

COPY package.json ${appDir}/
RUN ["npm", "install"]

COPY bower.json ${appDir}/
RUN ["./node_modules/bower/bin/bower", "install", "--allow-root"]

COPY . ${appDir}
RUN ["./node_modules/gulp/bin/gulp.js"]

CMD ["npm", "start"]

EXPOSE 9090

FROM node:argon
we shall base our custom image on a suitable existing image from the Docket registry/hub. We could use ubuntu image and download and install a particular nodejs version as part of the image build ourselves, but it is more convenient to use an image already created for a particular use. For other nodejs targeted images, see  Docker hub.

ENV APP_DIR=/var/www/clear_cache_example
we are setting an environment variable determining our application root.

RUN mkdir -p $APP_DIR
docker RUN directive executes a shell command.

WORKDIR $APP_DIR
docker WORKDIR command decides the place where the subsequent RUN docker directive will be executed.

COPY package.json $APP_DIR
copies a file in our local directory, where we shall be running the docker build command, into the application root directory in the docker container

RUN ["npm", "install"]
installs application dependencies into the container $APP_DIR/node_modules

COPY . $APP_DIR
copies the application files into the application root directory in the docker container.

RUN ["./node_modules/bower/bin/bower", "install", "--allow-root"]
the root will be the user running the commands (unless we specify another user with docker USER directive) and bower complains if we run install unless we specifically allow it. The relative path ./node_modules/bower/bin/bower needs to be used as bower is not installed globally in the node:argon image. We downloaded bower as a nodejs dependency and therefore we have access to its binary in the node_modules directory.

RUN ["./node_modules/gulp/bin/gulp.js"]
now run the default gulp task

CMD ["npm", "start"]
there can be only one CMD directive in the Dockerfile which either starts the application, as in our case,  or sets the command line arguments for an application command to run with, if ENTRYPOINT (which executable to run after the image is loaded into the container) is specified.

Example:

                # Default Memcached run command arguments
                CMD ["-u", "root", "-m", "128"]

                # Set the entrypoint to memcached binary
                ENTRYPOINT memcached

EXPOSE 9090
port on which the application is running in the container


Create .dockerignore file to exclude files from being added to the image

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
.git
.gitignore
.gitattributes
node_modules
bower_components
Dockerfile
.dockerignore
*.bak
*.orig
*.md
*.swo
*.swp
*.js.*

node_modules and bower_components dependencies will be downloaded and installed during the image build ()

Create a docker image of your application (command must be run in the directory containing Dockerfile

           [sudo] docker build -t [username/]image-name .

Run the application in the docker container

           [sudo] docker run  -v "/home/tamara/.aws:/root/.aws"  -p 3000:9999 -it image-name
                                             |                                                          |
                                             |                                                          |
                       binds a local directory                                application port exposed by the image, 9999,
                       to the container                                           is accessible locally/from outside of the container on 3000

The application communicates with an AWS S3 bucket. The AWS authentication resides in the .aws subdirectory of the user running the application in the form of the file /home/tamara/.aws/credentials. The dockerized application is run by root in the container (unless we dictate otherwise by using the docker USER command), so we bind, at runtime, the local /home/tamara/.aws to the container directory /root/.aws.

Literature

Docker documenation
How To Create Docker Containers Running Memcached