For a few years our software has been deployed to Google Cloud’s AppEngine Flex environment, which behind the scenes creates a Docker image, pushes it to the GCP Container Registry and then deploys it to a compute engine instance managed by App Engine.
The time had come to take control of the Docker build to increase the flexibility of the deployment options. In GCP there are four Compute options where we can deploy a docker image:
- App Engine Flex
- Compute Engine
- Cloud Run
- Kubernetes (GKE)
Each option has its own trade-offs in features and pricing, so us and our customers could choose what best suits their requirements, and even change the service deployed to over time. Another Kubernetes based option would be to deploy to Anthos.
How to create an optimal production Docker image required some research to remember what the key considerations are when building an image, which include:
- Security - A slimmer base image, minimal extra files and running the app as a non-root user reduces the surface attack area.
- Size - A smaller image means less time when starting a new container. However sometimes extra steps are required in the build process to ensure the final image is the smallest possible
- Speed - The ordering of operations in the Dockerfile is crucial to allow the optimal caching of layers that contribute to an image to reduce build time.
The final code for this example can be found here.
Our test app will be a basic NestJS starter app, which we can create by running:
$ npm i -g @nestjs/cli $ nest new api-server
We’ll start with a simple naive approach, and work our way through to the final optimised to explain the steps.
The goal will be to run the build in a GitLab CI pipeline, so the node_modules folder won’t exist when the repository is checked out in the build. For now we will run it locally to explain some concepts as we update the Dockerfile.
Let’s create the naive Dockerfile in /api-server
FROM node:12 COPY /package.json ./yarn.lock ./ COPY /src ./src RUN yarn install RUN npm run build CMD ["node", "dist/main.js"]
docker build -t test/api-server .
You’ll notice the first line of the output is
Sending build context to Docker daemon 219.8MB
The docker client sends all the files in the context to the docker daemon. As we ran docker command with the final
. arg (i.e. the current directory) then the context will be all the files in the current directory.
We can exclude files that we don’t need to send by defining a .dockerignore file, which has the same format as a .gitignore file.
We can simulate being on a build server that has checked out the code by ignoring the node_modules we have already installed by creating a file named
.dockerignore with the contents:
When we run the docker build command again the output is now
Sending build context to Docker daemon 835.1kB
Finally note that the
COPY /package.json ./yarn.lock ./ command copies the files from the docker context to the image, not from the file system the docker cli was executed from. If you ever get an error of missing a file to copy make sure you haven’t excluded it from the context in the .dockerignore file.
Building from GitLab CI
Now we’ve tested built the image locally the next step is to build it with GitLab CI and push the image to the GCP Container Registry. Create a
gitlab-ci.yaml file with the contents:
image: node:12 variables: GCP_PROJECT_ID: 'gcp-nodejs-docker' APP_NAME: 'api-server' stages: - build-test - build-docker-image build-test: stage: build-test script: - cd ./api-server - yarn install --silent - npm run build - npm run test artifacts: paths: - api-server/node_modules/ - api-server/build/ build-docker-image: image: docker:19.03.1 stage: build-docker-image services: - docker:19.03.1-dind script: - cd api-server # Login using a service account created for the CI build. Define SERVICE_ACCOUNT_KEY as a file variable in the CI settings - docker login -u _json_key --password-stdin https://gcr.io < $SERVICE_ACCOUNT_KEY - docker build -t gcr.io/$GCP_PROJECT_ID/$APP_NAME:$CI_COMMIT_SHA . - docker push gcr.io/$GCP_PROJECT_ID/$APP_NAME:$CI_COMMIT_SHA
Before you commit and push this you will need to:
- Create a service account named gitlab-ci and grant it the Storage Admin role so it can push to the Container Registry, and Cloud Run Admin so it can deploy to Run
- Download a key for the service account and copy the contents to a GitLab CI file variable named $SERVICE_ACCOUNT_KEY
- Enable Google Container Registry API for the project
This build pipeline gives us a baseline of a simple implementation, which took 6min 14seconds 445.5 MB
Weighing in at almost half a gigabyte means more time building, uploading and downloading the image. Let’s see how small we can get it…
Optimising the image
The default node image contains all the tools one might need in development which adds significantly to the image size.
Some lightweight options include the slim or alpine versions of the node image and Minimial Ubuntu which have the own differences i.e. pros and cons. Google has a lightweight distroless image we can use as our base image.
Lets update our Dockerfile to start with
Attempting to build this will result in the error, first as Yarn isn’t installed, and second there isn’t even a shell to run Yarn from.
Normally we would fix this issue by creating a multi-stage build. As the CI build runner is already a version controller script running in a docker container, we will do the required preparation in the CI build script.
In our gitlab-ci.yaml file update the
build-test stage to be the following:
build-test: stage: build-test script: - cd ./api-server - yarn install --frozen-lockfile - npm run build - npm run test - rm -rf node_modules - yarn install --frozen-lockfile --production artifacts: paths: - api-server/node_modules/ - api-server/build/
If we were using npm instead of yarn the equivalent script would be
script: - cd ./api-server - npm ci - npm run build - npm run test - npm prune --production
By declaring the node_modules and build artifact paths they will be available in the next build stage where we don’t have the node tools available to create them, as the
build-docker-image stage container is created from the ‘docker:19.03.1’ image.
If we wanted to run the Docker build locally we will need to include the node_module dir in the docker context. So remove the node_modules exclusion from the .dockerignore file. A common way to build a .dockerignore is to specify all the extra files we want to ignore, like tsconfig.json, src/ etc, or we can ignore everything by default and then specify what we want to include. .dockerignore
# Ignore everything ** # Then use overrides to specify what we include !/package.json !/yarn.lock !/build/** !/node_modules/**
Finally update our Dockerfile to
FROM gcr.io/distroless/nodejs COPY /package.json ./yarn.lock ./ COPY /node_modules ./node_modules COPY /dist ./dist CMD ["node", "dist/main.js"]
Optimise for caching the build layers
In our initial Docker file the line
COPY /node_modules ./node_modules is the equivalent replacement for
RUN yarn install, and
COPY /dist ./dist the equivalent for
COPY /src ./src.
The ordering of the operations has been changed to optimise the caching of the layers. The changes to the file system from each command are stored separately in layers and cached.
If a layer changes then it invalidates the cache of any layers after it.
The rule of thumb is to copy the files that change the least often first.
During the course of development the src/dist files would change more often than the package.json/node_modules, so we copy the dist folder last. You can find numerous articles online going into more details of the layers and caching if you wish the research this topic further.
Testing the production image
With the CI build succeeding and pushing our image to GCP Container Registry now we need to test if we actually built our image correctly. To quickly test it two options are:
- Build the docker image locally and run it with the docker CLI
- From Container Registry deploy the image to Cloud Run
When this image is deployed it failed with the error
Error: Cannot find module '/node' at Function.Module._resolveFilename ...
The distroless Node image is configured differently to other images like the base node image or the launcher.gcr.io/google/nodejs image, in that the CMD arguments will be passed directly to the node executable, instead of a shell.
So we need to update the final line of the Dockerfile to
If we redeploy again to Cloud Run we can see how long it takes for the new version to be deployed in the logs
From the ‘Cloud Run ReplaceService’ log message to the ‘Listening on port 8080’ message is 11 seconds.
Keeping our image size down means lets bytes to pull over the network, a smaller docker container to initialise, which increases the responsiveness of a cluster to serve requests when it needs to auto-scale up or replace a crashed node.
Instead of manually deploying to Cloud Run let have our CI build do the deployment. Add the
deploy-run stage and
RUN_REGION variable as per the following changes to gitlab-ci.yaml:
variables: GCP_PROJECT_ID: 'gcp-nodejs-docker' APP_NAME: 'api-server' RUN_REGION: 'us-central1' stages: - build-test - build-docker-image - deploy-run ... deploy-run: image: google/cloud-sdk:latest stage: deploy-run script: - gcloud auth activate-service-account --key-file=$SERVICE_ACCOUNT_KEY # Use APP_NAME as the Run SERVICE_NAME - gcloud run deploy $APP_NAME --project=$GCP_PROJECT_ID --image gcr.io/$GCP_PROJECT_ID/$APP_NAME:$CI_COMMIT_SHA --platform managed --allow-unauthenticated --region=us-central1 when: manual
--platform-managed means the Run service will be deployed on a Google managed cluster, with the alternative to deploy on your own Anthos service.
--allow-unauthenticated allows the service to be publicly accessible without authentication.
After manually running the stage we find this error:
Consulting the Cloud Run documentation we need to grant the Service Account User role to the Cloud Run service account.
or by the command line
gcloud iam service-accounts add-iam-policy-binding \ [PROJECT_NUMBER]-firstname.lastname@example.org \ --member="[SERVICE_ACCOUNT]" --role="roles/iam.serviceAccountUser"
You can simply re-run the failed
deploy-run stage in GitLab CI and then should see your Run service deployed with the URL to access and test the service.
Deploying to App Engine
Before you deploy the docker image to AppEngine you will need to:
- Enable the App Engine Admin API and App Engine Flex API.
- To your CI service account add the App Engine Admin and Cloud Build Editor roles.
- Initialise the App Engine service.
The new stage to add to the gitlab-ci.yaml file is:
stages: - build-test - build-docker-image - deploy-run - deploy-appengine ... deploy-appengine: image: google/cloud-sdk:latest stage: deploy-appengine script: - cd api-server - gcloud auth activate-service-account --key-file=$SERVICE_ACCOUNT_KEY - gcloud app deploy --project=$GCP_PROJECT_ID --image-url gcr.io/$GCP_PROJECT_ID/$APP_NAME:$CI_COMMIT_SHA when: manual
App Engine needs an
app.yaml deployment descriptor file, so create it in /app-server with the basics for now:
env: flex runtime: custom # Use the custom runtime and not nodejs as we build our own docker image. threadsafe: true manual_scaling: instances: 1
Push the changes and start the manual CI step once the build-docker-image stage has completed. Then you should now see your App Engine service.
Deploying to Kubernetes
The 3rd deployment option with our docker image is to GKE. Just as App Engine needs a deployment descriptor, so does Kubernetes.
Create the file
apiVersion: apps/v1 kind: Deployment metadata: name: api-deployment spec: selector: matchLabels: app: api-server replicas: 1 template: metadata: labels: app: api-server env: production app.kubernetes.io/name: api-server app.kubernetes.io/instance: api-server-1 app.kubernetes.io/version: "0.0.1" app.kubernetes.io/component: server app.kubernetes.io/part-of: api app.kubernetes.io/managed-by: gitlab-ci spec: containers: - name: api-server image: gcr.io/gcp-nodejs-docker/api-server:$TAG ports: - containerPort: 8080 env: - name: PORT value: "8080" - name: LOG_LEVEL value: "info" - name: NODE_ENV value: "production" --- apiVersion: v1 kind: Service metadata: name: api-service spec: type: LoadBalancer ports: - port: 80 targetPort: 8080 protocol: TCP selector: app: api-server
The labels starting with app.kubernetes.io/ come from the recommended labels be applied on every resource object.
This file is just a template to generate the actual deployment.yaml file as we need to replace the $TAG token. Our CI script will use
sed to output the deployment.yaml file with the required label.
variables: ... K8_CLUSTER_NAME: 'cluster-1' K8_CLUSTER_ZONE: 'us-central1-c' stages: ... - deploy-gke ... deploy-gke: image: google/cloud-sdk:latest stage: deploy-gke script: - cd api-server - gcloud auth activate-service-account --key-file=$SERVICE_ACCOUNT_KEY - gcloud container clusters get-credentials $K8_CLUSTER_NAME --project=$GCP_PROJECT_ID --zone $K8_CLUSTER_ZONE - sed 's/$TAG/'"$CI_COMMIT_SHA"'/g' deployment-template.yaml > deployment.yaml - kubectl apply -f deployment.yaml when: manual
Create a basic GKE cluster, either in the console or through the command line.
gcloud beta container clusters create cluster-1 \ --zone us-central1-c \ --release-channel regular
Next add the Kubernetes Engine Developer role to the gitlab-ci service account, so it has permission to perform the deployment.
Now we can push the changes and run the manual
deploy-gke CI stage. If all went to plan then you should see the service in the console, and the IP address you can access it with.
Deploying to Compute Engine
GCP also has support for deploying containers to compute engine instances. We won’t go into an example of this one, but will leave it up to you to implement and test if you so wish.
With docker becoming an ubiquitous way to package applications it’s becoming more important for application developers to have some understanding of it’s core concepts. Google Cloud Platform gives us a number of ways to deploy docker images, each with their own pricing model and other considerations. For example at the time of writing managed Cloud Run doesn’t support VPC connections. With this sample code you can continue to build out your own deployment pipeline to suit your requirements.
You can get the source code for this demo app at https://gitlab.com/apporchestra/gcp-nodejs-docker/