Use Docker tags to document the lifecycle of an image

Defreyne, Denis

Use Docker tags to document the lifecycle of an image

A Docker image typically goes through a lifecycle:

built: the image is successfully built
tested: the image has successfully passed tests (e.g. integration tests, security tests)
deployed to staging: the image is deployed into the staging environment
deployed to production: the image is deployed into the production environment

It’s advantageous to add a new tag when the image progresses to a new stage in the lifecycle.

As an example, take the Git commit SHA 0b3da7f:

If an image is built from this Git commit, the Docker image can be tagged as built-0b3da7f.
Once tests have passed, it can be tagged additionally as tested-0b3da7f (note that previously added tags are kept).
Once deployed to staging, it can be tagged as deployed-staging-0b3da7f.
Once deployed to production, it can be tagged as deployed-production-0b3da7f.

Advantages to cleanup

The list of tags can be used to be clean up stale Docker images. For each Docker image, the following can be checked:

Does the image have a tag like deployed-production-…, and is it one of the 10 most recent images tagged like deployed-production-…? If so, keep it.
Does the image have a tag like deployed-staging-…, and is it one of the 20 most recent images tagged like deployed-staging-…? If so, keep it.
Does the image have a tag like tested-…, and is it less than 30 days old? If so, keep it.
Is the image younger than 5 days? If so, keep it.
Otherwise, delete it.

Advantages to safety

When deploying a Git commit, the deploy script can construct the appropriate Docker tag to deploy. For example, to perform a staging deploy, concatenate tested- with the (short) Git SHA, e.g. tested-0b3da7f. Pull the Docker image with that tag and deploy it.

The deployment will fail if no Docker image with such a tag exists. In other words, the deployment will fail if the Docker image did not pass tests.

Some cautionary notes:

Such a deploy script should probably use the full Git SHA, rather than an abbreviated one as used in the examples.
Hotfixes might need to be taken into account. While this added safety is nice, it can slow down the process and make hotfixes too slow to deploy.

Use outside of Docker

This approach is not limited to Docker. While I have not done so myself, I imagine it could work for publishing builds on an S3 bucket or similar.

How it could work (assuming 0b3da7f is the Git commit SHA of the build, which itself is a .tar.gz file):

Create a bucket with directories builds and tags.
When a build is published, store it in builds/0b3da7f.tar.gz. Create the file tags/built-0b3da7f and store in it the string builds/0b3da7f.tar.gz.
When a build is tested, create the file tags/tested-0b3da7f and store in it the string builds/0b3da7f.tar.gz.
When a build is deployed to staging, create the file tags/deployed-staging-0b3da7f and store in it the string builds/0b3da7f.tar.gz.
When a build is deployed to production, create the file tags/deployed-production-0b3da7f and store in it the string builds/0b3da7f.tar.gz.

To facilitate cleanup, it might be necessary to add the date to each tag, e.g. tags/deployed-production-20210222-0b3da7f, with 20210222 being the current date (February 22nd, 2021). With the date in the tag name, it becomes possible to parse out the date and delete stale builds.

Note last edited November 2023.

Incoming links: Software development.