Docker & git flow — A symbiosis
In this blog post I want to outline how well docker can integrate with git flow.
The essential idea is that we craft docker tags from git branch names. Combined with a continuous delivery pipeline we achieve a flexible yet low effort versioning-framework which allows our image consumers to control the tradeoff between stability and continuous change rollout.
First let’s start with a common property of docker image tags and git branches: Both are mutable.
Git branches can be altered through commits (technically we are moving the branch pointer to another commit).
For docker tags it might be less obvious, but they are equally not bound to point to the same image for their lifetime in contrast to e.g. maven release versions: You may publish a newly built docker image with a tag which already points to an existing image. For example, you may rebuild and publish your tomcat-application-image:1.0
under the same tag 1.0
, even if that image already exists.
This choice of making docker tags mutable has been made deliberately. It allows implicit rollout of security fixes to all image consumers. Imagine that the developers of tomcat realize that they have an outdated system library in their image. The mutable tagging system allows them to republish a patched version of let’s say tomcat:9-jre8
and each consumer with a Dockerfile inheriting FROM tomcat:9-jre8
will automatically receive the patch the next time they rebuild their image.
Some sysadmins might feel a cold shiver down the spine hearing of implicit change rollout, and they have a valid point in that: Any implicit application-update may lead to unexpected and difficult to pinpoint errors in production.
The good news is: Docker still has the concept of immutable image references. Just use the digest to pin the exact binary blob you need.
docker pull baseimage:mutable-tag@sha256:immutabledigest123456789
Consequently, it’s up to the consumer of the image to decide: Do I prefer being implicitly up to date with security fixes or being absolutely certain of never introducing regressions? You may read more on mutable tags and how to work with them on this article about the tool renovatebot.
Let’s get back to git flow. In git flow as described by Vincent Driessen in 2010 we manage application releases in the form of dedicated branches in git. For example, we might have a branch named release/1.4. The interesting thing is: This makes releases inherently mutable. Suddenly, if a new commit appears in a release branch, the already existing release is modified. Compared to immutable maven release versions this is a big shift. Driessen already specifies that this is intended but only for bugfixes. Suddenly we have a remarkable parallel to docker image tags. They are mutable for the exact same reason.
All this becomes really meaningful when you think of continuous delivery. In continuous delivery we have the principle:
Each commit is a deployment candidate
We do not decide if something is going to be deployed before the commit (by setting a new fixed version number in our build system). We decide it right before pushing it to production, when it has passed all other stages and steps: Compile, unit tests, static code analysis, isolated application tests, deployment on staging environments, integration tests, customer acceptance tests, and whatever else you might have established. Only when all these are passed we are able to decide if that binary makes it to production.
Many of the steps above can be automated easily for every git branch. You may use jenkins or gitlab pipelines to name just two of many options out there.
So what’s the link with docker tagging now? The caveat is: You may use git-branch information to craft the docker-tags in your automated delivery pipeline. Just tell your build-pipeline to start from each commit, build a docker-image and publish it with the following tags:
application-image:develop
(on thedevelop
-branch)application-image:release_1.4
(on therelease/1.4
branch)application-image:feature_refactor
(on thefeature/refactor
branch)application-image:git-commit-revision-digest
(always)
By doing so you allow the consumers of your image to “signup” for a specific release line of your application, from most to least mutable:
- referencing the image by branch name (might be changed)
- referencing the image by git-revision (will almost never change)
- referencing the image by image-blob-digest (truly immutable)
A risk-sensitive Operations-team might decide to stick to image-blob-digests and accept the burden of frequent manual updates of that digest. On a staging environment you might choose to point to a release-branch. On a dev environment you might point to the develop-branch and risk that things break more frequently. Temporarily you could decide to switch to a feature-branch-tag to check how your changes interact with external dependencies within that environment.
Another type of consumer might be someone extending FROM
your docker-image. That person has the same control on which release-line they like to base their image on.
You may ask yourself if this approach will not lead to a flood of images. Indeed this approach will potentially build a new docker image on each commit. However, the docker layering approach allows to save a lot of disk space (If you push one image under two different tags, they both refer to the same physical blobs). Even if you modify one of the image layers to create a new image, the other layers won’t be stored twice. This largely reduces the total size of the blob-store.
In addition to that the need for powerful housekeeping tools has been identified in the docker community and is getting addressed by vendors and open source developers.
Let me add as a final note that is is all not new. Git flow dates back to 2010; “Continuous Delivery” is a buzzword since at least that long, and people have been trying to make it work with maven since long; Docker is around since 2013, which is fairly young but based on mature and proven Linux kernel technology.
In this blog post I hope to have pointed out, how docker and git flow can be combined to implement a flexible yet efficient approach to continuous delivery.
A sidenote: This way of tagging docker images may also be useful for other branching models than git flow — as long as your docker tags are generated from the branch names.
Further reading:
- philipphauer.de describes a very similar approach, without the part on git-flow branches. He raises the good question, if the scheme is also suitable for libraries / reusable application components.
- renovatebot.com explains how docker’s mutable tags have caused many nodeJS-applications to stop working out of a sudden and how this can be prevented.
- cloudbees.com Explains how to move away from traditional maven-semver-releases to versions based on the current build id (docker out of scope).
- axelfontaine.com The inventor of flyway on the limitations of the maven-release-plugin and how to use the git-revision as a maven artifact version (docker out of scope).