Making builds reliable and reproducible
1. Introduction
The goal of this blog post is to present a set of practices that significantly improves the reliability and reproducibility of software builds. The common theme of these practices is to introduce proper versioning of all dependencies, from packages to tools and compilers.
The post is structured as follows. In Section 2 we present the problem of builds becoming unreliable over time through some common scenarios. In Section 3 we discuss how to lock the version of your package dependencies, then in Section 4 we discuss how to make your build scripts self-contained, and finally we look at how to make the compiler and runtime explicit in Section 5. We conclude the post in Section 6.
2. The problem of unreliable and irreproducible builds
As a software developer, there is a certain class of bugs which tends to slowly sneak their way into your code base when working on software that has one or more of these characteristics:
- written by more than one person,
- over a longer period of time, and
- in several phases.
While most bugs - both syntactic and semantic ones - are usually caused by the code written by you or your colleagues, this other class of bugs cannot immediately be caught by your compiler or another test case. Specifically, the class of bugs we are talking about are the ones that occur when a code base is compiled, tested, or built on different machines or in different points in time, where either:
- the dependencies installed by the package manager,
- the external tools used to test the code, or
- the compiler used to compile the code,
have slowly drifted from the versions originally used to build and test the project. A bug of this kind usually manifests itself as either:
- a – potentially very subtle – semantic bug,
- a failing test, or
- a broken build.
To illustrate this problem, we use as a – slightly exaggerated – example from the real world where we actually managed to experience all three types of bugs at different points in time in the same codebase. The concrete project was a web app that was:
- implemented in AngularJS using bower to handle dependencies, and where
- build and tests scripts were written in node.js using grunt to define and manage build steps, and where npm was used to handle the dependencies of the build tool chain.
The cause and subsequent effect of each of the three types of bugs played out as follows:
- The first bug appeared because several of the AngularJS dependencies had been
declared with a bounded major version number,
^1.2
, rather than a specific version number,1.2.3
, or at least a bounded minor version number,~1.2
, in thebower.json
file. The effect was that over time these dependencies started to cause bugs on runtime, such as the classicundefined is not a function
, or failing tests as these AngularJS dependencies started to introduce (unintended) breaking changes in their newer minor versions, which would then be installed the next time someone – like our build server – would build the project from scratch. - The second bug appeared when the
grunt
CLI,grunt-cli
, introduced breaking changes in its newest version, as it was still running version0.x.y
, and it subsequently turned out that each developer had installed their own version of the grunt CLI locally on their machine withnpm -g install grunt-cli
, rather than use the one specified in the project’spackage.json
. Given that we usedgrunt
for building, testing, and packaging our code, and a few other things, it was no surprise that the immediate effect was that our grunt tasks started breaking left and right making us unable to build our AngularJS web app. - The third and final bug appeared when
node.js
introduced a breaking change in their runtime API on which one of our dependencies inpackage.json
depended on. The effect was that when we tried to run our built script, after having updated thenode.js
runtime, the script printed out the not so helpful error messagefunction not defined in nodejs version 8.0
,1 with no indication as to which function it was trying to invoke.
In each of the three cases above, the inevitable solution was to spend a non-trivial amount of time doing detective work trying to figure out which package-, tool-, and runtime version, respectively, that could successfully build and test the project.2
The causes of the three bugs above can be boiled down to:
- Relying on a loosely defined version of a package,
- relying on a globally installed version of a tool, and
- relying on a globally installed version of a compiler or runtime environment.
These problem are all symptoms of the “Works on my machine” life style and can have a frustrating impact on your colleagues, your build server, and your future self as it steals a non-trivial amount of time when it appears at some random point in the future – usually close to a deadline – where you may not have that time to spend tracking down the cause.
In the next three sections we look at concrete practices that can help avoid this class of bugs from happening while also discussing how different languages handle this issue.
3. Locking down dependencies
As mentioned in the previous section, the first – and most obvious – cause of
dependency related bugs is “relying on a loosely defined version of a
package”. In the AngularJS case, the problem manifested by specifying our
dependencies in the bower.json
file with just a bounded minor version:
which would then break the application when the angular
version 1.5
was
released with breaking changes. Fixing the issue is trivial by simply specifying
a concrete version, 1.4.0
, or a bounded minor version,3 ~1.4
, of each
dependency:
Unfortunately, this does not protect us against the case where one of our dependencies have defined one of their own dependency versions too loosely.
To illustrate the problem of locking down nested dependencies, we shift our
focus from the bower
package manager and over to the package.json
file and
the npm
package manager. Going back to the AngularJS project, we now specify
our dependencies with a concrete version number like so:
Unfortunately, it turns out that the jasmine-reporters
dependency also uses
^
when specifying some of its dependencies:
Fortunately, to help fix the problem of locking down the versions of nested
dependencies, newer versions of npm
have introduced the concept of a lock
file, package-lock.json
, which locks down the versions of all dependencies,
both direct and nested by specifying them in the package-lock.json
file, when
running npm install
:
This also makes it less risky to use the ~
notation when specifying
dependencies:
as the package-lock.json
file will contain information about the exact
versions actually installed, as seen above.
Not only does specifying exact versions and using lock files make your builds
more reliable, it also makes it easier to update versions as you can see the
whole diff of changed (nested) versions in the package-lock.json
file.
Likewise, it also makes your builds more secure as you can pinpoint which
exactly versions you are running on your server(s) in the rare case where an npm
package becomes compromised.
Lock files are not just a JavaScript-specific concept, but can be found in many
other languages, such as Elixir, where dependencies are specified in a mix.exs
file:
where a dependency can either be specific or bounded, ~>
, and is then resolved
to specific version in the accompanying mix.lock
lock file:
Finally, some languages take it a step further, like Elm, where the package
file, elm.json
, requires exact versions of all dependencies:
and the package manager enforces semantic versioning of packages, such that:
- changing the type signature of an already exposed type or function forces the package author to update the major version,
- exposing a new type or function forces the package author to update the minor version,
- making changes to types or functions not exposed via the package API forces the package author to update the patch version.
As demonstrated in this section we can achieve a higher level of confidence in our builds by diligently specifying the exact version of all of our package dependencies, thus ensuring that we will not be caught off guard by breaking changes in a dependency. In the next section we examine how to make our builds more reliable by making build and test scripts self-contained.
4. Making build scripts self-contained
Having learned how to properly version all of our dependencies, the next step is to make sure that our test and build scripts are not “relying on a globally installed version of a tool” but on the properly versioned tools that we have specified, i.e. making sure that everything needed to build the source code and run the tests on another machine is properly specified.
If we return to our AngularJS example, we originally had the following set of
scripts defined in our package.json
file:
which exhibit the second cause of dependency related bugs, as we are referring
to the globally installed versions of both bower
and grunt
– and
technically also npm
, but we address that in the Section
5 – which resulted in tests
failing and the build pipeline breaking.
In order to fix the issue, we do the following:
- Explicitly add
bower
andgrunt
to our list of dependencies in thepackage.json
file,
- and make our scripts use the local versions installed in the
node_modules
folder of the project:
thus ensuring that whenever npm test
is executed – be it on our own
development laptop or the company build server – the same version of bower
and grunt
is used.
While this issue of having to use 3rd party tools for building and testing is
especially prevalent for JavaScript projects, it can also occur for
non-JavaScript projects like Elm, where a project still might rely on tools
installed via npm
, like elm-test
, in which case it is also worth considering
to add proper versioning of any tool dependencies and make the test scripts
self-contained, e.g.
As demonstrated in this section, we can achieve an even greater level of confidence in our builds by combining the lesson from the previous section, of versioning our dependencies, with the practice of locally installing all needed 3rd party tools in our current project folder, thus making our build and test scripts self-contained. In the next section we take the final step and look at how to lock down the the compiler and runtime environment to further improve the reliability and reproducible of our builds.
5. Making compiler and runtime versions explicit
The final step on our road towards more reliable and reproducible builds is to avoid “relying on a globally installed version of a compiler or runtime environment” by making these explicit in each of our projects.
As mentioned in Section
2, one of the problems
we faced in the AngularJS project, was that our build script would all of a
sudden print the error message function not defined in nodejs version 8.0
and
exit without further explanation. While the error message did indicate that the
error had been introduced by changing the version of the nodejs
runtime, it
did not indicate which function it was trying to invoke that was now gone.
Fortunately, this issue can also be fixed by introducing proper versioning to
our project. Specifically, we introduced the
asdf tool, which is a “CLI tool that can
manage multiple language runtime versions on a per-project basis” similar to
what nvm
does for nodejs
, gvm
does for go
, and rbenv
does for ruby
.
Thus, using asdf
we could make sure that all of our different projects across
all different machines, both development and CI, would be using the exact same
versions of compilers and runtime environments when running our build and test
scripts.
Without going into the practical details of how to setup asdf
, the general
idea of asdf
is to add a .tool-versions
file to your project that contains
the version number(s) of the runtime(s) and compiler(s) needed to start the
project. This is then enforced by installing a
shim in the user’s favorite
shell that picks the proper runtime
or compiler based on the content of the .tool-versions
file, whenever the user
makes a call to such a runtime or compiler. In the specific case of our
AngularJS project, the .tool-versions
file simply contains the needed nodejs
version:
which is then installed onto the user’s machine when running asdf install
before running npm install
or similar.
Besides the AngularJS project, we have also used this .tool-verions
technique
to specify the versions of erlang
and elixir
in some of our microservices:
and nodejs
and elm
for some of our newer frontends:
as asdf
supports a simple plugin system
that makes it easy to support new languages and cmd line tools using the
.tool-versions
file.
In this section, we have shown how to properly version our compiler and runtime
dependencies using a .tool-versions
file and making these versions enforceable
by using asdf
. In the next section, we conclude this post.
6. Conclusion
In this blog post we presented a set of practices for significantly improving the reliability and reproducibility of software builds. These practices focused on:
- making all dependency versions explicit and freezing this versions in a lock file,
- including any needed build tools as properly versioned dependencies, and
- making any compiler and runtime needed to build or test a project explicit
using a
.tool-versions
file and enforceable byasdf
.
A final note: making builds truly reliable is not a trivial task and therefore the above principles are not an exhaustive list of what can be done to achieve this goal. Most notably, there is also the large topic of running scripts and servers in containers such as docker, which we have not covered.
-
The specific wording of the error message escapes me but it was about as helpful as the example message above. ↩
-
Note that the time spend doing detective work can dramatically increase if more than one of the three types of bugs occur simultaneously. ↩
-
This presumes that your dependencies do not introduce breaking changes in their patch versions. ↩
Programming
- « Previous |
- Archive |
- Next »