One of the main reasons we built CODE-RADE was to ensure that only the good stuff gets into the application repository. We want site maintainers and administrators to rest safe in the fact that mounting this repository is reliable, which is why we put such a high premium on testing.
The CODE-RADE pipeline
We tentatively compile things based on scripts in a repository, then test them in an ephemeral but reproducible environment. This environment needs to be augmented by the products from previous builds, since CODE-RADE builds atomically. We need a repository of the artifacts in this environment to be persisted since they must be available across builds - once one part of the repository has been thoroughly tested, we need to keep it as a building block for whatever comes downstream, so that we don’t have to build the entire chain from scratch.
This is easy if you’re on a single machine - just put things in a dedicated directory, and refer to that via an environment variable. If you’re working with containers, to provide a clean build environment, you can use the data volume pattern1 - this also works great when you’re building on a single site. Data in the CI build and test phase are persisted using data volumes, as shown in Figure 1
However, what about the case where you want to build “in the cloud”, or in geographically distributed places?
Building in the cloud
The heart of CODE-RADE is the CVMFS repository, which acts as a content distribution network for the quality products we build. How these products get into that repository is not prescribed - we only make the qualitative constraint that they need to be tested. Typically this testing is done by taking automated actions triggered by changes to a change-controlled repository. The kind of thing that does this is typically referred to as a “continuous integration server”, or CI- in our case, one of the most widely-used ones: Jenkins, but in principle any one would do. This also means that in principle, our Jenkins server at ci.sagrid.ac.za does not by design have a monopoly on the right to push products into the CVMFS repository! This could be done by any participating CI server, as long as there is some kind of co-ordination between them regarding who gets to build which job, and whether the tests are coherent.
We are reaching the limits of what a single build host can do, Previously, we would build artifacts and then tar them up into tarballs, which could be picked off the shelf, unpacked into the build environment and linked against - this could be one option.
Another would be to re-use the CVMFS infrastructure and have a CI repository, which could be mounted just like in deployment.
Both of these should be used in subsequent releases of CODE-RADE infrastructure.
References and Footnotes
Actually, we’re using the now-deprecated “data container” pattern and we still need to move to this data-volume pattern. Docker is not for the slow, yo. ↩