Automated Software Maintenance
This is a guest post by Kevin Burnett, DevOps Lead at Rosetta Stone. |
Have you experienced that thing where you make a change in an app, and when you go to check on the results of the build, you find an error that really doesn’t seem relevant to your change? And then you notice that your build is the first in over a year. And then you realize that you have accidentally become the subject matter expert in this app.
You have no clue what change caused this failure or when that change occurred. Did one Jenkins agent become a snowflake server, accruing cruft on the file system that is not cleaned up before each build? Did some unpinned external dependency upgrade in a backwards-incompatible fashion? Did the credentials the build plan was using to connect to source control get rotated? Did a dependent system go offline? Or - and I realize that this is unthinkable - did you legitimately break a test?
Not only is this type of archaeological expedition often a bad time for the person who happened to commit to this app ("No good deed goes unpunished"), but it’s also unnecessary. There’s a simple way to reduce the cognitive load it takes to connect cause and effect: build more frequently.
One way we achieve this is by writing scripts to maintain our apps. When we
build, the goal is that an equivalent artifact should be produced unless there
was a change to the app in source control. As such, we pin all of our
dependencies to specific versions. But we also don’t want to languish on old
versions of dependencies, whether internal or external. So we also have an
auto-maintain
script that bumps all of these versions and commits the result.
I’ll give an example. We use docker to build and deploy our apps, and each app
depends on a base image that we host in a docker registry. So a Dockerfile
in
one of our apps would have a line like this:
FROM our.registry.example.com/rosettastone/sweet-repo:jenkins-awesome-project-sweet-repo-5
We build our base images in Jenkins and tag them with the Jenkins $BUILD_TAG
,
so this app is using build 5 of the rosettastone/sweet-repo
base image.
Let’s say we updated our sweet-repo
base image to use ubuntu 16.04 instead of 14.04
and this resulted in build 6 of the base image. Our auto-maintain script takes
care of upgrading an app that uses this base image to the most recent version.
The steps in the auto-maintain script look like this:
-
Figure out what base image tag you’re using.
-
Find the newest version of that base image tag by querying the docker registry.
-
If necessary, update the
FROM
line in the app’sDockerfile
to pull in the most recent version.
We do the same thing with library dependencies.
If our Gemfile.lock
is referencing an old library, running auto-maintain
will update things.
The same applies to the Jenkinsfile
for each app. If we decide to implement a new policy where we
discard old builds, we update auto-maintain
so that it will bring each app into
compliance with the policy, by changing, for example, this Jenkinsfile
:
pipeline {
agent { label 'docker' }
stages {
stage('commit_stage') {
steps {
sh('./bin/ci')
}
}
}
}
to this:
pipeline {
agent { label 'docker' }
options {
buildDiscarder(logRotator(numToKeepStr: '100'))
}
stages {
stage('commit_stage') {
steps {
sh('./bin/ci')
}
}
}
}
We try to account for these sorts of things (everything that we can) in our
auto-maintain
script rather than updating apps manually, since this reduces the
friction in keeping apps standardized.
Once you create an auto-maintain
script (start small), you just have to run it.
We run ours based on both "actions" and "non-actions." When an internal library
changes, we kick off app builds, so a library’s Jenkinsfile
might look like
this:
pipeline {
agent { label 'docker' }
stages {
stage('commit_stage') {
steps {
sh('./bin/ci')
}
}
stage('auto_maintain_things_that_might_be_using_me') {
steps {
build('hot-project/auto-maintain-all-apps/master')
}
}
}
}
When auto-maintain
updates something in an app, we have it commit the change
back to the app, which in turn triggers a build of that app, and—if all is
well—a production deployment.
The only missing link then for avoiding one-year build droughts is to get around
the problem where auto-maintain
isn’t actually updating anything in a certain app.
If no dependencies are changing, or if the technology in question is not
receiving much attention, auto-maintain
might not do anything for an
extended period of time, even if the script is run on a schedule using
cron
. For those cases, putting
a cron
trigger in the Pipeline for each app will ensure that builds still happen periodically:
pipeline {
agent { label 'docker' }
triggers {
cron('@weekly')
}
stages {
stage('commit_stage') {
steps {
sh('./bin/ci')
}
}
}
}
In most cases, these periodic builds won’t do anything different from the last
build, but when something does break, this strategy will allow you to decide
when you find out about it (by making your cron @weekly
, @daily
, etc)
instead of letting some poor developer find out about it when they do
something silly like commit code to an infrequently-modified app.
Kevin will be
presenting
more on this topic at
Jenkins World in August,
register with the code |