why we duplicate metrics
magic numbers and tuning syslog-ng

package management and the devops revolution

All this talk over the last few years about devops skirts around one particularly cynical take on traditional system administration: that it slows product development down. That error-proofing everything is unnecessary, that it's purely overhead and catastrophizing, at the expense of shipping new features. For many, certain failure modes aren't worth preventing, and there's a whole lot of these tradeoffs happening in the deployment space in order to ship new features, faster.

I think the core of this comes down to package management.

This is, by far, the biggest point of friction I've experienced. Certainly there are other places where traditional system administration is being improved or automated -- like better monitoring, better provisioning, etc. But I think that developers are looking for a more progressive approach to package management while systems administrators are a little more conservative.

In the broadest of strokes, packaging is the act of taking software and making the installation process repeatable. In turn this means that we're generating an artifact once, and installing it on a collection of servers; which means the state of the system is inspectable; which then means that we can ensure that every server is consistent. Tools like apt and yum are the usual things we think of -- and they're very mature package managers -- but even something like AMIs or tarballs could do the trick, depending on your needs.

It means that you're not re-compiling, re-building, re-executing your build steps on every single server. Putting aside the time it takes, you're ultimately multiplying the number of times you're doing an operation, increasing the likelihood of failure. (There's also a security angle here, in that some would argue that production hosts shouldn't have the "build" toolchain available: either installed or over the network. I personally agree with them, but I acknowledge there's tons of debate on this point.)

But it's slow and painful.

If you're relying on your upstream provider to give you packages, you're pretty much restricted to an out-of-date existence. Bugs are fixed in newer versions! Github is social coding! We add features to the libraries we use! Waiting on someone to give you packages is a losing proposition. When we optimize for multiple deploys a day, waiting a month or a year for your upstream to provide new functionality to you is a non-starter. If we want the latest and greatest -- and we often do -- you need to figure out how to package it.

Using pip or gem or cpanm is easy. Taking that output, packaging it, figuring out how to make it relocatable, etc, isn't very easy. Making sure that pip installs the exact same thing every time isn't easy. (You might specify exact versions of all your dependencies, but do those dependencies do the same?) Packaging each individual module takes forever. Packaging your app with its dependencies is bulky. Tools like fpm get us a lot closer, but it's still an extra step.

When "npm install" works most of the time, it's easy to start relying on it all of the time. When you're already deploying multiple times a day, it's easy to not want to make any particular installation repeatable. When you can replace any machine and just re-run your setup script, it's easy to not trap and handle failures in that process. But resiliency and convergence mean more than just "if I try this enough times, eventually it will work".

Over the last decade, I think the pendulum has swing back and forth from "package everything as an rpm and write an init script" to "just run it out of your homedir with screen". I don't like swinging between the extremes -- neither is great. Finding a healthy balance, somewhere in between, is devops is about. It's about finding a way to move fast while still maintaining a rapidly-deployable, operable system.

comments powered by Disqus