NeverSawUs

This talk is a gold mine:

  • Impress your friends by casually using "afferent" and "efferent" in conversation
  • Don't miss "Reach for the banana, get the gorilla", which might be my new favorite turn of phrase. (Immediately applicable, too: at $dayjob today, we're porting a Wordpress server to the new Terraform-based infrastructure setup. We only use Wordpress for its WYSIWYG editor!)
  • Discover the one weird reason bash uses fi to close if statements.

Some links I drew from the talk, which I'll be reading (or re-reading) in the coming week:


My day job involves paying down a small mountain of tech debt: the previous infrastructure team was laid off in January shortly before COVID was a going concern. (Ha, at the time I was worried work would stress me out!)

So, since early February I've been pair programming with ceej on a daily basis. She's observed, accurately, that left unchecked we will gleefully burn each other out by working too much -- we both find the work fun, and ratchet each other up into working more.

Looking back at the work we've gotten done in the last 4 months, I'm both impressed and a little scared. Time started turning into soup before COVID, if I'm being honest. We started by ditching our previous attempt at a new deploy system which was stymied by, err, organizational problems.

We wrote a deployment system inspired by npm's deploy system -- a git flow that acts on pushes to deploy/<cluster name>. Along the way I've had to learn a lot about ASP.NET and Windows server, Terraform, and GitHub actions. We've been writing a lot of Bash, which is handily one of my favorite languages, and a little bit of Rust for tooling. I'm happy with the result! Deploys used to take thirty minutes, or up to an hour or more. Now they take minutes. Our infrastructure costs a third of what it used to.

Really, our success here is down to breaking down the wall between infrastructure and developers and adopting a (now-classic) devops approach. This let us identify application-change level wins by observing the behavior of the system in the real world.

To illustrate: A particularly pernicious one-line bug in a C# dependency injector, coupled with a lack of communication between developers and infrastructure, resulted in our system maintaining 40,000 redis connections at a time, forcing us onto the most expensive elasticache instance. The wall that existed between devs and infrastructure kept either side from identifying the problem: we were creating a new Redis connection pool on every request to any endpoint. The problem? The Redis pool manager was attached to the application as a "scoped lifestyle" dependency -- that is, instantiated at request time -- and it was indirectly depended upon by every endpoint. Changing the lifestyle of the object fixed the bug and allowed us to drop to a reasonably sized Elasticache instance. Sorry AWS.

It's probably the most value I've ever introduced at a company with a one-line change. So that's been good for my ego.

This has been both the easy part and the hard part, though -- hard, in that it took major organizational changes to get here; easy, in that "running server processes on an EC2 instance" is a fairly well-understood problem, even on Windows. Now, we're looking at rewriting a half-decade old (or older?) C# monolith. We're nearly out of the woods with the infrastructure problem -- we've dutifully turned our technical debt into documentation debt, as we rewrite the system to match the needs of our current engineering team.

I can't shake the feeling that we've only just started hiking up the mountain, though. This application is layered, but each layer is attached to the next like velcro. Methods hand database IDs to each other in lieu of passing objects; methods may expect to be invoked by multiple callers for different purposes at different layers. I found an entire class dedicated to making integers positive. A critical delivery modality is, essentially, type-punned on top of a completely different delivery modality. ("What if you built a pizza delivery company by repurposing fleet planning software built for ice cream delivery trucks?")

Everyone I know needs a vacation, but no one can leave their house.