In my role as co-CEO of Trunk I’ve had the opportunity to speak with many engineering leaders across the industry. This gives me insight on how engineering organizations manage their tech debt, the solutions they have used, and some lessons we can learn from them.
Monolith vs microservices
Let’s start with the monolith architecture. The explosion of microservices in software development practice would lead us to believe that monoliths should 100% be avoided. I often hear from highly successful companies that in year 10 of operation they begin a massive project to break apart their Ruby or Python monolith. I have also seen smaller orgs, much earlier in their life cycles, break every piece of code into a microservice to avoid this 10 years down the road problem.
I think a monolith is actually beneficial early in the life of a service. There is a huge advantage to building everything in a single box. Deployment is simpler. Versioning is simpler. You build and ship everything at once. When you need to scale you can easily deploy more instances of the monolith, assuming a reasonable sharding solution.
Premature optimization
The main lesson here is that starting with a monolith is neither good nor bad. Just because a monolith will one day need to be decomposed doesn’t mean that it isn’t good tech to take on. Multiple billion dollar companies were built on the backs of a monolith. Building microservices at the start can be an exercise in premature optimization. If your startup is optimizing for scale that you want to one day have, then that is energy and time you aren’t spending on actually building features for customers. Needing to refactor in 10 years is a problem, but it’s a problem you want to have, because it means you found product market fit and your customer base has scaled up and now you can afford to fix the problem. Tech Debt can be good!
I would argue that every line of code is essentially some form of tech debt. Languages fall out of favor. This year’s trendy web framework will be very stale in a couple years time. It’s easy to fall into the trap of building for the future with the right technology. We must remember that at the end of the day, customers do not care how the product was built. They don’t care what tech stack you used, how modular is your architecture, or what UI toolkit you built on top of. Customers only care how the product works. They care that your product is reliable, and that you can continue to innovate to make it better.
Debt has value
That last part, being able to innovate, will often be the driving force behind a project to pay down tech debt. We can look at tech debt like we would traditional financial debt: businesses use debt to create leverage; leverage to move faster than they could otherwise. Engineers do the same thing. We build prototypes that the product team is so excited about that the prototype quickly becomes the product and gets shipped out the door to happy customers. On the flip side, if we take on debt for a project that flops we can write off the debt without paying it back.
Now let’s consider some tech debt that must be paid down. The most common case is deprecated dependencies. Many teams had massive Python tech debt that had to be paid when they were forced to migrate from Python 2 to Python 3. This was tech debt that could no longer be deferred. Similarly, AWS and other cloud providers will sunset older versions of a service API, requiring their customers to upgrade.
Finally, our security tools are constantly screaming at us - for good reason - to upgrade the dependencies in our package.json, requirements.txt, and cargo files. These are all cases of tech debt that must be paid down ASAP when the pain becomes too great, but not before. Migrating a codebase the minute Python 3 arrived would have been a mistake. It was the correct decision to wait a while, just not too long. The challenge is knowing when to migrate, and keep the code running in the meantime.
Summary
We need to pause and ask ourselves what actually benefits we are getting out of rewriting working code or optimizing early on by using the right technology. When your engineers are pushing to rewrite your service stack in Rust, because everyone loves Rust, or migrate your tests from Jest to Mocha, stop and consider the actual value gained first.
As a basic rubric, I would say the most important tech debt to pay down are rough spots in the system that are significantly slowing down your engineering cycles. Things like Flaky Tests, slow build/deploy pipelines, and brittle scripts for checking and linting your code base. Any place where an engineer is expecting their computing system to work, but instead has to babysit. That is the right place to start burning down your tech debt.
We just so happen to build a bunch of tools at Trunk that facilitate exactly that. Check replaces those brittle linter, formatting, and security scripts. Merge smooths and accelerates your build pipeline. And Flaky Tests (now in early access) lets you identify, quarantine, and fix buggy tests that slow down your engineers. I hope you’ll check our tools and give us your feedback. Happy Coding!