Trunk - Trunk Check: Hold-the-Line

Building a universal linter means not only managing the discovery, installation, caching, and execution of the right tool for every file in your codebase, but freeing your organization to enable new rules and checks without bringing engineering to a standstill.

Picture this scenario:

Being a good citizen of your codebase, you decide to finally rename your Foo class to something less generic; a quick find-and-replace should do the trick. You commit, push, send it for code review, and pat yourself on the back.
You check the CI run for your change and discover that your team’s linters, static analyzers, and formatters are complaining about all sorts of pre-existing issues in the files that your innocent change touched.
You skim the pre-existing issues and realize you have two options: (1) give up on the change, or (2) fix all the pre-existing issues, which would mean increasing the scope of your change, going down a bunch of rabbit holes, and possibly breaking something.

Which choice do you think most engineers make?

Which choice do you think most engineers should make?

A lot of engineering teams run into this exact problem: they decide they want to level up their code quality or consistency so they start running new checking tools on modified files or turn on new rules in existing tools, only discover that there are tons of pre-existing issues and without a way to ignore those issues, they can’t afford to turn on a new tool or enable a new rule.

Enter trunk check’s hold-the-line feature: with hold-the-line we figure out which issues already exist in your codebase. When you run those tools using trunk check, we complain if and only if your change contains new issues.

The Devil is in the Details

The solution that a lot of tools reach for¹ is to run a linter on the set of modified files², and then filter the output to only the modified lines³. Indeed, a number of tools go with this approach:

clang-tidy supports a –-line-filter flag,
lint-diff wraps eslint with a modified line filter, and
darker wraps mypy, pylint, and flake8 with a modified line filter.

Unfortunately, this approach relies on the flawed assumption that new lint issues can only appear on the modified lines. Consider this Python code:

1def foo():
2  s = 'asdf'
3  print(s)

And let’s say we decide we don’t want to print this anymore, so we take it away:

1def foo():
2  s = 'asdf'

If you compare the flake8findings from before and after the change, you’ll see that even though we didn’t change line 2, flake8 is unhappy with line 2 in the second version!

1foo.py:2:5: F841 local variable 's' is assigned to but never used

It’s easy to see how this might happen in a real scenario: you delete some code or you shuffle some stuff around, and you miss something in the process. And of course, this isn’t the only way you can end up with new lint issues on unmodified lines: there are plenty of other scenarios.

OK, so filtering for modified lines isn’t good enough — but what is the right way to do this?

Rinse / Wash / Repeat

The solution we’ve come up with is to run linters twice for each modified file — once on the upstream and once on the version in your current working tree — and then diff the results to figure out which lint issues already existed in your mainline and which are new.

Sounds straightforward enough, right?

Of course, no solution is ever complete without introducing new problems of its own:

How do you reliably lint the upstream?
What if we can’t lint the upstream?
What if you modify the configuration for a linter?

To lint the upstream, we prepare what we call a “shadow” tree, which we populate with your upstream git branch; we also cache these lint results, because there’s no point in re-running linters on already committed code.

But what if running a linter on your upstream fails? This isn’t just a hypothetical: we’ve found that there are a number of ways this can happen: say, for example, malformed code somehow got committed into your repo (maybe automation went haywire).

(This is the type of error handling that we take pride in doing well; one of the many qualities which distinguish great tools from good tools, after all, is good error handling.)

In these scenarios, you still want to see the lint results for your current, but you still also want to avoid getting spammed by existing lint issues. To handle this, we fall back to treating the modified lines as a lint issue filter: although we know there are shortcomings to this approach, it’s better than nothing, and perfect should not be the enemy of good.

The final question, dealing with modified linter configuration, is the trickiest but also the one: we actually run both the upstream and current lint actions using your current linter config, and discard the upstream’s linter config.

This sounds surprising, but we promise that there’s a good reason for this!

Let’s say that you’ve ended up with a mix of naming conventions for your constants — some use CamelCase, others use UPPER_SNAKE_CASE — and you realize that your linter needs you to tell it which one to enforce, so you update your linter config to enforce UPPER_SNAKE_CASE.

That doesn’t actually mean you want to go and fix every single CamelCase constant in your codebase right now; all you really want is to require that new code, starting today, follows your new rule. The existing issues in your codebase can be dealt with over time.

Of course, that means you also need some system to actually address them over time. That’s one of the reasons Trunk Check integrates with GitHub: we want to let you triage the existing formatting, lint, and security issues in your repository and decide how you want to burn those down.

We also have work planned to teach hold-the-line to detect when you’ve moved code, so that you can refactor your code without having your linters nag you. Stay tuned for updates!

Interested in building the next generation of developer tooling with us? We’re hiring!

What bothers you the most about your developer experience? Join the Trunk Community Slack and let us know.

¹ Another suboptimal solution we’ve seen is selectively disabling your linter (c.f. Rubocop), either by excluding specific files or disabling specific rules.

² Running linters on the set of modified files — technically speaking, the set of modified targets — is actually not a straightforward problem: although many linters operate on a per-file basis, there are a number of linters which don’t. clang-tidy, for example, operates on translation units (even though the CLI pretends to operate on files), golangci-lint can’t resolve types when run on a per-file basis, and terraform validate runs on directories, not single files.

³ Running a linter on only the modified lines sounds like another alternative here, but this is rarely if ever possible. Most linters operate on ASTs, and with only the modified lines you don’t have information to build one.