Back to all posts

From Black Box to Transparent Debugging

By Eli SchleiferJune 7, 2023
CI Analytics

Welcome Trunk’s CI Debugger

CI systems can fail, resulting in a laborious process of troubleshooting. Logs are examined, hypotheses are tested, and changes are applied in the hopes of a successful fix.

Trunk’s CI Debugger allows you to monitor and address issues on your CI machines in real-time, so you can move quickly to the rest of your tasks.

CI Debugger makes it easy for engineers to diagnose and fix issues quickly. No more slow, frustrating printf() debugging’ cycles. Get real-time, hands-on debugging, right in your CI system, no matter where it runs. Regardless of why your CI is failing, CI Debugger lets you peek inside, take control, and fix things up in a breeze.

Let’s go through why we’ve built this tool, and how you can start to use it with any CI system today.

What is happening?!?

Debugging a failing CI is difficult and time-consuming. The problem is due to their ephemeral nature. You spin up a machine, run your tests and your builds, and then the machine disappears.

When your CI pipeline is green and all systems are go, no problem. But when your tests fail, the problems start.

The machine disappears and your only clue to what went wrong is a log file. Combing through logs, to try to understand what went wrong and then making changes to the code in an attempt to fix the problem, without any real certainty that the fix will work.

Then you repeat the process of spinning up a machine, seeing it fail, and getting more logs. It’s printf debugging on an industrial scale.

And things go wrong in CI pipelines all the time:

  • Missing dependencies: You had a required package on your local machine during development but that dependency is missing from the CI machine.

  • Incorrect paths: If your builds depend on assets across different directories, you might have the relative paths set incorrectly.

  • Flaky tests: You might have known issues with your tests that you need to get past.

  • Single machines can fail during large jobs. If you are training a data model that gets sharded out to 60 different machines, it only takes one of those machines to fail for the whole pipeline to fail.

  • Residual processes: If you have semi-ephemeral machines (as we use for our builds), then residual processes can be left after tests, memory can get crammed, machines can slow, and then your tests fail.

In a traditional setup, developers don’t have real-time access to the CI environment. They can’t interact with the tests as they run, or change the environment on the fly to test hypotheses. This black-box nature of the CI systems is limiting.

Each iteration to fix a problem could take anywhere from several minutes to several hours, depending on the size of the project and the nature of the issue. During this time, the pipeline is blocked, delaying other developers and slowing down the entire development process. It can become a huge drain on time and resources.

This is the conundrum we face: CI, the very tool we use to increase speed and efficiency, becomes a bottleneck when things don’t go as planned.

This traditional model of CI debugging just doesn’t work. What’s needed is a way to step into your tests on the CI machine and debug in the same way you would locally. That’s what we’ve built.

Using Trunk’s CI Debugger to fix a test in real-time

Debugging locally is an interactive process. Making the same true for remote CI machines is the foundation of Trunk’s CI Debugger.

The key features of the CI Debugger include:

  1. Breakpoints: You can set up breakpoints in your CI job which trigger under certain conditions. When these conditions are met, the CI Debugger posts notifications about the failure, allowing you to connect to a debugging session for the issue.

  2. Live access: Once connected, you can access the machine, run commands, and download files. You can retry the failed command, make changes to the system, and rerun the test.

  3. Override exit codes: If you decide that the error isn’t critical, you can change the exit code to prevent it from blocking the entire release pipeline.

  4. Rules: You can set up rules for breakpoints to activate under certain conditions. For instance, you can set a rule to trigger a breakpoint when the job is being run by a specific author.

  5. Security: The CI Debugger allows secure communication with the CI system, allowing you to make changes and fix issues without compromising the security of your system.

CI Debugger works on Jenkins, GitHub Actions–any CI system you use. Wherever there’s a terminal, CI Debugger will work.

Here we’ll run through an example using GitHub Actions. Our first step is to set up the breakpoint configuration:

1name: Pull Request
2on: - pull_request
5 test:
6 name: Test
7 runs-on: ubuntu-latest
9 steps:
10 - name: Checkout
11 uses: actions/checkout@v2
13 - name: Install Trunk
14 run: curl -fsSL | bash
16 - name: Testing
17 uses: trunk-io/breakpoint@v1
18 with:
19 breakpoint_id: unit-tests
20 run: ./
21 trunk_token: ${{ secrets.INSERT_TRUNK_API_TOKEN }}

While your workflow may be different–the key information for the debugger are in the last three lines:

  • breakpoint_id is the contextual information. In this case, running the unit-tests rule set.

  • run is the command we’re going to wrap inside our breakpoint. In this case it is our shell command, but it could be Jest, gtest, mocha, or any testing/building/compilation step.

  • trunk_token authenticates against the chain of custody.

Like any other breakpoint, these breakpoints have rules we can enable and disable as needed. You can set up the rules for each breakpoint in the CI Debugger rules editor:

So within the unit-tests rule set, we say “if exit code is not equal to 0, break and send a message to the tester.”

With that set, we can run our tests. If the test does fail, in this case exiting with an exit code of 3, it will hit the breakpoint. Instead of getting a message in GitHub that your tests have failed, you get a message saying “A breakpoint has triggered, click here to connect to the session”:

Clicking on the link will launch into a secure SSH connection to the CI machine, and open up a debugging session for you:

You are now live on the machine. Type lsand you’ll list what’s in the directory. Type retry, and you can rerun the test. If it’s flaky, it’ll work; if not, you’ll then get the full output of the tests in the terminal as it happens.

Type download error.log and we’ll upload an encrypted version of error.log and a secure download link will print to the terminal.

While you are in the terminal doing all this work, what’s happening on GitHub? Nothing. As far as GitHub is concerned, the job is still running and it’s waiting for the checks to complete:

So you really are inside the CI pipeline. You have paused it and are now able to interact fully with your code to investigate the issues.

At this point, you might just need this to continue. So you can setexit 0and then continue, which will release the machine and give you a successful test (You can set who can use these overrides and have a complete audit trail of how your team has been debugging).

This is a toy example. But consider one of the problems from above–missing dependencies. If a test fails because of a missing dependency, you click the link, enter the live session and download the missing library. Then just retry directly in the terminal. Missing another? Do it again. Another? Do it again.

All the while, GitHub (or however you are running your tests) is just humming along. You now have the ability to stop time as far as your tests are concerned, and completely change anything in the system to help make your tests or builds run as effectively as possible.

Let’s stop guessing

Debugging in a run->fail->log->fix->run->fail-> cycle is a massive waste of time and resources for engineering teams. In an age where continuous integration plays a pivotal role in our development process, efficiency and real-time solutions are non-negotiable.

If an engineer was printf debugging locally, they’d be considered a noob. Yet we think nothing of debugging vital CI pipelines for huge applications in exactly the same way. The Trunk CI Debugger fixes that. It is bringing the same tools we have on our local machines to make things work and figure out why things are broken to your CI solution. It lets engineers do what they do best–get into the code, find the problem, and fix it properly. It’s a tool designed to take control, dissect problems, and implement effective solutions. And, in doing so, makes your team more effective and more efficient.

Check out our CI Debugger here.

Try it yourself or
request a demo

Get started for free

Try it yourself or
Request a Demo

Free for first 5 users