Continuous Benchmarking eBPF in Rust with Bencher

A look at using Bencher, an open source continuous benchmarking tool, to track both our micro- and macro-benchmarks to catch performance regressions in CI.

Jul 21st, 2023 7:36am by Everett Pompeii

Featued image for: Continuous Benchmarking eBPF in Rust with Bencher

Image from Bakhtiar Zein on Shutterstock

This is the fifth of a five-part series. Read Part 1, Part 2, Part 3 and Part 4.

In this series we learned what eBPF is, the tools to work with it and why eBPF performance is important. We created a basic eBPF XDP program, line by line in Rust using Aya. Then we went over how to evolve a basic eBPF XDP program to new feature requirements. Next, we refactored both our eBPF and userspace source code to be more testable and added both micro-benchmarks and macro-benchmarks. In this installment, we will look at using Bencher, an open source continuous benchmarking tool, to track both our micro- and macro-benchmarks to catch performance regressions in CI. All of the source code for the project is open source and available on GitHub.

Now that we have created both micro- and macro-benchmarks, we can easily see the performance improvements and regressions our changes make when working locally. For the same reasons that unit tests are run in CI to prevent feature regressions, benchmarks should also be run in CI to prevent performance regressions. This is called continuous benchmarking.

Continuous benchmarking is not a new concept. Companies such as Microsoft, Facebook, Apple, Amazon, Netflix, Google and many more have all created internal continuous benchmarking tools. However, Bencher is the first open source, self-hostable continuous benchmarking tool that works out of the box for any programming language. This is accomplished through the use of benchmark harness adapters. If there isn’t already a benchmark harness adapter for your preferred tool, please open an issue on GitHub.

In order to keep things simple, we’re going to be using the hosted SaaS version of Bencher for the rest of this post, Bencher Cloud. We will also be using GitHub Actions to run our benchmarks. Some folks mistakenly believe that you can’t run benchmarks in CI.

Most benchmarking harnesses use the system wall clock to measure latency or throughput. This is very helpful, as these are the exact metrics that we as developers care the most about. However, general-purpose CI environments are often noisy and inconsistent when measuring wall clock time. When performing continuous benchmarking, this volatility adds unwanted noise into the results.

There are a few options for handling this:

Relative benchmarking
Dedicated CI runners, like AWS Bare Metal
Switching benchmark harnesses to one that counts instructions as opposed to wall time, like Iai.

Or simply embrace the chaos! Continuous benchmarking doesn’t have to be perfect. Yes, reducing the volatility and thus the noise in your continuous benchmarking environment will allow you to detect ever finer performance regressions. However, don’t let perfect be the enemy of good here.

You might think that a ~25% variance in results is a lot. But ask yourself, can your current development process detect a factor of two or even a factor of ten performance regression before it affects your users? Probably not.

Even with all of the noise from a CI environment, tracking wall clock benchmarks can still pay great dividends in catching performance regressions before they reach your customers in production. Over time, as your software performance management matures, you can build from there. In the meantime, just use your regular CI.

Now, let’s get continuous benchmarking set up for “Fun XDP.” To follow along, just fork Bencher on GitHub. Then sign up for an account on Bencher Cloud. Once your email is verified and you’ve logged in, you’ll need to create a project. Click the “+ Add” button and create the new project. Name the project “Fun XDP,” leave the URL blank and leave the visibility set to “Public.”

Once your project is created, take note of the slug that was generated for it. It should be something like fun-xdp-12345. You will need it shortly. Next, you need to create an API token to use in CI. Select “API Tokens” from the menu bar, click the “+ Add” button, and create a new API token. Just input a name and leave the time to live (TTL) blank.

Take note of the new API token, you need to add it to the _repository_ secrets for your fork of Bencher on GitHub. The URL for this looks something like “https://github.com/USERNAME/bencher/settings/secrets/actions,” where USERNAME is replaced with your GitHub username. Name the secret “BENCHER_API_TOKEN” and set the value to the API token you just created.

Next, clone the repository and replace all of the jobs inside of .github/workflows/bencher.yml with a new job to run and track our micro-benchmarks:

micro_benchmarks:
name: Continuous Micro-Benchmarking with Bencher
runs-on: ubuntu-latest
env:
BENCHER_PROJECT: fun-xdp-12345
BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
BENCHER_ADAPTER: rust_criterion
steps:
– uses: actions/checkout@v3
– uses: bencherdev/bencher@v0.2.46
– name: Run Micro-Benchmarks with Bencher
run: |
cd examples/ebpf/ebpf-common
bencher run \
–if-branch “$GITHUB_REF_NAME” \
–else-if-branch “$GITHUB_BASE_REF” \
–else-if-branch main \
–err \
“cargo bench”

Going line by line:

Create a GitHub Actions job, micro_benchmarks.
Give the job a name.
Run on the latest Ubuntu image, ubuntu-latest.
Set environment variables.
Set the BENCHER_PROJECT environment variable to the project slug that we noted earlier. This should be something like fun-xdp-12345.
Set the BENCHER_API_TOKEN environment variable to the repository secret that we set earlier, ${{ secrets.BENCHER_API_TOKEN }}.
Set the BENCHER_ADAPTER environment variable to the exact benchmark harness adapter that we will be using, rust_criterion.
Next, list the steps in this job.
Checkout your source code, uses: actions/checkout@v3
Install the Bencher CLI using the GitHub Action, uses: bencherdev/bencher@v0.2.46
This step will track your benchmarks with Bencher.
Run the following commands.
Navigate to the ebpf-common crate.
Run the bencher run CLI command.
There are several options for setting the project branch. See branch selection for a full overview. The provided command uses GitHub Action default environment variables, and it tries to use the current branch data if it already exists, --if-branch "$GITHUB_REF_NAME".
Create a clone of pull request (PR) target branch data and thresholds if it already exists, --else-if-branch "$GITHUB_BASE_REF".
Otherwise, create a clone of the main branch data and thresholds, --else-if-branch main.
Set the command to fail if an alert is generated. We will talk about this shortly, --err.
Run your benchmarks and generate a report from the results, cargo bench.

Then let’s add a second job to run and track our macro-benchmarks:

macro_benchmarks:
name: Continuous Macro-Benchmarking with Bencher
runs-on: ubuntu-latest
env:
BENCHER_PROJECT: fun-xdp-12345
BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
BENCHER_ADAPTER: json
steps:
– uses: actions/checkout@v3
– uses: bencherdev/bencher@v0.2.46
– name: Run Macro-Benchmarks with Bencher
run: |
cd examples/ebpf/ebpf
bencher run \
–if-branch “$GITHUB_REF_NAME” \
–else-if-branch “$GITHUB_BASE_REF” \
–else-if-branch main \
–err \
–file “../target/results.json” \
“cargo bench”

The only differences here are:

The names of the job (lines 1 and 2)
Using the json adapter (line 7)
The name of the step (line 11)
Navigating to the ebpf userspace agent crate (line 13)
Loading the custom benchmark harness output from a file (line 19)

Now commit your changes and push them up to GitHub. Your benchmarks should then run, and the results should be uploaded to Bencher Cloud. In the Bencher Cloud UI, select your “Fun XDP” project. You should see a list of the most recent Reports, which should have just a single entry. Click on it. The perf chart should now display your results.

Boom! You are now tracking your benchmarks with Bencher. 🎉 This is a massive improvement. No more ephemeral benchmark results! Better yet, we are also set up to catch performance regressions in CI. Remember that --err flag we added to our bencher run CLI command? That will fail the job if an alert is generated. Alerts are generated when a statistical threshold is exceeded. Whenever you create a project, a default branch (main), default testbed (localhost) and a statistical threshold for the pair of them is automatically created.

If we were to replay the development history for “Fun XDP,” but with Bencher in place, we could have caught our performance regression when it was still a pull request. The Fizz (1) and FizzBuzz (2) features were nearly identical performance-wise. However, the FizzBuzzFibonacci (3) feature would have been statistically significant performance regression, exceeding the default threshold and generating an alert.

In summary, we have learned what eBPF is and why it is useful. Using Rust, we created a basic eBPF XDP program and then evolved it to add new features. The last of these new features caused a major performance regression. To prevent this from happening again, we refactored our code to be more testable and added both micro- and macro-benchmarks. In order to track these benchmarks, we added continuous benchmarking to our project using Bencher. With your continuous benchmarking in place, we can detect and prevent performance regressions before they make it to production.

Everett Pompeii is a software engineer with a focus on real-time network systems. As a recovering C++ developer, he now enjoys working primarily in Rust. He is a contributor and maintainer for several open source projects, the most recent of...