TNS
VOXPOP
Will JavaScript type annotations kill TypeScript?
The creators of Svelte and Turbo 8 both dropped TS recently saying that "it's not worth it".
Yes: If JavaScript gets type annotations then there's no reason for TypeScript to exist.
0%
No: TypeScript remains the best language for structuring large enterprise applications.
0%
TBD: The existing user base and its corpensource owner means that TypeScript isn’t likely to reach EOL without a putting up a fight.
0%
I hope they both die. I mean, if you really need strong types in the browser then you could leverage WASM and use a real programming language.
0%
I don’t know and I don’t care.
0%
CI/CD / DevOps / Operations

Continuous Benchmarking eBPF in Rust with Bencher 

A look at using Bencher, an open source continuous benchmarking tool, to track both our micro- and macro-benchmarks to catch performance regressions in CI.
Jul 21st, 2023 7:36am by
Featued image for: Continuous Benchmarking eBPF in Rust with Bencher 
Image from Bakhtiar Zein on Shutterstock

This is the fifth of a five-part series. Read Part 1, Part 2, Part 3 and Part 4.

In this series we learned what eBPF is, the tools to work with it and why eBPF performance is important. We created a basic eBPF XDP program, line by line in Rust using Aya. Then we went over how to evolve a basic eBPF XDP program to new feature requirements. Next, we refactored both our eBPF and userspace source code to be more testable and added both micro-benchmarks and macro-benchmarks. In this installment, we will look at using Bencher, an open source continuous benchmarking tool, to track both our micro- and macro-benchmarks to catch performance regressions in CI. All of the source code for the project is open source and available on GitHub.

Now that we have created both micro- and macro-benchmarks, we can easily see the performance improvements and regressions our changes make when working locally. For the same reasons that unit tests are run in CI to prevent feature regressions, benchmarks should also be run in CI to prevent performance regressions. This is called continuous benchmarking.

Continuous benchmarking is not a new concept. Companies such as Microsoft, Facebook, Apple, Amazon, Netflix, Google and many more have all created internal continuous benchmarking tools. However, Bencher is the first open source, self-hostable continuous benchmarking tool that works out of the box for any programming language. This is accomplished through the use of benchmark harness adapters. If there isn’t already a benchmark harness adapter for your preferred tool, please open an issue on GitHub.

In order to keep things simple, we’re going to be using the hosted SaaS version of Bencher for the rest of this post, Bencher Cloud. We will also be using GitHub Actions to run our benchmarks. Some folks mistakenly believe that you can’t run benchmarks in CI.

Most benchmarking harnesses use the system wall clock to measure latency or throughput. This is very helpful, as these are the exact metrics that we as developers care the most about. However, general-purpose CI environments are often noisy and inconsistent when measuring wall clock time. When performing continuous benchmarking, this volatility adds unwanted noise into the results.

There are a few options for handling this:

  • Relative benchmarking
  • Dedicated CI runners, like AWS Bare Metal
  • Switching benchmark harnesses to one that counts instructions as opposed to wall time, like Iai.

Or simply embrace the chaos! Continuous benchmarking doesn’t have to be perfect. Yes, reducing the volatility and thus the noise in your continuous benchmarking environment will allow you to detect ever finer performance regressions. However, don’t let perfect be the enemy of good here.

You might think that a ~25% variance in results is a lot. But ask yourself, can your current development process detect a factor of two or even a factor of ten performance regression before it affects your users? Probably not.

Even with all of the noise from a CI environment, tracking wall clock benchmarks can still pay great dividends in catching performance regressions before they reach your customers in production. Over time, as your software performance management matures, you can build from there. In the meantime, just use your regular CI.

Now, let’s get continuous benchmarking set up for “Fun XDP.” To follow along, just fork Bencher on GitHub. Then sign up for an account on Bencher Cloud. Once your email is verified and you’ve logged in, you’ll need to create a project. Click the “+ Add” button and create the new project. Name the project “Fun XDP,” leave the URL blank and leave the visibility set to “Public.”

Once your project is created, take note of the slug that was generated for it. It should be something like fun-xdp-12345. You will need it shortly. Next, you need to create an API token to use in CI. Select “API Tokens” from the menu bar, click the “+ Add” button, and create a new API token. Just input a name and leave the time to live (TTL) blank.

Take note of the new API token, you need to add it to the _repository_ secrets for your fork of Bencher on GitHub. The URL for this looks something like “https://github.com/USERNAME/bencher/settings/secrets/actions,” where USERNAME is replaced with your GitHub username. Name the secret “BENCHER_API_TOKEN” and set the value to the API token you just created.

Next, clone the repository and replace all of the jobs inside of .github/workflows/bencher.yml with a new job to run and track our micro-benchmarks:

  1. micro_benchmarks:
  2. name: Continuous Micro-Benchmarking with Bencher
  3. runs-on: ubuntu-latest
  4. env:
  5. BENCHER_PROJECT: fun-xdp-12345
  6. BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
  7. BENCHER_ADAPTER: rust_criterion
  8. steps:
  9. – uses: actions/checkout@v3
  10. – uses: bencherdev/bencher@v0.2.46
  11. – name: Run Micro-Benchmarks with Bencher
  12. run: |
  13. cd examples/ebpf/ebpf-common
  14. bencher run \
  15. –if-branch “$GITHUB_REF_NAME” \
  16. –else-if-branch “$GITHUB_BASE_REF” \
  17. –else-if-branch main \
  18. –err \
  19. “cargo bench”

Going line by line:

  1. Create a GitHub Actions job, micro_benchmarks.
  2. Give the job a name.
  3. Run on the latest Ubuntu image, ubuntu-latest.
  4. Set environment variables.
  5. Set the BENCHER_PROJECT environment variable to the project slug that we noted earlier. This should be something like fun-xdp-12345.
  6. Set the BENCHER_API_TOKEN environment variable to the repository secret that we set earlier, ${{ secrets.BENCHER_API_TOKEN }}.
  7. Set the BENCHER_ADAPTER environment variable to the exact benchmark harness adapter that we will be using, rust_criterion.
  8. Next, list the steps in this job.
  9. Checkout your source code, uses: actions/checkout@v3
  10. Install the Bencher CLI using the GitHub Action, uses: bencherdev/bencher@v0.2.46
  11. This step will track your benchmarks with Bencher.
  12. Run the following commands.
  13. Navigate to the ebpf-common crate.
  14. Run the bencher run CLI command.
  15. There are several options for setting the project branch. See branch selection for a full overview. The provided command uses GitHub Action default environment variables, and it tries to use the current branch data if it already exists, --if-branch "$GITHUB_REF_NAME".
  16. Create a clone of pull request (PR) target branch data and thresholds if it already exists, --else-if-branch "$GITHUB_BASE_REF".
  17. Otherwise, create a clone of the main branch data and thresholds, --else-if-branch main.
  18. Set the command to fail if an alert is generated. We will talk about this shortly, --err.
  19. Run your benchmarks and generate a report from the results, cargo bench.

Then let’s add a second job to run and track our macro-benchmarks:

  1. macro_benchmarks:
  2. name: Continuous Macro-Benchmarking with Bencher
  3. runs-on: ubuntu-latest
  4. env:
  5. BENCHER_PROJECT: fun-xdp-12345
  6. BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
  7. BENCHER_ADAPTER: json
  8. steps:
  9. – uses: actions/checkout@v3
  10. – uses: bencherdev/bencher@v0.2.46
  11. – name: Run Macro-Benchmarks with Bencher
  12. run: |
  13. cd examples/ebpf/ebpf
  14. bencher run \
  15. –if-branch “$GITHUB_REF_NAME” \
  16. –else-if-branch “$GITHUB_BASE_REF” \
  17. –else-if-branch main \
  18. –err \
  19. –file “../target/results.json” \
  20. “cargo bench”

The only differences here are:

  • The names of the job (lines 1 and 2)
  • Using the json adapter (line 7)
  • The name of the step (line 11)
  • Navigating to the ebpf userspace agent crate (line 13)
  • Loading the custom benchmark harness output from a file (line 19)

Now commit your changes and push them up to GitHub. Your benchmarks should then run, and the results should be uploaded to Bencher Cloud. In the Bencher Cloud UI, select your “Fun XDP” project. You should see a list of the most recent Reports, which should have just a single entry. Click on it. The perf chart should now display your results.

Boom! You are now tracking your benchmarks with Bencher. 🎉 This is a massive improvement. No more ephemeral benchmark results! Better yet, we are also set up to catch performance regressions in CI. Remember that --err flag we added to our bencher run CLI command? That will fail the job if an alert is generated. Alerts are generated when a statistical threshold is exceeded. Whenever you create a project, a default branch (main), default testbed (localhost) and a statistical threshold for the pair of them is automatically created.

If we were to replay the development history for “Fun XDP,” but with Bencher in place, we could have caught our performance regression when it was still a pull request. The Fizz (1) and FizzBuzz (2) features were nearly identical performance-wise. However, the FizzBuzzFibonacci (3) feature would have been statistically significant performance regression, exceeding the default threshold and generating an alert.

In summary, we have learned what eBPF is and why it is useful. Using Rust, we created a basic eBPF XDP program and then evolved it to add new features. The last of these new features caused a major performance regression. To prevent this from happening again, we refactored our code to be more testable and added both micro- and macro-benchmarks. In order to track these benchmarks, we added continuous benchmarking to our project using Bencher. With your continuous benchmarking in place, we can detect and prevent performance regressions before they make it to production.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.