ABProf: An Accurate Statistical Profiling Harness

ABProf is a tool to compare which of two programs is faster. It can prove that a particular optimization worked, which is my job these days.

We talked about the easy-to-use version of ABProf recently. Let's talk about the more powerful and accurate version now. It's a bit harder to use, but often worth it.

To follow this post, first install abprof: "gem install abprof" will generally do it.

A Simple Example

Let's look at the simple version of the harness code. The below example sets up a block for sampling -- ABProf will run it repeatedly to check it for speed.

#!/usr/bin/env ruby

require "abprof"

puts "ABProf example: sleep 0.1 seconds"

ABProf::ABWorker.iteration do
sleep 0.001
end

ABProf::ABWorker.start

This is written in ABProf's worker DSL (Domain-Specific Language), which is different from its benchmark DSL.

This DSL is used to set up a worker process for ABProf. Normally ABProf will run two workers and use them to check which of two processes is faster, unless you run it "bare" and supply a block of Ruby. Bare, it will run the block from ABProf's own process. Running in a single process is fast but limits you to a single version of Ruby in-process (and a few other things.)

So: the benchmark DSL sets up a test between two workers (command lines, number of trials to run, desired accuracy, etc.), or you can just specify all that from the command line. The worker DSL sets up a single worker, or you can skip it and get worse accuracy when measuring.

Why worse accuracy? Because being able to do just one specific action is better than doing that plus having to start up a new process. The most accurate measurement is the one that does nearly nothing except measuring. Also, just running what you need instead of spinning up a new process is a bit faster. So without harness code, abprof takes longer to converge, and you wait longer for your benchmarking.

If you wanted to do an A/A test of sleep.rb against itself, you'd run abprof this way:

abprof examples/sleep.rb examples/sleep.rb

You can also pick how many trials to run, how many times to run the block per trial and many other options. Run "abprof --help" for the full list and default values. Here's an example:

abprof --pvalue=0.001 --min-trials=10 examples/sleep.rb examples/for_loop_10k.rb

By default, abprof runs your code block or your program a fair number of times -- 10 before it starts measuring, 10 per trial run. For something very slow (e.g. 5 minutes per run) you'll probably want fewer, while for something fast like a tight "for" loop you'll probably want many more. 10 times isn't many when you're benchmarking small, tight code.

The two commands, abprof and abcompare, are very similar. If you run abprof in "bare" mode with two command-line programs, you're getting the same thing as abcompare.

A More In-Depth Example

The example above is pretty simple - the block runs a single iteration. We can get better accuracy by running several at once, or by having the code measure in a specific way. ABProf supports both of these modes by changing the method used to supply the block. In the above, that method is called "iteration" and it times very simply. If you call the method "iteration_with_return_value" and return a measurement, you can time it yourself.

Here's an example:

#!/usr/bin/env ruby

require "abprof"

puts "ABProf example: sleep 0.1 seconds"

ABProf::ABWorker.iteration_with_return_value do
t1 = Time.now# Can also be a counter
  sleep 0.01
  (Time.now - t1)# Return measurement
end

ABProf::ABWorker.start

This can be useful if you need to do setup or teardown for every iteration, or if you want to measure a counter rather than time, or if you have some kind of custom timer that you want to do yourself rather than using Ruby's built-in Time.now() for timing, as ABProf does.

Be careful - ABProf will tell you it's returning the lower command. When the measurements are time, lower means faster. But if you return a counter or other higher-is-faster performance measurement, ABProf will consistently return the lower result, which may not be what you expect. That's why ABProf keeps printing "faster?" with a question mark to the terminal. If you return something other than a time, lower may not mean faster.

Randomly: don't use an explicit "return" in the block. It's a block, not a lambda, so "return" won't do what you want. Just add the return value as the final one, as in the example above.

Conclusion

So: want better accuracy than with abcompare? Want to take out process startup time as a factor? Use abprof.

Want to do your own timing, or otherwise control it more? Use the "iteration_with_return_value" method instead of just "iteration", and return a value. Also, use abprof.

Want to just quickly, roughly see if something is a lot faster? You can keep using abcompare. But it'll take longer to be sure, so if you're going to keep doing it, use abprof.