Wrk: Does It Matter If It Has Native No-Keepalive?
/I wrote about a load-tester called Wrk a little while ago. Wrk is unusual among load testers in that it doesn’t have an option for turning off HTTP keepalive. HTTP 1.1 defaults to having KeepAlive, and it helps performance significantly… But shouldn’t you allow testing with both? Some intermediate software might not support KeepAlive, and HTTP 1.0 only supports it in an optional mode. Other load-testers normally allow turning it off. Shouldn’t Wrk allow it too?
Let’s explore that, and run some tests to check how “real” No-KeepAlive performs.
In this post I’m measuring with Rails Simpler Bench, using 110 60-seconds batches of HTTP requests with a 5-second warmup for each. It’s this experiment configuration file, but with more batches.
Does Wrk Allow Turning Off KeepAlive?
First off, Wrk has a workaround. You can supply the “Connection: Close” header, which asks the server to kill the connection when it’s finished processing the request. To be clear, that will definitely turn off KeepAlive. If the server closes the connection after processing each and every request, there is no keepAlive. Wrk also claims in the bug report that you can do it with their Lua scripting. First off, I don’t think that’s true since Wrk’s Lua API doesn’t seem to have any way to directly close a connection. Second off, supplying the header on the command line is easy and writing correct Lua is harder. You could set the header in Lua, but that’s not any better or easier than doing it on the command line, unless you want to somehow do it conditionally, and only some of the time.
(Wondering how to make no-KeepAlive happen, practically speaking? wrk -H ”Connection: Close” will do it.)
Is it the same thing? Is supplying a close header the same as turning off KeepAlive?
Mostly yes, but not quite 100%.
When you supply the “close” header, you’re asking the server to close the connection afterward. Let’s assume the server does that since basically any correct HTTP server will.
But when you turn off KeepAlive on the client, you’re closing it client-side rather than waiting and detecting when the server has closed the socket. So: it’s about who initiates the socket close. Technically wrk will also just keep going with the same connection if the server somehow doesn’t correctly close the socket… But that’s more of a potential bug than an intentional difference.
It’s me writing this, so you may be wondering: does it make a performance difference?
Difference, No Difference, What’s the Difference?
First off, does KeepAlive itself make a difference? Absolutely. And like any protocol-level difference, how much you care depends on what you’re measuring. If you spend 4 seconds per HTTP requests, the overhead from opening the connection seems small. If you’re spending a millisecond per request, suddenly the same overhead looks much bigger. Rails, and even Rack, have pretty nontrivial overhead so I’m going to answer in those terms.
Yeah, KeepAlive makes a big difference.
Specifically, here’s RSB with a simple “hello, world” Rack route with and without the header-based KeepAlive hack:
Config | Throughput | Std Deviation |
---|---|---|
wrk w/ no extra header | 13577 | 302.8 |
wrk -H "Connection: Close" | 10185 | 263.4 |
That’s in the general neighborhood of 30% faster with KeepAlive. Admittedly, this is an example with tiny, fast routes and minimal network overhead. But more network overhead may actually make KeepAlive even faster, relatively, because if you turn off KeepAlive it has to make a new network connection for every request.
So whether “hack no-KeepAlive” versus “real no-KeepAlive” makes a difference, definitely “KeepAlive” versus “no KeepAlive” makes a big difference.
What About Client-Disconnect?
KeepAlive isn’t a hard feature to add to a client normally. The logic for “no KeepAlive” is really simple (close the connection after each request.) What if we check client-closed versus server-closed KeepAlive?
I’ve written a very small patch to wrk to turn off KeepAlive with a command-line switch. There’s also a much older PR to wrk that does this using the same logic, so I didn’t file mine separately — I don’t think this change will get upstreamed.
In fact, just in case I broke something, I wound up testing several different wrk configurations with varying results… These are all using the RSB codebase, with 5 different variants for the wrk command line.
Below, I use “new_wrk” to mean my patched version of wrk, while “old_wrk” is wrk without my —no-keepalive patch.
wrk command | Throughput (reqs/sec) | Std Deviation |
---|---|---|
old_wrk | 13577 | 302.8 |
old_wrk -H "Connection: Close" | 10185 | 263.4 |
new_wrk | 13532 | 310.9 |
new_wrk --no-keepalive | 7087 | 108.3 |
new_wrk -H "Connection: Close" | 10193 | 261.7 |
I see a couple of interesting results here. First off, there should be no difference between old_wrk and new_wrk for the normal and header-based KeepAlive modes… And that’s what I see. If I don’t turn on the new command line arg, the differences are well within the margin of measurement error (13577 vs 13532, 10185 vs 10193.)
However, the new client-disconnected no-KeepAlive mode is around 30% slower than the “hacked” server-disconnected no-KeepAlive! That means it’s around 60% slower than with KeepAlive! I strongly suspect what’s happening is that a server-disconnected KeepAlive mode winds up sending the “close” request alongside the request data, while a client-disconnect winds up making a whole extra network round trip.
A Very Quick Ruby Note - Puma and JRuby
You might reasonably ask if there’s anything Ruby-specific here. Most of this isn’t - it’s experimenting on a load tester and just using a Ruby server to check against, after all.
However, there’s one very important Ruby-specific note for those of you who have been reading carefully.
Most of my posts here are related to work I’m doing on Ruby. This one is no exception.
Puma has some interesting KeepAlive-related bugs, especially in combination with JRuby. If you find yourself getting unreasonably slow results for no reason, especially with Puma and/or JRuby, try turning KeepAlive on or off.
The Puma and JRuby folks are both looking into it. Indeed, I found this bug while working with the JRuby folks.
Conclusions
There are several interesting takeaways here, depending on your existing background.
KeepAlive speeds up a benchmark a lot; if there’s no reason to turn it off, keep it on
wrk doesn’t have a ‘real’ way to turn off KeepAlive (most load testers do)
you can use a workaround to turn off KeepAlive for wrk… and it works great
if you turn off KeepAlive, make sure you’re still getting not-utterly-horrible performance
be careful combining Puma and/or JRuby with KeepAlive - test your performance
And that’s what I have for this week.