Rails Speed with Ruby 2.4.0 and Current Discourse

My recent benchmarking blog posts looked at how Rails and Discourse performance changed from Ruby 2.0.X to 2.3.X But they have a glaring, huge omission: they stop at Ruby 2.3.4 and at Discourse 1.5.0 (vintage March 2016.) That covers a lot of the post-Ruby-2.0 performance improvements, but what's changed between 2.3.4 and the latest Ruby?

Unfortunately, the Discourse version we used, 1.5.0, doesn't support Ruby 2.4 or higher. It's a risk with using a real app for benchmarking. New Discourse only supports Ruby 2.3 or 2.4. So: let's look at current Discourse's speed on Ruby 2.3.4 and Ruby's head-of-master in Git.

Runs

As you may remember from previous posts, Rails Ruby Bench runs a series of consecutive requests about as fast as Puma can manage on an EC2 m4.2xlarge dedicated instance. So let's look at comparative times for full runs between Ruby 2.3.4 and 2.5.0 (head-of-master.)

I'm seeing roughly 10%-15% lower time-per-run between runs. from Ruby 2.0.0 to 2.3.4 was about A 30% speedup, so an extra 10% or 15% on top of that isn't bad.

I'm seeing roughly 10%-15% lower time-per-run between runs. from Ruby 2.0.0 to 2.3.4 was about A 30% speedup, so an extra 10% or 15% on top of that isn't bad.

You can also visualize these results as the total change in throughput -- that's the number of requests/second until the slowest load thread finishes, so it emphasizes the longest-running requests:

This is also around 10% to 15% speedup. It's based on exactly the same numbers, so that's no surprise.

This is also around 10% to 15% speedup. It's based on exactly the same numbers, so that's no surprise.

 

And finally, let's look at individual request times. You may recall from previous comparisons that different Ruby versions have different effects on the fastest and slowest requests -- so let's compare 2.3.4 to 2.5.0 by various request speeds...

As with earlier transitions, everything slow speeds up. You can't see sub-median requests here, but 1) they're very fast and 2) they speed up, but only a little. Ruby 2.3.4 is on the left, ruby head-of-master is on the right.

As with earlier transitions, everything slow speeds up. You can't see sub-median requests here, but 1) they're very fast and 2) they speed up, but only a little. Ruby 2.3.4 is on the left, ruby head-of-master is on the right.

As with earlier posts, this shows that Ruby head-of-master is incrementally speeding up nearly everything. Unlike some of the earlier Ruby versions, slower requests did not get disproportionate improvement here. Even the very fast requests (e.g. 5th percentile) sped up a tiny bit. One thing you can read into that: it's not primarily about the garbage collector speeding up, or about a few unusual slow operations. Most of Ruby has increased in speed by a small percentage, pretty uniformly.

Conclusions

With support for Ruby 2.4.0 and higher, this brings Rails Ruby Bench support to the present day. It also allows us to check for whether particular optimizations help Rails speed with a real application. Look for more of that in the future.

And as far as Ruby 2.4, if you're not using it, you're missing out on about 10%-15% extra speed in Rails. And if you're using Ruby before 2.3.4, you're missing even more speed!

If you hear somebody say "yeah, but these optimizations don't affect I/O-bound applications like most Rails apps," you now have a comprehensive answer: Ruby 2.0 to 2.4 has decreased request times for Rails by around 40% combined, and even more for slower requests. And by all indications, more speed is coming in the future.

Methodology and Footnotes

For the last post, I switched from using a t2.2xlarge EC2 instance to an m4.2xlarge instance. The latter is slightly slower but supports dedicated placement so that I don't have to worry about noisy neighbors (aka other VMs on the same hardware, affecting my benchmark speed.) I expect to stay with the m4.2xlarge for the foreseeable future. If you see modest differences in the specific number of requests per second or increases in the milliseconds per request, that's probably why. This shouldn't change the relative speed of different Ruby versions significantly, it's just a small multiplier on the graph scale.

Last post and this post both use dedicated EC2 instances instead of shared. Thus, the change to m4.2xlarge.

As always, my benchmark code is on GitHub, and the Ruby and Discourse code are standard and open-source. You can contact me for any of my benchmark JSON files. Data processing is done via process.rb in the benchmark repository. Graph output is now via Rickshaw. I don't put full source for graphs in the repository since it's repetitive and a little tedious - contact me if you want it. You can find an example Rickshaw output template in the graph directory of the repository. All Rickshaw output is based on variations of that template.