Don't make your users wait for GC

One of the weaknesses of Ruby is garbage collection. Although some improvements have been made (tunable GC settings, lazy sweep, bitmap marking), it is still a simple stop-the-world GC. This can be problematic especially for large applications. We have a rather large application running on Passenger 3 using REE 1.8.7 with tuned GC. As the application grew, the GC performance has been degrading until we were faced with the following situation where GC averages 130 ms per request.

It was time to address our obvious GC problem.

We could have continued to tune the GC and optimize memory usage to reduce the impact of GC on response times. But why? Why have the GC impact the response time at all? Instead, we decided to trigger GC after one request finishes but before the next request starts. This is not a new idea. It is actually generically referred to as out of band work (OOBW). It has been implemented by Unicorn and discussed here. However, it is not supported by Passenger. So we decided to add OOBW support to Passenger.

Our patch allows the application to respond with an additional header, namely X-Passenger-OOB-Work. When Passenger sees this header, it will stop sending new requests to that application process and tell that application process to perform the out of band work. The application registers oob_work callback for the work it wants done using Passenger's event mechanism. When the work is done, the application process resumes handling of normal requests.

All the application code can be packaged in an initializer,

PhusionPassenger.on_event(:oob_work) do
 t0 =
 GC.start "Out-Of-Bound GC finished in #{ - t0} sec"

class RequestOOBWork
 def initialize(app, frequency)
   @app = app
   @frequency = frequency
   @request_count = 0

 def call(env)
   @request_count += 1
   status, headers, body =
   if @request_count % @frequency == 0
     headers['X-Passenger-Request-OOB-Work'] = 'true'
   [status, headers, body]

Rails.application.config.middleware.use RequestOOBWork, 5

After experimentation, we decided to trigger GC after every 5 requests. If you set the parameter too high, then GC will continue to occur in the middle of requests. If you set the parameter too low, then application processes will be busy performing their out-of-band GC causing Passenger to spawn more workers.

With these changes in place, our graphs show 10x improvement in GC average per request,

The patch we implemented and deployed to production is for Passenger 3. It is available here.

Passenger 4 is currently under active development. In fact, beta 1 has already been released. We've ported our patch to Passenger 4. It is merged and expected to be included in beta 2.