Ruby Method Lookup, RubyVM.stat and Global State

In Ruby, it can sometimes be a bit involved figuring out where a constant or a method comes from - not just for you, but for Ruby, too! When you call "foo," is it foo from your current class? One of its parents? A module included from a parent class? Somewhere else?

And of course, Ruby has a few objects whose methods are available just about everywhere - Object, BasicObject and Kernel.

When I wrote about the Global Method Cache, we touched on that just a little. Let's go a bit deeper, shall we?

Better yet, we'll learn about some things that you should carefully *not* do to keep your performance good, and how you check.

Method Caching and Cache Invalidation

Methods on three specific Ruby objects are special - BasicObject, Object and Kernel. BasicObject is the root of the inheritance tree - even the simplest Ruby objects inherit from it. And it includes Kernel, so everything does. Object is slightly more full-featured than BasicObject, but it's basically the same thing. Nearly every object inherits from all three of these.

As a result, they get special hardcoding in Ruby's performance code. Ruby keeps a count of how many times you have defined methods on those three objects. It calls that count your “Global Method State.” And it tags all method cache entries with that number. Which means that if you add a new method on any of them, all the old method cache entries get invalidated...

That's good! You don't want to use stale entries that would point to the wrong method definition.

Also, that's bad. All your old cache entries are gone, and now you have to look them up again. That can be slow.

If it happens during the setup at the beginning of your Rails app... Well, sure. That sounds like Rails, defining methods in all sorts of places. Any cache you set up before it's done with that is wasted, sure. Sounds like Rails. Also, sounds fine.

But if you do that in your inner loop, that's really bad. It means you're looking up all your methods fresh, every time, even methods that don't share a name with the method you're defining on Object (or Kernel, or BasicObject.) That's no good.

Aaron Patterson has also written more about Global Method State and its relatives, if you want more details.

Constants and Cache Invalidation: Global Constant State

Ruby has really involved constant lookup. In the same way as for global methods, making a new constant makes it hard to tell exactly what constant to use. Ruby has a really similar solution.

When you define a new constant, it invalidates your old constant caching - Ruby has to look everything up again. The state number Ruby uses to track this is called the Global Constant State.

Your Global Cache Detective: RubyVM.stat

In CRuby, you can't "fix" the cache-invalidation problem except by not defining methods on Object, Kernel or BasicObject in your inner loop. If you need to define new methods constantly, do it in some other class.

But how can you tell if you — or one of your dependencies — are doing that?

Remember Ruby's GC.stat? In the same way, Ruby has a RubyVM.stat that tracks changes to (what it calls) "global_method_state".

2.5.0 :002 > RubyVM.stat
 => {:global_method_state=>137, :global_constant_state=>1115, :class_serial=>7109}

Each of these tracks an internal quantity for the VM. :global_method_state is the one we’ve just been talking about. :global_constant_state is the counter for constants. And class_serial is complicated to explain — but basically, for each individual class, they keep a similar counter, and this number assigns the value across all classes. Unlike global_method_state, it’s not a big performance problem if you reset it. Mostly you can ignore class_serial.

Great! But How Do I Use It?

You might reasonably ask, “what would I do with this?” If you’re finding your app slows down unaccountably at some times, or you’re worried that it might, it’s one more thing you can check - has global_method_state increased a lot? Or has global_constant_state increased a lot? Either case might mean that you’re doing something slow that busts your caches. Ruby’s Struct has had this problem, for instance.

If you’re using Rails, you might consider doing this in a Rack middleware - if either of those increases during a request, that may be a sign that you’re using something slow and you’ll want to look into that.