Tuesday Oct 14, 2008

Ruby Objects on the JRuby Heap at Runtime

I wrote an entry last week about using jhat to get a static snapshot of the Ruby objects on the JRuby heap. The next step is to try getting similar information from a running application, with periodic updates. In other words, look inside the JRuby code as it runs on the JVM and find out the number of Ruby objects that have been allocated, broken down by the Ruby class for each object.

The tool I chose this time was BTrace. The "B" stands for "Bytecode" and BTrace provides a "roll your own" profiling capability. In other words, you get complete control over exactly what gets instrumented in your Java application. The trade-off, in comparison to other profiling tools, is you have to do more of the work. Specifically, you have to write scripts that control the instrumentation. Luckily the scripting language is familiar: Java, along with some annotations that BTrace provides.

BTrace is a simple command line utility that just needs the process ID of the JVM to instrument and the name of a script file (you can provide the .java source file or use btracec to create a .class file). In order to experiment with it, I needed a Ruby program that would create lots of objects. I wrote this simple application:

class Every

class EveryOther

puts "Hello World"
i = 0;
all = Array.new
some = Array.new
while (true) 
  some[i] = EveryOther.new if (i%2==0)
  all[i] = Every.new
  puts i if (i%1000000==0)

It just has an endless loop that allocates objects. I am using JRuby 1.1 to run it and I pass -J-Xmx2048m on the JRuby command line so that the program can run for quite a while before using up all the heap space and exiting with an OutOfMemoryError.

My BTrace script is:

package org.jruby;

import com.sun.btrace.annotations.\*;
import static com.sun.btrace.BTraceUtils.\*;
import com.sun.btrace.aggregation.Aggregation;
import com.sun.btrace.aggregation.AggregationFunction;
import com.sun.btrace.aggregation.AggregationKey;

 \* Create a histogram of RubyObject instances, 
 \* keyed by RubyClass classId.
 \* @author Gregg Sporar
public class RubyClassHistogram {
   private static Aggregation sum = newAggregation(AggregationFunction.SUM);

    \* Instrument the public RubyObject contstructor 
        type="void (org.jruby.Ruby, org.jruby.RubyClass)"
    public static void onNewObject(org.jruby.RubyObject self,
                                   org.jruby.Ruby ruby,
                                   org.jruby.RubyClass rubyClass) {
        String classId = rubyClass.classId;
        AggregationKey key = newAggregationKey(classId);
        addToAggregation(sum, key, 1);

    public static void print() {
        printAggregation("RubyObject Instances Created:", sum);

BTrace calls my print() method every 4 seconds to display the current values in the histogram. As an example:

RubyObject Instances Created:
  String                                                           18
  EveryOther                                                  2934364
  Every                                                       5868731

Please note:

  • The histogram is of the number of objects created, not the number of objects that are currently alive on the heap. With the toy sample application that I used, those numbers are the same, but with real applications the number of live objects is usually smaller than the total number of objects that have been created. I do not have much knowledge of the internals of JRuby, but I am guessing the BTrace script above could be enhanced to also instrument the method in JRuby that removes a Ruby object from the JRuby heap.
  • BTrace requires JDK 6 or higher.
  • BTrace is still under development - use it at your own risk. From what I have seen and heard, it appears to be robust and stable but I do not know if I would use it on a production system. In particular, I do not have enough experience with it yet to have a good feel for the amount of overhead that it would introduce when used with a script like the one shown here.

Tuesday Oct 07, 2008

Counting Ruby Objects on the JRuby Heap

I got an email the other day from Michael Yuan. He had been working with some folks who were running an application on JRuby and they thought they might be seeing memory usage problems. Michael put me in contact with Greg Fodor, the CTO of Adtuition who sent me this email:

Our main concern here is that we very well might have memory leaking per request or generally over time but with off-the-shelf java heap profilers it's incredibly difficult to make heads or tails of what is sitting on the virtual ruby heap. For example, if we are leaking objects of a certain class, in the profilers I've used, there's no easy way for me to see an aggregate count of instances of that Ruby class. All Ruby objects in JRuby are in the JVM as instances of RubyObject (or RubyString, etc for some primitives) so in order to even determine the runtime class of a RubyObject I need to drill through the heap through some data structures.

In my experience with native Matz ruby I was able to use profiling tools to quickly identify which types of instances were leaking and this often was sufficient information to begin tracking down the bug. Just being able to get a "Ruby object histogram" would be a great starting point, and the Java profiling tools will shine once we can figure out which particular instances are worth looking at.

Hmmm... a "Ruby object histogram" for an application that is running on JRuby. Most of the tools available for looking at the Java heap are not going to help much for creating something like that. To make sure I was not overlooking any tools/techniques specific to the JRuby world itself, I sent an email to Charlie Nutter, who confirmed that there is "... no simple way ... to explore Ruby objects."

Charlie also provided confirmation for the basic approach that would be needed to parse the Java heap in order to find the class names of the underlying Ruby objects: "Each [RubyObject] should have a metaClass field that represents the RubyClass; and the RubyClass would have a classId string field that's probably enough to make that determination."

Using a sample binary heap dump file from Charlie, I looked around a bit using Profile > Load Heap Dump in the NetBeans IDE. That makes it easy to see the RubyClass for any one RubyObject:

If you know which RubyObject you want to examine, this is great. Unfortunately, you usually don't know that ahead of time. And even though the heap was small (about 40mb), there were almost 4000 instances of RubyClass. So in order to create a "Ruby object histogram" I needed a tool that can do queries on the contents of the heap.

There might be another way, but the tool I chose was jhat, which is only available in JDK 6 (or higher). The good news is that jhat can read a binary heap dump file that was created by a Java 1.4 or Java 5 JVM. So step one in this process is: get a binary heap dump file from the JVM that is running your JRuby application. Charlie provided one for me, but having the JVM create one for your application as it runs is not too hard. Details are available in the article described here.

Start the jhat utility and pass it the binary heap dump file name as a parameter. Note that for larger heap dump files, you will also need to pass jhat a maximum value that it can use for its own heap. This is because jhat uses heap space to store information about the binary heap dump that it reads. So for example,  passing -J-Xmx512m on the command line would give jhat 512mb of heap to use for parsing the file. Please note that this is a potential Achilles' heel for this technique - jhat needs large amounts of heap space, so its ability to read really large binary heap dump files is limited.

After jhat starts up successfully, use your browser to interact with it at http://localhost:7000. The final link on that page will read "Execute Object Query Language (OQL) query." Click that link to display the form where you can type in an OQL query:

Admittedly, it is not the most user-friendly or intuitive interface. And grok'ing how OQL operates takes a bit of effort. OQL is a query language for the heap that allows you to use JavaScript and provides some built-in functions. Click the "OQL Help" link for more information, but essentially what you want to put in the input field is an expression that returns a string. The built-in "select" operation will do that for you, but for something like this I needed to write a function that returns a string and then invoke that function.

The initial inspiration for this script came from an OQL script that Sundar created for an article that he and I wrote with Frank Kieviet. It took me a while to get this version working because of a bug that I ran into in jhat, but with Sundar's help (and a sort function I found online), I was able to get jhat to display a histogram based on the RubyClass values found in each of the RubyObject instances (truncated for brevity):

It's not particularly pretty. And of course it is only a static picture of the contents of the heap. And there is no information about relationships between the objects, etc. But it is a "Ruby object histogram." :-)

The source code for the script (ready to be copy/pasted into the jhat form) is below. I showed the results to Charlie Nutter before I posted this blog entry and his response was "It would also be worth including the other Ruby\* objects as well...RubyString, RubyFile, etc."  I'll leave that as an exercise for the reader.

// find RubyObject instances and create a histogram
// that shows the breakdown of their metaClass.classId values
(function findRubyObjects() {

// use the OQL count() function to iterate over all org.jruby.RubyObject
// instances.  Note that the second parameter to count() is an
// internal function that is used to create the histogram
var histo =new Object();
var total = count(heap.objects("org.jruby.RubyObject",false),
function(it) {
// disregard any RubyObject that has a null metaClass
if (it.metaClass !=null && it.metaClass.classId !=null) {
var rc = it.metaClass.classId.value.toString();
if (rcin histo) {
else {
return true;
else {
return false;

// create a table with the sorted output
var str =" Total RubyObjects found: " + total +"<br><table border=1>";
  str += sortAssoc(histo);
  str += "</table>";

// return value is what jhat displays in the user's browser
return str;

// helper function for sorting the results
// in descending order - a modified version
// of the solution shown in http://bytes.com/forum/post579606-2.html
function sortAssoc(aInput) {
var aTemp = [];
var htmlTable ="";
for (var sKeyin aInput) {
      aTemp.push([sKey, aInput[sKey]]);
    aTemp.sort(function () {return arguments[0][1] > arguments[1][1]});

for (var nIndex = aTemp.length-1; nIndex >=0; nIndex--) {
      htmlTable += "<tr>";
      htmlTable += "<td>" + aTemp[nIndex][0] +
"</td><td>" + aTemp[nIndex][1] +

return htmlTable;



Gregg Sporar


« October 2008