Tuesday Oct 07, 2008

Counting Ruby Objects on the JRuby Heap

I got an email the other day from Michael Yuan. He had been working with some folks who were running an application on JRuby and they thought they might be seeing memory usage problems. Michael put me in contact with Greg Fodor, the CTO of Adtuition who sent me this email:

Our main concern here is that we very well might have memory leaking per request or generally over time but with off-the-shelf java heap profilers it's incredibly difficult to make heads or tails of what is sitting on the virtual ruby heap. For example, if we are leaking objects of a certain class, in the profilers I've used, there's no easy way for me to see an aggregate count of instances of that Ruby class. All Ruby objects in JRuby are in the JVM as instances of RubyObject (or RubyString, etc for some primitives) so in order to even determine the runtime class of a RubyObject I need to drill through the heap through some data structures.

In my experience with native Matz ruby I was able to use profiling tools to quickly identify which types of instances were leaking and this often was sufficient information to begin tracking down the bug. Just being able to get a "Ruby object histogram" would be a great starting point, and the Java profiling tools will shine once we can figure out which particular instances are worth looking at.

Hmmm... a "Ruby object histogram" for an application that is running on JRuby. Most of the tools available for looking at the Java heap are not going to help much for creating something like that. To make sure I was not overlooking any tools/techniques specific to the JRuby world itself, I sent an email to Charlie Nutter, who confirmed that there is "... no simple way ... to explore Ruby objects."

Charlie also provided confirmation for the basic approach that would be needed to parse the Java heap in order to find the class names of the underlying Ruby objects: "Each [RubyObject] should have a metaClass field that represents the RubyClass; and the RubyClass would have a classId string field that's probably enough to make that determination."

Using a sample binary heap dump file from Charlie, I looked around a bit using Profile > Load Heap Dump in the NetBeans IDE. That makes it easy to see the RubyClass for any one RubyObject:

If you know which RubyObject you want to examine, this is great. Unfortunately, you usually don't know that ahead of time. And even though the heap was small (about 40mb), there were almost 4000 instances of RubyClass. So in order to create a "Ruby object histogram" I needed a tool that can do queries on the contents of the heap.

There might be another way, but the tool I chose was jhat, which is only available in JDK 6 (or higher). The good news is that jhat can read a binary heap dump file that was created by a Java 1.4 or Java 5 JVM. So step one in this process is: get a binary heap dump file from the JVM that is running your JRuby application. Charlie provided one for me, but having the JVM create one for your application as it runs is not too hard. Details are available in the article described here.

Start the jhat utility and pass it the binary heap dump file name as a parameter. Note that for larger heap dump files, you will also need to pass jhat a maximum value that it can use for its own heap. This is because jhat uses heap space to store information about the binary heap dump that it reads. So for example,  passing -J-Xmx512m on the command line would give jhat 512mb of heap to use for parsing the file. Please note that this is a potential Achilles' heel for this technique - jhat needs large amounts of heap space, so its ability to read really large binary heap dump files is limited.

After jhat starts up successfully, use your browser to interact with it at http://localhost:7000. The final link on that page will read "Execute Object Query Language (OQL) query." Click that link to display the form where you can type in an OQL query:

Admittedly, it is not the most user-friendly or intuitive interface. And grok'ing how OQL operates takes a bit of effort. OQL is a query language for the heap that allows you to use JavaScript and provides some built-in functions. Click the "OQL Help" link for more information, but essentially what you want to put in the input field is an expression that returns a string. The built-in "select" operation will do that for you, but for something like this I needed to write a function that returns a string and then invoke that function.

The initial inspiration for this script came from an OQL script that Sundar created for an article that he and I wrote with Frank Kieviet. It took me a while to get this version working because of a bug that I ran into in jhat, but with Sundar's help (and a sort function I found online), I was able to get jhat to display a histogram based on the RubyClass values found in each of the RubyObject instances (truncated for brevity):

It's not particularly pretty. And of course it is only a static picture of the contents of the heap. And there is no information about relationships between the objects, etc. But it is a "Ruby object histogram." :-)

The source code for the script (ready to be copy/pasted into the jhat form) is below. I showed the results to Charlie Nutter before I posted this blog entry and his response was "It would also be worth including the other Ruby\* objects as well...RubyString, RubyFile, etc."  I'll leave that as an exercise for the reader.


// find RubyObject instances and create a histogram
// that shows the breakdown of their metaClass.classId values
(function findRubyObjects() {

// use the OQL count() function to iterate over all org.jruby.RubyObject
// instances.  Note that the second parameter to count() is an
// internal function that is used to create the histogram
var histo =new Object();
var total = count(heap.objects("org.jruby.RubyObject",false),
function(it) {
// disregard any RubyObject that has a null metaClass
if (it.metaClass !=null && it.metaClass.classId !=null) {
var rc = it.metaClass.classId.value.toString();
if (rcin histo) {
          histo[rc]++;
        }
else {
          histo[rc]=1;
        }
return true;
      }
else {
return false;
      }
    });

// create a table with the sorted output
var str =" Total RubyObjects found: " + total +"<br><table border=1>";
  str += sortAssoc(histo);
  str += "</table>";

// return value is what jhat displays in the user's browser
return str;

// helper function for sorting the results
// in descending order - a modified version
// of the solution shown in http://bytes.com/forum/post579606-2.html
function sortAssoc(aInput) {
var aTemp = [];
var htmlTable ="";
for (var sKeyin aInput) {
      aTemp.push([sKey, aInput[sKey]]);
    }
    aTemp.sort(function () {return arguments[0][1] > arguments[1][1]});

for (var nIndex = aTemp.length-1; nIndex >=0; nIndex--) {
      htmlTable += "<tr>";
      htmlTable += "<td>" + aTemp[nIndex][0] +
"</td><td>" + aTemp[nIndex][1] +
"</td></tr>";
    }

return htmlTable;
  }

})();

About

Gregg Sporar

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today