Sundararajan's Weblog

  • Java
    September 30, 2005

Querying Java heap with OQL

Guest Author

This is continuation of What's in my Java heap? post. Let us see more OQL (Object Query Language) examples. But before that we will digress into built-in functions supported in OQL.

The built-in functions in OQL fall into the following categories:
  • Functions that operate on individual Java objects
    1. sizeof(o)-- returns size of Java object in bytes
    2. objectid(o)-- returns unique id of Java object
    3. classof(o)-- returns Class object for given Java object
    4. identical(o1, o2) -- returns (boolean) whether two given object are identical
      or not (essentially objectid(o1) == objectid(o2). Do not use simple JavaScript
      reference comparison for Java Objects!)
    5. referrers(o) -- returns array of objects refering to given Java object
    6. referees(o) -- returns array of objects referred by given Java object
    7. reachables(o) -- returns array of objects directly or indirectly referred
      from given Java object (transitive closure of referees of given object)
  • Functions that operate operate on arrays
    1. contains(array, expr) -- returns array contains an element that satisfies given expression
      The expression can refer to built-in variable 'it'. This is current object iterated
    2. count(array, [expr]) -- returns number of elements satisfying given expression
    3. filter(array, expr) -- returns a new array containing elements satisfying given expression
    4. map(array, expr) -- returns a new array that contains results of applying given expression on
      each element of input array
    5. sort(array, [expr]) -- sorts the given array. optionally accepts comparison expression to use.
      if not given, sort uses numerical comparison
    6. sum(array) -- sums all elements of array

    As you can see, most array operating functions accept boolean expression -- the expression can refer to current object by it variable. This allows operating on arrays without loops -- the built-in functions loop through the array and 'apply' the expression on each element.

There is also built-in object called heap. There are various useful methods in heap object. For more details on built-in functions, object refer to "OQL Help" link in jhat's OQL page.

Now, let us see some interesting queries.

Select all objects referred by a SoftReference:

select f.referent from java.lang.ref.SoftReference f
where f.referent != null

referent is a private field of java.lang.ref.SoftReference class (actually inherited field from java.lang.ref.Reference. You may use javap -p to find these!) We filter the SoftReferences that have been cleared (i.e., referent is null).

Show referents that are not referred by another object. i.e., the referent is reachable only by that soft reference:

select f.referent from java.lang.ref.SoftReference f
where f.referent != null && referrers(f.referent).length == 1

Note that use of referrers built-in function to find the referrers of a given object. because referrers returns an array, the result supports length property.

Let us refine above query. We want to find all objects that referred only by soft references but we don't care how many soft references refer to it. i.e., we allow more than one soft reference to refer to it.

select f.referent from java.lang.ref.SoftReference f
where f.referent != null &&
filter(referrers(f.referent), "classof(it).name != 'java.lang.ref.SoftReference'").length == 0

Note that filter function filters the referrers array using a boolean expression. In the filter condition we check the class name of referrer is not java.lang.ref.SoftReference. Now, if the filtered arrays contains atleast one element, then we know that f.referent is referred from some object that is not of type java.lang.ref.SoftReference!

Find all finalizable objects (i.e., objects that are some class that has 'java.lang.Object.finalize()' method overriden)

select f.referent from java.lang.ref.Finalizer f
where f.referent != null

How does this work? When an instance of a class that overrides finalize() method is created (potentially finalizable object), JVM registers the object by creating an instance of java.lang.ref.Finalizer. The referent field of that Finalizer object refers to the newly created "to be finalized" object. (dependency on implementation detail!)

Find all finalizable objects and approximate size of the heap retained because of those.

select { obj: f.referent, size: sum(map(reachables(f.referent), "sizeof(it)")) }
from java.lang.ref.Finalizer f
where f.referent != null

Ah! That looks really complex -- but, actually it is simple. I use JavaScript object literal to select multiple values in the select expression (obj and size properties). reachables finds objects reachable from given object. map creates a new array from input array by applying given expression on each element. The map call in this query would create an array of sizes of each reachable object. sum built-in adds all elements of array. So, we get total size of reachable objects from given object (f.referent in this case). Why do I say approximate size? HPROF binary heap dump format does not account for actual bytes used in live JVM. Instead sizes just enough to hold the data are used. For eg. JVMs would align smaller data types such as 'char' -- JVMs would use 4 bytes instead of 2 bytes. Also, JVMs tend to use one or two header words with each object. All these are not accounted in HPROF file dump. HPROF uses minimal size needed to hold the data - for example 2 bytes for a char, 1 byte for a boolean and so on.

That's all for now! We will more interesting OQL queries in future. Stay tuned...

Join the discussion

Comments ( 10 )
  • Gordon Mohr Tuesday, November 22, 2005
    Neat stuff! Any way to get the _total_ (or _average_) of everything reachable from a query's result set, rather than a row per result?

    For example, based on the Finalizer example, something like:

    select sum(\*) 
    from { select sum(map(reachables(f.referent), "sizeof(it)"))
    from java.lang.ref.Finalizer f
    where f.referent != null }

  • A. Sundararajan Tuesday, November 22, 2005
    Nested query has not been implemented. But, there is a built-in object called "heap". This has many useful methods. heap object has a method called "objects". This method accepts class, instanceof flag and filter. In other words, this method gives you the same power of select-from-where. It is possible to use that. Or OQL is just a cover over JavaScript. You can even procedural code using heap and other built-in functions and compute whatever needed.
  • Gordon Mohr Wednesday, November 23, 2005
    Aha. I see what you mean -- the select/from/where is just syntactic sugar on the built-in functions.

    But, the behavior I'm seeing is still mysterious.

    For example, I'm unclear why this...

     select sum(heap.objects("sun.util.calendar.ZoneInfo"),"sizeof(it)")

    ...returns a concatenated string of all instance sizes, rather than a true total.

    Because of that, the closest I can get to the result I want is to keep a running average, and ignore all but the last value reported, like this:

    function tally(obj) {
    if(typeof tally.total == 'undefined') {
    tally.total = 0;
    tally.count = 0;
    tally.total += sum(map(reachables(obj), "sizeof(it)"));
    return tally.total/tally.count;

    But, it seems like there should be a simpler way...

    Thanks for your help!

  • A. Sundararajan Thursday, November 24, 2005
    Yes, it is a bug in sum function :-( I am sorry.
    I'll try to to fix this as soon as I can.
  • Peter Arrenbrecht Monday, June 26, 2006
    This is extremely cool! I have been thinking along similar lines (heap query) for a while, but as a tool integrated within an IDE's debugger. Does Mustang offer an API such as what JHAT can use for the heap dump into the live heap, for debuggers?
  • qw Sunday, September 17, 2006
  • David Hunnisett Friday, March 30, 2007
    Hi I have a really really simple question.
    I am trying to find a doc / spec for objectid
    any ideas where one might be ?
  • A. Sundararajan Friday, March 30, 2007
    Hi David Hunnisett: objectid and other built-in functions for OQL are specified in the OQL help which can be accessed within the jhat's OQL page. "objectid" returns a unique string id for a given object and can be used to compare for identity.
  • C. Simpson Wednesday, April 18, 2007
    How do you select only new objects in OQL when jhat is started with a baseline file and a new file? There is a canned query page to include the count of new objects so I know there must be something that discriminates old and new objects but I could find no function or field in the OQL docs that identify an object as new.
  • A. Sundararajan Thursday, April 19, 2007
    Hi C. Simpson: Actually, the canned queries do not go through the OQL interface. These queries directly operate on top of the Java classes of jhat. OQL interface was added in JDK 6. Yes, you are right. There is no function in OQL to check new objects from given baseline. Sorry. Will try to address that in future.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.