Monday May 21, 2007

Collector names for GarbageCollectorMXBean MXBean

While trying to use GarbageCollectorMXBean to obtain GC information about a running process, I had to look around to find out the correct name strings for various Garbage Collectors.

This is what I did to obtain an instance of GarbageCollectorMXBean:
  MBeanServer server = ManagementFactory.getPlatformMBeanServer();
  GarbageCollectorMXBean bean = ManagementFactory.newPlatformMXBeanProxy(server,
             GC_BEAN_NAME, GarbageCollectorMXBean.class);

where GC_BEAN_NAME is the ObjectName for uniquely identifying the MXBean for a garbage collector within an MBeanServer and it is of the form:
    java.lang:type=GarbageCollector,name=collector's name  

Now, what is the correct string for collector's name?

Here are the name strings for various collectors in VM:

You can run a simple java program with different collectors and using the below code snippet, can see the full object names of all the available GarbageCollectorMXBean MXBeans.
MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
Set names = mbs.queryNames(null, null);
System.out.println(names.toString().replace(", ",

Sunday May 13, 2007

Automatic Imports addition using NetBeans

NetBeans has a very useful keyboard shortcut: Alt+Shift+F. Upon pressing these keys, NetBeans adds all the required import statements at the top of the source file.

Friday May 11, 2007

JVMTI agent to print reference paths

To get started on JVMTI please read JVMTI reference guide JVMTI reference guide and a very good article on JVMTI Programming here

Here I will briefly describe how I used JVMTI to write an agent which prints the reference paths of instances of a given class in the running process.

The global data structure used in this agent:
/\* Global agent data structure \*/
typedef struct {
    /\* JVMTI Environment \*/
    jvmtiEnv      \*jvmti;

    /\* Data access Lock \*/
    jrawMonitorID  lock;

    /\* Fields to store the Options passed to the agent\*/
    char\* classname;   
    int  max_count;    

    /\* boolen to indicate if dump is in progress \*/	
    jboolean dumpInProgress;
    JavaVM\* jvm;
    jclass  klass;
    jlong  klassTag;
} GlobalAgentData;

This agent prints these reference paths when a SIGQUIT signal is sent to the process. For this, In Agent_OnLoad() function, set the callback for DataDumpRequest and enable the event notification for JVMTI_EVENT_DATA_DUMP_REQUEST event.
JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM \*jvm, char \*options, void \*reserved)
  callbacks.DataDumpRequest = &dataDumpRequest;

With the above code, when the process receives SIGQUIT (ctrl+\\ on solaris and ctrl+break on windows) signal, dataDumpRequest() function of this agent would get invoked.

Let's see how this function looks:
static void JNICALL dataDumpRequest(jvmtiEnv \*jvmti)
  enter_critical_section(jvmti); {
    if ( !gdata->dumpInProgress ) {
      gdata->dumpInProgress = JNI_TRUE;
      gdata->klassTag = 1;
      gdata->jvmti->SetTag(gdata->klass, gdata->klassTag);
      jint count = 0;
      void\* user_data = NULL;

      // iterate over all the reachable objects in heap
      jvmti->IterateOverReachableObjects(&heap_root_callback, &stack_ref_callback,
                 &object_ref_callback, user_data);

      // print ref paths
      RefPaths\* list = ref_paths;
      int max = gdata->max_count;

      printf("Reference paths of instances of %s ....\\n", gdata->classname);
      while ((list != NULL) && (max >= 0) ) {
        ObjectInfo\* object = (ObjectInfo\*)list->path;
        printf("\\n\\nReference Path:");
        list = list->next;

      // unset tags
      jvmti->IterateOverReachableObjects(&heap_root_callback, &stack_ref_callback,
            &object_ref_clean_callback, user_data);

      // delete object info list
      ObjectInfoList\* list1 = objList;
      while (list1) {
        ObjectInfoList\* node = list1;
        list1 = list1->next;
      objList = NULL;
      // delete ref paths list
      list = ref_paths;
      RefPaths\* path;

      while (list != NULL) {
        path = list;
        list = list->next;
      ref_paths = NULL;

      gdata->klassTag = 1;
      gdata->dumpInProgress = JNI_FALSE;
  } exit_critical_section(jvmti);	
Here, we first set the tag of the class gdata->klass which was set to the classname passed as parameter to the agent. Then we call IterateOverReachableObjects function to go over all the reachable objects in heap. IterateOverReachableObjects function iterates over the root objects and all objects that are directly and indirectly reachable from the root objects. In dataDumpRequest we are making two calls to IterateOverReachableObjects, first one to create a list of reachable objects and their referrers and the second one to unset the tags set in the first call.

IterateOverReachableObjects accepts three callbacks, callback which gets called when any root object is visited, callback which gets called when any stack object is visited and a callback when any object reference is visited. This sample agent does not do anything in the root and stack callbacks.

Now let's see what the object reference callback does:
static jvmtiIterationControl JNICALL
object_ref_callback(jvmtiObjectReferenceKind reference_kind,
		jlong class_tag, jlong size, jlong\* tag_ptr,
		jlong referrer_tag, jint referrer_index, void \*user_data)
  ObjectInfo\* obj_info = NULL;
  // if this object's tag is set which can be for the class we want to create
  // reference paths for or if we are visiting this object again.
  if (\*tag_ptr != NULL) { 
    if (gdata->klassTag == 1) {
      if (\*tag_ptr == gdata->klassTag) {
        obj_info = new ObjectInfo();
	memset(obj_info, 0 , sizeof(ObjectInfo));
	obj_info->size = size;
	obj_info->visited = 1;
	obj_info->kind = reference_kind;

	\*tag_ptr = (jlong)(ptrdiff_t)(void\*)obj_info;
	gdata->klassTag = \*tag_ptr;
    } else {
      obj_info = (ObjectInfo\*)\*tag_ptr;
      if (obj_info->visited == 1)
  // if tag is not present, then create ObjectInfo and set it as it's tag.
  else {
    obj_info = new ObjectInfo();
    memset(obj_info, 0 , sizeof(ObjectInfo));
    obj_info->size = size;
    obj_info->visited = 1;
    obj_info->kind = reference_kind;

    \*tag_ptr = (jlong)(ptrdiff_t)(void\*)obj_info;

    //Add the new ObjectInfo to ObjectInfo's list
    if (objList == NULL) {
      objList = new ObjectInfoList();
      objList->obj = obj_info;
      objList->next = NULL;
    } else {
      ObjectInfoList\* list = objList;
      while (list->next != NULL) {
        list = list->next;
      ObjectInfoList\* objinfo = new ObjectInfoList();
      objinfo->obj = obj_info;
      objinfo->next = NULL;

      list->next = objinfo;
  // If this object has the referrer then put it in the referrers list
  if (referrer_tag != NULL) {
    if (obj_info->referrers == NULL) {
      obj_info->referrers = new Referrer();
      obj_info->referrers->tag = referrer_tag;
      obj_info->referrers->refereeTag = \*tag_ptr;
      obj_info->referrers->next = NULL;
    } else {
      Referrer\* referrer = obj_info->referrers;
      while (referrer->next != NULL) {
        referrer = referrer->next;
      Referrer\* ref = new Referrer();
      ref->tag = referrer_tag;
      ref->refereeTag = \*tag_ptr;
      ref->next = NULL;
      referrer->next = ref;			

  if (class_tag == gdata->klassTag) {


Here the ObjectInfo class represents an object in the heap. We tag every visited reference with the ObjectInfo pointer and when it's refree is visited, the referer_tag contains this ObjectInfo pointer. So by using tag_ptr and referer_tag, this callback creates list of reference paths (RefPaths) of all the instances of the given class.

After returning from the callback, function dfsPrintRefPaths prints prints these reference paths. Then we make another call to IterateOverReachableObjects to unset the tags we had set in the first iteration.

The full code of this agent can be downloaded from here.

How this works:
Build ref_paths.cpp into a shared library. Copy the library refpaths.dll or in the current working directory or set the LD_LIBRARY_PATH on solaris and PATH env variable on Windows to the loaction of this library.

Start your java program with options:
java -agentlib:refpaths=classname="java/lang/String",max=40 TestClass

where -agentlib:refpaths specifies that 'refpaths' is the jvmti based shared library
and following that are the arguments passed to this agent.
o classname - name of the class for which we want to print the ref-paths
o max - the maximum number of instances

The above commnad will print the 40 ref-paths of instances of java/lang/String class.

To print the ref-paths at any point, send Sigquit signal to the running process
and it will print the paths though which the requested instances are reachable.

The output of ref-paths printed by this agent looks like this:
Reference Path:
- Instance Field: /E:/jre1.5.0_10/lib/ext/dnsns.jar
- Array Element: file:/E:/jre1.5.0_10/lib/ext/dnsns.jar
- Instance Field: [Ljava.lang.Object;@1c78e57
- Instance Field: [file:/E:/jre1.5.0_10/lib/ext/dnsns.jar, file:/E:/jre1.5.0_10/lib/ext/localedata.jar, file:/E:/jre1.5.0_10/lib/ext/sunjce_provider.jar, file:/E:/jre1.5.0_10/lib/ext/sunpkcs11.jar]
- Instance Field: sun.misc.URLClassPath@5224ee
- Instance Field: sun.misc.Launcher$ExtClassLoader@92e78c
- ClassLoader: sun.misc.Launcher$AppClassLoader@a39137
- class TestClass

Reference Path:
- Instance Field: file:/E:/jre1.5.0_10/lib/ext/sunpkcs11.jar!/
- Instance Field: jar:file:/E:/jre1.5.0_10/lib/ext/sunpkcs11.jar!/
- Array Element: sun.misc.URLClassPath$JarLoader@f6a746
- Instance Field: [Ljava.lang.Object;@15ff48b
- Instance Field: [sun.misc.URLClassPath$JarLoader@affc70, sun.misc.URLClassPath
- Instance Field: sun.misc.URLClassPath@5224ee
- Instance Field: sun.misc.Launcher$ExtClassLoader@92e78c
- ClassLoader: sun.misc.Launcher$AppClassLoader@a39137
- class TestClass

where any entry is reachable through the entry below it.

Thursday May 10, 2007

jmap with CMS

Hotspot Servicability Agent, the backbone of jmap tool had a bug due to which jmap was not able to generate correct histogram and heap dump of applications running with Concurrent Mark Sweep collector. This bug 6311411 has been fixed in 5.0u12 and 6.0u2. So starting with these releases, jmap works fine with CMS collector.

Thursday Mar 23, 2006

Understanding CMS GC Logs

CMS GC with -XX:+PrintGCDetails and -XX:+PrintGCTimeStamps prints a lot of information. Understanding this information can help in fine tuning various parameters of the application and CMS to achieve best performance.

Let's have a look at some of the CMS logs generated with 1.4.2_10:

39.910: [GC 39.910: [ParNew: 261760K->0K(261952K), 0.2314667 secs] 262017K->26386K(1048384K), 0.2318679 secs]

Young generation (ParNew) collection. Young generation capacity is 261952K and after the collection its occupancy drops down from 261760K to 0. This collection took 0.2318679 secs.

40.146: [GC [1 CMS-initial-mark: 26386K(786432K)] 26404K(1048384K), 0.0074495 secs]

Beginning of tenured generation collection with CMS collector. This is initial Marking phase of CMS where all the objects directly reachable from roots are marked and this is done with all the mutator threads stopped.

Capacity of tenured generation space is 786432K and CMS was triggered at the occupancy of 26386K.

40.154: [CMS-concurrent-mark-start]

Start of concurrent marking phase.
In Concurrent Marking phase, threads stopped in the first phase are started again and all the objects transitively reachable from the objects marked in first phase are marked here.

40.683: [CMS-concurrent-mark: 0.521/0.529 secs]

Concurrent marking took total 0.521 seconds cpu time and 0.529 seconds wall time that includes the yield to other threads also.

40.683: [CMS-concurrent-preclean-start]

Start of precleaning.
Precleaning is also a concurrent phase. Here in this phase we look at the objects in CMS heap which got updated by promotions from young generation or new allocations or got updated by mutators while we were doing the concurrent marking in the previous concurrent marking phase. By rescanning those objects concurrently, the precleaning phase helps reduce the work in the next stop-the-world “remark” phase.

40.701: [CMS-concurrent-preclean: 0.017/0.018 secs]

Concurrent precleaning took 0.017 secs total cpu time and 0.018 wall time.

40.704: [GC40.704: [Rescan (parallel) , 0.1790103 secs]40.883: [weak refs processing, 0.0100966 secs] [1 CMS-remark: 26386K(786432K)] 52644K(1048384K), 0.1897792 secs]

Stop-the-world phase. This phase rescans any residual updated objects in CMS heap, retraces from the roots and also processes Reference objects. Here the rescanning work took 0.1790103 secs and weak reference objects processing took 0.0100966 secs. This phase took total 0.1897792 secs to complete.

40.894: [CMS-concurrent-sweep-start]

Start of sweeping of dead/non-marked objects. Sweeping is concurrent phase performed with all other threads running.

41.020: [CMS-concurrent-sweep: 0.126/0.126 secs]

Sweeping took 0.126 secs.

41.020: [CMS-concurrent-reset-start]

Start of reset.

41.147: [CMS-concurrent-reset: 0.127/0.127 secs]

In this phase, the CMS data structures are reinitialized so that a new cycle may begin at a later time. In this case, it took 0.127 secs.

This was how a normal CMS cycle runs. Now let us look at some other CMS log entries:

197.976: [GC 197.976: [ParNew: 260872K->260872K(261952K), 0.0000688 secs]197.976: [CMS197.981: [CMS-concurrent-sweep: 0.516/0.531 secs]
(concurrent mode failure): 402978K->248977K(786432K), 2.3728734 secs] 663850K->248977K(1048384K), 2.3733725 secs]

This shows that a ParNew collection was requested, but it was not attempted because it was estimated that there was not enough space in the CMS generation to promote the worst case surviving young generation objects. We name this failure as “full promotion guarantee failure”.

Due to this, Concurrent Mode of CMS is interrupted and a Full GC is invoked at 197.981. This mark-sweep-compact stop-the-world Full GC took 2.3733725 secs and the CMS generation space occupancy dropped from 402978K to 248977K.

The concurrent mode failure can either be avoided by increasing the tenured generation size or initiating the CMS collection at a lesser heap occupancy by setting CMSInitiatingOccupancyFraction to a lower value and setting UseCMSInitiatingOccupancyOnly to true. The value for CMSInitiatingOccupancyFraction should be chosen appropriately because setting it to a very low value will result in too frequent CMS collections.

Sometimes we see these promotion failures even when the logs show that there is enough free space in tenured generation. The reason is 'fragmentation' - the free space available in tenured generation is not contiguous, and promotions from young generation require a contiguous free block to be available in tenured generation. CMS collector is a non-compacting collector, so can cause fragmentation of space for some type of applications. In his blog, Jon talks in detail on how to deal with this fragmentation problem:

Starting with 1.5, for the CMS collector, the promotion guarantee check is done differently. Instead of assuming that the promotions would be worst case i.e. all of the surviving young generation objects would get promoted into old gen, the expected promotion is estimated based on recent history of promotions. This estimation is usually much smaller than the worst case promotion and hence requires less free space to be available in old generation. And if the promotion in a scavenge attempt fails, then the young generation is left in a consistent state and a stop-the-world mark-compact collection is invoked. To get the same functionality with UseSerialGC you need to explicitly specify the switch -XX:+HandlePromotionFailure.

283.736: [Full GC 283.736: [ParNew: 261599K->261599K(261952K), 0.0000615 secs] 826554K->826554K(1048384K), 0.0003259 secs]
GC locker: Trying a full collection because scavenge failed
283.736: [Full GC 283.736: [ParNew: 261599K->261599K(261952K), 0.0000288 secs]

Stop-the-world GC happening when a JNI Critical section is released. Here again the young generation collection failed due to “full promotion guarantee failure” and then the Full GC is being invoked.

CMS can also be run in incremental mode (i-cms), enabled with -XX:+CMSIncrementalMode. In this mode, CMS collector does not hold the processor for the entire long concurrent phases but periodically stops them and yields the processor back to other threads in application. It divides the work to be done in concurrent phases in small chunks(called duty cycle) and schedules them between minor collections. This is very useful for applications that need low pause times and are run on machines with small number of processors.

Some logs showing the incremental CMS.

2803.125: [GC 2803.125: [ParNew: 408832K->0K(409216K), 0.5371950 secs] 611130K->206985K(1048192K) icms_dc=4 , 0.5373720 secs]
2824.209: [GC 2824.209: [ParNew: 408832K->0K(409216K), 0.6755540 secs] 615806K->211897K(1048192K) icms_dc=4 , 0.6757740 secs]

Here, the scavenges took respectively 537 ms and 675 ms. In between these two scavenges, iCMS ran for a brief period as indicated by the icms_dc value, which indicates a duty-cycle. In this case the duty cycle was 4%. A simple calculation shows that the iCMS incremental step lasted for 4/100 \* (2824.209 - 2803.125 - 0.537) = 821 ms, i.e. 4% of the time between the two scavenges.

Starting with 1.5, CMS has one more phase – concurrent abortable preclean. Abortable preclean is run between a 'concurrent preclean' and 'remark' until we have the desired occupancy in eden. This phase is added to help schedule the 'remark' phase so as to avoid back-to-back pauses for a scavenge closely followed by a CMS remark pause. In order to maximally separate a scavenge from a CMS remark pause, we attempt to schedule the CMS remark pause roughly mid-way between scavenges.

There is a second reason why we do this. Immediately following a scavenge there are likely a large number of grey objects that need rescanning. The abortable preclean phase tries to deal with such newly grey objects thus reducing a subsequent CMS remark pause.

The scheduling of 'remark' phase can be controlled by two jvm options CMSScheduleRemarkEdenSizeThreshold and CMSScheduleRemarkEdenPenetration. The defaults for these are 2m and 50% respectively. The first parameter determines the Eden size below which no attempt is made to schedule the CMS remark pause because the pay off is expected to be minuscule. The second parameter indicates the Eden occupancy at which a CMS remark is attempted.

After 'concurrent preclean' if the Eden occupancy is above CMSScheduleRemarkEdenSizeThreshold, we start 'concurrent abortable preclean' and continue precleanig until we have CMSScheduleRemarkEdenPenetration percentage occupancy in eden, otherwise we schedule 'remark' phase immediately.

7688.150: [CMS-concurrent-preclean-start]
7688.186: [CMS-concurrent-preclean: 0.034/0.035 secs]
7688.186: [CMS-concurrent-abortable-preclean-start]
7688.465: [GC 7688.465: [ParNew: 1040940K->1464K(1044544K), 0.0165840 secs] 1343593K->304365K(2093120K), 0.0167509 secs]
7690.093: [CMS-concurrent-abortable-preclean: 1.012/1.907 secs]
7690.095: [GC[YG occupancy: 522484 K (1044544 K)]7690.095: [Rescan (parallel) , 0.3665541 secs]7690.462: [weak refs processing, 0.0003850 secs] [1 CMS-remark: 302901K(1048576K)] 825385K(2093120K), 0.3670690 secs]

In the above log, after a preclean, 'abortable preclean' starts. After the young generation collection, the young gen occupancy drops down from 1040940K to 1464K. When young gen occupancy reaches 522484K which is 50% of the total capacity, precleaning is aborted and 'remark' phase is started.

Note that in 1.5, young generation occupancy also gets printed in the final remark phase.

For more detailed information and tips on GC tuning, please refer to the following documents:




« July 2016