Friday Feb 29, 2008

A Short java_crw_demo User Guide

The java_crw_demo library is provided as a native code (C) demo of a BCI library that can instrument class files. It is just a demo, but an operating demo in that it has been used in the hprof VM agent and various JVM TI demo agents delivered in the OpenJDK JVM TI Demo Sources (typically the built versions of these demos are in the JDK installed on your system in the demo/jvmti directory). The complete code to java_crw_demo can be found in the OpenJDK Mercurial Repository. In particular, the #include file java_crw_demo.h is of primary interest.

A complete description of the class file format can be found in Chapter 4 of the Java Virtual Machine Specification (or look at the wikipedia entry on the class file format). Only part of the class file needs to be modified for basic instrumentation: the Code attribute (including the max-stack field), the constant pool, the Exception Table, the LineNumberTable, the LocalVariableTable, the LocalVariableTypeTable, the StackMapTable, and any StackMap attributes. The java_crw_demo does not add methods or fields and does not change the exception data except to adjust the pc offsets or byte offsets in the exception table. In fact the basic Table changes are just the pc or byteoffset adjustments. Any instrumentation more detailed than this would best be done with something other than a demo library like java_crw_demo, something more like ASM or BCEL would make much more sense.

Please keep in mind this is just a demo library, and fairly primitive in what it can do. It's functionality was driven by what hprof needed to do, e.g. instrument method entry and exit, and instrument memory allocations.

Basically there are just two functions in this library:

void java_crw_demo(
         unsigned              class_number,
	 const char \*          name,
	 const unsigned char \* file_image, 
	 long                  file_len,
	 int                   system_class,
	 char \*                tclass_name,
	 char \*                tclass_sig,
	 char \*                call_name,
	 char \*                call_sig,
	 char \*                return_name,
	 char \*                return_sig,
	 char \*                obj_init_name,
	 char \*                obj_init_sig,
	 char \*                newarray_name,
	 char \*                newarray_sig,
	 unsigned char \*\*      pnew_file_image,
	 long \*                pnew_file_len,
	 FatalErrorHandler     fatal_error_handler,
	 MethodNumberRegister  mnum_callback);

char \* java_crw_demo_classname(
         const unsigned char \* file_image, 
	 long                  file_len, 
	 FatalErrorHandler     fatal_error_handler);

The java_crw_demo_classname method is used to extract out the classname from a class file. In some cases classes are loaded into the VM without a name (see defineClass method in java.lang.ClassLoader.)

The java_crw_demo is the function you can call with a class file image and get back an instrumented class file image. The arguments are defined below:

  • unsigned class_number
    A unique identifying number for this class in your agent (you get to define what this will mean). This number will be given back to you when the instrumentation code is executed for method calls and method returns. If you are not doing method call or method return instrumentation, this has little value. It is assumed that you would have some kind of table in the agent code that could map this class number to the class name and it's method tables if needed. is ess
  • const char \* name
    The name of the class in the form "java/lang/Object".
  • const unsigned char \* file_image
    The class file image.
  • long file_len
    The number of bytes in the file_image.
  • int system_class
    Set to non-zero if this class is one that is loaded very early in the VM startup. Great care needs to happen with modifying these classes during VM startup.
  • char \* tclass_name
    The name of the Tracker class that will have the static methods we will call as part of the instrumentation code.
  • char \* tclass_sig
    The class signature for the Tracker class.
  • char \* call_name
    The name of the static method in the Tracker class that will be used for method entries or indications of method calls.
  • char \* call_sig
    The method signature for the call_name method.
  • char \* return_name
    The name of the static method in the Tracker class that will be used for method exits or indications of method returns.
  • char \* return_sig
    The method signature for the return_name method.
  • char \* obj_init_name
    The name of the static method in the Tracker class that will be used for object allocations.
  • char \* obj_init_sig
    The method signature for the obj_init_name method.
  • char \* newarray_name
    The name of the static method in the Tracker class that will be used for array allocations.
  • char \* newarray_sig
    The method signature for the newarray_name method.
  • unsigned char \*\* pnew_file_image
    If instrumentation happens, this will be a pointer to the new instrumented class file image, malloc() space.
  • long \* pnew_file_len
    The length of the new class file image returned in \*pnew_file_image.
  • FatalErrorHandler fatal_error_handler
    If non NULL, provides a function to call when fatal errors are encountered while parsing or creating the new class file image.
  • MethodNumberRegister mnum_callback
    If non NULL, provides a callback function to get access to the method names and signatures in the class. This returns the class number you supplied plus arrays of method names and signatures plus a count of those methods. These method numbers (index into the arrays is the method number) are passed into the instrumented method calls, baked into the instrumentation.

It's assumed that the JVM TI agent code would request some kind of class load event, a good example is the heapTracker demo. When it gets a CLASS FILE LOAD HOOK event, it effectively passes in the class image to java_crw_demo:

newImage = NULL;
newLength = 0;
java_crw_demo(cnum, classname,
              class_data, class_data_len, systemClass,
              "HeapTracker", "LHeapTracker;",
              NULL, NULL,
              NULL, NULL,
              "newobj", "(Ljava/lang/Object;)V",
              "newarr", "(Ljava/lang/Object;)V",
              &newImage, &newLength,
              NULL, NULL);

Which only does instrumentation for object allocations and doesn't use the callbacks.

This demo doesn't fully use java_crw_demo like hprof, the hprof Tracker class is a complete Tracker class, while the heapTracker class is just a partial tracking class. The VM agent needs to implement and register the native methods for these Tracker classes.

The Tracker class doesn't have to use native methods, but since hprof was a native code agent, and most VM agents are native code, somehow the information captured via the class file instrumentation needs to get back into the native agent anyway.

A pure java VM agent via the java agent mechanisms is probably a better way to go, but at this time I don't have a simple demo of the java agent.



Various blogs on JDK development procedures, including building, build infrastructure, testing, and source maintenance.


« July 2016