Profiling jdbc related method calls with DTrace

The Sun Hotspot VM has build-in facilities for method tracing using DTrace. Each Java method entry and exit now has a probe that can fire. Every non arbitrary Java program most likely does many many method calls. Therefore the expected overhead should be a factor to consider. While running a benchmark on a Glassfish application server environment using PostgreSQL as the database we were trying to understand the time spent in the JDBC driver being used.

Here I try to describe how a package specific profiler is implemented through DTrace. In order to catch these firing probes access to the Java pid is needed. The generic way to create clauses around these probes is:

hotspot$1:::method-entry
and
hotspot$1:::method-return
Here $1 is the pid of the Java process of interest. These probes are specific for a process. Therefor the Java process needs to be started before the DTrace script can run (successfully). In order to not miss any method calls the Java process is best started with the following options in place:
-XX:+UnlockDiagnosticVMOptions
-XX:+PauseAtStartup
-XX:+ExtendedDTraceProbes
The ExtendedDTraceProbes is needed to enable the two mentioned method probes. The UnlockDiagnosticVMOption is needed to enable the PauseAtStartup parameter handling. The PauseAtStartup is needed to stop the VM before the Java program is started, but after the VM has been initialized. The little thing here is that all the VM probes are part of the libjvm.so library. Only once this library is loaded and .init-ted will the probes exists and can we enable the probes specified by our clauses.

The directory from where the VM is started will contain a (empty) file called vm.paused. where is the process ID of the VM. Once this file exists will the VM be in stop-ped state and can your DTrace script be started. If your DTrace script is compiled and active remove the vm.paused. file. This will automagically activate the VM after which it will "just" run to completion.

More details about the probes the hotspot provider gives you can be found at Keith his blog:
http://blogs.sun.com/kamg/category/Hotspot

Here I will describe my attempt to create a "profiler" that can focus on the use (that is (CPU)time consumption) of the methods in a certain package. A note of caution is called for. The code presented below will give a measurable performance degradation. I believe it should not be used in a production environment.

We had an interest to understand which part of a JDBC driver were consuming most of the time. The JDBC drivers packages of interest were org/postgresql/jdbc2 and org/postgresql/jdbc3 as being used from a Glassfish installation. This installation was stressed in a benchmark environment where we had a need to understand where time was spent.

In general a method will call other methods, that most of time will time yet other methods. Not only did we needed an understanding of which method consumed most of the time, but also if this was due to this method or due to one or more other methods that were called by it. This separates the data into two groups: inclusive and exclusive numbers. I think the following idea should be used during the implementation:

target$1:::method-entry
{
    time[depth] = timestamp;        /\* remember when we started           \*/
    depth++;                        /\* one extra step in the call tree    \*/
    includeMeasurement[depth] = 0;  /\* time consumption due to calls to   \*/
                                    /\* other methods originating from this\*/
                                    /\* method is set to zero. Hey we just \*/
                                    /\* started \*/
}

target$1:::method-return
/depth > 0/
{
    depth--;
    delta = timestamp - time[depth];/\* amount of time we spend in this    \*/
    inclusive += delta;             /\* method all the time spend in this  \*/
                                    /\* method (every invocation)          \*/
    includeMeasurement[depth]       /\* total time spend, including methods\*/
        += delta;                   /\* called by me, that must be sub-    \*/
                                    /\* stracted to get the exclusive time \*/
                                    /\* of the method that called me       \*/
    delta -=                        /\* all time on my account reduced with\*/
        includeMeasurement[depth+1];/\* the total time spend by others that\*/ 
                                    /\* I called will give me my exclusive \*/
    includeMeasurement[depth+1] = 0;/\* time to prevent this amount be used\*/
                                    /\* once more in some future           \*/
    exclusive += delta;             /\* My exclusive time                  \*/
}
There are some gotcha's here. Any VM process is multithreaded. To ensure minimal overhead but proper handling of multiple competing threads there is a need to use the DTrace "self->" construct. Both variables inclusive and exclusive are fine candidates to be implemented by an aggregation. The variable delta is typically a clause local variable, while most of the others should be thread local variables. In order to implement this we also need to specify type of the variables used.

I hope the following is a correct implementation:

#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option destructive
#pragma D option defaultargs
#pragma D option bufsize=16m
#pragma D option aggrate=100ms
#pragma D option dynvarsize=16m

self string package;

/\* class, method and signature together uniquely define a method.
(assuming a single namespace) However concatenating these into
a single string has my preference but strjoin() is only valid in
clause context. Looks to me I need to keep them separately.
If you know a better way, please do inform me
\*/

/\* all these arrays are indexed by calling depth. This to not 
miscalculate when recursive calls are made.
\*/
self string class[int];
self string method[int];
self string signature[int];

self int    started[int];
self int    incl[int];
self int    vincl[int];
self int    depth;

BEGIN
{
        startedAt = timestamp;
}

/\* the arguments specify the class, method, and signature. 
   Currently only interested in the package name. This 
   is the base name of the class.
\*/
hotspot$1:::method-entry
{
        this->str_ptr = (char\*) copyin(arg1, arg2+1);
        this->str_ptr[arg2] = '\\0';
        self->class[self->depth] = (string) this->str_ptr;

        self->package = dirname(self->class[self->depth] );
}

/\*  The clause specified above gets us our package
    name so we can filter. Only for the stuff of
    interest do we need the class, method and 
    signature. And the time (vtime) everything
    started for this method at this call depth.
\*/
hotspot$1:::method-entry
/  self->package == "org/postgresql/jdbc2"
|| self->package == "org/postgresql/jdbc3"
/
{
        this->str_ptr = (char\*) copyin(arg3, arg4+1);
        this->str_ptr[arg4] = '\\0';
        self->method[self->depth] = (string) this->str_ptr;

        this->str_ptr = (char\*) copyin(arg5, arg6+1);
        this->str_ptr[arg6] = '\\0';
        self->signature[self->depth] = (string) this->str_ptr;

        self->started[self->depth] = timestamp;
        self->vstarted[self->depth]= vtimestamp;

        self->depth++;
        self->incl[ self->depth ] = 0;
        self->vincl[ self->depth ] = 0;
}

hotspot$1:::method-return
/self->depth > 0/
{
        self->depth--;

        /\* always nice to know how much time elapsed in 
           context of the amount of calls being handled.
           Puts things in a certain perspective.
        \*/
        @calls[ self->class[self->depth]
              , self->method[self->depth]
              , self->signature[self->depth]
              ] = count();

        this->delta  = timestamp  - self->started [self->depth];

        /\* This one is easy. Just accumulate it all \*/
        @inclusive[ self->class[self->depth]
                  , self->method[self->depth]
                  , self->signature[self->depth]
                  ] = sum(this->delta);
 
        self->incl[self->depth] += this->delta;

        /\* I used everything exept what has been used by those
           called by me \*/
        this->delta -= self->incl[ self->depth + 1 ];

        /\* its been used, and must be reset to zero \*/
        self->incl[ self->depth + 1 ] = 0;

        @exclusive[ self->class[self->depth]
                  , self->method[self->depth]
                  , self->signature[self->depth]
                  ] = sum(this->delta);

        /\* and the same story for the vtimestamp measurements \*/

        this->delta = vtimestamp - self->vstarted[self->depth];
        @vinclusive[ self->class[self->depth]
                   , self->method[self->depth]
                   , self->signature[self->depth]
                   ] = sum(this->delta);

        self->vincl[self->depth] += this->delta;
        this->delta -= self->vincl[ self->depth + 1 ];

        self->vincl[ self->depth + 1 ] = 0;

        @vexclusive[ self->class[self->depth]
                   , self->method[self->depth]
                   , self->signature[self->depth]
                   ] = sum(this->delta);
}

syscall::rexit:entry
/pid == $1/
{
        exit(0);
}

:::END
{
        printf("Total time (sec): %d\\n", (timestamp - startedAt)/1000000000 );
        normalize(@inclusive, 1000);
        normalize(@exclusive, 1000);
        normalize(@vinclusive, 1000);
        normalize(@vexclusive, 1000);
        printa("C %8@d %s/%s %s\\n", @calls);
        printa("TI %8@d %s/%s %s\\n", @inclusive);
        printa("TE %8@d %s/%s %s\\n", @exclusive);
        printa("VI %8@d %s/%s %s\\n", @vinclusive);
        printa("VE %8@d %s/%s %s\\n", @vexclusive);
}
It might take some time to grasp everything. I tested it with the following method call flow:

A() {
    consume(15);
    B();
    consume(20);
    C();
    consume(5);
    B();
    consume(5;
}
B() {
    consume(5);
    C();
    consume(5);
    C();
    consume(10);
}
C() {
    consume(10);
}
Each consume() is an inline chunk of code that consumes that amount of time. This should lead to the following:
     Incl   Excl
A:    135     45
B:     80     40
C:     50     50				5 times called, each 10

Findings

Below some tables with the findings of a benchmark run. These tables show the Top-5 of Count, Ela delta and CPU delta. The Ela columns show the result of the timestamp measurements, the CPU columns that of the results of the vtimestamp measurements. The Incl stand for inclusive, Excl for exclusive. The method signature is shortened by removing the package name part "org/postgresql/" from it.

Top-5 Count
Count Ela Incl Ela Excl Ela delta CPU Incl CPU Exlc CPU delta Method signature
53255123311233110359035900jdbc2/AbstractJdbc2ResultSet/ toInt(Ljava/lang/String;)I
57814024120010666613453422108983112277jdbc2/AbstractJdbc2ResultSet/ getFixedString(I) Ljava/lang/String;
114437043396433960372537250jdbc2/AbstractJdbc2ResultSet/ trimString(ILjava/lang/String;) Ljava/lang/String;
11455652631672139914917624355206213734jdbc2/AbstractJdbc2ResultSet/ getString(I) Ljava/lang/String;
119874950701507010387238720jdbc2/AbstractJdbc2ResultSet/ checkColumnIndex(I)V

Top-5 Elapsed Inclusive
Count Ela Incl Ela Excl Ela delta CPU Incl CPU Exlc CPU delta Method signature
134422739612552948432446218272635jdbc3/AbstractJdbc3Connection/
prepareStatement(Ljava/lang/String;II)
Ljava/sql/PreparedStatement;
10552684984386744631028102048762jdbc2/AbstractJdbc2Statement
$StatementResultHandler
handleResultRows
(Lcore/Query;[Lcore/Field;Ljava/util/Vector;
Lcore/ResultCursor;)V
13442214441241973102439967425117163jdbc2/AbstractJdbc2Connection/
prepareStatement(Ljava/lang/String;)
Ljava/sql/PreparedStatement;
57814024120010666613453422108983112277jdbc2/AbstractJdbc2ResultSet/
getFixedString(I)Ljava/lang/String;
11455652631672139914917624355206213734jdbc2/AbstractJdbc2ResultSet/
getString(I)Ljava/lang/String;

Top-5 Elapsed Delta
Count Ela Incl Ela Excl Ela delta CPU Incl CPU Exlc CPU delta Method signature
10552684984386744631028102048762jdbc2/AbstractJdbc2Statement
$StatementResultHandler/
handleResultRows
(Lcore/Query;[Lcore/Field;Ljava/util/Vector;
Lcore/ResultCursor;)V
134422739612552948432446218272635jdbc3/AbstractJdbc3Connection/
prepareStatement(Ljava/lang/String;II)
Ljava/sql/PreparedStatement;
11455652631672139914917624355206213734jdbc2/AbstractJdbc2ResultSet/
getString(I)Ljava/lang/String;
13442214441241973102439967425117163jdbc2/AbstractJdbc2Connection/
prepareStatement(Ljava/lang/String;)
Ljava/sql/PreparedStatement;
57814024120010666613453422108983112277jdbc2/AbstractJdbc2ResultSet/
getFixedString(I)Ljava/lang/String;

Top-5 CPU Elapsed
Count Ela Incl Ela Excl Ela delta CPU Incl CPU Exlc CPU delta Method signature
134422739612552948432446218272635jdbc3/AbstractJdbc3Connection/
prepareStatement(Ljava/lang/String;II)
Ljava/sql/PreparedStatement;
105499520741796334111480614703336jdbc2/AbstractJdbc2Statement/
executeQuery()Ljava/sql/ResultSet;
13442214441241973102439967425117163jdbc2/AbstractJdbc2Connection/
prepareStatement(Ljava/lang/String;)
Ljava/sql/PreparedStatement;
57814024120010666613453422108983112277jdbc2/AbstractJdbc2ResultSet/
getFixedString(I)Ljava/lang/String;
11455652631672139914917624355206213734jdbc2/AbstractJdbc2ResultSet/
getString(I)Ljava/lang/String;

Top-5 CPU Delta
Count Ela Incl Ela Excl Ela delta CPU Incl CPU Exlc CPU delta Method signature
134422739612552948432446218272635jdbc3/AbstractJdbc3Connection/
prepareStatement(Ljava/lang/String;II)
Ljava/sql/PreparedStatement;
105499520741796334111480614703336jdbc2/AbstractJdbc2Statement/
executeQuery()Ljava/sql/ResultSet;
11455652631672139914917624355206213734jdbc2/AbstractJdbc2ResultSet/
getString(I)Ljava/lang/String;
13442214441241973102439967425117163jdbc2/AbstractJdbc2Connection/
prepareStatement(Ljava/lang/String;)
Ljava/sql/PreparedStatement;
57814024120010666613453422108983112277jdbc2/AbstractJdbc2ResultSet/
getFixedString(I)Ljava/lang/String;

From these top-5 lists it can be concluded that, in the context of this benchmark, the three methods of interest are org/postgresql/jdbc2 package. The methods are: getString, getFixedString both from theAbstractJdbc2ResultSet class and the prepareStatement from the class AbstractJdbc2Connection. The next step in this conquest is to only check what is happening with respect to these three methods.

Comments:

This seems a complete step backwards in terms of software performance engineering for Java applications. DTrace is a fine technology but its application to JDBC performance analysis is woefully lacking and shows clear the main difference between a performance management solution and a nice instrumentation technology. Java EE performance management is all about context - context that is lacking in looking at the invocation of a JDBC code. Context in relation to JDBC includes: JDBC URL, SQL, Parameters, Transaction History (Paths/Patterns), XA XID. An effective performance management solution combines many aspects (metrics, resources, paths, code, parameters) into a complete performance model. This is nowhere to be seen above and what's worse no one would even attempt to use DTrace is such a way when there are many better alternatives. DTrace has its strengths but it is not contextual and certainly cannot associated contexts across tiers in any production environment (except for relatively weak temporal analysis). Monitoring Sun’s Application Servers http://blog.jinspired.com/?p=43 JDBInsight Blog Entries http://blog.jinspired.com/?cat=22 Older JDBInsight Articles http://www.jinspired.com/products/jxinsight/insights.html regards, William Louth JXInsight Product Architect JINSPIRED "Performance monitoring and problem management for Java EE, SOA, and Grid Computing" http://www.jinspired.com

Posted by William Louth on July 19, 2007 at 05:33 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

paulvandenbogaard

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today