Tuesday Feb 18, 2014

Finite Number of Fat Locks in JRockit

Introduction

JRockit has a hard limit on the number of fat locks that can be "live" at once. While this limit is very large, the use of ever larger heap sizes makes hitting this limit more likely. In this post, I want to explain what exactly this limit is and how you can work around it if you need to.

Background

Java locks (AKA monitors) in JRockit basically come in one of two varieties, thin and fat. (We'll leave recursive and lazy locking out of the conversation for now.) For a detailed explanation of how we implement locking in JRockit, I highly recommend reading chapter 4 of JR:TDG. But for now, all that you need to understand is the basic difference between thin and fat locks. Thin locks are lightweight locks with very little overhead, but any thread trying to acquire a thin lock must spin until the lock is available. Fat locks are heavyweight and have more overhead, but threads waiting for them can queue up and sleep while waiting, saving CPU cycles. As long as there is only very low contention for a lock, thin locks are preferred. But if there is high contention, then a fat lock is ideal. So normally a lock will begin its life as a thin lock, and only be converted to a fat lock once the JVM decides that there is enough contention to justify using a fat lock. This conversion of locks between thin and fat is known as inflation and deflation.

Limitation

One of the reasons we call fat locks "heavyweight" is that we need to maintain much more data for each individual lock. For example, we need to keep track of any threads that have called wait() on it (the wait queue) and also any threads that are waiting to acquire the lock (the lock queue). For quick access to this lock information, we store this information in an array (giving us a constant lookup time). We'll call this the monitor array. Each object that corresponds to a fat lock holds an index into this array. We store this index value in a part of the object header known as the lock word. The lock word is a 32-bit value that contains several flags related to locking (and the garbage collection system) in addition to the monitor array index value (in the case of a fat lock). After the 10 flag bits, there are 22 bits left for our index value, limiting the maximum size of our monitor array to 2^22, or space to keep track of just over 4 million fat locks.

Now for a fat lock to be considered "live", meaning it requires an entry in the monitor array, it's object must still be on the heap. If the object is garbage collected or the lock is deflated, it's slot in the array will be cleared and made available to hold information about a different lock. Note that because we depend on GC to clean up the monitor array, even if the object itself is no longer part of the live set (meaning it is eligible for collection), the lock information will still be considered "live" and can not be recycled until the object gets collected.

So what happens when we use up all of the available slots in the monitor array? Unfortunately, we abort and the JVM exits with an error message like this:

===
ERROR] JRockit Fatal Error: The number of active Object monitors has overflowed. (87)
[ERROR] The number of used monitors is 4194304, and the maximum possible monitor index 4194303
===

Want to see for yourself? Try the test case below. One way to guarantee that a lock gets inflated by JRockit is to call wait() on it. So we'll just keep calling wait() on new objects until we run out of slots.

=== LockLeak.java
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;

public class LockLeak extends Thread {

      static List<Object> list  = new LinkedList<Object>();

      public static void main(String[] arg) {
            boolean threadStarted = false;
            for (int i = 0; i < 5000000; i++) {
                  Object obj = new Object();
                  synchronized(obj) {
                      list.add(0, obj);
                      if (!threadStarted) {
                          (new LockLeak()).start();
                          threadStarted = true;
                      }
                      try {
                          obj.wait();
                      } catch (InterruptedException ie) {} // eat Exception
                  }
            }
            System.out.println("done!"); // you must not be on JRockit!
            System.exit(0);
      }

      public void run() {
            while (true) {
                  Object obj = list.get(0);
                  synchronized(obj) {
                      obj.notify();
                  }
            }
      }

}
===

(Yes, this code is not even remotely thread safe. Please don't write code like this in real life and blame whatever horrible fate that befalls you on me. Think of this code as for entertainment purposes only. You have been warned.)

Resolution

While this may seem like a very serious limitation, in practice it is very unlikely to see even the most demanding application hit this limit. The good news is, even if you do have a system that runs up against this limit, you should be able to tune around the issue without too much difficulty. The key point is that GC is required to clean up the monitor array. The more frequently you collect your heap, the quicker "stale" monitor information (lock information for an object that is no longer part of the live set) will be removed.

As an example, one of our fellow product teams here at Oracle recently hit this limit while using a 50GB heap with a single space collector. By enabling the nursery (switching to a generational collector), they were able to completely avoid the issue. By proactively collecting short-lived objects, they avoided filling up the monitor array with entries for dead objects (that would otherwise have to wait for a full GC to be removed).

One other possible solution may be to set the -XX:FatLockDeflationThreshold option to a value below the default of 50 to more aggressively deflate fat locks. While this does work well for simple test cases like LockLeak.java above, I believe that more aggressive garbage collection is more likely to resolve any issues without a negative performance impact.

Either way, we have never seen anyone hit this problem that was not able to tune around the limitation very easily. It is hard to imagine that any real system will ever need more than 4 million fat locks all at once. But in all seriousness, given JRockit's current focus on stability and the lack of a use case that requires more, we are almost certainly not going to ever make the significant (read: risky) changes that removing or expanding this limit would require. The good news is that HotSpot does not seem to have a similar limitation.

Conclusion

You are very unlikely to ever see this issue unless you are running an application with a very large heap, a lot of lock contention, and very infrequent collections. By tuning to collect dead objects that correspond to fat locks faster, for example by enabling a young collector, you should be able to avoid this limit easily. In practice, no application today (or for the near future) will really need over 4 million fat locks at once. As long as you help the JVM prune the monitor array frequently enough, you should never even notice this limit.

Sunday Feb 02, 2014

Inflation System Properties

I wanted to write a quick post about the two inflation-related system properties: sun.reflect.inflationThreshold and sun.reflect.noInflation. There seems to be a lot of confusion on Oracle's forums (and the rest of the net) regarding their behavior. Since neither of these properties is officially documented by us, I thought an informal explanation here might help some people.

There are a ton of good resources out on the net that explain inflation in detail and why we do it. I won't try to duplicate the level of detail of those efforts here. But just to recap:

There are two ways for Java reflection to invoke a method (or constructor) of a class: JNI or pure-Java. JNI is slow to execute (mostly because of the transition overhead from Java to JNI and back), but it incurs zero initial cost because we do not need to generate anything; a generic accessor implementation is already built-in. The pure-Java solution runs much faster (no JNI overhead), but has a high initial cost because we need to generate custom bytecode at runtime for each method we need to call. So ideally we want to only generate pure-Java implementations for methods that will be called many times. Inflation is the technique the Java runtime uses to try and achieve this goal. We initially use JNI by default, but later generate pure-Java versions only for accessors that are invoked more times than a certain threshold. If you think this sounds similar to HotSpot method compilation (interpreter before JIT) or tiered compilation (c1 before c2), you've got the right idea.

This brings us to our two system properties that influence inflation behavior:

sun.reflect.inflationThreshold

This integer specifies the number of times a method will be accessed via the JNI implementation before a custom pure-Java accessor is generated. (default: 15)

sun.reflect.noInflation

This boolean will disable inflation (the default use of JNI before the threshold is reached). In other words, if this is set to true, we immediately skip to generating a pure-Java implementation on the first access. (default: false)

There are a few points I would like to make about the behavior of these two properties:

1. noInflation does NOT disable the generation of pure-Java accessors, it disables use of the JNI accessor. This behavior is the exact opposite of what many users assume based on the name. In this case, "inflation" does not refer to the act of generating a pure-Java accessor, it refers to the 2-stage process of using JNI to try and avoid the overhead of generating a pure-Java accessor for a method that may only be called a handful of times. Setting this to true means you don't want to use JNI accessors at all, and always generate pure-Java accessors.

2. Setting inflationThreshold to 0 does NOT disable the generation of pure-Java accessors. In fact, it has almost the exact opposite effect! If you set this property to 0, on the first access, the runtime will determine that the threshold has already been crossed and will generate a pure-Java accessor (which will be used starting from the next invocation). Apparently, IBM's JDK interprets this property differently, but on both of Oracle's JDKs (OracleJDK and JRockit) and OpenJDK, 0 will not disable generation of Java accessors, it will almost guarantee it. (Note that because the first invocation will still use the JNI accessor, any value of 0 or less will behave the same as a setting of 1. If you want to generate and use a pure-Java accessor from the very first invocation, setting noInflation to true is the correct way.)

So there is no way to completely disable the generation of pure-Java accessors using these two system properties. The closest we can get is to set the value of inflationThreshold to some really large value. This property is a Java int, so why not use Integer.MAX_VALUE ((2^31)-1)?

$ java -Dsun.reflect.inflationThreshold=2147483647 MyApp

This should hopefully meet the needs for anyone looking to prevent continuous runtime generation of pure-Java accessors.

For those of you interested in all of the (not really so) gory details, the following source files (from OpenJDK) correspond to most of the behavior I have described above:

jdk/src/share/classes/sun/reflect/ReflectionFactory.java
jdk/src/share/classes/sun/reflect/NativeMethodAccessorImpl.java
jdk/src/share/classes/sun/reflect/DelegatingMethodAccessorImpl.java
jdk/src/share/classes/sun/reflect/NativeConstructorAccessorImpl.java
jdk/src/share/classes/sun/reflect/DelegatingConstructorAccessorImpl.java

As with anything undocumented, you should not rely on the behavior of these options (or even their continued existence). The idea is you should not normally need to set these properties or even understand what inflation is; it should be transparent and "just work" right out of the box. The inflation implementation could change at any point in the future without notice. But for now, hopefully this post will help prevent some of the confusion out there.

Tuesday Jan 21, 2014

JRockit R28 issue with exact profiling and generics

Some users of JRockit R28 may have noticed problems using Mission Control's exact profiling on methods that use Java's generics facility. Invocation counters for these methods would simply not respond; calling each method would fail to increment the corresponding call count.

For exact method profiling in R28, we replaced our homegrown bytecode instrumentation solution with Apache's Byte Code Engineering Library (BCEL). A version of BCEL was already included in the JDK as an internal component of JAXP, and using BCEL helped provide a cleaner code generation pipeline.

The problem was that while the internal version of BCEL contained in the JDK is very stable and works fine for the very narrow use cases JAXP requires, there were problems using it to instrument arbitrary code as needed by Mission Control Console's profiling tool.

One of those issues was that support for Java generics was never fully implemented in BCEL. In particular, instrumenting a method with generic code could produce bytecode with inconsistencies between the LocalVariableTable (LVT) attribute and the LocalVariableTypeTable (LVTT) attribute (Please see the class file format for details). Thankfully, this issue was found and fixed (in development branches) by the BCEL project:

Bug 39695 - java.lang.ClassFormatError: LVTT entry for 'local' in class file org/shiftone/jrat/test/dummy/CrashTestDummy does not match any LVT entry

Unfortunately, the JDK version of BCEL predated this fix. So when JRockit tried to instrument such methods using BCEL, the new method's bytecode would be invalid and fail subsequent bytecode validation, leaving the original, uninstrumented, version of the method in place.

While I briefly considered adding code on the JRockit side to work around the BCEL issue, it seemed that fixing the version of BCEL in the JDK itself was The Right Thing to do here. Unfortunately for me, the fix for bug 39695 was based on of a version of BCEL that was much more recent than the one contained in the JDK, so I needed to port over a lot of other code to get things working.

JDK-8003147: port fix for BCEL bug 39695 to our copy bundled as part of jaxp

My port of the BCEL fix and other needed code went into 5u45, 6u60 and 7u40. Note that for Java 5, our only option was to put the fix into a CPU release, as we no longer provide non-CPU releases for Java 5. This means that the exact version of JRockit this fix made it into depends on the major Java version: For Java 5: R28.2.7. For Java 6: R28.2.9. As the LVT/LVTT would normally only be included with debug builds, recompiling affected classes without the -g flag should also be a viable workaround for users of earlier releases.

Hopefully, not too many users were impacted by this issue. As explained very well in JR:TDG, sampling-based profiling (like that provided by Flight Recorder), is almost always a better choice than exact profiling. But this story is interesting for another reason: it is a great example of how depending on internal (not officially documented) classes within the JDK is almost always a really bad idea (*1). Even we have been known to get bit.

(*1) Upcoming modularization of the Java Class Library is expected to do a better job preventing outside use of internal JDK classes. Not that it would have helped here.

Wednesday Jan 15, 2014

JRockit R27.8.1 and R28.3.1 versioning

As part of today's CPU release, JRockit R27.8.1 and R28.3.1 are now available to our support customers for download from MOS. (If you don't have a support contract, upgrading to Java 7 update 51 is the way to go.)

I just wanted to post a brief comment about our versioning scheme. It seems that many people have noticed that we have increased the "minor" version numbers for both R27 and R28. For example, R28 suddenly went from R28.2.9 to R28.3.1. Please let me assure you: these are just ordinary maintenance releases, not feature releases. There is zero significance to the jump in minor version number.

The reasoning behind the jump is simple: fear of breaking stuff. For as long as we have used the Rxx.y.z versioning scheme for JRockit, y and z have been single digits. For better or worse, version strings are often read and parsed by all sorts of tools, scripts, and sometimes even the Java applications themselves. While R28.2.10 may have been the most intuitive choice for today's release, we didn't want to risk breaking anyone's system that somehow depended on these numbers being single digits. So why R28.3.1 as opposed to R28.3.0? We thought that a dot zero release would sound too much like a feature release, so to help emphasize the fact that this is just another maintenance release, we went to .1 instead of .0. R27 had an even bigger sudden jump, from R27.7.7 to R27.8.1. This was done to synchronize the last version digits between R27 and R28 to make it easier to tell what versions were released at the same time (and hence contain the same security fixes).

We have actually done this once before in the past, when R27 jumped from R27.6.9 to R27.7.1. Because so many JRockit users had already moved on to R28 by then, that bump seems to have gotten a lot less attention than today's release.

So in summary, all recent JRockit releases (R27.6.1 and later for R27. R28.2.1 and later for R28.) are maintenance releases. If you are still using JRockit, please plan to upgrade as soon as possible to get these important fixes. (Or even better, make the move to Java 7!)

Monday Jan 13, 2014

<< "Hello, Blog!" MSGBOX >>

Welcome to my new work blog!

For those of you that don't know me, a quick introduction:

I am a member of the Java SE Sustaining Engineering team which is made up of the former Sun and BEA (JRockit) Java sustaining teams. My work is divided between our two JVMs (HotSpot and JRockit) and the various Java client technologies (Java2D, AWT, Swing, JavaWS, applets, Java Sound, etc.). Currently, most of my time is still spent working on JRockit. I am based out of Oracle's Akasaka, Tokyo office. In my spare time, I enjoy programming.

My plan is to post regularly about anything interesting I come across related to Java or even software development in general. There will probably be a lot of posts about JRockit for the immediate future, but I am also looking forward to talking more about HotSpot as the JRockit user base continues the move to JDK7 and beyond.

One of the most fun and exciting things about working on Java is that our users are programmers, just like us. Interacting with the people at user group meetings, JavaOne, and other conventions is always a complete blast. I started this blog to have another outlet to communicate with my fellow programmers.

Thank you for visiting my new blog! I look forward to your comments and feedback.
About

I am a member of the Java SE Sustaining Engineering team. I work on both JVMs (HotSpot and JRockit) and the various client technologies (AWT, Swing, Java2D, etc.). I will post mostly about the work I do and any interesting programming, troubleshooting, tuning tips or other random stuff I think somebody out there might want to read about.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today