Java @Contended annotation to help reduce false sharing

See this posting by Aleksey Shipilev for details -- @Contended is something we've wanted for a long time. The JVM provides automatic layout and placement of fields. Usually it'll (a) sort fields by descending size to improve footprint, and (b) pack reference fields so the garbage collector can process a contiguous run of reference fields when tracing. @Contended gives the program a way to provide more explicit guidance with respect to concurrency and false sharing. Using this facility we can sequester hot frequently written shared fields away from other mostly read-only or cold fields. The simple rule is that read-sharing is cheap, and write-sharing is very expensive. We can also pack fields together that tend to be written together by the same thread at about the same time.

More generally, we're trying to influence relative field placement to minimize coherency misses. In a simple single-threaded environment fields that are accessed closely together in time should be placed proximally in space to promote cache locality. That is, temporal locality should condition spatial locality. Fields accessed together in time should be nearby in space. That having been said, when threads are accessing our fields concurrently we have to be careful to avoid false sharing and excessive invalidation from coherence traffic. As such, we try to cluster or otherwise sequester fields that tend to written at approximately the same time by the same thread onto the same cache line. Note that there's a tension at play: if we try too hard to minimize single-threaded capacity misses then we can end up with excessive coherency misses running in a parallel environment. In native C/C++ code it's fairly typical for programmers to use informed concurrency-aware structure layout. @Contended should give use the same capability in Java, although in native code the binding of fields to offsets happens at compile-time, while it happens at load-time for the Java. It's worth pointing out that in the general case there is no single optimal layout for both single-thread and multithreaded environments. And the ideal layout problem itself is NP-hard.

Ideally, a JVM would employ hardware monitoring facilities to detect sharing behavior and change the layout on the fly. That's a bit difficult as we don't yet have the right plumbing to provide efficient and expedient information to the JVM. Hint: we need to disintermediate the OS and hypervisor. Another challenge is that raw field offsets are used in the unsafe facility, so we'd need to address that issue, possibly with an extra level of indirection.

Finally, I'd like to be able to pack final fields together as well, as those are known to be read-only.


Hmm, I like the idea of @Contented, presumably to go with relaxed memory semantics and lazy eval? B^>

I also like the idea of @Contended and the layout dynamically chosen to fit the CPU/cache properties is kewl.



Posted by Damon Hart-Davis on November 24, 2012 at 01:35 PM EST #

Hi Damon, Even on a system with a strong sequentially consistent memory model we'll be exposed to issues with false sharing. That is, our concern isn't particularly related to strong or weak memory models.
Regards, -Dave

Posted by guest on November 24, 2012 at 03:22 PM EST #

Sorry, I was mainly ribbing you about the typo in your title, though I *do* also like the real thrust of this feature (and it's something I can tease my C++ addicted friends about with their old-fashioned static compilation)...



Posted by Damon Hart-Davis on November 24, 2012 at 03:26 PM EST #

Thanks for catching the embarrassing typo, which I just fixed! Like you, I wonder exactly what the semantics of "@contented" would mean (:>). Regards, -Dave

Posted by guest on November 24, 2012 at 03:32 PM EST #


Many local variables need not have to go through GC as they can be GCed when going out of scope. Is it possible to introduce this function.


Posted by guest on April 23, 2013 at 12:31 PM EDT #


I am thinking if Java introduces defined GC timing and overload relative to code location. It will reduce the dependence on GC related issues for real time / embedded systems programming. Proper escape analysis can determine what objects can be deleted in the code.

I am thinking it would be great if the following annotation are introduced:
GCAtBlockExit - for loops the GC happens when the
GCOnGCOfContainingObject - GC on GC of aggregate / composing object
GCOnStatementCompletion - GC parameters and return value (if not assigned) of a call when the function returns. Need to support statement level annotations.
GCOnLastResort - Only GC at last resort before outofmemory exception - class level
ExcludeFinalising - prevent finalizers running on the object - class level
GCUsingNewThread(Priority = n)
GCUsing(GCSystem) - all objects created in the specified scope will be GCed using specified GC - annotations need to support static references
GCSuspend - a synchronous object can be used so that the GC threads do not run while the code block is executing


Posted by Suminda Dharmasena on May 03, 2013 at 04:36 AM EDT #

In the penultimate paragraph, you raise a concern about the unsafe facility using raw field offsets. JEP 159 Enhanced Class Redefinition allows for adding and removing fields. Wouldn't JEP 159 have to deal with the unsafe facility as well?

Posted by Nathan Reynolds on June 02, 2013 at 05:41 PM EDT #

Regarding variables going out of scope, escape analysis should provide the desired benefit. But escape analysis is tricky and the JIT is not always able to prove non-escape.

Regards, -Dave

Posted by guest on June 03, 2013 at 10:18 AM EDT #

Regarding variables going out of scope, escape analysis should provide the desired benefit. But escape analysis is tricky and the JIT is not always able to prove non-escape.

Regards, -Dave

Posted by guest on June 03, 2013 at 10:22 AM EDT #

Hi Nathan, The raw offsets used by unsafe present a number of challenges. First, as mentioned above, we can't easily change object layout on the fly. 2nd, as you mentioned, it interacts poorly with JEP-159. I'm not certain what the expected outcomes would be regarding redefinition. That is, the JEP-159 implementation might check, for instance, to see if offsets have been queried or if unsafe code using such accesses is currently extant and reachable, and if so, deny the edit. But then again unsafe is just that, so responsibility may fall on the users of unsafe to properly hide their types.


Posted by guest on June 03, 2013 at 10:28 AM EDT #

What tool can I use to view false sharing and its remedy ? I didn't know how to use Oracle Studio Analyzer for Java.


Posted by Mohan Radhakrishnan on November 07, 2013 at 08:01 AM EST #

Hi Mohan,

Good question. There are some conservative static tools that detect that false sharing _might be taking place, but they don't really have the ability to gauge the magnitude of the issue. They're also vulnerable to false-positives so you have to sort through the output. (Static race detection operates on similar ideas). There are also runtime approaches that try to tag areas (regions, objects, lines, pages, etc) with the ID of the last thread to write, and then try to intuit false sharing based on the ID of threads that subsequently read or write that region. The overheads and probe effect can be rather large with such approaches. Based on what I've seen, most of the viable approaches use hardware performance counters based on coherence events, and statistical sampling (with precise or semi-precise traps) to try to identify pieces of code that are generating lots of coherence traffic. This captures both true intentional sharing and false sharing. I'd recommend starting with O S Analyzer. Once you've found a suspected site in the code, you can usually change the data structure layout (@Contended or other approaches) or change the access frequency to the hot fields.


Posted by guest on November 07, 2013 at 10:29 AM EST #

Post a Comment:
  • HTML Syntax: NOT allowed

Dave is a senior research scientist in the Scalable Synchronization Research Group within Oracle Labs : Google Scholar.


« July 2016