Thursday Mar 26, 2009

Garbage First - the G1 garbage collector

I think it would be fair to say that OpenDS "raises the bar" when it comes to Java application performance. Many applications have performance requirements associated with one or two of the following criteria:
  • Throughput: the average number of requests handled per second
  • Response time: the average time taken to handle a single request
  • Variability: the variation, or predictability, of response times

For example, an application server such as Glassfish typically has very strong requirements for throughput, but much weaker requirements for response time and variability: a web service must be able to handle a huge volume of traffic, but users will not be too bothered if their request is handled in 10ms, 100ms, or even 5 seconds. Many real-time applications have weak throughput requirements but, by nature, have very strong variability requirements: when an event occurs it must be handled immediately and predictably (e.g. there should be no unpredictable pauses due to thread scheduling, disk IO, DB check pointing, etc).

OpenDS has relatively strong performance requirements for all three criteria: on suitable hardware we need to handle tens of thousands of requests per second per server, with average response times in the milli-second range, and with soft real-time response time variability. I believe that we have satisfied the first two criteria, however the last one, variability, has posed us some problems. Variability can be caused by many things, the most obvious of which is garbage collection for Java applications.

We recommend that people use the CMS garbage collector which is enabled using the -XX:+UseConcMarkSweepGC JVM option and is recommended by Sun for server applications. However, CMS is not without its problems: typically it performs a stop the world (STW) garbage collection every few seconds which may last over 100ms in some cases. Occasionally (e.g. once a day) it may perform a Full GC - this is a STW garbage collection that, depending on the size of the application, may last several seconds. These kinds of pauses are unacceptable to some OpenDS users and this is where G1 enters the picture.

Garbage First, or G1 as it is commonly referred to, is Sun's next generation soft real time garbage collector. G1 still performs STW pauses but gives the user much more control over their duration and frequency. If you are interested in finding out more about G1 then take a look at the following links:

Here are the potential benefits of G1 to OpenDS:

  1. No long duration stop the world full GCs,
  2. Possible to split GC pauses into more frequent but smaller pauses,
  3. Decouples GC pause times from heap size.

In fact, points 2 and 3 above mean the G1 is more cache friendly which is great for database applications such as OpenDS.

The OpenDS developers are in a fortunate position: we work for Sun and can collaborate closely with the HotSpot JVM team. To that end, I have had the great opportunity to work very with Tony Printezis (a.k.a. "Mr GC") and Laurent Daynes over the past few months. This collaboration is very exciting for both us and the HotSpot JVM team: we get to test and feed our requirements directly to the G1 developers, and they get to test G1 against a Real World application on big hardware that pushes Java, and G1, to the limits.

This collaboration has already been enormously productive to the G1 team:

  • We have identified various scalability issues (bottlenecks and contention points) using a specially instrumented G1 based JVM. None of the issues identified so far look like design problems and should be fixed in the coming months,
  • We have been able to test experimental changes to G1. For example, we found that for big JVMs (e.g. 32GB) bigger region sizes yielded a huge performance improvement. Tony and Laurent had suspected this would be the case and, with the help of OpenDS, we could prove it.

The benefits to the OpenDS team are obvious:

  • Early experience with this exciting new garbage collector,
  • A feel for how OpenDS will behave with G1 and great support from the guys who are developing it!
  • We can adapt future development of OpenDS based on our findings.

The great news is that G1 looks like a very promising way forward for us. I can already see the potential benefits - even under heavy write load with really big caches the pause times remain stable and short lived, something we have never been able to achieve before.

An experimental version of G1 will be available in JDK6u14 (coming soon). Unfortunately, I don't think that it will include the improvements arising as a result of the G1 - OpenDS collaboration work, as these are arriving too late in the release cycle.

Wednesday Feb 04, 2009

New ASN1 library brings performance boost to OpenDS

One of my fellow OpenDS core team developers, Bo Li, has spent the last two or three months busily refactoring our ASN1 APIs. Tonight his work is finally complete and we have merged it onto the trunk in time for OpenDS 2.0.

It's never easy to refactor low-level APIs like these and nor is it, at first glance, very rewarding since you do not have a nice new shiny feature to play around with when your work is complete.

So what was the point in refactoring these APIs? What do they bring us? Well, before going into too many details, they say a picture speaks a thousand words, so here goes:

Average etime (ms) ASN1 vs Trunk

In the graph above the vertical axis represents the average time taken in milliseconds for a search operation to complete. The horizontal axis represents the size of the entry in kilobytes being returned to the user. We return the entire entry which, in the case of the 4 KB entry, contains around 80 attributes.

As you can see, the refactored ASN1 library (the red line) brings a significant performance improvement: the time taken for a search operation to complete has approximately halved when compared against the original ASN1 library (the blue line).

The old ASN1 libraries, whilst relatively easy to use, were not very efficient and, in particular, forced the developer to waste enormous amounts of memory allocating ASN1Element objects in order to construct complex ASN1 sequence structures. This inefficiency was further compounded during serialization which, for search results containing possibly many attribute values (the ASN1 "leaf" elements), was particular costly.

To see why, lets take a look at an LDAP search entry response message:

  LDAPMessage ::= SEQUENCE
  {
  . messageID           MessageID,
  . searchResEntry      [APPLICATION 4] SEQUENCE
  . {
  . . objectName        LDAPDN,
  . . attributes        SEQUENCE OF
  . . {
  . . . SEQUENCE
  . . . {
  . . . . type          AttributeDescription,
  . . . . vals          SET OF
  . . . . {
  . . . . . value       AttributeValue
  . . . . }
  . . . }
  . . }
  . },
  . controls            [0] Controls OPTIONAL
  }

The innermost elements are the individual attribute values contained by the entry. These represent the real content of the entry - a 4KB entry will contain 4KB of attribute values. As you can see in the illustration above, the attribute values are nested 5 levels deep in the LDAP search entry message structure.

The old ASN1 libraries would duplicate each ASN1 element within an ASN1 sequence two times during serialization: once to determine the element's content length and once to concatenate the content onto the element header. This meant that attribute values were duplicated 11 times (5x2+1).

A similar story applies when decoding LDAP messages using the old ASN1 libraries.

In addition, our principle back-end implementation also stores entries using an ASN1 encoded format similar to the LDAP search entry message structure. So in order to read a single entry from the back-end, decode it, and then serialize it would consume approximately 20 times the amount of memory required by the entry.

Our new ASN1 library uses a streaming based design and typically performs at most one copy during encoding and decoding. We have split encoding and decoding into two distinct interfaces:

  org.opends.server.protocols.asn1.ASN1Reader
  org.opends.server.protocols.asn1.ASN1Writer

A factory class, ASN1, is used to create new ASN1Readers and ASN1Writers from various underlying data sources and data sinks respectively.

Here's an example of how the ASN1Writer interface should be used:

  long msgID = 123; 
  Entry entry = ...;
  OutputStream stream = ...;

  ASN1Writer writer = ASN1.getWriter(stream);

  writer.writeStartSequence();
  {
    // LDAP message
    writer.writeInteger(msgID);
    writer.writeStartSequence();
    {
      // Search entry
      writer.writeOctetString(entry.getDN().toString());
      writer.writeStartSequence();
      for (Attribute attribute : ...)
      {
        // Attribute list
        writer.writeStartSequence();
        {
          // Attribute
          writer.writeOctetString(attribute.getNameWithOptions());
          writer.writeStartSet();
          for (AttributeValue value : attribute)
          {
            // Attribute value
            writer.writeOctetString(value.getValue());
          }
          writer.writeEndSet();
        }
        writer.writeEndSequence();
      }
      writer.writeEndSequence();
    }
    writer.writeEndSequence();
  }
  writer.writeEndSequence();

Decoding follows a similar pattern using the ASN1Reader interface.

Part of this work also required that we provided some simple APIs for manipulating byte arrays. We have implemented a similar design to the JDK CharSequence, String, and StringBuilder classes. Specifically we now have the following classes and interfaces defined in org.opends.server.types:

  • ByteSequence - this is similar to CharSequence and provides a read-only interface for manipulating sequences of bytes.
  • ByteString - this is similar to String and provides an immutable sequence of bytes.
  • ByteStringBuilder - this is similar to StringBuilder and provides methods for incrementally constructing a byte sequence.
  • ByteSequenceReader - this is similar to StringReader and provides methods for incrementally decoding content from a byte sequence.

I believe we now have a pretty solid low level API stack in OpenDS which should prove to be stable and capable of meeting all our requirements for the foreseeable future (famous last words!).

Please feel free to give us any feedback.

Saturday Jan 24, 2009

Welcome to my new blog!

Thank you for taking the time to visit my shiny new blog.

My name is Matthew Swift and I am a software engineer working on the OpenDS Project - Sun's open source next generation LDAP Directory Server.

Come back again if you are interested in understanding the internals of OpenDS, upcoming improvements to the core server, and the inside story on what we're doing, why we're doing it, and how we're doing it.

See you soon!


About

This is the blog of Matthew Swift, a developer working on OpenDS - Sun's open source next generation LDAP Directory Server.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today