Sunday Nov 20, 2005

Java Heap Sizing: How do I size my Java heap correctly?

Proper heap sizing is key to good Java application performance. Whether running a client or server application, if your system is running low on heap and spending a lot of time with garbage collection you'll want to investigate adjusting your heap size. You also don't want to set your heap size too large and impact other applications running on the system. This edition of JVM Performance Tuning Basics will cover general heap and generation tuning steps, and new features in JDK 5.0. h2. Java SE 5.0 Ergonomics Ergonomics for servers was first introduced in Java SE 5.0. It has greatly reduced application tuning time for server applications, particularly with heap sizing and advanced GC tuning. In many cases no tuning options when running on a server is the best tuning you can do. Server ergonomics is enabled when running on a server class machine on Solaris, Linux, and 64-bit Windows. It is disabled by default when running on 32-bit Windows. Ergonomics does the following: \* Throughput garbage collector and Adaptive Sizing (-XX:+UseParallelGC) \* Initial heap size of 1/64 of physical memory up to 1Gbyte \* Maximum heap size of 1/4 of physical memory up to 1Gbyte \* Server runtime compiler (-server) To enable server egonomics on 32-bit Windows, use the following flags: -server -Xmx1g -XX:+UseParallelGC (varying the heap size) h2. Identify how much Java heap your application needs. There are several ways to identify how much heap your application is using. Java SE has a suite of monitoring tools such as jstat and jconsole. There is Brian Doherty's jvmstat tools, in particular visualgc. Then there is the tried and true -verbosegc and -XX:+PrintGCDetails. For this example I'll use -verbosegc. For details on GC implementation and logging outputs, take a look here: Details on Garbage Collection Tuning with Java SE 5.0 Examples of verbosegc and -XX:+PrintGCDetails. h3. -verbosegc The first step in investigating GC performance problems is looking at -verbosegc output. The following example is a server application with a heap size fixed at 64mb. The server compiler is specified and default GC collectors are chosen. In this case J2SE 1.4.2 is running with the default serial GC collectors. java -server -Xms80m -Xmx80m my.serverApp

[GC 55974K->35946K(79232K), 0.0269796 secs]
[GC 57834K->36306K(79232K), 0.0278222 secs]
[GC 58194K->36669K(79232K), 0.0264892 secs]
[GC 58557K->37044K(79232K), 0.0223606 secs]
[GC 58932K->37400K(79232K), 0.0262330 secs]

[GC 59288K->37803K(79232K), 0.0271792 secs]
[GC 59691K->38097K(79232K), 0.0283054 secs]
[GC 59985K->38516K(79232K), 0.0276064 secs]
[GC 60404K->38847K(79232K), 0.0244366 secs]
[GC 60735K->43570K(79232K), 0.0732041 secs]
[GC 65458K->56730K(79232K), 0.1476127 secs]
[Full GC 78618K->61524K(79232K), 0.8851303 secs]
[Full GC 79231K->61898K(79232K), 0.9426240 secs]
[Full GC 79231K->62263K(79232K), 0.9828957 secs]
[Full GC 79231K->59527K(79232K), 1.0334212 secs]
[Full GC 79231K->59906K(79232K), 0.9298369 secs]
[Full GC 79231K->60014K(79232K), 0.8833146 secs]
[Full GC 79231K->60124K(79232K), 0.8293863 secs]
[Full GC 79231K->59615K(79232K), 0.8944206 secs]
[Full GC 79231K->59679K(79232K), 0.9169885 secs]
[Full GC 79231K->59626K(79232K), 0.9366790 secs]
[Full GC 79231K->59697K(79232K), 0.8613183 secs]
[Full GC 79231K->59594K(79232K), 0.9114757 secs]
[Full GC 79231K->59654K(79232K), 0.9987619 secs]
[Full GC 79231K->59654K(79232K), 1.0146781 secs]
[Full GC 79231K->59661K(79232K), 0.9687409 secs]

The first 11 lines of verbose gc output is young generation garbage collection. The first number is the size of the heap before the GC, the second number is the size afterwards. The third number is the overall heap size, the forth it the time spent during the GC operation. [GC 60404K->38847K(79232K), 0.0244366 secs] # 60404K # 38847K # 79232K # 0.0244366 secs Note that the second number continues to grow during the first 11 GCs. This is an indication that many objects are being promoted to the old (tunured generation). There are several reasons this may occur. First is the obvious, most object continue to be live and are properly tenured. Second is that the young generation is not large enough to allow transient objects to successfully die in the young generation. Also note that eventually young GCs cease and only Full GC operations occur there after. When running the serial collector, there must be enough space in the tenured generation to allow full promotion of the young generation plus one survivor space. This is known as the . If there isn't enough space and the guarantee is not upheld, then only Full GCs occur. Simply increasing the size of the heap slightly to 128mb is enough to uphold the young generation guarantee and avoids a majority of the Full GCs. java -server -Xms128m -Xmx128m my.serverApp

[GC 109920K->75933K(126720K), 0.0533675 secs]
[GC 110877K->76916K(126720K), 0.0437944 secs]
[GC 111860K->77818K(126720K), 0.0490449 secs]
[GC 112762K->78812K(126720K), 0.0482215 secs]
[GC 113756K->79810K(126720K), 0.0444408 secs]
[GC 114754K->80759K(126720K), 0.0502736 secs]
[GC 115703K->81657K(126720K), 0.0435275 secs]
[GC 116601K->82629K(126720K), 0.0521527 secs]
[GC 117573K->83564K(126720K), 0.0443587 secs]
[GC 118508K->84501K(126720K), 0.0438583 secs]
[GC 119445K->85492K(126720K), 0.0556998 secs]
[GC 120436K->86412K(126720K), 0.0437702 secs]
[GC 121356K->87402K(126720K), 0.0478918 secs]
[Full GC 122346K->59749K(126720K), 0.9128712 secs]
[GC 94693K->63029K(126720K), 0.0415602 secs]
[GC 97973K->64037K(126720K), 0.0442277 secs]
[GC 98981K->65123K(126720K), 0.0538927 secs]
[GC 100067K->66058K(126720K), 0.0509740 secs]
[GC 101002K->66971K(126720K), 0.0529873 secs]
[GC 101915K->67931K(126720K), 0.0432661 secs]
[GC 102875K->68896K(126720K), 0.0468042 secs]
[GC 103840K->69864K(126720K), 0.0515457 secs]
[GC 104808K->70787K(126720K), 0.0435953 secs]
[GC 105731K->71789K(126720K), 0.0438197 secs]
[GC 106733K->72799K(126720K), 0.0520742 secs]
[GC 107743K->73692K(126720K), 0.0528108 secs]
[GC 108636K->74622K(126720K), 0.0531088 secs]
[GC 109566K->75533K(126720K), 0.0523352 secs]
[GC 110477K->76456K(126720K), 0.0532375 secs]
[GC 111400K->77423K(126720K), 0.0434274 secs]
[GC 112367K->78398K(126720K), 0.0435165 secs]
[GC 113342K->79417K(126720K), 0.0537748 secs]
[GC 114361K->80287K(126720K), 0.0432627 secs]
[GC 115231K->81156K(126720K), 0.0422614 secs]
[GC 116100K->82170K(126720K), 0.0427083 secs]
[GC 117114K->83087K(126720K), 0.0528816 secs]
[GC 118031K->84140K(126720K), 0.0488751 secs]
[GC 119084K->85046K(126720K), 0.0533192 secs]
[GC 119990K->86011K(126720K), 0.0542484 secs]
[GC 120955K->86914K(126720K), 0.0451865 secs]
[GC 121858K->87861K(126720K), 0.0435737 secs]
[GC 122805K->88683K(126720K), 0.0457787 secs]
[GC 123627K->89739K(126720K), 0.0540739 secs]
[Full GC 124683K->59854K(126720K), 0.9496498 secs]
h2. Next Topic: Young Generation Sizing

Monday Sep 26, 2005

JVM Runtime Compilers: -client or -server

The following is the first edition of a series of articles on JVM performance tuning basics. The suggestions are generally applicable to J2SE releases starting with 1.3, I'll try to note when release defaults differ, and plan to give you many reasons why to upgrade to our latest releases. JVM Runtime Compilers – Client or Server? The Sun JVM has two runtime compilers. Here's a brief summary of each of the systems and when to use each. Client Compiler The client compiler is for use with client applications, generally short lived, where application startup time and memory footprint size are most important. There are many features in J2SE 5.0 address these requirements. A good example is class data sharing. The client compiler is the default option for J2SE 1.3 - 1.4.2, with a change in J2SE 5.0 with the addition of the server ergonomics (see below). When do I use the client compiler? Use the client runtime compiler when application startup and memory footprint are most important to you. How do I use the client compiler? J2SE releases 1.3 to 5.0: specify -client as the first JVM tuning option. Example: > java -client my.clientApp Note: The client compiler (-client) is the default, out of the box compiler choice for J2SE releases 1.3 to 1.4.2, and J2SE 5.0 on Windows (32-bit). Server Compiler The server compiler is targetted for long lived applications where runtime performance and throughput are essential, and startup speed and memory footprint are not. When do I use the sever compiler? Its important to use -server for all server applications. For older JVMs (1.3 – 1.4.1) be sure you're running the latest update release. Its also worth trying the server compiler for your long lived IDE's (Netbeans) or cpu intensive trading applications. Use cases where runtime performance is important are broad and include both client and server applications. How do I use the server compiler? J2SE releases 1.3 to 5.0: specify -server as the first JVM tuning option. Example: > java –server my.serverApp Note: The server compiler (-server) is generally not on by default and must be specified on the command line for J2SE releases 1.3 to 1.4.2. Server ergonomics changes this default in J2SE 5.0. J2SE 5.0 Server Ergonomics The server ergonomics feature in Tiger will modify default JVM tuning options and enable the server compiler if you're running on a server class machine machine. In the J2SE 5.0 a server-class machine has been defined as a machine with 2 or more physical processors and 2 or more Gbytes of physical memory. A single hyperthreaded cpu is counted as one physical processor. Upgrade to J2SE 5.0 There are many, many reasons to upgrade to J2SE 5.0 and to the latest update J2SE 5.0_05. See the J2SE 5.0 Performance Whitepaper for details on Tiger performance improvements. Below is a few of the features recently added in J2SE 5.0_05. Large Page Support for Solaris, Linux, and Windows. J2SE 5.0_05 adds large pages support to Linux and Windows. To enable on these platforms use: -XX:+UseLargePages. This feature is on by default on Solaris 10. For more details see: (bottom of the page) System.arraycopy optimizations on SPARC and x64 System.arraycopy optimizations first identified on x64 platforms have now been extended to include SPARC platforms as well. Synchronization Performance Improvements The J2SE 5.0_05 server compiler includes simple lock coarsening. This is the first phase of our uncontended synchronization performance improvements coming in upcoming update releases. Continued J2SE API Performance Improvements J2SE 5.0_05 includes many improvements to the J2SE API. This is part of Sun's continued Java performance worka nd you should expect to see more improvements in future updates as well.



« April 2014