By sbohne on Jan 12, 2007
The reason is that the Mem Usage accounting charges each process for memory that is shared with other processes. For example, although there's only one copy in physical memory, DLL read-only memory usage is charged against every process that has loaded a particular DLL.
We can't fault Task Manager for this, because the measurement is accurate from the perspective of a single process. But not when considering the actual cost of multiple processes.
So how do we get the right number? The answer is to subtract out all shared memory that's counted more than once. Although Task Manager doesn't provide an easy way to do this, there's a tool called vadump.exe that does precisely what we want.
With the "-o" (dump current working set) and "-s" (summary info only) options, vadump displays working set information for a single process. Here is the output for a typical medium-sized graphical Java application. (Some output is omitted for the sake of clarity.)
Category Total Private Shareable Shared Pages KBytes KBytes KBytes KBytes Grand Total Working Set 8841 35364 23940 8964 2460"Total KBytes" corresponds exactly with the Task Manager Mem Usage column. But now we have additional information as well. "Shareable KBytes" is memory that is potentially shareable but is not currently shared, because only this process is using it. "Shared KBytes" is memory that is actually shared with one or more other processes.
In this example, not much is shared because we're only running one Java process. When another instance of the same application is launched, the vadump output becomes:
Category Total Private Shareable Shared Pages KBytes KBytes KBytes KBytes Grand Total Working Set 8841 35364 23908 92 11364Since two copies of the same Java application are running, almost all the shareable memory is in fact shared.
So let's figure out the total footprint using both methods. First, using the method we know to be incorrect, summing up Task Manager's Mem Usage numbers gives us just over 69M as the combined footprint. Now let's calculate the actual resident set footprint of these two processes.
To calculate the true footprint, we simply subtract out the shared charge for instances beyond the first. For n instances of the same application, the formula looks like this:
True Footprint = (Mem Usage \* n) - (Shared Kbytes \* (n - 1))For the two application instances in this example, the true working set footprint is just under 58M. That's a significant difference - if we didn't consider shared memory, the footprint estimate is off by almost 20%! And the error rate only increases as you add additional instances.
It's also important to realize that even though our example uses two copies of the same application, the footprint analysis translates very well to different Java apps. For example, when running two different medium-sized graphical Java applications, the Shared KBytes for each are 11084 and 11076 - almost as good as the example above. This rule of thumb holds up for most Java programs. To confirm it, you can look at the vadump verbose output and verify at a per-DLL level precisely what is being shared between the Java processes.
Where exactly does the sharing come from? Two places, mostly. The first is shareable DLLs regions such as the text section. The second is from the class data sharing feature, which shares read-only class metadata between processes. For typical applications, each of these contributes roughly equally to the total amount of shareable memory.
So this entry tries to demonstrate that the multiprocess Java memory usage situation on Windows may be better than you think. If it's still not good enough, you might check out Chris Oliver's JApplication, which uses classloader-based isolation to host multiple (well-behaved) applications in a single JVM instance.
Another good reference for those interested in multiprocess JVM footprint is Dave Dice's MVM blog entry.