Swat Trace Facility (STF), high IOPS, and memory problems during STF Analyze.
By Henk Vandenbergh-Oracle on Feb 12, 2009
During the STF Analyze phase, STF reads and interprets all the trace data generated by TNF. Since the trace probes describe I/O starts and I/O completions, and TNF frequently generates duplicate I/O completion probes or some times does not generate a completion probe at all, STF needs to keep all I/O start and I/O completion probes in memory so that it can identify and ignore these duplicate probes.
After three minutes STF finally says: OK, if any duplicate data shows up or if any (infrequent) I/O completion does not get generated, three minutes wait time is enough, and starts aging the retained probes out of memory.
This means however that if there are any I/O operations that take longer than three minutes, the I/O start probe is already gone when the I/O completion finally shows up. This I/O then is completely lost to STF.
You may think that three-minute I/O response times never happen, and I agree with that. However, in our lab environments people do a lot of fancy things trying to break our hardware and software to make sure that our customers ultimately get a product that is as good as can be. I used to have 30 seconds set as an aging limit, but there were occasions that this was not enough, so I increased it to 180 seconds, three minutes.
This all worked fine when running a ‘decent’ amount of IOPS.
Storage devices however are becoming faster, especially since the arrival of solid-state devices (SSD). And now the very high IOPS are starting to create memory problems for STF: keeping three minutes worth of data for 30,000 IOPS in memory starts becoming a problem (30,000 \* 180 seconds \* (start + completion probe) = 10 million probes kept in memory).
- Increase the Java heap space using the STF ‘Settings’ tab from –Xmx1024m to –Xmx3500m, but that some times is not enough (3500m is some 32-bit Java limit, I have never run STF in a 64-bit environment).
- Lower the probe aging in STF. Use the ‘Settings’ tab again, and enter ‘-a30’ as a batch_prm parameter for 30 second aging, or lower if you know for sure that in your environment no I/O response time ever is longer than the value that you specify.