Swat Trace Facility (STF), high IOPS, and memory problems during STF Analyze.
By Henk Vandenbergh on Feb 12, 2009
During the STF Analyze phase, STF reads and interprets all the trace data generated by TNF. Since the trace probes describe I/O starts and I/O completions, and TNF frequently generates duplicate I/O completion probes or some times does not generate a completion probe at all, STF needs to keep all I/O start and I/O completion probes in memory so that it can identify and ignore these duplicate probes.
After three minutes STF finally says: OK, if any duplicate data shows up or if any (infrequent) I/O completion does not get generated, three minutes wait time is enough, and starts aging the retained probes out of memory.
This means however that if there are any I/O operations that take longer than three minutes, the I/O start probe is already gone when the I/O completion finally shows up. This I/O then is completely lost to STF.
You may think that three-minute I/O response times never happen, and I agree with that. However, in our lab environments people do a lot of fancy things trying to break our hardware and software to make sure that our customers ultimately get a product that is as good as can be. I used to have 30 seconds set as an aging limit, but there were occasions that this was not enough, so I increased it to 180 seconds, three minutes.
This all worked fine when running a ‘decent’ amount of IOPS.
Storage devices however are becoming faster, especially since the arrival of solid-state devices (SSD). And now the very high IOPS are starting to create memory problems for STF: keeping three minutes worth of data for 30,000 IOPS in memory starts becoming a problem (30,000 \* 180 seconds \* (start + completion probe) = 10 million probes kept in memory).
- Increase the Java heap space using the STF ‘Settings’ tab from –Xmx1024m to –Xmx3500m, but that some times is not enough (3500m is some 32-bit Java limit, I have never run STF in a 64-bit environment).
- Lower the probe aging in STF. Use the ‘Settings’ tab again, and enter ‘-a30’ as a batch_prm parameter for 30 second aging, or lower if you know for sure that in your environment no I/O response time ever is longer than the value that you specify.