The Wonders of ZFS Storage
Performance for your Data

Tuning the knobs

Roch Bourbonnais
Principal Performance Engineer
As experience builds up we're finding a few knobs of ZFS that we want to experiment with. As we gains a better understanding on them, our aim is that tuning them will not be necessary in the future and there is already work in progress to offset the need to tune them. But for those ZFS users that lives on the bleeding edge of performance, I figured this ztune script can come in handy.

The scripts works by tuning the in-kernel values of some internal parameters. It then runs an export/import sequence on the specified pool which becomes tuned. After that, the scripts resets the in-kernel values to the ZFS default values. This means that the tunings will not be persistent across reboots or even following export/import sequence.

We've seen the need to tune 2 parameters; the vdev level prefetch size and the maximum number of pending I/O per vdev.

The low level prefetch causes problem when in occupies the I/O channel for no benefit i.e. if we end up never using the prefetched data. The default value is 64K but 16K or even less appears to be good values when a workload is read intensive to non-cacheable data (working set bigger than memory).

The max pending parameter can cause some problem also. When working with volumes that map to multiple spindles, it is possible that the default value be too low for write throughput scenarios. Although I am a bit skeptical it would help, on those types of volumes, if faced with disapointing throughput, one could try to increase the value. The default is 35 but one could try to use 100 or so. I would be very interesting to hear about this if you stumble on an occasion where this helps.

More likely in my mind, the default value can cause extra latency for critical I/O (log writes and reads). During transaction group sync, each devices becomes saturated with that number of I/O causing the service time to be fairly large; this occurs in cyclic fashion commonly on a 5 seconds beat. When latency is more important than throughput, tuning down the value to 10 or less should bring better performance.

Be warned that the script mocks (mdb -kw) around with unstable kernel definitions; there is potential to crash an OS so be extra prudent and don't use in production.


And remember, "You can tune a filesystem, but you can't tune a fish".

Join the discussion

Comments ( 2 )
  • Sean Tuesday, September 19, 2006
    Cool, though lets hope this script has a rather short life span as more self-tuning gets into zfs ;-)
  • Jess Tuesday, September 19, 2006
    Did you write this next year? The date in the comments of the script says Sept 2007.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.