ZIL, SSD, and Other Fun Acronyms
By user12674320 on Jan 10, 2009
The ZFS Intent Log (or ZIL) is always written to non-volatile storage.
The ZIL allows the file system to recover from crashes without data
loss. In a 7000 Series with Write Optimized SSD, the ZIL is stored on the
Write Optimized SSD, otherwise it is stored in the disk pool. Either
way, it is also available in system memory. The ZIL flushes to the disk pool
every once in awhile (this is called a
Transaction Group Commit).
In a 7410 cluster, if a fail over occurs under normal conditions the pool is imported by the alternate node, the ZIL is replayed against the pool, and the pool is online and ready. You can think of the Write Optimized SSD and ZIL as our NVRAM if that helps, but we don't need batteries.
If the ZIL is stored on a single SSD, and that device fails, the system has a window to flush the ZIL from memory to disk (the Transaction Group Commit I mentioned earlier). Typically in the 7000 Series, this flush happens every 1-5 seconds, but it can take up to 30 seconds on an extremely busy system. Once the data is flushed from memory to disk, the system will use the disk pool to store the ZIL for the next transaction group. This window is the only time in a 7000 series where there is a chance for data loss. We mitigate this risk by mirroring the Write Optimized SSD's in the system.
ZFS performance on asynchronous writes is good and SSD is not required in these configurations (although it will help improve performance and is recommended) however in configurations that require synchronous writes (many iSCSI configurations, NFS with O_DSYNC etc) Write SSD is almost mandatory.
Write SSD Sizing Rules of Thumb:
-Each device supports about 9000-10000 Write IOPS (Sequential writes stream directly to disk for better performance)
-If devices are mirrored, they only count for 1x Write IOPS (ie two devices at 9000 IOPS each when mirrored together support 9000 IOPS total)
-If aiming to support No Single Point of Failure configurations, more trays with less SSD's per tray will have higher usable capacities. Clusters will only allow SSD in pairs.