An epiphany about ZFS
By DaveLevy on Feb 28, 2006
The hightlight of yesterday's conference to me was a presentation about ZFS. How long am I going to hang out for a british pronounciation . The preso was delivered by Dave Brittle, Lori Alt & Tabriz Leman.
While much of the material delivered yesterday was standard "Dog & Pony" material, this version stayed away from the administrative management interface and while mentioning the ideological substitution of pool for volume, it concentrated on the transactional nature of the filesystem update, the versioning this enables and also "bringing the ZFS goodness to slash".
Somehow I suddenly get it. ZFS revolutionises the storage of disk data blocks and their meta data. It writes new blocks before deleting old one and so can roll back if the write errors. This also allows versioining to occur, the old superblock becomes a snapshot master superblock. The placement of parity data in the meta blocks (as opposed to creating additional leaf node blocks) means that error correction is safer and and offers richer functionally. More....
It seems to me that this technology will enable a sedimentation process to occur and that much of a DBMS's functionality can migrate to the operating system (or in this case file system). When I say much, when I first started working with DBMS (i.e. in the last century ), they often used the filesystem and often didn't use write ahead logs. By bringing this DBMS functionality to the file system, a process started by the adoption of direct & async i/o, the ZFS designers have closed a loop and borrowed from the DBMS designer's learning curve. Only the DBMS can "know" if two blocks are part of the same "success unit", but ZFS can implement a sucess unit and should begin to weaken the need for a write ahead log. It will also enable the safe(r) use of open source databases.
The versioning feature of the file system, when certified for use as a root file system will enable much safer and faster patching; it will enable snapshot and rollback . If system managers use these features to adopt a faster software technology refresh, then innovation will come to the data centre faster since newer code is better quality and should contain new usefull features. Disk cloning, snapshot and rollback will also enable the rapid spawing of Solaris Containers. Fantastic.
We are also released from the tyranny of the partition table, which for the last 15 years we have required a volume manager for.
Despite these fantastic advances, when it becomes available, it'll be a V1.0 product, so care will be needed. Certainly, the authors seem to have some humility about this, but with Solaris Express, we can get hold of it now and begin acceptance and confidence testing. A final really great feature is that ZFS has been donated/incorporated into OpenSolaris.
This stuff should be available as an update in Solaris 10, maybe sometime over the summer and I'm going to get hold of an "Express" version for my laptop.