By Darren Moffat-Oracle on Nov 15, 2010
The main goal of encryption is to make the (presumably sensitive) cleartext data indistinguishable from random data. Good file system encryption usually aims to have the same plaintext encrypt to different ciphertext at least when written at a different "location" even if the same key is used. One way to achieve that is that the initialisation vector (IV) is some how derived from where the blocks of the files are stored on disk. In this respect the encryption support in ZFS is no different, by default we derive the IV from a combination of what dataset / object the block is for and also when (its transaction) written. This means that the same block of plaintext data written to a different file in the same filesystem will get a different IV and thus different ciphertext. Since ZFS is copy-on-write and we use the transaction identifier it also means that if we "overwrite" the same block of a file at a later time it still ends up having a different IV and thus will be different ciphertext. Each encrypted dataset in ZFS has a different set of data encryption keys (see my earlier post on assured delete for more details on that), so there we change the IV and the encryption key so have a really high level of confidence of getting different ciphertext when written to different datasets.
The goal of deduplication in storage is to coalesce matching disk blocks into a smaller number of copies (ideally 1, but in ZFS that nunber depends on the value of the copies property on the dataset and the pool wide dedupditto property so it could be more than 1). Given the above description of how we do encryption it would seem that encryption and deduplication are fundamentally at odds with each other - and usually that is true.
When we write a block to disk in ZFS it goes through the ZIO pipeline and in doing so a number of transforms are optionally applied to the data: compress -> encryption -> checksum -> dedup -> raid.
The deduplication step uses the checksums of the blocks to find suitable matches. This means it is acting on the already compressed and encrypted data. Also in ZFS deduplication matches are searched for in all datasets in the pool with dedup=on.
So we have very little chance of getting any deduplication hits with encrypted datasets because of how the IV is generated and the fact that each dataset has its own set of encryption keys. In fact not getting hits with deduplication is actually a good test that we are using different keys and IVs and thus getting different ciphertext for the same plaintext.
So encryption=on + dedup=on is pointless, right ?
Not so with ZFS, I wasn't happy about giving up on deduplication for encrypted datasets, so we found a solution, it has some restrictions but I think they are reasonable and realistic ones.
Within what I'll call a "clone family", ie all datasets are clones of the same original dataset or are clones of those clones, we would be sharing data encryption keys in the default case, because they share data (again see my earlier post on assured delete for info on the data encryption keys). So I found a method of generating the IV such that within the "clone family" we will get dedup hits for the same plaintext. For this to work you must not run 'zfs key -K' on any of the clones and you must not pass '-K' to 'zfs clone' when you create your clones. Note that dedup does not apply to child datasets only to the snapshots/clones, and by that I mean it doesn't break you just won't get deduplication matches.
So no it isn't pointless and whats more for some configurations it will actually work really well. A common use case for a configuration that does work well is a set of visualisation image (maybe filesystems for local Zones or ZVOLs shared over iSCSI for OVM or similar) where they are all derived from the same original master by using zfs clones and that all get patched/updated with the pretty much the same set of patches/updaets. This is a case where clones+dedup work well for the unencrypted case, and one which as shown above can still work well even when encryption is enabled.
The usual deployment caveats with ZFS deduplication still apply, ie it is block based and it works best when you have lots of available DRAM and/or L2ARC for caching the DDT. ZFS Encryption doesn't add any additional requirements to this.
So we can happily do this type of thing, and have it "work as expected":
$ zfs create -o compression=on -o encryption=on -o dedup=on tank/builds ... $ zfs create tank/builds/master ... $ zfs clone tank/builds/master@1tank/builds/project-one ... $ zfs clone tank/builds/master@1 tank/builds/project-two