Assured delete with ZFS dataset encryption
By DarrenMoffat on Nov 15, 2010
Need to be assured that your data is inaccessible after a certain point in time ?
Many government agency and private sector security policies allow you to achieve that if the data is encrypted and you can show with an acceptable level of confidence that the encryption keys are no longer accessible. The alternative is overriding all the disk blocks that contained the data, that is both time consuming, very expensive in IOPS and in a copy-on-write filesystem like ZFS actually very difficult to achieve. So often this is only done on full disks as they come out of production use for recycling/repurposing, but this isn't ideal in a complex RAID layout.
In some situations (compliance or privacy are common reasons) it is
desirable to have an assured delete of a subset of the data on a disk
(or whole storage pool). Having the encryption policy / key management at that ZFS dataset (file system / ZVOL) level allows us to provide assured delete via key destruction at a much smaller granularity than full disks, it also means that unlike full disk encryption we can do this on a subset of the data while the disk drives remain live in the system.
If the subset of data matches a ZFS file system (or ZVOL) boundary we can provide this assured delete via key destruction; remember ZFS filesystems are relatively very cheap.
Lets start with a simple case of a single encrypted file system:
$ zfs create -o encryption=on -o raw,file:///media/keys/g projects/glasgow $ zfs create -o encryption=on -o raw,file:///media/keys/e projects/edinburgh
After some time we decide we want to make projects/glasgow completely inaccessible. The simplest way is to just destroy the wrapping key, in this case it is on /media/keys/g, and destroying the projects/glasgow dataset. The data on disk will still be there until ZFS starts using those blocks again but since we have destroyed /media/keys/g (which I'm assuming here is on some separate file system) we have a high level of assurance that the encrypted data can't be recovered even by reading "below" ZFS by looking at the disk blocks directly.
I'd recommend a tiny additional step just to make sure that the last version of the data encryption keys (which are stored wrapped on disk in the ZFS pool) are not encrypted by anything the user/admin knows:
$ zfs key -c -o raw,file:///dev/random projects/glasgow $ zfs key -u projects/glasgow $ zfs destroy projects/glasgow
While the step of re-wrapping the keys with a key the user/admin doesn't know doesn't provide a huge amount of additional security/assurance it makes the administrative intent much clearer and at least allows the user to assert that they did not know the wrapping key at the point the dataset was destroyed.
If we have clones this situation is slightly more complex since clones share their data encryption key with their origin - since they share data written before the clone was branched off the clone needs to be able to read the shared and unique data as if it was its own.
We can make sure that the unique data in a clone uses a different data encryption key than the origin does from the point the clone was taken:
... time passes data is placed in projects/glasgow $ zfs snapshot projects/glasgow@1 $ zfs clone -K projects/glasgow@1 projects/mungo
By passing '-K' to 'zfs clone' we ensure that any unique data in projects/mungo is using a different dataset encryption key from projects/glasgow, this means we can use the same operations as above to provide assured delete for the unique data in projects/mungo even though it is a clone.
Additionally we could also do 'zfs key -K projects/glasgow' and have any new data written to projects/glasgow after the projects/mungo clone was taken use a different data encryption key was well. Note however that that is not atomic so I would recommend making projects/glasgow read-only before taking the snapshot even though normally this isn't necessary, the full sequence then becomes:
$ zfs set readonly=on projects/glasgow $ zfs snapshot projects/glasgow@1 $ zfs clone -K projects/glasgow@1 projects/mungo $ zfs set readonly=off projects/mungo $ zfs key -K projects/glasgow $ zfs set readonly=off projects/glasgow
If you don't have projects/glasgow marked as read-only then there is a risk that data could be written to projects/glasgow after the snapshot is taken and before we get to the 'zfs key -K'. This may be more than is necessary in some cases but it is the safest method.