ZFS and Solaris: storage optimization for the cloud
By Thierry Manfe on Dec 20, 2009
Cloud computing has been one of the most discussed topic over the year, and the discussion is not over because what is really being discussed is the way we will access computing an storage resources in the future. Even famous French intellectuals are giving their opinion and making predictions. Future will decide on predictions's accuracy.
What is usually less discussed is the technology behind cloud-computing, though this is no secret that virtualization is playing a key role. Cloud data-centers will be loaded with virtual machines each of these machines potentially requiring in disk-space what a complete operating system (OS) requires, which can go up to many gigabytes. How much disk-space does a virtual machine image (vdi) really consum? The only good answer is: too much. Too much because the OS is part of the infrastructure as opposed to the software that really brings value for the user and that
is located at the application level. In short, a good infrastructure reduces its resource consumption to the minimum while bringing flexibility and availability.
This is where ZFS and Solaris get into the picture. ZFS - which already offers compression on the fly - has just been enriched with deduplication while Solaris comes with a really light-weight technology in terms of virtualization: the zones.
ZFS COMPRESSION AND DEDUPLICATION
Let's go down quickly to some numbers :
- I have an OpenSolaris 2009.06 vdi (running VirtualBox btw) on my disk. OpenSolaris default install consumes less than 2.5GBytes but this image includes extra software. The size of the vdi - returned by ls -hl - is 4.8GBytes
- When I activate the ZFS compression on the file-system where the vdi is located the real space consumed goes down to 2.8GBytes (the real space is returned by du -hs)
- With compression off, I activate ZFS deduplication - on the ZFS pool on which the file-system is located (dedup is available in build 128 of OpenSolaris). From there, I duplicate the vdi using the VBoxManage clonehd command - a typical operation in a cloud environment. Note that this command does not leverage ZFS cloning capabilities, so without deduplication the allocated space should double.
Before duplication zpool list returns:
$ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 93G 83.0G 9.99G 89% 1.20x ONLINE -
After the vdi duplication:
$ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 93G 83.1G 9.86G 89% 1.96x ONLINE -
The pool allocation grew 0.1GBytes (100MB) instead of 4.8GBytes. I am saving 98% of the additional disk-space (note that I also tried dedup with compress=on but the saving was only about 28%).
Now I create a Solaris zone. A zone is another type of virtual machine based on virtualization features available in Solaris (and also in OpenSolaris). The zone vdi is created from the Solaris image that is hosting the zone (ok, a zone does not have a vdi per see, but the objective here is only to measure disk-space consumption). I create a "sparse" zone, which means that it shares part of its image with the host OS. The zone disk-image takes 976MBytes versus more than 4GBytes for the Solaris host (to be precise, I ran my experiment with a Nevada distribution. On OpenSolaris a default zone takes 237MB). I already reduce the disk-space consumption by more than 75%.
From there, I create a second zone by cloning the first one. To do so, I use the zoneadm clone command. Interestingly since both zones are located on a ZFS file-system this command leverages the ZFS cloning capabilities. Yes, nice integration between the zones and ZFS: instead of duplicating the first file-system ZFS creates a clone out of it that shares most of its underlying blocks with the original. The resulting zone vdi takes 7.60M. If I compare this to the 4GBytes of the host OS, I get a saving of 99.8%.
Whichever virtualization technology I am using, ZFS now provides me with features that allow me to save a lot of disk-space when it comes to vdi. Solaris zones demonstrate that their reputation of being a light-weight technology is well deserved.
Whether the future of cloud is public, private, or delivered as an appliance, no doubt that ZFS and Solaris will be part of the picture.