This may sound counterintuitive, but turning on ZFS compression not only saves space, but also improves performance. This is because the time it takes to compress and decompress the data is quicker than then time it takes to read and write the uncompressed data to disk (at least on newer laptops with multi-core chips).
To turn on compression simply run:
pfexec zfs set compression=on rpool
All the child datasets of rpool will inherit this setting. For example:
bleonard@opensolaris:~$ zfs get compression rpool/export/home
NAME PROPERTY VALUE SOURCE
rpool/export/home compression on inherited from rpool/export
Note, only new data written after turning on compression is effected. Your existing data is left in its uncompressed state.
To check the compression you're achieving on a dataset, use the compressratio property:
bleonard@opensolaris:~$ zfs get compressratio rpool/export/home
NAME PROPERTY VALUE SOURCE
rpool/export/home compressratio 1.00x
I'm seeing 1.00x because I just enabled compression. Over time, as I write new data, this number will increase.
This part is optional but will give you a better feeling for how compression works.
Start by creating a new (therefore, empty) file system, ensuring compression is off (otherwise it will inherit the setting from rpool):
bleonard@opensolaris:~$ pfexec zfs create -o compression=off rpool/test
bleonard@opensolaris:~$ cd /rpool/test
Copy a file into the test file system, turn on compression, and copy it again:
bleonard@opensolaris:/rpool/test$ pfexec cp /platform/i86pc/kernel/amd64/unix unix1
bleonard@opensolaris:/rpool/test$ pfexec zfs set compression=on rpool/test
bleonard@opensolaris:/rpool/test$ pfexec cp /platform/i86pc/kernel/amd64/unix unix2
Check the compression ratio:
bleonard@opensolaris:/rpool/test$ pfexec zfs get compressratio rpool/test
NAME PROPERTY VALUE SOURCE
rpool/test compressratio 1.28x -
Check the difference in file size:
bleonard@opensolaris:/rpool/test$ du -hs u\*
1.7Munix1
936Kunix2
Also note the difference between using du and ls:
bleonard@opensolaris:/rpool/test$ ls -lh
total 2.6M
-rwxr-xr-x 1 root root 1.6M 2009-04-28 13:32 unix1\*
-rwxr-xr-x 1 root root 1.6M 2009-04-28 13:33 unix2\*
ls does not show the compressed file size! See ZFS Compression, whcn du and ls appear to disagree for a great explanation of this.
Finally, delete unix1, which was not compressed, and notice how the compression ratio for the file system rises accordingly:
bleonard@opensolaris:/rpool/test$ pfexec rm unix1
bleonard@opensolaris:/rpool/test$ pfexec zfs get compressratio rpool/test
NAME PROPERTY VALUE SOURCE
rpool/test compressratio 1.77x -
Finally, clean up:
bleonard@opensolaris:/rpool/test$ cd
bleonard@opensolaris:~$ pfexec zfs destroy rpool/test
Yes, the file can be copied to another medium, such as a USB stick. I think this is one of the reasons ls shows the uncompressed file size, so you'll understand how much space you'll actually need when moving the file.
Thanks for the pointer on UFS compression.
/Brian
fiocompress works for sparc and x86. The bits were being built only for sparc. BTW dcfs supports only on the fly decompression whereas zfs supports on the fly compression and decompression. That is, any file that was compressed with fiocompress, then deleted and recreated will be a normal file. ZFS files in do not suffer this fate. Moinak and other are looking at enhancing dcfs to support on the fly compression.
And just to show the improvement in performance, check out the following test results on AMD Duron(TM) Processor @ 1100 MHz with 368 Megabytes RAM. Disks are WD2000BB-00D and ST3200822A setup in a ZFs mirror.
This is not a fast machine!
But....
# zfs set compression=off export/test
# time dd if=/dev/zero of=500MB.file bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 50.2898 s, 10.4 MB/s
real 0m50.313s
user 0m0.018s
sys 0m5.598s
# ls -l
total 604499
-rw-r--r-- 1 root root 524288000 2009-05-25 17:56 500MB.file
# du -sh 500MB.file
491M 500MB.file
And now for the test with compression...
# zfs set compression=on export/test
# time dd if=/dev/zero of=500MB.file bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 10.5315 s, 49.8 MB/s
real 0m10.561s
user 0m0.018s
sys 0m5.274s
# ls -l
total 1
-rw-r--r-- 1 root root 524288000 2009-05-25 17:57 500MB.file
# du -sh 500MB.file
512 500MB.file
Compression off:-
# zfs set compression=off export/test
# time dd if=/dev/zero of=500MB.file bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 51.821 s, 10.1 MB/s
real 0m51.842s
user 0m0.016s
sys 0m5.543s
Compression on:-
# zfs set compression=on export/test
# time dd if=/dev/zero of=500MB.file bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 13.5011 s, 38.8 MB/s
real 0m13.532s
user 0m0.018s
sys 0m5.378s
So with compression on it took 10-13 seconds versus with compression off it took 50+ seconds to create the 500MB file. Sure /dev/zero is easily compressed, but it still demonstrates the value of using the CPU rather than the Disk I/O.
Philip, awesome, Thanks for the proof on the performance side.
did i discovere a bug inside this testings ?
setup: one pool 2 harddrives stripeset
in one terminal i have running zpool iostat testpool 5
in the other i have running:
time dd if=/dev/zero of=/testpool/test1.img
this makes only 30M in iostat
now i used mkfile 10G /testpool/test2.img and i get 60M in iostat
now i used cat /dev/zero > /testpool/test3.img i get 60M
so i belive there is as big difference between using something like > or dd.
could it be, because the default blocksize of dd is not so optimal for zfs?
You just tested the perfect case for Compression. Even a C64 can write several GB/s of Zeros to Floppy if you Runlength encode them first. ZFS can store zeros even faster on your old Laptop if you just create a sparse file by seeking other the desired amount of bytes.