Wednesday Apr 25, 2007

ZFS, Compression, and You

One of ZFS's strong points is its ability to self-tune. It monitors the activity on the disks that it's managing, and adjusts internally for the workload being asked of it. There are a few things that ZFS cannot know, and therefore can't really tune for. These are presented as properties.

Here are a typical list of properties (as of build 62) for a filesystem or a pool. Mouseover each line for a bit more detail.

# zfs get all tankThis is the command you enter
NAME  PROPERTY       VALUE                  SOURCEThe column headers
tank  type           filesystem             -The type can be a filesystem or a volume.  Pools are by default filesystems
tank  creation       Mon Apr 23 14:51 2007  -This is when the pool was created.  See 'zfs history'.
tank  used           209K                   -Indicates how much of the pool is in use
tank  available      66.9G                  -Indicates how much remains within the pool
tank  referenced     23K                    -How much data is used by the pool itself, excluding filesystems and volumes within it.
tank  compressratio  1.00x                  -Current compression ratio.  A value of 1.00x means compression is off or not having any effect.
tank  mounted        yes                    -Is this filesystem mounted anywhere?
tank  quota          none                   defaultWhat is the quota on this filesystem?
tank  reservation    none                   defaultIs there any space reserved for this filesystem alone?
tank  recordsize     128K                   defaultCurrent size of the data blocks.  One of the few tunables
tank  mountpoint     /tank                  defaultWhere is the fielsystem mounted?
tank  sharenfs       off                    defaultIs this filesystem shared via NFS?
tank  checksum       on                     defaultIs block checksumming on?
tank  compression    off                    defaultIs compression on?
tank  atime          on                     defaultShould the access time be updated when files are read?
tank  devices        on                     defaultAre device nodes allowed in this filesystem?
tank  exec           on                     defaultAre executables allowed to be run from this filesystem?
tank  setuid         on                     defaultIf 'off', then the setuid bit is ignored in this filesystem
tank  readonly       off                    defaultReadonly or read-write filesystem?
tank  zoned          off                    defaultIs this filesystem managed from a local zone?
tank  snapdir        hidden                 defaultShould the .zfs directory show up in ls -a output?
tank  aclmode        groupmask              defaultControls chmod(2)'s ACL modification
tank  aclinherit     secure                 defaultControls ACL inheritance
tank  canmount       on                     defaultIs this filesystem mountable?
tank  shareiscsi     off                    defaultShould this filesystem be shared as an iSCSI target?
tank  xattr          on                     defaultAre extended attributes enabled?
tank  copies         1                      defaultHow many copies (1-3) of data blocks should be made?

Any property in the list above whose SOURCE column is not a hypen can be modified. For example, you can change the mountpoint of the pool:

# zfs set mountpoint=/newmount tank

If the /newmount directory doesn't exist, it will be created, and the pool and its datasets will be remounted at the new location. For every property save two, the change you make to the property list has completed by the time the zfs command exits. The two properties left out of the change game are compression and copies: they only affect future writes to the filesystem.

What this means, of course, is that you can't suddenly get space from a filesystem just by turning on compression. Only new files that are written after compression is enabled will get the benefit. So we're stuck, right? Well, there's a workaround, but it's a little involved, and you need enough space to make a duplicate of the filesystem you're compressing.

Let's create a filesystem for Joe. Joe's got files that are compressible, and he'd like to use a compressed filesystem. But oops, the filesystem was created uncompressed (mouseover for more detail):

# zpool create tank c0t0d0s0Create a pool called "tank" on device c0t0d0s0
# zfs list -r tankLet see some stats
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank   106K  66.9G    18K  /tankWe're using 106K out of 67G
# zfs get compression tankIs compression on?
NAME  PROPERTY     VALUE     SOURCE
tank  compression  off       defaultAnswer: nope!
# zfs create tank/joefsNow let's create a filesystem for Joe to use.  We'll call it joefs because we like to use our imagination

And now we email Joe and say "your filesystem is ready! It's on /tank/joefs". Joe happily creates files. We'll simulate that with mkfile:

# mkfile 2g /tank/joefs/mylargefile.dataCreate a 2 gigabyte file.  mkfile's files are all zeros
# zfs list -r tankNow let's see how much space we're using
tank        2.00G  64.9G    19K  /tankOops, we're using 2G of the pool
tank/joefs  2.00G  64.9G  2.00G  /tank/joefsAnd it's all in Joe's filesystem
# zfs get compression tank/joefsDid we forget to turn on compression?
NAME        PROPERTY     VALUE       SOURCE
tank/joefs  compression  off         defaultYep, compression is off.

OK, so turning compression on won't change anything. Let's prove that:

# zfs set compression=on tank/joefsEnable compression
# zfs get compression tank/joefsDid we do it correctly?
NAME        PROPERTY     VALUE       SOURCE
tank/joefs  compression  on          localYes, compression is on
# zfs list -r tankHow much space did we save?
NAME         USED  AVAIL  REFER  MOUNTPOINT
tank        2.00G  64.9G    19K  /tank
tank/joefs  2.00G  64.9G  2.00G  /tank/joefsAnswer: None at all

The workaround might not be suitable for all environments, as it involves taking down the mountpoint temporarily. The steps we need to take are to create a new filesystem with compression on, snapshot Joe's filesystem, send Joe's filesystem to the new compressed filesystem, delete Joe's filesystem, and then rename the copy to the original. Step by step:

# zfs create -o compression=on tank/compressedfsCreate a new filesystem, and this time remember to turn on compression!
# zfs get -r compression tankCheck to see if it's set
NAME               PROPERTY     VALUE              SOURCE
tank               compression  off                defaultIt's still off for our pool
tank/compressedfs  compression  on                 localOur new filesystem has it turned on, as expected
tank/joefs         compression  on                 localThe flag hasn't changed for this filesystem - it's still on
tank/joefs@snap    compression  -                  -
# zfs send tank/joefs@snap | zfs receive tank/compressedfs/tmpjoeNow we get to transfer the snapshot to the compressed filesystem.  We're calling it "tmpjoe" because it's just a temporary.  This will take quite some time - we're copying 2 gigabytes through stdout/stdin, compressing it, and writing it to disk
# zfs list -r tankNow let's see what the result of all that is
NAME                            USED  AVAIL  REFER  MOUNTPOINT
tank                           2.00G  64.9G    21K  /tankStill using 2 gig
tank/compressedfs                37K  64.9G    19K  /tank/compressedfsOur new filesystem is only using 37k!  Files that are all nulls are easy to compress
tank/compressedfs/tmpjoe         18K  64.9G    18K  /tank/compressedfs/tmpjoeThe temporary copy of joefs is using only 18K.  Compression really works.
tank/compressedfs/tmpjoe@snap      0      -    18K  -
tank/joefs                     2.00G  64.9G  2.00G  /tank/joefs
tank/joefs@snap                    0      -  2.00G  -
# ls -l /tank/joefs /tank/compressedfs/tmpjoe/Let's take a look at the two directories and see if the files are there
/tank/compressedfs/tmpjoe/:
total 1We're only taking up one block in the compressed filedsystem
-rw------T   1 root     root     2147483648 Apr 24 14:59 mylargefile.dataand yet we've got the full 2 gigabyte file here

/tank/joefs:
total 4194865That's a lot of blocks
-rw------T   1 root     root     2147483648 Apr 24 14:59 mylargefile.dataAnd it's the same size as the compressed filesystem

What's the state of compresson on our filesystems? Easy way to find out:

# zfs get -r compression tank
NAME                           PROPERTY     VALUE                          SOURCE
tank                           compression  off                            default
tank/compressedfs              compression  on                             local
tank/compressedfs/tmpjoe       compression  on                             inherited from tank/compressedfsA-ha.  This means that tmpjoe was automatically compressed because tank/compressedfs had compression on and tmpjoe inherited that.
tank/compressedfs/tmpjoe@snap  compression  -                              -
tank/joefs                     compression  off                            default
tank/joefs@snap                compression  -                              -

Our plan now is to rename tank/compressedfs/tmpjoe to tank/joefs, but we'll still need to address the compression issue. If we just rename it, then tank/joefs will still inherit the compression=off value from tank, so any new files created in /tank/joefs will be uncompressed. Not to put too fine a point on it, but the files that are in tmpjoe will not be decompressed during the rename operation: compression (and 'copies') only affects future writes, not existing data.

So instead of inheriting the compression value, let's set it directly:

# zfs set compression=on tank/compressedfs/tmpjoe
# zfs get -r compression tank
NAME                           PROPERTY     VALUE                          SOURCE
tank                           compression  off                            default
tank/compressedfs              compression  on                             local
tank/compressedfs/tmpjoe       compression  on                             localNo longer inherited, it has now got its own ('local') value for compression
tank/compressedfs/tmpjoe@snap  compression  -                              -
tank/joefs                     compression  off                            default
tank/joefs@snap                compression  -                              -

Now it will keep its value, no matter what dataset it is part of, and we can safely go through the steps of deleting the old filesystem and renaming the new one:

# zfs destroy -r tank/joefsThis is the nerves-inducing step, of course.  If you want, you can do a 'zfs rename tank/joefs tank/saved-joefs' or something like that.  Just remember to delete it once you're convinced that the rest of the steps worked
# zfs rename tank/compressedfs/tmpjoe tank/joefsAnd now our old filesystem is back!  And compressed!
# zfs list -r tank
NAME                USED  AVAIL  REFER  MOUNTPOINT
tank                202K  66.9G    21K  /tank
tank/compressedfs    18K  66.9G    18K  /tank/compressedfs
tank/joefs           33K  66.9G    18K  /tank/joefs
tank/joefs@snap      15K      -    18K  -
# zfs destroy tank/joefs@snapCleaning up the now-unneeded snapshot
# zfs list -r tank
NAME                USED  AVAIL  REFER  MOUNTPOINT
tank                172K  66.9G    21K  /tank
tank/compressedfs    18K  66.9G    18K  /tank/compressedfsNo harm in leaving this around - it may be useful and only takes up 18k of disk space
tank/joefs           18K  66.9G    18K  /tank/joefs
# zfs get -r compression tankDouble-checking that we've got everything we wanted
NAME               PROPERTY     VALUE              SOURCE
tank               compression  off                default
tank/compressedfs  compression  on                 local
tank/joefs         compression  on                 localEt voilà!
# ls -l /tank/joefs
total 1
-rw------T   1 root     root     2147483648 Apr 24 14:59 mylargefile.data
# od -x /tank/joefs/mylargefile.data 
0000000 0000 0000 0000 0000 0000 0000 0000 0000
\*
20000000000

Not pretty, but it's definitely doable.

About

Known throughout Sun as a man of infinite wit, of jovial attitude, and of making things up about himself at the slightest whim.

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today