A faster ZFS snapshot massacre

I moved the zfs snapshot script into the office and started running it on our build system. Being a cautious type when it comes to other people's data I ran the clean up script in “do nothing” mode so I could be sure it was not cleaning snapshots that it should not. After a while running like this we had over 150,000 snapshots of 114 file systems which meant that zfs list was now taking a long time to run.

So long in fact that the clean up script was not actually making forward progress against snapshots being created every 10 minutes. So I now have a new clean up script. This is functionally identical to the old one but a lot faster. Unfortunately I have now cleaned out the snapshots so the times are not what they were, zfs list was taking 14 minutes, however the difference is still easy to see.

When run with the option to do nothing the old script:

# time /root/zfs_snap_clean > /tmp/zfsd2

real    2m23.32s
user    0m21.79s
sys     1m1.58s
#

And the new:

# time ./zfs_cleanup -n > /tmp/zfsd

real    0m7.88s
user    0m2.40s
sys     0m4.75s
#

which is a result.


As you can see the new script is mostly a nawk script and more importantly only calls the zfs command once to get all the information about the snapshots:


#!/bin/ksh -p
#
# Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
# Use is subject to license terms.
#
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License, Version 1.0 only
# (the "License").  You may not use this file except in compliance
# with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
#
#	Script to clean up snapshots created by the script from this blog
#	entry:
#
#	http://blogs.sun.com/chrisg/entry/cleaning_up_zfs_snapshots
#
#	or using the command given in this entry to create snapshots when
#	users mount a file system using SAMBA:
#
#	http://blogs.sun.com/chrisg/entry/samba_meets_zfs
#
#	Chris.Gerhard@sun.com 23/11/2006
#

PATH=$PATH:$(dirname $0)

while getopts n c
do
	case $c in
	n) DO_NOTHING=1 ;;
	\\?) echo "$0 [-n] [filesystems]"
		exit 1 ;;
	esac
done
shift $(($OPTIND - 1))
if (( $# == 0))
then
	set - $(zpool list -Ho name)
fi


export NUMBER_OF_SNAPSHOTS_boot=${NUMBER_OF_SNAPSHOTS:-10}
export DAYS_TO_KEEP_boot=${DAYS_TO_KEEP:-365}

export NUMBER_OF_SNAPSHOTS_smb=${NUMBER_OF_SNAPSHOTS:-100}
export DAYS_TO_KEEP_smb=${DAYS_TO_KEEP:-14}

export NUMBER_OF_SNAPSHOTS_month=${NUMBER_OF_SNAPSHOTS:-24}
export DAYS_TO_KEEP_month=365

export NUMBER_OF_SNAPSHOTS_day=${NUMBER_OF_SNAPSHOTS:-$((28 \* 2))}
export DAYS_TO_KEEP_day=${DAYS_TO_KEEP:-28}

export NUMBER_OF_SNAPSHOTS_hour=$((7 \* 24 \* 2))
export DAYS_TO_KEEP_hour=$((7))

export NUMBER_OF_SNAPSHOTS_minute=$((100))
export DAYS_TO_KEEP_minute=$((1))


zfs get -Hrpo name,value creation $@ | sort -r -n -k 2 |\\
	nawk -v now=$(convert2secs $(date)) -v do_nothing=${DO_NOTHING:-0} '
function ttg(time)
{
	return (now - (time \* 24 \* 60 \* 60));
}
BEGIN {
	time_to_go["smb"]=ttg(ENVIRON["DAYS_TO_KEEP_smb"]);
	time_to_go["boot"]=ttg(ENVIRON["DAYS_TO_KEEP_boot"]);
	time_to_go["minute"]=ttg(ENVIRON["DAYS_TO_KEEP_minute"]);
	time_to_go["hour"]=ttg(ENVIRON["DAYS_TO_KEEP_hour"]);
	time_to_go["day"]=ttg(ENVIRON["DAYS_TO_KEEP_day"]);
	time_to_go["month"]=ttg(ENVIRON["DAYS_TO_KEEP_month"]);
	number_of_snapshots["smb"]=ENVIRON["NUMBER_OF_SNAPSHOTS_smb"];
	number_of_snapshots["boot"]=ENVIRON["NUMBER_OF_SNAPSHOTS_boot"];
	number_of_snapshots["minute"]=ENVIRON["NUMBER_OF_SNAPSHOTS_minute"];
	number_of_snapshots["hour"]=ENVIRON["NUMBER_OF_SNAPSHOTS_hour"];
	number_of_snapshots["day"]=ENVIRON["NUMBER_OF_SNAPSHOTS_day"];
	number_of_snapshots["month"]=ENVIRON["NUMBER_OF_SNAPSHOTS_month"];
} 
/.\*@.\*/ { 
	split($1, a, "@");
	split(a[2], b, "_");
	if (number_of_snapshots[b[1]] != 0 &&
		++snap_count[a[1], b[1]] > number_of_snapshots[b[1]] &&
		time_to_go[b[1]] > $2) {
		str= sprintf("zfs destroy %s\\n", $1);
		printf(str);
		if (do_nothing == 0) {
			system(str);
		}
	}
}'

Tags:

Comments:

Post a Comment:
Comments are closed for this entry.
About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today