MySQL backup soon becomes an important matter when the database is
used in production. The pain-point comes from the fact that while backuping the database is not available to respond
to client requests anymore. With mysqldump - the
standard tool for performing MySQL backups - and a large database the operation can go over many tenth of
minutes if not hours. If I am running my business on line this is simply not
classical approach to workaround this problem
is to take advantage of MySQL replication. I set up a master/slave
configuration where the slave acts as copy of the master. Then, when
needed, I run mysqldump on the
slave without any service interruption on the master.
But ZFS snapshosts bring a new straightforward approach that avoids the pain and the complexity of a master/slave replication.
are a key feature of ZFS that allows me to save a copy of any ZFS file
system in less than a second. Yesterday, Sept 27th I first created
ZFS filesystem for my database, and since I decided
to backup my database everyday I also created a first snapshot. I am using InnoDB, so I made sure that both the data and the logs are located in the myDB filesystem. Today,
Sept 28th, I created a second snapshot. Here are the commands:
Yesterday, September 27th
# zfs create pool/myDB
# zfs snapshot pool/myDB@Sept27.2009
Today, September 28th
# zfs snapshot pool/myDB@Sept28.2009
that as long as the dataset does not change, the snapshot does not use
any disk space, so since there was no change to the database when I
created the first snapshot zfs list reporst me a null USED space :
# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
pool/myDB@Sept27.2009 0 - 667M
commands are simple, but what about the performance? If I run the
commands just as is, the filesystem that I save takes 667M on disk. On my laptop I
have much more data in my home directory. Let's see how long it takes me to
snapshot this data.
# zfs list rpool/export/home/tmanfe
NAME USED AVAIL REFER MOUNTPOINT
rpool/export/home/tmanfe 47.8G 22.5G 28.1G /export/home/tmanfe
# time zfs snapshot firstname.lastname@example.org
I am able to snapshot 47.9GBytes of data in 0.37seconds.
Good enough. As long as I can stay bellow a second, this will certainly
go unnoticed by the users. By the way, should my dataset increase, the
snapshot time will stay the same.
Now, what do I need to do with MySQL?
my data to be properly and consistently saved, I need to make sure that
it gets written to disk. To do this, MySQL provides me with the command
FLUSH and its TABLES WITH READ LOCK option. About this option the documentation says: "Closes all open tables and locks all tables for all
databases with a read lock until you explicitly release the
lock by executing UNLOCK TABLES
. This is a very convenient way to get backups
if you have a file system such as Veritas that can take
snapshots in time". Note that the documentation should be modified to add ZFS as another filesystem that supports snapshots. What FLUSH TABLES WITH READ LOCK really does is:
Since I am using InnoDB I also need to do a couple of things:
No such thing is
required with MyISAM.
Then, I create the ZFS snapshot and I release the tables with the command UNLOCK TABLES. I am done with MySQL.
I need to run and synchronize commands at both the shell level and in MySQL. More specifically, after executing the FLUSH in MySQL, I must not close the connection, otherwise the locks will be freed. An easy way to meet these requirements is to use a Python script:
dbh = MySQLdb.connect(host='localhost', user='mysqdl-admin', passwd=mysql_pass)
dbh.query('FLUSH TABLES WITH READ LOCK;')
os.system('zfs snapshot pool/myDB@Sept29.2009 )
snapshot I created is located on the same disks and server than my
original data. To put a real backup system in place I need to transfer
the snapshot to another backup server or storage system. Once again ZFS
provides me with all I need. The zfs send command enables me to send a snapshot somewhere else. The only prerequisite is that ZFS must be available on my backup server in order to take advantage of the snapshot that it receives.
Here is the exact syntax:
# zfs send -Ri pool/myDB@Sept27.2009 pool/myDB@Sept28.2009 | ssh backupHost “zfs receive -Fd pool”
With the -i option, only the difference between my two snapshots is sent to a system named backupHost.
This option enables me to do incremental backups. Ahead of incremental
backups I need to send a first complete snapshot to the backup host. I
did it yesterday, when creating my very first snapshot:
# zfs send -R pool/myDB@Sept27.2009 | ssh backupHost “zfs receive -Fd pool”
Note that the stream generated by zfs send replicates the filesystem, not the snapshot.
The stream is then funneled into an ssh command that connects to the backup host. Once connected, the zfs receive
command is executed that takes the content of the stream - i.e. the
filesytem - and copies it into a zfs pool that is local to the backup
I am done.