Running OpenSolaris and Zones in the Amazon Cloud - Part 2

Introduction

In Part 1 of this series of tutorials on OpenSolaris and Zones in AWS we described a method for creating zones within an EC2 instance running OpenSolaris.

This is Part 2 of the series where we will describe a method for backing up the zones using ZFS snapshots, sending a copy to a secondary EBS volume, and then performing a EBS snapshot of the secondary volume. We will provide an example of how to recover zones from our ZFS snapshots as well as recover from the secondary EBS volume if for some reason our primary EBS volume fails.

In Part 3 we will explain how to save a fully configured environment using a AMI and EBS snapshots, which can then be cloned and up and running in minutes.

Disclaimer: while the procedures described in this document have been tested, it is the responsibility of the reader to verify that these procedures work in their environment.

Prerequisites

  • Basic understanding of AWS EC2 including: managing EBS volumes and snapshots.

  • Basic understanding of OpenSolaris including ZFS.
  • Zones up and running as described in Part 1 of this tutorial.

Example EC2 environment

As described in Part 1, I created three EBS volumes, one for shared software, one for zones storage, and another one for zones backup. The EC2 environment is displayed below.

AWS EC2 Environment


Our goal is to perform the following steps:

  • Create ZFS snapshots for zone1 and zone2.

  • Send these snapshots to our backup EBS volume.

  • Create an EBS snapshot of our backup EBS volume.

I create ZFS snapshots on a hourly basis, and then create my EBS snapshot once a day. Pick the schedule that works for you, and make sure that you test the restore process!

Before we run our first backup, my ZFS environment for zone storage and zone backup is shown below.

root:~# zfs list -r zones
NAME                   USED  AVAIL  REFER  MOUNTPOINT
zones                 4.55G  3.26G    22K  /zones
zones/zone1           3.91G  3.26G    22K  /zones/zone1
zones/zone1/ROOT      3.91G  3.26G    19K  legacy
zones/zone1/ROOT/zbe  3.91G  3.26G  3.90G  legacy
zones/zone2            658M  3.26G    22K  /zones/zone2
zones/zone2/ROOT       658M  3.26G    19K  legacy
zones/zone2/ROOT/zbe   658M  3.26G   650M  legacy
root:~#
root:~# zfs list -r zones-backup
NAME           USED  AVAIL  REFER  MOUNTPOINT
zones-backup    70K  15.6G    19K  /zones-backup
root:~#

Backup Operations

I wrote a simple perl script to create ZFS snapshots and send to a backup ZFS pool. You can get a copy of the script at: zsnap-backup.pl. Please review the script before using to ensure it is suitable for your environment.

The script takes the following arguments.

root:~/bin# ./zsnap-backup -h

usage:

    zsnap-backup [-q] -t full|inc|list -f file_system
      -s source_pool -d dest_pool

      -q : run in quiet mode
      -t : type of backup to run:
             full : send complete snapshot to destination pool
             inc  : send incremental snapshot to destination pool
             list : list snapshots on source and destination pools
      -f : name of ZFS file system to backup
      -s : name of the source ZFS pool
      -d : name of the backup ZFS pool

Before I run my first backup, let's run the zsnap-backup script with the list option.

root:~/bin# ./zsnap-backup -t list -s zones -d zones-backup -f zone1
cannot open 'zones-backup/zone1': dataset does not exist
cannot open 'zones-backup/zone1': dataset does not exist
Source File System List
=========================
Name     : zones/zone1
Last Snap:
Name     : zones/zone1/ROOT
Last Snap:
Name     : zones/zone1/ROOT/zbe
Last Snap:
=========================
Dest File System List
=========================
=========================
Source Snapshot List
=========================
=========================
Destination Snapshot List
=========================
=========================
root:~/bin#

The messages "dataset does not exist" are telling me that there is no zones dataset on the destination pool yet. This will be created once the first full backup is run.

The first backup that I perform needs to be a "full" backup, i.e. when I send the ZFS snapshots to the zones-backup pool. In the example below, we backup our zone1 and zone2 file systems.

root:~/bin# ./zsnap-backup -t full -s zones -d zones-backup -f zone1
cannot open 'zones-backup/zone1': dataset does not exist
cannot open 'zones-backup/zone1': dataset does not exist
Sending: zone1@20090921-230804
root:~/bin#
root:~/bin# ./zsnap-backup -t full -s zones -d zones-backup -f zone2
cannot open 'zones-backup/zone2': dataset does not exist
cannot open 'zones-backup/zone2': dataset does not exist
Sending: zone2@20090921-230848
root:~/bin#

The time to complete the full backup depends n the size of the dataset. It could take a while, so be patient. Let's list our backups again with the list option.

root:~/bin# ./zsnap-backup -t list -s zones -d zones-backup -f zone1
Source File System List
=========================
Name     : zones/zone1
Last Snap: zones/zone1@20090921-230804
Name     : zones/zone1/ROOT
Last Snap: zones/zone1/ROOT@20090921-230804
Name     : zones/zone1/ROOT/zbe
Last Snap: zones/zone1/ROOT/zbe@20090921-230804
=========================
Dest File System List
=========================
Name     : zones-backup/zone1
Last Snap: zones-backup/zone1@20090921-230804
Name     : zones-backup/zone1/ROOT
Last Snap: zones-backup/zone1/ROOT@20090921-230804
Name     : zones-backup/zone1/ROOT/zbe
Last Snap: zones-backup/zone1/ROOT/zbe@20090921-230804
=========================
Source Snapshot List
=========================
zones/zone1@20090921-230804
zones/zone1/ROOT@20090921-230804
zones/zone1/ROOT/zbe@20090921-230804
=========================
Destination Snapshot List
=========================
zones-backup/zone1@20090921-230804
zones-backup/zone1/ROOT@20090921-230804
zones-backup/zone1/ROOT/zbe@20090921-230804
=========================
root:~/bin#

We see from the example above that we have created snapshots on the source pool zones and sent them to the destination pool zones-backup.

Now we will perform an incremental backup. Take a look at the script and you will see that we use the -R and -I options for the receive command. This will cause an incremental replication stream to be generated.

root:~/bin# ./zsnap-backup -t inc -s zones -d zones-backup -f zone1
Sending: zone1@20090921-231140
root:~/bin#
root:~/bin# ./zsnap-backup -t inc -s zones -d zones-backup -f zone2
Sending: zone2@20090921-231158

The incremental backup should complete in seconds. Use the list option for each file system to see the results. You can now schedule the incremental backups to run on a regular basis using cron if desired.

At this point we have created ZFS snapshots of our zones and sent a copy to our backup pool which resides on a separate EBS volume.

In the next step, we will create a EBS snapshot of the volume being used to store our zones-backup pool. You could use the AWS Console, the AWS Manager console, Elasticfox, etc. to perfrm the EBS snapshot step. In our example below, we provide the command line version. This assumes that you have the EC2 API tools installed in your environment and that you know the volume id where your zones-backup pool is located.

In our example, the EBS volume id is: vol-f546b69c

I have found that it is best to first export the ZFS pool before creating the EBS snapshot. Remember to import after the EBS snapshot command is run. If you have your incremental backups running in cron, schedule the EBS snapshot to occur at a time when you know that the zsnap-backup command will not be running.

root:~/bin# zpool export zones-backup
root:~/bin# ec2-create-snapshot vol-f546b69c
SNAPSHOT        snap-74bd0b1d   vol-f546b69c    pending 2009-09-22T06:14:52+0000
root:~/bin#
root:~/bin# zpool import zones-backup
root:~/bin# ec2-describe-snapshots | egrep snap-74bd0b1d
SNAPSHOT   snap-74bd0b1d   vol-f546b69c  completed  2009-09-22T06:14:52+0000  100%
root:~/bin#

At this point we have completed the following:

  • Created ZFS snapshots of our zones

  • Sent a copy of the snapshots to a backup ZFS pool

  • Created a EBS snapshot of the volume which contains our backup ZFS pool

Restore Operations

In general, there are three scenarios that we want to recover from:

  • User deleted a file or a handful of files in the zone
  • User did something catastrophic such as "rm -r \*" while in the /usr directory
  • The EBS volume containing the zones ZFS pool crashed

In the examples included below, we explain how to recover from each of these scenarios. Please see Working With ZFS Snapshots for more information.

Copying Individual Files From a Snapshot

It is possible to copy individual files from a snapshot by changing into the hidden .zfs directory of the file system that has been snapped. In our example we have created a couple of test files. After creation of these files a ZFS snapshot has been created. We then delete these files by accident and want to recover them. In the global zone we can restore the deleted files.

For example, we deleted two files, test1 and test2, by accident from zone1. We can then restore from our ZFS snapshot as shown below.

zone1

root@zone1:/var/tmp# ls -l test\*
test\*: No such file or directory
root@zone1:/var/tmp#

global zone

root:~# cd /zones/zone1/root/.zfs/snapshot/20090921-232400/var/tmp
root:~# ls test\*
test1  test2
root:~# cp test\* /zones/zone1/root/var/tmp

zone1

root@zone1:~# cd /var/tmp
root@zone1:/var/tmp# ls test\*
test1  test2
root@zone1:/var/tmp#

Rolling Back to a Previous Snapshot

We can use rollback to recover an entire file system. In our example, we accidently remove the entire /usr directory. To recover, we will halt the zone, perform the rollback, and then boot the zone.

Make sure you know what you are doing and have a snapshot to rollback before performing the following. You also need to know the name of the snapshot that you want to rollback. We will rollback to the latest snapshot for each zone1 file system. We also perform the rollback on our zones-backup pool to ensure they are synchronized.

zone1

root@zone1:/usr# cd /usr
root@zone1:/usr# rm -rf \*
root@:/usr# ls
-bash: /usr/bin/ls: No such file or directory
-bash: /usr/bin/hostname: No such file or directory

global zone

root:~# zoneadm -z zone1 halt
root:~#
root:~/bin# zfs rollback zones/zone1@20090921-232400
root:~/bin# zfs rollback zones/zone1/ROOT@20090921-232400
root:~/bin# zfs rollback zones/zone1/ROOT/zbe@20090921-232400
root:~/bin#
root:~/bin# zfs rollback zones-backup/zone1@20090921-232400
root:~/bin# zfs rollback zones-backup/zone1/ROOT@20090921-232400
root:~/bin# zfs rollback zones-backup/zone1/ROOT/zbe@20090921-232400
root:~/bin#
root:~/bin# zoneadm -z zone1 boot

zone1

root@zone1:~# cd /usr
root@zone1:/usr# ls
X         dict      kernel    man       perl5     sadm      spool
adm       games     kvm       net       platform  sbin      src
bin       gnu       lib       news      preserve  sfw       tmp
ccs       has       local     old       proc      share     xpg4
demo      include   mail      opt       pub       snadm
root@zone1:/usr#

Recovering from a EBS Volume Failure

Next, we provide an example of how to recover from a failure of the EBS volume where the zones data is stored.

It is important that prior to performing these steps you know what you are doing and have performed at least a full backup of the zones file systems.

In our example scenario, the EBS volume for the zones ZFS pool has been corrupted somehow. We will perform the following steps to recover from our zones-backup pool.

  • Halt the zones

  • Export the zones pool

  • Export the zones-backup pool

  • Import the zones-backup pool renaming it to "zones"

global zone

root:~# zoneadm -z zone1 halt
root:~# zoneadm -z zone2 halt
root:~#
root:~# zpool export zones
root:~# zpool export zones-backup
root:~#
root:~# zpool import zones-backup zones
root:~#
root:~# zoneadm -z zone1 boot
root:~# zoneadm -z zone2 boot

Effectively what we have done is booted our zones from the zones-backup pool after importing zones-backup as zones. To get our backup process going again we can delete the old zones EBS volume, create a new EBS volume to replace zones-backup, and create a new zones-backup pool on this new volume.

We can then perform a full backup as described above and re-start our regular incremental backup schedule.

References

Comments:

Thanks for posting this, perfect timing! I can't wait to test this out on EC2.

Posted by Kelvin Nicholson on September 22, 2009 at 10:45 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Sean ODell

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today