Thursday Feb 20, 2014

Pre-work for upcoming Solaris 11 Hands on Workshops

Over the next few weeks, I will be hosting several Solaris 11 hands on workshops. Some of these will be public events at an Oracle office while others will be private sessions for a specific customer. If you are planning on attending any of these sessions, there are a few items of pre-work that will help us get the workshop started on time.

Enable VT-x

If you will be using VirtualBox to run your workshop lab guest, the hardware virtualization feature for your CPU must be enabled. For AMD systems, AMD-V is on by default and there may not be a setting to turn it off. For Intel systems, this is controlled by a BIOS setting, and almost always defaults to disabled. The BIOS setting varies from vendor to vendor, but is generally found in the System or CPU settings. If you don't see it there, try looking in security. If you still can't find it, search for your brand of laptop and "enable vt" using your favorite search engine.

On newer Intel systems, you may be given choices for CPU virtualization (VT-x) and data/IO (VT-d). You only need to enable VT-x. Some laptops will require a complete power cycle after changing this setting, including removing the battery.

If you have a company laptop that does not allow you to change the BIOS settings, you might ask your employer if they can provide you one for the day that is not locked down.

Note: Enabling hardware virtualization is a requirement to complete the workshop.

Download and Install VirtualBox

Since this will be primarily a hands on lab, you are encouraged to bring a laptop. The labs will all be run in a Solaris guest machine, so your laptop will also need a virtualization application, such as VMware or VirtualBox. We recommend VirtualBox and will be supplying the lab materials as a VirtualBox guest appliance. You can download VirtualBox for free at VirtualBox.org. Binaries are available for Windows, MacOS, Solaris and most Linux distributions.

After installing VirtualBox, you should also install the VirtualBox Extensions Pack. These are not required for the lab, but should you continue to use the guest machine after the workshop, you might find some of the features very useful.

Don't Forget Your Power Adapters

Since you will be running Solaris as a guest operating system, your host power management features might not very effective and you may find yourself with a drained battery before the morning is over. Please remember to bring your power adapter and cables. An external mouse, while not required, is generally a welcome device, as you cut and paste text between windows.

That should be about it. Please leave a comment if you have any questions. I am looking forward to seeing you at one of these, or a future Solaris event.

Technocrati Tags:

Friday Jan 13, 2012

Live Upgrade, /var/tmp and the Ever Growing Boot Environments

Even if you are a veteran Live Upgrade user, you might be caught by surprise when your new ZFS root pool starts filling up, and you have no idea where the space is going. I tripped over this one while installing different versions of StarOffice and OpenOffice and forgot that they left a rather large parcel behind in /var/tmp. When recently helping a customer through some Live Upgrade issues, I noticed that they were downloading patch clusters into /var/tmp and then I remembered that I used to do that too.

And then stopped. This is why. What follows has been added to the list of Common Live Upgrade Problems, as Number 3.

Let's start with a clean installation of Solaris 10 10/09 (u8).

# df -k /
Filesystem                       kbytes    used   avail capacity  Mounted on
rpool/ROOT/s10x_u8wos_08a      20514816 4277560 13089687    25%    /

So far, so good. Solaris is just a bit over 4GB. Another 3GB is used by the swap and dump devices. That should leave plenty of room for half a dozen or so patch cycles (assuming 1GB each) and an upgrade to the next release.

Now, let's put on the latest recommended patch cluster. Note that I am following the suggestions in my Live Upgrade Survival Guide, installing the prerequisite patches and the LU patch before actually installing the patch cluster.

# cd /var/tmp
# wget patchserver:/export/patches/10_x86_Recommended-2012-01-05.zip .
# unzip -qq 10_x86_Recommended-2012-01-05.zip

# wget patchserver:/export/patches/121431-69.zip
# unzip 121431-69

# cd 10x_Recommended
# ./installcluster --apply-prereq --passcode (you can find this in README)

# patchadd -M /var/tmp 121431-69

# lucreate -n s10u8-2012-01-05
# ./installcluster -d -B s10u8-2012-01-05 --passcode

# luactivate s10u8-2012-01-05
# init 0

After the new boot environment is activated, let's upgrade to the latest release of Solaris 10. In this case, it will be Solaris 10 8/11 (u10).

Yes, this does seem like an awful lot is happening in a short period of time. I'm trying to demonstrate a situation that really does happen when you forget something as simple as a patch cluster clogging up /var/tmp. Think of this as one of those time lapse video sequences you might see in a nature documentary.

# pkgrm SUNWluu SUNWlur SUNWlucfg
# pkgadd -d /cdrom/sol_10_811_x86  SUNWluu SUNWlur SUNWlucfg
# patchadd -M /var/tmp 121431-69

# lucreate -n s10u10-baseline'
# echo "autoreg=disable" > /var/tmp/no-autoreg
# luupgrade -u -s /cdrom/sol_10_811_x86 -k /var/tmp/no-autoreg -n s10u10-baseline
# luactivate s10u10-baseline
# init 0
As before, everything went exactly as expected. Or I thought so, until I logged in the first time and checked the free space in the root pool.
# df -k /
Filesystem                       kbytes    used   avail capacity  Mounted on
rpool/ROOT/s10u10-baseline     20514816 10795038 2432308    82%    /
Where did all of the space go ? Back of the napkin calculations of 4.5GB (s10u8) + 4.5GB (s10u10) + 1GB (patch set) + 3GB (swap and dump) = 13GB. 20GB pool - 13GB used = 7GB free. But there's only 2.4GB free ?

This is about the time that I smack myself on the forehead and realize that I put the patch cluster in the /var/tmp. Old habits die hard. This is not a problem, I can just delete it, right ?

Not so fast.

# du -sh /var/tmp
 5.4G   /var/tmp

# du -sh /var/tmp/10*
 3.8G   /var/tmp/10_x86_Recommended
 1.5G   /var/tmp/10_x86_Recommended-2012-01-05.zip

# rm -rf /var/tmp/10*

# du -sh /var/tmp
 3.4M   /var/tmp

Imagine the look on my face when I check the pool free space, expecting to see 7GB.
# df -k /
Filesystem                      kbytes    used   avail capacity  Mounted on
rpool/ROOT/s10u10-baseline    20514816 5074262 2424603    68%    /

We are getting closer. At least my root filesystem size is reasonable (5GB vs 11GB). But the free space hasn't changed at all.

Once again, I smack myself on the forehead. The patch cluster is also in the other two boot environments. All I have to do is get rid them too, and I'll get my free space back.

# lumount s10u8-2012-01-05 /mnt
# rm -rf /mnt/var/tmp/10_x86_Recommended*
# luumount s10u8-2012-01-05

# lumount s10x_u8wos_08a /mnt
# rm -rf /mnt/var/tmp/10_x86_Recommended*
# luumount s10x_u8wos_08a
Surely, that will get my free space reclaimed, right ?
# df -k /
Filesystem                    kbytes    used   avail capacity  Mounted on
rpool/ROOT/s10u10-baseline  20514816 5074265 2429261    68%    /

This is when I smack myself on the forehead for the third time in one afternoon. Just getting rid of them in the boot environments is not sufficient. It would be if I were using UFS as a root filesystem, but lucreate will use the ZFS snapshot and cloning features when used on a ZFS root. So the patch cluster is in the snapshot, and the oldest one at that.

Let's try this all over again, but this time I will put the patches somewhere else that is not part of a boot environment. If you are thinking of using root's home directory, think again - it is part of the boot environment. If you are running out of ideas, let me suggest that /export/patches might be a good place to put them.

Doing the exercise again, with the patches in /export/patches, I get similar results (to be expected), but with one significant different.This time the patches are in a shared ZFS dataset (/export) and can be deleted.

# lustatus
Boot Environment           Is       Active Active    Can    Copy      
Name                       Complete Now    On Reboot Delete Status    
-------------------------- -------- ------ --------- ------ ----------
s10x_u8wos_08a             yes      no     no        yes    -         
s10u8-2012-01-05           yes      no     no        yes    -         
s10u10-baseline            yes      yes    yes       no     -         

# df -k /
Filesystem                      kbytes    used   avail capacity  Mounted on
rpool/ROOT/s10u10-baseline    20514816 5184578 2445140    68%    /


# df -k /export
Filesystem                      kbytes    used   avail capacity  Mounted on
rpool/export                  20514816 5606384 2445142    70%    /export

This time, when I delete them, the disk space will be reclaimed.
# rm -rf /export/patches/10_x86_Recommended*

# df -k /
Filesystem                      kbytes    used   avail capacity  Mounted on
rpool/ROOT/s10u10-baseline    20514816 5184578 8048050    40%    /

Now, that's more like it. With this free space, I can continue to patch and maintain my system as I had originally planned - estimating a few hundred MB to 1.5GB per patch set.

The moral to the story is that even if you follow all of the best practices and recommendations, you can still be tripped up by old habits when you don't consider their consequences. And when you do, don't feel bad. Many best practices come from exercises just like this one.

Technocrati Tags:

Monday Nov 23, 2009

Great Lakes OpenSolaris Users Group - Nov 2009

I would like to thank Chip Bennett and all of the fine folks from Laurus Technologies for hosting the November meeting of the Great Lakes OpenSolaris Users Group (GLUG), especially on such a short notice. It was a pleasure coming back and I enjoyed meeting up some some old friends and making some new ones.

We had a rather nice discussion around recent enhancements to ZFS. As promised, I have posted my slides for your review. Please let me know if you have any trouble downloading them or if you find any confusing or erroneous bits.

I appreciate all of the folks that turned out as well as those that connected to the webcast. I hope to see all of you again at a future meeting.

Sunday Nov 22, 2009

Taking ZFS deduplication for a test drive

Now that I have a working OpenSolaris build 128 system, I just had to take ZFS deduplication for a spin, to see if it was worth all of the hype.

Here is my test case: I have 2 directories of photos, totaling about 90MB each. And here's the trick - they are almost complete duplicates of each other. I downloaded all of the photos from the same camera on 2 different days. How many of you do that ? Yeah, me too.

Let's see what ZFS can figure out about all of this. If it is super smart we should end up with a total of 90MB of used space. That's what I'm hoping for.

The first step is to create the pool and turn on deduplication from the beginning.
# zpool create -f scooby -O dedup=on c2t2d0s2
This will use sha256 for determining if 2 blocks are the same. Since sha256 has such a low collision probability (something like 1x10\^-77), we will not turn on automatic verification. If we were using an algorithm like fletcher4 which has a higher collision rate we should also perform a complete block compare before allowing the block removal (dedup=fletcher4,verify)

Now copy the first 180MB (remember, this is 2 sets of 90MB which are nearly identical sets of photos).
# zfs create scooby/doo
# cp -r /pix/Alaska\* /scooby/doo
And the second set.
# zfs create scooby/snack
# cp -r /pix/Alaska\* /scooby/snack
And finally the third set.
# zfs create scooby/dooby
# cp -r /pix/Alaska\* /scooby/dooby
Let's make sure there are in fact three copies of the photos.
# df -k | grep scooby
scooby               74230572      25 73706399     1%    /scooby
scooby/doo           74230572  174626 73706399     1%    /scooby/doo
scooby/snack         74230572  174626 73706399     1%    /scooby/snack
scooby/dooby         74230572  174625 73706399     1%    /scooby/dooby


OK, so far so good. But I can't quite tell if the deduplication is actually doing anything. With all that free space, it's sort of hard to see. Let's look at the pool properties.
# zpool get all scooby
NAME    PROPERTY       VALUE       SOURCE
scooby  size           71.5G       -
scooby  capacity       0%          -
scooby  altroot        -           default
scooby  health         ONLINE      -
scooby  guid           5341682982744598523  default
scooby  version        22          default
scooby  bootfs         -           default
scooby  delegation     on          default
scooby  autoreplace    off         default
scooby  cachefile      -           default
scooby  failmode       wait        default
scooby  listsnapshots  off         default
scooby  autoexpand     off         default
scooby  dedupratio     5.98x       -
scooby  free           71.4G       -
scooby  allocated      86.8M       -
Now this is telling us something.

First notice the allocated space. Just shy of 90MB. But there's 522MB of data (174MB x 3). But only 87MB used out of the pool. That's a good start.

Now take a look at the dedupratio. Almost 6. And that's exactly what we would expect, if ZFS is as good as we are lead to believe. 3 sets of 2 duplicate directories is 6 total copies of the same set of photos. And ZFS caught every one of them.

So if you want to do this yourself, point your OpenSolaris package manager at the dev repository and wait for build 128 packages to show up. If you need instructions on using the OpenSolaris dev repository, point the browser of your choice at http://pkg.opensolaris.org/dev/en/index.shtml. And if you can't wait for the packages to show up, you can always .

Technocrati Tags:
<script type="text/javascript"> var sc_project=1193495; var sc_invisible=1; var sc_security="a46f6831"; </script> <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>

Tuesday Mar 24, 2009

Nice OpenSolaris 2008.11 training materials from CZOSUG event

I don't normally just post about something that someone else did. That is what RSS aggregators and search engines are for. Occasionally something comes across an email discussion list that you just have to pass along. This is one of those times.

Roman Strobl, Martin Man and and Lubos Kocman (leaders of the Czech OpenSolaris User Group) put together a very nice OpenSolaris training day and have posted all of the materials from the event. This is an excellent OpenSolaris overview and tutorial - nicely paced and a good amount of content.

Thanks to Roman, Lubos and Martin for making this available.

Tuesday Mar 17, 2009

Time-slider saves the day (or at least a lot of frustration)

As I was tidying up my Live Upgrade boot environments yesterday, I did something that I thought was terribly clever but had some pretty wicked side effects. While linking up all of my application configuration directories (firefox, mozilla, thunderbird, [g]xine, staroffice) I got blindsided by the GNOME message client: pidgin, or more specifically one of our migration assistants from GAIM to pidgin.

As a quick background, Solaris, Solaris Express Community Edition (SXCE), and OpenSolaris all have different versions of the GNOME desktop. Since some of the configuration settings are incompatible across releases the easy solution is to keep separate home directories for each version of GNOME you might use. Which is fine until you grow weary of setting your message filters for Thunderbird again or forget which Firefox has that cached password for the local recreation center that you only use once a year. Pretty quickly you come up with the idea of a common directory for all shared configuration files (dot directories, collections of pictures, video, audio, presentations, scripts).

For one boot environment you do something like
$ mkdir /export/home/me
$ for dotdir in .thunderbird .purple .mozilla .firefox .gxine .xine .staroffice .wine .staroffice\* .openoffice\* .VirtualBox .evolution bin lib misc presentations 
> do
> mv $dotdir /export/home/me
> ln -s /export/home/me/$dotdir   $dotdir
> done
And for the other GNOME home directories you do something like
$ for dotdir in .thunderbird .purple .mozilla .firefox .gxine .xine .staroffice .wine .staroffice\* .openoffice\* .VirtualBox .evolution bin lib misc presentations 
> do
> mv $dotdir ${dotdir}.old
> ln -s /export/home/me/$dotdir   $dotdir
> done
And all is well. Until......

Booted into Solaris 10 and fired up pidgin thinking I would get all of my accounts activated and the default chatrooms started. Instead I was met by this rather nasty note that I had incompatible GAIM entries and it would try to convert them for me. What it did was wipe out all of my pidgin settings. And sure enough when I look into the shared directory, .purple contained all new and quite empty configuration settings.

This is where I am hoping to get some sympathy, since we have all done things like this. But then I remembered I had started time-slider earlier in the day (from the OpenSolaris side of things).
$ time-slider-setup
And there were my .purple files from 15 minutes ago, right before the GAIM conversion tools made a mess of them.
$ cd /export/home/.zfs/snapshot
$ ls
zfs-auto-snap:daily-2009-03-16-22:47
zfs-auto-snap:daily-2009-03-17-00:00
zfs-auto-snap:frequent-2009-03-17-11:45
zfs-auto-snap:frequent-2009-03-17-12:00
zfs-auto-snap:frequent-2009-03-17-12:15
zfs-auto-snap:frequent-2009-03-17-12:30
zfs-auto-snap:hourly-2009-03-16-22:47
zfs-auto-snap:hourly-2009-03-16-23:00
zfs-auto-snap:hourly-2009-03-17-00:00
zfs-auto-snap:hourly-2009-03-17-01:00
zfs-auto-snap:hourly-2009-03-17-02:00
zfs-auto-snap:hourly-2009-03-17-03:00
zfs-auto-snap:hourly-2009-03-17-04:00
zfs-auto-snap:hourly-2009-03-17-05:00
zfs-auto-snap:hourly-2009-03-17-06:00
zfs-auto-snap:hourly-2009-03-17-07:00
zfs-auto-snap:hourly-2009-03-17-08:00
zfs-auto-snap:hourly-2009-03-17-09:00
zfs-auto-snap:hourly-2009-03-17-10:00
zfs-auto-snap:hourly-2009-03-17-11:00
zfs-auto-snap:hourly-2009-03-17-12:00
zfs-auto-snap:monthly-2009-03-16-11:38
zfs-auto-snap:weekly-2009-03-16-22:47

$ cd zfs-auto-snap:frequent-2009-03-17-12:15/me/.purple
$ rm -rf /export/home/me/.purple/\*
$ cp -r \* /export/home/me/.purple

(and this is is really really important)
$ mv $HOME/.gaim $HOME/.gaim-never-to-be-heard-from-again

Log out and back in to refresh the GNOME configuration settings and everything is as it should be. OpenSolaris time-slider is just one more reason that I'm glad that it is my daily driver.

Technocrati Tags:

Monday Mar 02, 2009

Alaska and Oregon Solaris Boot Camps

A big thanks to all who attended the Solaris Boot Camps in Juneau, Fairbanks, Portland and Salem. I hope that you found the information useful. And thanks for all of the good questions and discussion.

Here are the materials that were used during the bootcamp.

Please send me email if you have any questions or want to follow up on any of the discussions.

Thanks again for your attendance and continued support for Solaris.

Technocrati Tags:

Monday Feb 18, 2008

ZFS and FMA - Two great tastes .....

Our good friend Isaac Rozenfeld talks about the Multiplicity of Solaris. When talking about Solaris I will use the phrase "The Vastness of Solaris". If you have attended a Solaris Boot Camp or Tech Day in the last few years you get an idea of what we are talking about - when we go on about Solaris hour after hour after hour.

But the key point in Isaac's multiplicity discussion is how the cornucopia of Solaris features work together to do some pretty spectacular (and competitively differentiating) things. In the past we've looked at combinations such as ZFS and Zones or Service Management, Role Based Access Control (RBAC) and Least Privilege. Based on a conversation last week in St. Louis, let's consider how ZFS and Solaris Fault Management (FMA) play together.

Preparation

Let's begin by creating some fake devices that we can play with. I don't have enough disks on this particular system, but I'm not going to let that slow me down. If you have sufficient real hot swappable disks, feel free to use them instead.
# mkfile 1g /dev/disk1
# mkfile 1g /dev/disk2
# mkfile 512m /dev/disk3
# mkfile 512m /dev/disk4
# mkfile 1g /dev/disk5

Now let's create a couple of zpools using the fake devices. pool1 will be a 1GB mirrored pool using disk1 and disk2. pool2 will be a 512MB mirrored pool using disk3 and disk4. Device spare1 will spare both pools in case of a problem - which we are about to inflict upon the pools.
# zpool create pool1 mirror disk1 disk2 spare spare1
# zpool create pool2 mirror disk3 disk4 spare spare1
# zpool status
  pool: pool1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        pool1       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk1   ONLINE       0     0     0
            disk2   ONLINE       0     0     0
        spares
          spare1    AVAIL   

errors: No known data errors

  pool: pool2
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        pool2       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk3   ONLINE       0     0     0
            disk4   ONLINE       0     0     0
        spares
          spare1    AVAIL   

errors: No known data errors

So far so good. If we were to run a scrub on either pool, it will complete immediately. Remember that unlike hardware RAID disk replacement, ZFS scrubbing and resilvering only touches blocks that contain actual data. Since there is no data in these pools (yet), there is little for the scrubbing process to do.
# zpool scrub pool1
# zpool scrub pool2
# zpool status
  pool: pool1
 state: ONLINE
 scrub: scrub completed with 0 errors on Mon Feb 18 09:24:16 2008
config:

        NAME        STATE     READ WRITE CKSUM
        pool1       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk1   ONLINE       0     0     0
            disk2   ONLINE       0     0     0
        spares
          spare1    AVAIL   

errors: No known data errors

  pool: pool2
 state: ONLINE
 scrub: scrub completed with 0 errors on Mon Feb 18 09:24:17 2008
config:

        NAME        STATE     READ WRITE CKSUM
        pool2       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk3   ONLINE       0     0     0
            disk4   ONLINE       0     0     0
        spares
          spare1    AVAIL   

errors: No known data errors

Let's populate both pools with some data. I happen to have a directory of scenic images that I use as screen backgrounds - that will work nicely.

# cd /export/pub/pix>
# find scenic -print | cpio -pdum /pool1
# find scenic -print | cpio -pdum /pool2

# df -k | grep pool
pool1                1007616  248925  758539    25%    /pool1
pool2                 483328  248921  234204    52%    /pool2

And yes, cp -r would have been just as good.

Problem 1: Simple data corruption

Time to inflict some harm upon the pool. First, some simple corruption. Writing some zeros over half of the mirror should do quite nicely.
# dd if=/dev/zero of=/dev/dsk/disk1 bs=8192 count=10000 conv=notrunc
10000+0 records in
10000+0 records out 

At this point we are unaware that anything has happened to our data. So let's try accessing some of the data to see if we can observe ZFS self healing in action. If your system has plenty of memory and is relatively idle, accessing the data may not be sufficient. If you still end up with no errors after the cpio, try a zpool scrub - that will catch all errors in the data.
# cd /pool1
# find . -print | cpio -ov > /dev/null
416027 blocks

Let's ask our friend fmstat(1m) if anything is wrong ?
# fmstat
module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0    0.1   0   0     0     0      0      0
disk-transport           0       0  0.0  366.5   0   0     0     0    32b      0
eft                      0       0  0.0    2.6   0   0     0     0   1.4M      0
fmd-self-diagnosis       1       0  0.0    0.2   0   0     0     0      0      0
io-retire                0       0  0.0    1.1   0   0     0     0      0      0
snmp-trapgen             1       0  0.0   16.0   0   0     0     0    32b      0
sysevent-transport       0       0  0.0  620.3   0   0     0     0      0      0
syslog-msgs              1       0  0.0    9.7   0   0     0     0      0      0
zfs-diagnosis          162     162  0.0    1.5   0   0     1     0   168b   140b
zfs-retire               1       1  0.0  112.3   0   0     0     0      0      0

As the guys in the Guinness commercial say, "Brilliant!" The important thing to note here is that the zfs-diagnosis engine has run several times indicating that there is a problem somewhere in one of my pools. I'm also running this on Nevada so the zfs-retire engine has also run, kicking in a hot spare due to excessive errors.

So which pool is having the problems ? We continue our FMA investigation to find out.
# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 18 09:56:24 d82d1716-c920-6243-e899-b7ddd386902e  ZFS-8000-GH    Major    

Fault class : fault.fs.zfs.vdev.checksum

Description : The number of checksum errors associated with a ZFS device
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/ZFS-8000-GH for more information.

Response    : The device has been marked as degraded.  An attempt
              will be made to activate a hot spare if available.

Impact      : Fault tolerance of the pool may be compromised.

Action      : Run 'zpool status -x' and replace the bad device.


# zpool status -x
  pool: pool1
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver in progress, 44.83% done, 0h0m to go
config:

        NAME          STATE     READ WRITE CKSUM
        pool1         DEGRADED     0     0     0
          mirror      DEGRADED     0     0     0
            spare     DEGRADED     0     0     0
              disk1   DEGRADED     0     0   162  too many errors
              spare1  ONLINE       0     0     0
            disk2     ONLINE       0     0     0
        spares
          spare1      INUSE     currently in use

errors: No known data errors

This tells us all that we need to know. The device disk1 was found to have quite a few checksum errors - so many in fact that it was replaced automatically by a hot spare. The spare was resilvering and a full complement of data replicas would be available soon. The entire process was automatic and completely observable.

Since we inflicted harm upon the (fake) disk device ourself, we know that it is in fact quite healthy. So we can restore our pool to its original configuration rather simply - by detaching the spare and clearing the error. We should also clear the FMA counters and repair the ZFS vdev so that we can tell if anything else is misbehaving in either this or another pool.
# zpool detach pool1 spare1
# zpool clear pool
# zpool status pool1
  pool: pool1
 state: ONLINE
 scrub: resilver completed with 0 errors on Mon Feb 18 10:25:26 2008
config:

        NAME        STATE     READ WRITE CKSUM
        pool1       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk1   ONLINE       0     0     0
            disk2   ONLINE       0     0     0
        spares
          spare1    AVAIL   

errors: No known data errors


# fmadm reset zfs-diagnosis
# fmadm reset zfs-retire
# fmstat
module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0    0.5   0   0     0     0      0      0
disk-transport           0       0  0.0  223.5   0   0     0     0    32b      0
eft                      1       0  0.0    4.6   0   0     0     0   1.4M      0
fmd-self-diagnosis       4       0  0.0    0.6   0   0     0     0      0      0
io-retire                1       0  0.0    1.1   0   0     0     0      0      0
snmp-trapgen             4       0  0.0    8.8   0   0     0     0    32b      0
sysevent-transport       0       0  0.0  372.7   0   0     0     0      0      0
syslog-msgs              4       0  0.0    5.4   0   0     0     0      0      0
zfs-diagnosis            0       0  0.0    1.4   0   0     0     0      0      0
zfs-retire               0       0  0.0    0.0   0   0     0     0      0      0


# fmdump -v -u d82d1716-c920-6243-e899-b7ddd386902e
TIME                 UUID                                 SUNW-MSG-ID
Feb 18 09:51:49.3025 d82d1716-c920-6243-e899-b7ddd386902e ZFS-8000-GH
  100%  fault.fs.zfs.vdev.checksum

        Problem in: 
           Affects: zfs://pool=pool1/vdev=449a3328bc444732
               FRU: -
          Location: -

# fmadm repair zfs://pool=pool1/vdev=449a3328bc444732
fmadm: recorded repair to zfs://pool=pool1/vdev=449a3328bc444732

# fmadm faulty

Problem 2: Device failure

Time to do a little more harm. In this case I will simulate the failure of a device by removing the fake device. Again we will access the pool and then consult fmstat to see what is happening (are you noticing a pattern here????).
# rm -f /dev/dsk/disk2
# cd /pool1
# find . -print | cpio -oc > /dev/null
416027 blocks

# fmstat
module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0    0.5   0   0     0     0      0      0
disk-transport           0       0  0.0  214.2   0   0     0     0    32b      0
eft                      1       0  0.0    4.6   0   0     0     0   1.4M      0
fmd-self-diagnosis       4       0  0.0    0.6   0   0     0     0      0      0
io-retire                1       0  0.0    1.1   0   0     0     0      0      0
snmp-trapgen             4       0  0.0    8.8   0   0     0     0    32b      0
sysevent-transport       0       0  0.0  372.7   0   0     0     0      0      0
syslog-msgs              4       0  0.0    5.4   0   0     0     0      0      0
zfs-diagnosis            0       0  0.0    1.4   0   0     0     0      0      0
zfs-retire               0       0  0.0    0.0   0   0     0     0      0      0

Rats, the find ran totally out of cache from the last example. As before, should this happen,proceed directly to zpool scrub.
# zpool scrub pool1
# fmstat
module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0    0.5   0   0     0     0      0      0
disk-transport           0       0  0.0  190.5   0   0     0     0    32b      0
eft                      1       0  0.0    4.1   0   0     0     0   1.4M      0
fmd-self-diagnosis       5       0  0.0    0.5   0   0     0     0      0      0
io-retire                1       0  0.0    1.0   0   0     0     0      0      0
snmp-trapgen             6       0  0.0    7.4   0   0     0     0    32b      0
sysevent-transport       0       0  0.0  329.0   0   0     0     0      0      0
syslog-msgs              6       0  0.0    4.6   0   0     0     0      0      0
zfs-diagnosis           16       1  0.0   70.3   0   0     1     1   168b   140b
zfs-retire               1       0  0.0  509.8   0   0     0     0      0      0

Again, hot sparing has kicked in automatically. The evidence of this is the zfs-retire engine running.
# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 18 11:07:29 50ea07a0-2cd9-6bfb-ff9e-e219740052d5  ZFS-8000-D3    Major    
Feb 18 11:16:43 06bfe323-2570-46e8-f1a2-e00d8970ed0d

Fault class : fault.fs.zfs.device

Description : A ZFS device failed.  Refer to http://sun.com/msg/ZFS-8000-D3 for
              more information.

Response    : No automated response will occur.

Impact      : Fault tolerance of the pool may be compromised.

Action      : Run 'zpool status -x' and replace the bad device.

# zpool status -x
  pool: pool1
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver in progress, 4.94% done, 0h0m to go
config:

        NAME          STATE     READ WRITE CKSUM
        pool1         DEGRADED     0     0     0
          mirror      DEGRADED     0     0     0
            disk1     ONLINE       0     0     0
            spare     DEGRADED     0     0     0
              disk2   UNAVAIL      0     0     0  cannot open
              spare1  ONLINE       0     0     0
        spares
          spare1      INUSE     currently in use

errors: No known data errors

As before, this tells us all that we need to know. A device (disk2) has failed and is no longer in operation. Sufficient spares existed and one was automatically attached to the damaged pool. Resilvering completed successfully and the data is once again fully mirrored.

But here's the magic. Let's repair the device - again simulated with our fake device.
# mkfile 1g /dev/dsk/disk2
# zpool repair pool1 disk2
# zpool status pool1 
  pool: pool1
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 4.86% done, 0h1m to go
config:

        NAME               STATE     READ WRITE CKSUM
        pool1              DEGRADED     0     0     0
          mirror           DEGRADED     0     0     0
            disk1          ONLINE       0     0     0
            spare          DEGRADED     0     0     0
              replacing    DEGRADED     0     0     0
                disk2/old  UNAVAIL      0     0     0  cannot open
                disk2      ONLINE       0     0     0
              spare1       ONLINE       0     0     0
        spares
          spare1           INUSE     currently in use

errors: No known data errors

Get a cup of coffee while the resilvering process runs.
# zpool status
  pool: pool1
 state: ONLINE
 scrub: resilver completed with 0 errors on Mon Feb 18 11:23:13 2008
config:

        NAME        STATE     READ WRITE CKSUM
        pool1       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk1   ONLINE       0     0     0
            disk2   ONLINE       0     0     0
        spares
          spare1    AVAIL   


# fmadm faulty

Notice the nice integration with FMA. Not only was the new device resilvered, but the hot spare was detached and the FMA fault was cleared. The fmstat counters still show that there was a problem and the fault report still existes in the fault log for later interrogation.
# fmstat
module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0    0.5   0   0     0     0      0      0
disk-transport           0       0  0.0  171.5   0   0     0     0    32b      0
eft                      1       0  0.0    3.6   0   0     0     0   1.4M      0
fmd-self-diagnosis       6       0  0.0    0.6   0   0     0     0      0      0
io-retire                1       0  0.0    0.9   0   0     0     0      0      0
snmp-trapgen             6       0  0.0    6.8   0   0     0     0    32b      0
sysevent-transport       0       0  0.0  294.3   0   0     0     0      0      0
syslog-msgs              6       0  0.0    4.2   0   0     0     0      0      0
zfs-diagnosis           36       1  0.0   51.6   0   0     0     1      0      0
zfs-retire               1       0  0.0  170.0   0   0     0     0      0      0

# fmdump
TIME                 UUID                                 SUNW-MSG-ID
Feb 16 11:38:16.0976 48935791-ff83-e622-fbe1-d54c20385afc ZFS-8000-GH
Feb 16 11:38:30.8519 9f7f288c-fea8-e5dd-bf23-c0c9c4e07233 ZFS-8000-GH
Feb 18 09:51:49.3025 2ac4568f-4040-cb5d-f3b8-ae3d69e7d713 ZFS-8000-GH
Feb 18 09:56:24.8029 d82d1716-c920-6243-e899-b7ddd386902e ZFS-8000-GH
Feb 18 10:23:07.2228 7c04a6f7-d22a-e467-c44d-80810f27b711 ZFS-8000-GH
Feb 18 10:25:14.6429 faca0639-b82b-c8e8-c8d4-fc085bc03caa ZFS-8000-GH
Feb 18 11:07:29.5195 50ea07a0-2cd9-6bfb-ff9e-e219740052d5 ZFS-8000-D3
Feb 18 11:16:44.2497 06bfe323-2570-46e8-f1a2-e00d8970ed0d ZFS-8000-D3


# fmdump -V -u 50ea07a0-2cd9-6bfb-ff9e-e219740052d5
TIME                 UUID                                 SUNW-MSG-ID
Feb 18 11:07:29.5195 50ea07a0-2cd9-6bfb-ff9e-e219740052d5 ZFS-8000-D3

  TIME                 CLASS                                 ENA
  Feb 18 11:07:27.8476 ereport.fs.zfs.vdev.open_failed       0xb22406c635500401

nvlist version: 0
        version = 0x0
        class = list.suspect
        uuid = 50ea07a0-2cd9-6bfb-ff9e-e219740052d5
        code = ZFS-8000-D3
        diag-time = 1203354449 236999
        de = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = fmd
                authority = (embedded nvlist)
                nvlist version: 0
                        version = 0x0
                        product-id = Dimension XPS                
                        chassis-id = 7XQPV21
                        server-id = arrakis
                (end authority)

                mod-name = zfs-diagnosis
                mod-version = 1.0
        (end de)

        fault-list-sz = 0x1
        fault-list = (array of embedded nvlists)
        (start fault-list[0])
        nvlist version: 0
                version = 0x0
                class = fault.fs.zfs.device
                certainty = 0x64
                asru = (embedded nvlist)
                nvlist version: 0
                        version = 0x0
                        scheme = zfs
                        pool = 0x3a2ca6bebd96cfe3
                        vdev = 0xedef914b5d9eae8d
                (end asru)

                resource = (embedded nvlist)
                nvlist version: 0
                        version = 0x0
                        scheme = zfs
                        pool = 0x3a2ca6bebd96cfe3
                        vdev = 0xedef914b5d9eae8d
                (end resource)

        (end fault-list[0])

        fault-status = 0x3
        __ttl = 0x1
        __tod = 0x47b9bb51 0x1ef7b430

# fmadm reset zfs-diagnosis
fmadm: zfs-diagnosis module has been reset

# fmadm reset zfs-retire
fmadm: zfs-retire module has been reset

Problem 3: Unrecoverable corruption

For those of you that have attended one of my Boot Camps or Solaris Best Practices training classes know, House is one of my favorite TV shows - the only one that I watch regularly. And this next example would make a perfect episode. Is it likely to happen ? No, but it is so cool when it does :-)

Remember our second pool, pool2. It has the same contents as pool1. Now, let's do the unthinkable - let's corrupt both halves of the mirror. Surely data loss will follow, but the fact that Solaris stays up and running and can report what happened is pretty spectacular. But it gets so much better than that.
# dd if=/dev/zero of=/dev/dsk/disk3 bs=8192 count=10000 conv=notrunc
# dd if=/dev/zero of=/dev/dsk/disk4 bs=8192 count=10000 conv=notrunc
# zpool scrub pool2

# fmstat
module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0    0.5   0   0     0     0      0      0
disk-transport           0       0  0.0  166.0   0   0     0     0    32b      0
eft                      1       0  0.0    3.6   0   0     0     0   1.4M      0
fmd-self-diagnosis       6       0  0.0    0.6   0   0     0     0      0      0
io-retire                1       0  0.0    0.9   0   0     0     0      0      0
snmp-trapgen             8       0  0.0    6.3   0   0     0     0    32b      0
sysevent-transport       0       0  0.0  294.3   0   0     0     0      0      0
syslog-msgs              8       0  0.0    3.9   0   0     0     0      0      0
zfs-diagnosis         1032    1028  0.6   39.7   0   0    93     2    15K    13K
zfs-retire               2       0  0.0  158.5   0   0     0     0      0      0

As before, lots of zfs-diagnosis activity. And two hits to zfs-retire. But we only have one spare - this should be interesting. Let's see what is happenening.
# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 18 09:56:24 d82d1716-c920-6243-e899-b7ddd386902e  ZFS-8000-GH    Major    
Feb 18 13:18:42 c3889bf1-8551-6956-acd4-914474093cd7

Fault class : fault.fs.zfs.vdev.checksum

Description : The number of checksum errors associated with a ZFS device
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/ZFS-8000-GH for more information.

Response    : The device has been marked as degraded.  An attempt
              will be made to activate a hot spare if available.

Impact      : Fault tolerance of the pool may be compromised.

Action      : Run 'zpool status -x' and replace the bad device.

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 16 11:38:30 9f7f288c-fea8-e5dd-bf23-c0c9c4e07233  ZFS-8000-GH    Major    
Feb 18 09:51:49 2ac4568f-4040-cb5d-f3b8-ae3d69e7d713
Feb 18 10:23:07 7c04a6f7-d22a-e467-c44d-80810f27b711
Feb 18 13:18:42 0a1bf156-6968-4956-d015-cc121a866790

Fault class : fault.fs.zfs.vdev.checksum

Description : The number of checksum errors associated with a ZFS device
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/ZFS-8000-GH for more information.

Response    : The device has been marked as degraded.  An attempt
              will be made to activate a hot spare if available.

Impact      : Fault tolerance of the pool may be compromised.

Action      : Run 'zpool status -x' and replace the bad device.

# zpool status -x
  pool: pool2
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed with 602 errors on Mon Feb 18 13:20:14 2008
config:

        NAME          STATE     READ WRITE CKSUM
        pool2         DEGRADED     0     0 2.60K
          mirror      DEGRADED     0     0 2.60K
            spare     DEGRADED     0     0 2.43K
              disk3   DEGRADED     0     0 5.19K  too many errors
              spare1  DEGRADED     0     0 2.43K  too many errors
            disk4     DEGRADED     0     0 5.19K  too many errors
        spares
          spare1      INUSE     currently in use

errors: 247 data errors, use '-v' for a list

So ZFS tried to bring in a hot spare, but there were insufficient replicas to be able to reconstruct all of the data. But here is where is gets interesting. Let's see what zpool status -v says about things.
zpool status -v
  pool: pool1
 state: ONLINE
 scrub: resilver completed with 0 errors on Mon Feb 18 11:23:13 2008
config:

        NAME        STATE     READ WRITE CKSUM
        pool1       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk1   ONLINE       0     0     0
            disk2   ONLINE       0     0     0
        spares
          spare1    INUSE     in use by pool 'pool2'

errors: No known data errors

  pool: pool2
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed with 602 errors on Mon Feb 18 13:20:14 2008
config:

        NAME          STATE     READ WRITE CKSUM
        pool2         DEGRADED     0     0 2.60K
          mirror      DEGRADED     0     0 2.60K
            spare     DEGRADED     0     0 2.43K
              disk3   DEGRADED     0     0 5.19K  too many errors
              spare1  DEGRADED     0     0 2.43K  too many errors
            disk4     DEGRADED     0     0 5.19K  too many errors
        spares
          spare1      INUSE     currently in use

errors: Permanent errors have been detected in the following files:

        /pool2/scenic/cider mill crowds.jpg
        /pool2/scenic/Cleywindmill.jpg
        /pool2/scenic/csg_Landscapes001_GrandTetonNationalPark,Wyoming.jpg
        /pool2/scenic/csg_Landscapes002_ElowahFalls,Oregon.jpg
        /pool2/scenic/csg_Landscapes003_MonoLake,California.jpg
        /pool2/scenic/csg_Landscapes005_TurretArch,Utah.jpg
        /pool2/scenic/csg_Landscapes004_Wildflowers_MountRainer,Washington.jpg
        /pool2/scenic/csg_Landscapes!idx011.jpg
        /pool2/scenic/csg_Landscapes127_GreatSmokeyMountains-NorthCarolina.jpg
        /pool2/scenic/csg_Landscapes129_AcadiaNationalPark-Maine.jpg
        /pool2/scenic/csg_Landscapes130_GettysburgNationalPark-Pennsylvania.jpg
        /pool2/scenic/csg_Landscapes131_DeadHorseMill,CrystalRiver-Colorado.jpg
        /pool2/scenic/csg_Landscapes132_GladeCreekGristmill,BabcockStatePark-WestVirginia.jpg
        /pool2/scenic/csg_Landscapes133_BlackwaterFallsStatePark-WestVirginia.jpg
        /pool2/scenic/csg_Landscapes134_GrandCanyonNationalPark-Arizona.jpg
        /pool2/scenic/decisions decisions.jpg
        /pool2/scenic/csg_Landscapes135_BigSur-California.jpg
        /pool2/scenic/csg_Landscapes151_WataugaCounty-NorthCarolina.jpg
        /pool2/scenic/csg_Landscapes150_LakeInTheMedicineBowMountains-Wyoming.jpg
        /pool2/scenic/csg_Landscapes152_WinterPassage,PondMountain-Tennessee.jpg
        /pool2/scenic/csg_Landscapes154_StormAftermath,OconeeCounty-Georgia.jpg
        /pool2/scenic/Brig_Of_Dee.gif
        /pool2/scenic/pvnature14.gif
        /pool2/scenic/pvnature22.gif
        /pool2/scenic/pvnature7.gif
        /pool2/scenic/guadalupe.jpg
        /pool2/scenic/ernst-tinaja.jpg
        /pool2/scenic/pipes.gif
        /pool2/scenic/boat.jpg
        /pool2/scenic/pvhawaii.gif
        /pool2/scenic/cribgoch.jpg
        /pool2/scenic/sun1.gif
        /pool2/scenic/sun1.jpg
        /pool2/scenic/sun2.jpg
        /pool2/scenic/andes.jpg
        /pool2/scenic/treesky.gif
        /pool2/scenic/sailboatm.gif
        /pool2/scenic/Arizona1.jpg
        /pool2/scenic/Arizona2.jpg
        /pool2/scenic/Fence.jpg
        /pool2/scenic/Rockwood.jpg
        /pool2/scenic/sawtooth.jpg
        /pool2/scenic/pvaptr04.gif
        /pool2/scenic/pvaptr07.gif
        /pool2/scenic/pvaptr11.gif
        /pool2/scenic/pvntrr01.jpg
        /pool2/scenic/Millport.jpg
        /pool2/scenic/bryce2.jpg
        /pool2/scenic/bryce3.jpg
        /pool2/scenic/monument.jpg
        /pool2/scenic/rainier1.gif
        /pool2/scenic/arch.gif
        /pool2/scenic/pv-anzab.gif
        /pool2/scenic/pvnatr15.gif
        /pool2/scenic/pvocean3.gif
        /pool2/scenic/pvorngwv.gif
        /pool2/scenic/pvrmp001.gif
        /pool2/scenic/pvscen07.gif
        /pool2/scenic/pvsltd04.gif
        /pool2/scenic/banhall28600-04.JPG
        /pool2/scenic/pvwlnd01.gif
        /pool2/scenic/pvnature08.gif
        /pool2/scenic/pvnature13.gif
        /pool2/scenic/nokomis.jpg
        /pool2/scenic/lighthouse1.gif
        /pool2/scenic/lush.gif
        /pool2/scenic/oldmill.gif
        /pool2/scenic/gc1.jpg
        /pool2/scenic/gc2.jpg
        /pool2/scenic/canoe.gif
        /pool2/scenic/Donaldson-River.jpg
        /pool2/scenic/beach.gif
        /pool2/scenic/janloop.jpg
        /pool2/scenic/grobacro.jpg
        /pool2/scenic/fnlgld.jpg
        /pool2/scenic/bells.gif
        /pool2/scenic/Eilean_Donan.gif
        /pool2/scenic/Kilchurn_Castle.gif
        /pool2/scenic/Plockton.gif
        /pool2/scenic/Tantallon_Castle.gif
        /pool2/scenic/SouthStockholm.jpg
        /pool2/scenic/BlackRock_Cottage.jpg
        /pool2/scenic/seward.jpg
        /pool2/scenic/canadian_rockies_csg110_EmeraldBay.jpg
        /pool2/scenic/canadian_rockies_csg111_RedRockCanyon.jpg
        /pool2/scenic/canadian_rockies_csg112_WatertonNationalPark.jpg
        /pool2/scenic/canadian_rockies_csg113_WatertonLakes.jpg
        /pool2/scenic/canadian_rockies_csg114_PrinceOfWalesHotel.jpg
        /pool2/scenic/canadian_rockies_csg116_CameronLake.jpg
        /pool2/scenic/Castilla_Spain.jpg
        /pool2/scenic/Central-Park-Walk.jpg
        /pool2/scenic/CHANNEL.JPG



In my best Hugh Laurie voice trying to sound very Northeastern American, that is so cool! But we're not even done yet. Let's take this list of files and restore them - in this case, from pool1. Operationally this would be from a back up tape or nearline backup cache, but for our purposes, the contents in pool1 will do nicely.

First, let's clear the zpool error counters and return the spare disk. We want to make sure that our restore works as desired. Oh, and clear the FMA stats while we're at it.
# zpool clear
# zpool detach pool2 spare1

# fmadm reset zfs-diagnosis
fmadm: zfs-diagnosis module has been reset

# fmadm reset zfs-retire   
fmadm: zfs-retire module has been reset

Now individually restore the files that have errors in them and check again. You can even export and reimport the pool and you will find a very nice, happy, and thoroughly error free ZFS pool. Some rather unpleasant gnashing of zpool status -v output with awk has been omitted for sanity sake.
# zpool scrub pool2
# zpool status pool2
  pool: pool2
 state: ONLINE
 scrub: scrub completed with 0 errors on Mon Feb 18 14:04:56 2008
config:

        NAME        STATE     READ WRITE CKSUM
        pool2       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            disk3   ONLINE       0     0     0
            disk4   ONLINE       0     0     0
        spares
          spare1    AVAIL   

errors: No known data errors

# zpool export pool2
# zpool import pool2
# dircmp -s /pool1 /pool2

Conclusions and Review

So what have we learned ? ZFS and FMA are two great tastes that taste great together. No, that's chocolate and peanut butter, but you get this idea. One more great example of Isaac's Multiplicity of Solaris.

That, and I have finally found a good lab exercise for the FMA training materials. Ever since Christine Tran put the FMA workshop together, we have been looking for some good FMA lab exercises. The materials reference a synthetic fault generator that is not available in public (for obvious reasons). I haven't explored the FMA test harness enough to know if there is anything in there that would make a good lab. But this exercise that we have just explored seems to tie a number of key pieces together.

And of course, one more reason why Roxy says, "You should run Solaris."

Technocrati Tags:

Thursday Jun 21, 2007

Updated Solaris Bootcamp Presentations

I've had a great time traveling around the country talking about Solaris. It's not exactly a difficult thing - there's plenty to talk about. Many of you have asked for copies of the latest Solaris update, virtualization overview and ZFS deep dive. Rather than have you dig through a bunch of old blog entries about bootcamps from 2005, here they are for your convenience.



I hope this will save you some digging though http://mediacast.sun.com and tons of old blogs.

In a few weeks I'll post a new "What's New in Solaris" which will have some really cool things. But we'll save that for later.

Technocrati Tags:
About

Bob Netherton is a Principal Sales Consultant for the North American Commercial Hardware group, specializing in Solaris, Virtualization and Engineered Systems. Bob is also a contributing author of Solaris 10 Virtualization Essentials.

This blog will contain information about all three, but primarily focused on topics for Solaris system administrators.

Please follow me on Twitter Facebook or send me email

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today