Monday Nov 17, 2008

Veritas Storage Foundation for Windows on X4540

[This article was originally posted on Nov 17th and updated on Nov 21st.]

I have done a lot of work with Solaris and ZFS on the Sun Fire X45x0 server but now I am now working on something a little different as I have a project where I am running Microsoft Windows Server 2003 on a Sun Fire X4540.

For the volume management I am running Veritas Storage Foundation for Windows and I wanted to share my experiences.

Before you Install Storage Foundation

ONLY IF this is a new system and/or if this is the first time you have installed Windows on this system do the following after you have installed Windows and BEFORE you install Storage Foundation:

1. Bring up the Windows Storage Manager and make sure it finds all 48 disks.

2. Delete any partitions on disks other than the boot disk. THIS DESTROYS ANY DATA on the disks. You don't have to anything to disks with no partitions on them, just the act of LVM finding them makes them available to windows.

Who do this ? The X4540 ships with Solaris pre-installed and a pre-configured ZFS storage pool. Veritas Storage Foundation for Windows is essentially Veritas Volume Manager. We install Windows over the top of Solaris but the rest of the disks are untouched. Under windows, when Veritas Volume Manager scans the disks it sees something is on the disks, plays safe and marks them as unusable. I found no way to override this behavior. The standard Windows LVM does not have this issue and allows you to tidy the disks up. You need to follow this procedure before you install Storage Foundation as it replaces the Windows LVM.

Using Storage Foundation

I really liked using this software. It was easy to install (though the install takes a long time) and easy to configure and use. The X4540 has 48 disks which can be challenging to manage, but the Veritas software makes this relatively easy. The wizard for creating volumes is great, allowing you to manually select disks so that you are striping or mirroring across controllers for example..or you can let the software decide, but I prefer to maintain manual control of how my volumes are laid out.

This is the Disk View:

Disk View

This is the Volume View:

Volume View

Here are a couple of shots of the Volume Wizard. The first one shows the disk selection page and there is the option to allows the software to Autoselect...though I went for Manual:

Volume Wizard

This shows the page where you choose the type of volume to create:

Volume Wizard

One interesting thing I learnt from this UI is how Windows maps the disks in the X4540. Windows presents the disks as Disk0->Disk47 which is not very informative if you wish to build volumes across controllers. Via the Veritas GUI I was able to see that the six SAS controllers in the X4540 are mapped as P0->P5 and then we have eight disks on each controller T0->T7. C and L are always 0. You can see this in the first screenshot of the Volume Wizard.

I built a RAID-10 Volume and a RAID-5 Volume. To help me plan, I printed out a copy of my Sun Fire X4540 Disk Planner (which was designed for Solaris) and changed the column labels to P0->P5 and the row labels to C0->C7. I labeled the boxes Disk0->Disk47, starting at the top left I worked down the first column, then returned to the top of the next column and working down that and so on.

A Final Note: Disk Write Caches

The X4540 ships with the disk write caches on. The Volume Manager will warn you about this. Disk write caches are volatile and you could loose data in event of a power outage. If you don't have UPS protection then you can switch the disks write caches off using the Windows Device Manager. Note that Solaris ZFS is cool with disk write caches on as it flushes them out when it periodically syncs the file system.

Thursday Sep 25, 2008

Recipe for a ZFS RAID-Z Storage Pool on Sun Fire X4540

[Update Sept 26th: I have revised this from the initial posting on Sept 25th. The hot spares have been laid out in a tidier way and I have included an improved script which is a little more generalized.]

Almost  year ago I posted a Recipe for Sun Fire X4500 RAID-Z Config with Hot Spares. Now we have the new SunFire X4540, it has a different disk controller numbering and more bootable disk slots, so I have revisited this.

Using my Sun Fire X4540 Disk Planner, I first worked out how I wanted it to look....


The server has six controllers, each with 8 disks. In the planner, the first controller is c0, but the controller numbering will not start at c0 in all cases: if you installed Solaris off an ISO image they will run from c1->c6; if Solaris is installed with Jumpstart then they will run c0->c5, in one case I have seen the first controller as c4. Whatever the first controller is, the others will follow in sequence.

I assumed that mirrored boot disks are desirable, so I allocated two disk for the OS.

ZFS is happy with stripes of dissimilar lengths in a pool, but I like all the stripes in a pool to be the same length, so I allocated hot spares across the controllers to enable me to build Eight 5 disk RAID-Z stripes. There is one hot spare per controller.

This script creates the pool as described above. The required arguments are the desired name of the pool and the name of the first controller. It does a basic check to see that you are on a  Sun Fire X4540.

#! /bin/sh
#set -x
#Make ZFS storage pools on a Sun Fire X4540 (Thor).
#This WILL NOT WORK on Sun Fire X4500 (Thumper) as
#the boot disk locations and controller numbering
#is different.
#Need two arguments:
# 1. name of pool
# 2. name of first controller e.g c0

prtdiag -v | grep -w X4540 > /dev/null 2>&1
if [ $? -ne 0 ] ; then
        echo "This script can only be run on a Sun Fire X4540."
        exit 1

case $# in
        2)#This is a valid argument count
        \*) #An invalid argument count
        echo "Usage: `basename ${0}` zfspoolname first_controller_number"
        echo "Example: `basename ${0}` tank c0"
        exit 1;;

#The numbering of the disk controllers will vary,
#but will most likely start at c0 or c1.

case $CFIRST in
        echo "This script cannot work if the first controller is ${CFIRST}."
        echo "If this is the correct controller than edit the script to add"
        echo "settings for first controller = ${CFIRST}."
        exit 1

# Create pool with 8 x RAIDZ.4+1 stripes
# 6 Hot spares are staggered across controllers
# We skip ${Cntrl0}t0d0 and {Cntrl1}t1d0 as they are assummed to be boot disks
zpool create -f ${ZPOOLNAME} \\
raidz ${Cntrl1}t0d0 ${Cntrl2}t0d0 ${Cntrl3}t0d0 ${Cntrl4}t0d0 ${Cntrl5}t0d0 \\
raidz ${Cntrl0}t1d0 ${Cntrl2}t1d0 ${Cntrl3}t1d0 ${Cntrl4}t1d0 ${Cntrl5}t1d0 \\
raidz ${Cntrl0}t2d0 ${Cntrl1}t2d0 ${Cntrl3}t2d0 ${Cntrl4}t2d0 ${Cntrl5}t2d0 \\
raidz ${Cntrl0}t3d0 ${Cntrl1}t3d0 ${Cntrl2}t3d0 ${Cntrl4}t3d0 ${Cntrl5}t3d0 \\
raidz ${Cntrl0}t4d0 ${Cntrl1}t4d0 ${Cntrl2}t4d0 ${Cntrl3}t4d0 ${Cntrl5}t4d0 \\
raidz ${Cntrl0}t5d0 ${Cntrl1}t5d0 ${Cntrl2}t5d0 ${Cntrl3}t5d0 ${Cntrl4}t5d0 \\
raidz ${Cntrl1}t6d0 ${Cntrl2}t6d0 ${Cntrl3}t6d0 ${Cntrl4}t6d0 ${Cntrl5}t6d0 \\
raidz ${Cntrl0}t7d0 ${Cntrl2}t7d0 ${Cntrl3}t7d0 ${Cntrl4}t7d0 ${Cntrl5}t7d0 \\
spare ${Cntrl2}t2d0 ${Cntrl3}t3d0 ${Cntrl4}t4d0 ${Cntrl5}t5d0 ${Cntrl0}t6d0 ${Cntrl1}t7d0

#End of script

I have called the script In the below example I create a storage pool called tank and my first controller is c1.

root@isv-x4500a # tank c1

This is how it looks...

root@isv-x4540a # zpool status

root@isv-x4500a # zpool status tank
  pool: tank
 state: ONLINE
 scrub: none requested

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c3t0d0  ONLINE       0     0     0
            c4t0d0  ONLINE       0     0     0
            c5t0d0  ONLINE       0     0     0
            c6t0d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1t1d0  ONLINE       0     0     0
            c3t1d0  ONLINE       0     0     0
            c4t1d0  ONLINE       0     0     0
            c5t1d0  ONLINE       0     0     0
            c6t1d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1t2d0  ONLINE       0     0     0
            c2t2d0  ONLINE       0     0     0
            c4t2d0  ONLINE       0     0     0
            c5t2d0  ONLINE       0     0     0
            c6t2d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1t3d0  ONLINE       0     0     0
            c2t3d0  ONLINE       0     0     0
            c3t3d0  ONLINE       0     0     0
            c5t3d0  ONLINE       0     0     0
            c6t3d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1t4d0  ONLINE       0     0     0
            c2t4d0  ONLINE       0     0     0
            c3t4d0  ONLINE       0     0     0
            c4t4d0  ONLINE       0     0     0
            c6t4d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1t5d0  ONLINE       0     0     0
            c2t5d0  ONLINE       0     0     0
            c3t5d0  ONLINE       0     0     0
            c4t5d0  ONLINE       0     0     0
            c5t5d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c2t6d0  ONLINE       0     0     0
            c3t6d0  ONLINE       0     0     0
            c4t6d0  ONLINE       0     0     0
            c5t6d0  ONLINE       0     0     0
            c6t6d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1t7d0  ONLINE       0     0     0
            c3t7d0  ONLINE       0     0     0
            c4t7d0  ONLINE       0     0     0
            c5t7d0  ONLINE       0     0     0
            c6t7d0  ONLINE       0     0     0
          c3t2d0    AVAIL   
          c4t3d0    AVAIL   
          c5t4d0    AVAIL   
          c6t5d0    AVAIL   
          c1t6d0    AVAIL   
          c2t7d0    AVAIL   

errors: No known data errors

I have used this layout on my systems for over a year now in the labs, pounding the heck out of it. The first two controllers are marginally less busy as they support both a boot disk and hotspare, but I have seen very even performance across all the data disks.

So far, I have not lost a disk so I am probably way over a cautious with my hot spares...famous last words :-)..but if you want to reduce the number of hot spares to four, then it is easy to modify the script by taking spares and adding them to the stripes. If you want to do this, since the first two controllers are marginally less loaded than the other controllers, I recommend you modify the script to extend the stripes on rows t6 & t7 as below . You need to make this decision up front before building the pool as you cannot change the length of a RAID-Z stripe once the pool is built.

The zpool create command in the script would now look like this...the modified lines are in bold text.




# Create pool with 6 x RAIDZ.4+1 stripes & 2 x RAIDZ.5+1 stripes
# 6 Hot spares are staggered across controllers
# We skip ${Cntrl0}t0d0 and {Cntrl1}t1d0 as they are assummed to be boot disks
zpool create -f ${ZPOOLNAME} \\
raidz ${Cntrl1}t0d0 ${Cntrl2}t0d0 ${Cntrl3}t0d0 ${Cntrl4}t0d0 ${Cntrl5}t0d0 \\
raidz ${Cntrl0}t1d0 ${Cntrl2}t1d0 ${Cntrl3}t1d0 ${Cntrl4}t1d0 ${Cntrl5}t1d0 \\
raidz ${Cntrl0}t2d0 ${Cntrl1}t2d0 ${Cntrl3}t2d0 ${Cntrl4}t2d0 ${Cntrl5}t2d0 \\
raidz ${Cntrl0}t3d0 ${Cntrl1}t3d0 ${Cntrl2}t3d0 ${Cntrl4}t3d0 ${Cntrl5}t3d0 \\
raidz ${Cntrl0}t4d0 ${Cntrl1}t4d0 ${Cntrl2}t4d0 ${Cntrl3}t4d0 ${Cntrl5}t4d0 \\
raidz ${Cntrl0}t5d0 ${Cntrl1}t5d0 ${Cntrl2}t5d0 ${Cntrl3}t5d0 ${Cntrl4}t5d0 \\
${Cntrl0}t6d0 ${Cntrl1}t6d0 ${Cntrl2}t6d0 ${Cntrl3}t6d0 ${Cntrl4}t6d0 ${Cntrl5}t6d0 \\
raidz ${Cntrl0}t7d0
${Cntrl2}t7d0 ${Cntrl3}t7d0 ${Cntrl4}t7d0 ${Cntrl5}t7d0 \\
spare ${Cntrl2}t2d0 ${Cntrl3}t3d0 ${Cntrl4}t4d0 ${Cntrl5}t5d0

Tuesday Sep 23, 2008

Sun Fire X4540 Disk Planner

The SunFire X4500 (commonly known as Thumper) got a facelift a few months ago and the new version is the SunFire X4540. The X4540 has to a large degree been re-architected, and has new CPUs, more memory and a new I/O subsystem. There are still 48 disks, but the controller numbering is different and we now have four bootable disk slots vs only two in the X4500. 

Now, I need to draw a picture when planning ZFS storage pools with so many disks so I just uploaded my SunFire X4540 disk planner in PDF and OpenOffice formats. There is no rocket science here, it just helps you draw a picture, but I find it useful. It is an update of a similar doc I created for the X4500

Friday Apr 11, 2008

OpenSolaris as a StorageOS - The Week That Everything Worked First Time

This has been an extraordinary week...everything I have tried to do has worked first time.

To summarise this weeks activities, I have set-up and then blogged on:

Configuring the OpenSolaris CIFS Server in Workgroup Mode
Configuring the OpenSolaris CIFS Server in Domain Mode
Solaris CIFS Windows SID Mapping - A First Look
Configuring the OpenSolaris Virus Scanning Services for ZFS Accessed via CIFS Clients

All in all..a very good week :-) 

Monday Jan 14, 2008

Using the Sun Fire X4500 as Network Attached Archival Storage

A paper about a project I completed recently has been published on BigAdmin. The paper is "Configuring the Sun Fire X4500 Server as Network Attached Archival Storage for Symantec Enterprise Vault". The aim of the project was to see if the Sun Fire X4500 could be used as NAS in a Symantec Enterprise Vault environment. The findings were that this worked very well, and performance was excellent!

Software used on the Sun Fire X4500 was SAMBA, Sun StorageTek Storage Archive Manager, and Solaris ZFS.The project involved functional and performance testing, both of which are part of Symantec's comprehensive Self Certification program for Partners.

I went the extra mile, and wrote a configuration tool for the solution called x4500samconfig. The use of the tool is discussed in the paper, and it is available for download here. If you take an X4500 with Solaris 10 8/07 and the unconfigured Storage Archive Manager 4.6 packages installed, this script will have you up and running in minutes. The script configures ZFS, Storage Archive Manager and SAMBA. Some of what the script does "under the hood" is discussed here.

I am going to take the opportunity here to thank Sun's BigAdmin team: the process of publishing an article like this is quite lengthy. Initially, a paper is written by an Engineer (me in this case) for Sun Internal consumption only...this is pretty straightforward, requiring peer review only. Once the internal version is complete, the paper can be submitted for external publication on BigAdmin, and at this point the document must be legally reviewed and professionally edited which can take some time...and it is the BigAdmin team who do, Thanks BigAdmin Team!

Tuesday Nov 13, 2007

Building a SAM File System on a Sun Fire X4500 using a ZFS Volume

The volume management layer of SAM can only stripe data for performance, not availability. When working with the Sun Fire X4500 you need to use a volume manager to create a logical volume or volumes on which to build a SAM file system if you want the file system to survive the failure of a disk.

By far the easiest way to do this in my experience is to create a ZFS storage pool and carve a ZFS Volume (ZVOL) out of it. You can then build the SAM file system on the ZVOL. You could use multiple ZVOLs and have SAM stripe across them, but I prefer to have a single volume and have ZFS take care of the striping rather than have two volume managers fighting with each other.

Here is an example of how to set this up.

1. Create a ZFS storage pool

I will call it samfspool0. I have provided a few recipes for creating ZFS storage pools on the Sun Fire X4500 in the past, the one I default to is this one.

2. Allocate a ZFS Volume from the Storage Pool

You can use the ZFS GUI to do this, but I will show you how to do it on the command line. I want to create a 7 TB volume called samfs0zvol with a block size of 16KB. The block size does not have to be 16KB, that is just what I am using in this example.

root# zfs create -V 7.0TB -b 16KB samfspool0/samfs0zvol

The volume appears as a ZFS file system without a mount point when you list it: 

root# zfs list samfspool0/samfs0zvol
samfspool0/samfs0zvol  57.0M  7.13T  57.0M  -

3. Add an entry in the /etc/opt/SUNWsamfs/mcf file

Note the path to the ZVOL is under /dev/zvol/dsk/<poolname>

# Equipment             Eq      Eq      Family  Device  Additional
# Identifier               Ord     Type    Set     State   Parameters
# -----------            ---       ----    ------  ------  ----------
samfs0                   1           ms      samfs0  on
/dev/zvol/dsk/samfspool0/samfs0zvol  2  md      samfs0  on

4 . Reconfigure SAM

Force SAM to read in the configuration file and then make the file system with a  DAU of 16KB to align with the blocksize of the ZVOL.

root# samd config
root# sammkfs -a 16K samfs0
sammkfs: samfs0: One or more partitions exceeds 1 TB in size
sammkfs: file system samfs0 will not mount on 32 bit Solaris and
sammkfs: some earlier versions of Solaris
Building 'samfs0' will destroy the contents of devices:
Do you wish to continue? [y/N]y

You are done. Note that File System Manager, supplied with SAM, does not currently allow you to create a SAM file system on a ZVOL.

For an overview of SAM, and File System Manager, look here.


Tim Thomas


« April 2014