Sunday Apr 06, 2008

What does the home server do?

I was recently asked what the home server serves. So here is the list:

  1. NAS server. NFS and CIFS (via SAMBA). There is a single Windows system in the house which is increasingly not switched on. NFS for the two laptops that frequent the network. All supported via ZFS on two 400Gb drives with literally thousands of snapshots,44170. Space is beginning to get short thanks to the 10Mega pixel SLR camera so in the not to distant future a disk upgrade will be required.

  2. Sun Ray server. There are (currently) three Sun Rays. One acts as a photo frame and has no keyboard or mouse. The other two provide real interactive use. I can foresee a situation where we have two more Sun Rays.

  3. Email server. SMTP and IMAP via exim and imapd respectively. Clearly this implies spamassassin and and antivirus scanner, clamAV.

  4. SlimServer. I've just run up a slim server to get better access to internet radio stations. Having a radio player that I can hook up to the hi-fi that is not DAB, ie crap1, would be good. I feel a squeezebox coming soon.

Just occasionally and every time I ran up VirtualBox the system would struggle to cope prior to the CPU upgrade even when using the Fair Share Schedler. Since the upgrade it has not had any problems with having us all using it.




1It is nice to see that I am not alone in realising DAB is crap.

Friday Apr 04, 2008

Sun Ray resource management.

One of the great benefits of running Sun Rays at home is having the sessions always there. Just plug in the card and you get your session as if you were never away. However that also allows you to leave an application chewing CPU cycles when you are away. So to keep the interactive experience as good as possible I employ the same techniques described in “Using Solaris Resource Manager With Sun Ray” blueprint. For a long while I've wondered why IT don't do this. The keepers of our Sun Ray do and it works a treat. Which is a good thing when you share a Sun Ray Server with Tim.

Instead of setting the number of shares up to a specific value I use a multiplier so that those active on a Sun Ray get 10 times the number of shares that they would by default. While this works well it still leaves a significant load on the system from certain applications, specifically flash animations that are left running endlessly playing the games that were being played when the users card was removed. The fair share scheduler does it's thing to make CPU allocation fair but the memory use of those otherwise idle firefox sessions is significant.

So I've taken a leaf out of the BOFH and apply some special sanctions to those processes. Alas I may not get a job with the BOFH as my sanctions are simply to pstop(1) the copies of firefox associated with the user and DISPLAY when they detach and then prun(1) them when the user reconnects. I wondered about using memory resource caps to limit the memory but that would leave the systems rcapd(1M) battling the memory usage of the firefox processes which are not displaying anything anyway. In the unlikely event that any of the users are using their firefox sessions to simulate nuclear fission or crack SSL so would rather they kept running I'm sure they will get back to me.

So the script I have for doing this is slightly more complex than the one from the Blueprint. Since it has to err on the side of caution when stopping users firefox sessions. To do that it uses pargs(1) to make sure that the firefox sessions are really for this display. In practice I am the only person who might remote display a firefox session from here and even that is unlikely but it is the principle. The impact on the system of not trying to run all the disconnected firefox sessions is amazing.

Thursday Mar 27, 2008

Dual Core hits home server

I bit the bullet and bought a new CPU for the home server. It now has an AMD Athlon 64 X2 5000+ Socket AM2 2.6GHz Energy Efficient L2 1MB (2x512KB):

: pearson FSS 2 $; /usr/sbin/psrinfo -v
Status of virtual processor 0 as of: 03/27/2008 08:00:38
  on-line since 03/27/2008 07:47:52.
  The i386 processor operates at 2600 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 1 as of: 03/27/2008 08:00:38
  on-line since 03/27/2008 07:48:00.
  The i386 processor operates at 2600 MHz,
        and has an i387 compatible floating point processor.
: pearson FSS 3 $; 

So far so good. Obviously power now no longer works so this is running at full power all the time, which is less than ideal but the performance should be and so far is considerably better than the single 2.2GHz CPU it replaces.

With the exception of PowerNow which is not supported on this Dual Core CPU, Solaris works flawlessly as expected.

Tuesday Mar 25, 2008

Automatic opening a USB disk on Sun Ray

One of my users today had a bit of a hissy fit today when she plugged in her USB thumb drive into the Sun Ray and it did nothing. That is it did nothing visible. Behind the scenes the drive had been mounted somewhere but there was no realistic way she could know this.

So I need a way to get the file browser to open when the drive is inserted. A quick google finds " "USB Drive" daemon for Sun Ray sessions" which looks like the answer. The problem I have with this is that it polls to see if there is something mounted. Given my users never log out this would mean this running on average every second. Also the 5 second delay just does not take into account the attention span of a teenager.

There has to be a better way.

My solution is to use dtrace to see when the file system has been mounted and then run nautilus with that directory.

The great thing about Solaris 10 and later is that I can give the script just the privilege that allows it to run dtrace without handing out access to the world. Then of course you can then give that privilege away.

So I came up with this script. Save it. Mine is in /usr/local which in turn is a symbolic link to /tank/fs/local. Then add an entry to /etc/security/exec_attr, subsisting the correct absolute (ie one with no symbolic links in it) path in the line.

Basic Solaris User:solaris:cmd:::/tank/fs/local/bin/utmountd:privs=dtrace_kernel

This gives the script just enough privileges to allow it to work. It then drops the extra privilege so that when it runs nautilus it has no extra privileges.

Then you just have to arrange for users to run the script when they login using:

pfexec /usr/local/bin/utmountd

I have done this by creating a file called /etc/dt/config/Xsession.d/utmountd that contains these lines:


pfexec /usr/local/bin/utmountd &
trap "kill $!" EXIT

I leave making this work for uses of CDE as an exercise for the reader.

Saturday Dec 22, 2007

Preparing to move off samba onto the native CIFS.

First I following the instructions on the OpenSolaris.org page that describe how to set up the smb service I set it up on my laptop just to try and get a feel for the beast. To say it was easy is an understatement although I have much to learn and I'm not sure it is quite ready to inflict on my users.

Anyway it does allow me to start the process. First by editing pam.conf and then the most unpopular part of expiring all the passwords so that all the users generate new smb passwords. Once they have all done that I can think about moving over.

The only issue I think I have is it is unclear to me at this point whether the smb shares will cross mount points llike NFS v4 does with mirror mounts and is the current behaviour via Samba. If not that is going to be a major stumbling block.

Friday Nov 23, 2007

Memory upgrade

With the addition of the photo frame and a second Sun Ray the home server was beginning to struggle. The ideal (realistic) solution would be a multi core CPU but the lack of power management on those, at least the ones that will fit in the existing system, puts me off. However since most of the performance issues have been due to lack of memory there was a way out.

If you recall the motherboard in the system is ASUS M2NPV-VM which has four memory slots, however the system was supplied with a Zalman 8000 Low Profile Cooler which interferes with one memory slot reducing the number of available slots to 2 (the slots have to be used in pairs). The reason given for the optional cooler is to make the system quieter. So the only ways to increase the memory would be to replace the existing DIMMS with larger 2G DIMMS or swap the CPU fan for the default one, which I still have, and then add the extra memory. The alternative would be to unteach the family that it is OK just to pull your card out of the system when you walk away, now what is the fun of that?

It seems like a no brainer to me. I can always put the system in another room so the noise does not get me down.

So I have now added 4G of RAM, giving 6G total and put the AMD cooler on the system, so far so good:

Last login: Fri Nov 23 17:37:41 2007 from gmp-ea-fw-1.sun
Sun Microsystems Inc.   SunOS 5.11      snv_78  October 2007
: pearson FSS 1 $; prtconf | head -3
System Configuration:  Sun Microsystems  i86pc
Memory size: 6111 Megabytes
System Peripherals (Software Nodes):
: pearson FSS 2 $; batstat -t
Thermal zone: \\_TZ_.THRM, temperature = 40C, critical = 75C
        Active Cooling: 73
: pearson FSS 3 $; 

Once again battling with the hardware really brought home just how fantastic the designs of Sun hardware is. This could have been designed by Citroen1 given how difficult it is to replace bits. Even if I had not had to remove the CPU fan to add the memory I still would have had to remove all the disks, the main power supply the fan power cable just to add a DIMM.


What is surprising is that the system is very much quieter now than is used to be and even with the original AMD fan.



1 I once, well twice owned and or leased Citroen Xantia cars. While driving up the motorway on the way home one day, the day after it had been serviced, the “stop or you will die” light came on on the dashboard as the engine started making some alarming noises. So I pull over sharpish onto the hard shoulder and pop the bonnet and take a look. No problem spotting the issue. One of the spark plugs had come out, still attached to the high tension lead. I make a mental note not to return to that garage the service the car but am not that worried. I have a tool kit in the boot and so can just refit the spark plug and be on my way. That is until I see where the plug has come from , a deep deep whole which those cunning French engineers have made so difficult to access that a normal spark plug socket won't fit down there. I ring the AA, thankful that I am a member and wait. I did not have to wait long. I explain to the AA man who looks at me as if I am a fool, possibly I am, after all I am the one driving, or not, the Citroen and he says the immortal words: “No problem, I'll have you going in 2 minutes”. 30 minutes later he has managed to get the sparc plug in but not tight but the car will run so he follows me to a Citroen garage who did indeed have the special tool required to fit spark plugs and also employed a double jointed Frenchman (he may not actually have been French, but you get the picture) capable of using them. Suffice to say the phrase “ease of maintenance” and “Citroen cars” are not ones you will often hear together, unless there is also the word “nightmare” in the sentence. That said this was the first of my Citroens so it did not put me off. Though now I think about it the last car was replaced by a bike.

Friday Jun 01, 2007

Rolling incremental backups

Why do you take back ups?

  • User error

  • Hardware failure

  • Disaster Recovery

  • Admin Error

With ZFS using redundant storage and plenty of snapshots my data should be safe from the first two. However that still leaves two ways all my data could be lost if I don't take some sort of back up.

Given the constraints I have my solution is to use my external USB disk containing a standalone zpool and then use zfs send and receive via this script to send all the file systems I really care about to the external drive.

To make this easier I have put all the filesystems into another “container” filesystem which has the “nomount” option set so it is hidden from the users. I can then recursively send that file system to the external drive. Also to stop the users getting confused by extra file systems appearing and disappearing I have set the mount point on the external disk to “none”.

The script only uses the snapshots that are prefixed “day” (you can change that with the -p (prefix) option) so that it reduces the amount of work that the script does. Backing up the snapshots that happen every 10 minutes on this system does not seem worth while for a job I will run once a day or so.

The really cool part of this is that once I had the initial snapshot on the external drive every back up from now on will be incremental. A rolling incremental backup. How cool is that.

# time ./zfs_backup tank/fs/safe removable/safe      



real    12m10.49s

user    0m11.87s

sys     0m12.32s

# 
 zfs list  tank/fs/safe removable/safe      

NAME             USED  AVAIL  REFER  MOUNTPOINT

removable/safe  78.6G  66.0G    18K  none

tank/fs/safe    81.8G  49.7G    18K  /tank/fs

# 

The performance is slightly disappointing due to the number of transport errors that are reported by the usb2scsa layer but the data is really on the disk so I am happy.

Currently I don't have the script clearing out the old snapshots but will get that going later. The idea of doing this over ssh to a remote site is compelling when I can find a suitable remote site.

Thursday May 03, 2007

How many disk should a home server have (I'm sure that was a song).

My previous post failed to answer all the questions.

Specifically how many disks should a home server contain?

Now I will gloss over the obvious answer of zero, all your data should be on the net managed by a professional organisation, not least as I would not trust someone else with my photos however good they claim to be. Also any self respecting geek will have a server at home with storage, which at the moment means spinning rust.

Clearly you need more than one disk for redundancy and you have already worked out that the only sensible choice for a file system is ZFS, you really don't want to loose your data to corruption. It is also reasonable to assume that this system will have 6 drives or less. At the time of writing you can get a Seagate 750Gb SATA drive for £151.56 including VAT or a 320Gb for £43.99.

Here is the table showing the number of disks that can fail before you suffer data loss:

Number of disks

Mirror

Mirror with one hot spare

RaidZ

RaidZ with hot spare

Raidz2

Raidz2 with one hot spare

2

1

N/A

N/A

N/A

N/A

N/A

3

NA

2\*

1

N/A

N/A

N/A

4

1\*\*

N/A

1

2\*

2

N/A

5

NA

2\*

1

2\*

2

3

6

1\*\*

N/A

1

2\*

2

3

\* To not suffer data loss the second drive to fail has to not fail while the hot spare is being re silvered.

\*\* This is a worst case of both disks that form the mirror failing. It is possible that you could loose more than one drive and maintain the data.

Richard has some more numbers about mean time before data loss and the performance of various configurations from a more commercial point of view, including a 3 way mirror.

Now lets look at how much storage you get:

Number of disks of size X Gb

Mirror

Mirror with one hot spare

RaidZ

RaidZ with hot spare

Raidz2

Raidz2 with one hot spare

2

X

N/A

N/A

N/A

N/A

N/A

3

N/A

X

2X

N/A

N/A

N/A

4

2X

N/A

3X

2X

2X

N/A

5

N/A

2X

4X

3X

3X

2X

6

3X

N/A

5X

4X

4X

3X

The power consumption will be pretty much proportional to the number of drives, as will the noise and cost of purchase. For the Seagate drives I looked at the power consumption of the disks was identical for the 300Gb and 750Gb drives.

Since my data set would easily fit in a 320Gb disk (at the time of purchase) and that was the most economic at that point I chose the 2 way mirror. Also raidz2 was not available.

If I needed the space offered by 2X or more disks I would choose the RaidZ2 as that gives the best redundancy.

So the answer to the question is “it depends” but I hope the above will help you to understand your choices.

Tags:

Tuesday May 01, 2007

Chosing a server for home

I get asked often enough how I chose my home system, why I chose a single CPU system with just 2 disks that I'm going to put down the thought process here.

My priorities were in order:

  1. Data integrity.

    The data is important. My photographs, the kids home work and correspondence the family members have both in the form of letters and emails.

  2. Total cost of ownership.

    I have a realistic expectation that I will write the system off over 5 or more years, so the cost of the power it uses is important.

  3. Quantity of disk space.

    I needed at least 60G to move off the Qubes so add in a bit of a safety margin and 300G should keep us going for 5 years (unless some new technology like a helmet camera starts eating through the disk as if it cost nothing).

  4. Noise.

    It sits in a room where I work. I am used to using a Sun Ray which is silent so having it as quiet as possible is the ideal.

  5. Physical size.

    It was to replace a pair of Qube 3 which were beautifully small and sits on a window ledge. So a tower was not practical.

Storage

Given these constraints there was just one choice for file system, ZFS, so the system just had to support at least 2 disks so that they could be mirrored. The cost of running more than two disks combined with the fact that 300G drives are affordable and that the form factor of the box made having more disks less appealing even though that would have been faster. More spindles == more performance, mostly.

I'm not regretting this decision despite the fact that the system can be extremely unresponsive when doing a live upgrade. Since both drives contain the mirrors of both boot environments when running lumake to copy one boot environment to the other the disk performance is terrible as the heads seek back and forth, the same is true during the install as I have the DVD image on the same disks. Putting the image on the external USB drive does help, but the problem is not really bad enough that I bother. Having 4 disks would have mitigated this. I'm hopeful that when we get native queuing support in the sata driver this will improve slightly. Having root on ZFS should eventually eliminate the lumake copy step as that will be a clone.

Mother Board

The choice of mother board was defined by a desire to be able to support 4G of RAM so that if the system turned into a Sun Ray server, which it has, I have the option of a reasonable amount of memory to support two or three users and support for SATA disks since the price performance of those drives fitted my storage requirements. Obviously it had to be able to boot from these drives.

There was no need for good graphics but 2 network ports had to be available so if it came with one that would be idea. Gigabit networking would also help. I put all of those variables in and one of my blogless colleagues suggested the ASUS M2NPV-VM which was built around chipsets that the release of OpenSolaris that was current at the time should support. The only exception was the Nvidia graphics driver which at that time was not available. However since I did not need graphics that was not an issue. It had an on board 1GigaBit network so even with the addition of a second network card there are sill free slots if I need them.

CPU

The choice of CPU was based on cost and the knowledge that the Casper's powernow driver does not support multiple CPUs or Muilti core CPUs. For reasons of getting under the budget I had I chose the AMD Athlon(tm) 64 Processor 3500+ Socket AM2 which will run at 17.8 Watts when running at 1Ghz.

I know of people who are successfully using the following CPUS with power now on this mother board, however this does not constitute any kind of guarantee:

CPU

Earliest BIOS Version known to work

AMD Athlon 64 3800+

0 705 01/02/2007 (My colleague thinks it first started working on 0303 firmware, but is not certain.)

AMD Athon 64 3500+

0 705 01/02/2007

AMD Athlon 64 3000+

0 705 01/02/2007

If you know of other CPUs that work with the PowerNow driver on this motherboard let me know and I will update the table.

The compnents not appearing in this Blog

The system has a DVD RW but that was chosen on the whim that the supplier supplying the CPU, motherboard and case shipped that. The thing I don't have that might surprise some is a tape drive. Since I state that the number one goal was data integrity you would think a good back up would seem to be a requirement. However I have found my external USB disk drive, combined with being able to backup ZFS snapshots to DVD (hint here set the quota on your file systems to less than the size of a DVD, make back ups easier) means that while I know rebuilding would be very hard I'm sure I have all the photographs safe. My children's homework has such a short life span that anything other than the snapshots is unlikely to help.

Tags:

Wednesday Apr 18, 2007

Backing up laptop using ZFS over iscsi to more ZFS

After the debacle of the reinstall of my laptop zpool having to be rebuild and “restored” using zfs send and zfs receive I thought I would look for a better back up method. One that did not involve being clever with partitions on an external USB disk that are “ready” for when the whole disk is using ZFS.

The obvious solution is a play on one I had played with before. Store one half of the pool on another system. So welcome to ISCI.

ZFS volumes can now be shared using iscsi. So on the server create a volume with the “shareisci” property set to “on” and enable the iscsi target:

# zfs get  shareiscsi tank/iscsi/pearson   
NAME                PROPERTY    VALUE               SOURCE
tank/iscsi/pearson  shareiscsi  on                  inherited from tank/iscsi
  
# svcadm enable  svc:/system/iscsitgt       
# 

Now on the client tell the iscsi initiator where the server is:


5223 # iscsiadm add discovery-address 192.168.1.20
5224 # iscsiadm list discovery-address            
Discovery Address: 192.168.1.20:3260
5225 # iscsiadm modify discovery --sendtargets enable
5226 # format < /dev/null
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0d0 <DEFAULT cyl 3791 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@1f,1/ide@0/cmdk@0,0
       1. c10t0100001731F649B400002A004625F5BEd0 <SUN-SOLARIS-1-11.00GB>
          /scsi_vhci/disk@g0100001731f649b400002a004625f5be
Specify disk (enter its number): 
5227 # 

Now attach the new device to the pool. I can see some security would be a good thing here to protect my iscsi pool. More on that later.


5229 # zpool status newpool                                       
  pool: newpool
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scrub: scrub completed with 0 errors on Wed Apr 18 12:30:43 2007
config:

        NAME        STATE     READ WRITE CKSUM
        newpool     ONLINE       0     0     0
          c0d0s7    ONLINE       0     0     0

errors: No known data errors
5230 # zpool attach newpool c0d0s7 c10t0100001731F649B400002A004625F5BEd0
5231 # zpool status newpool                                              
  pool: newpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.02% done, 8h13m to go
config:

        NAME                                        STATE     READ WRITE CKSUM
        newpool                                     ONLINE       0     0     0
          mirror                                    ONLINE       0     0     0
            c0d0s7                                  ONLINE       0     0     0
            c10t0100001731F649B400002A004625F5BEd0  ONLINE       0     0     0

errors: No known data errors
5232 # 

The 8 hours to complete the resilver turns out to be hopelessly pessimistic and is quickly reduced to a more realistic, but still overly pessimistic 37 minutes. All of this over what is only a 100Mbit ethernet connection from this host. I'm going to try this on the Dell that has a 1Gbit network to see if that improves this even further. (Since the laptop has just been upgraded to build 62 the pool “needs” to be upgraded. However since upgrading the pool would then not be able to be imported on earlier builds I won't upgrade the pool version until both boot environments are running build 62 or above.)

I am left wondering how useful this could be in the real world. As a “nasty hack” you could have your ZFS based NAS box serving out volumes to your NFS which then have Zpools in them. Then on the NAS box you can snapshot and backup the the volumes which would actually give you a back up of the whole of the client pool, something many people want for disaster recovery reasons. Which is in effect what I have here.


Tags:


Monday Feb 26, 2007

Reverting to build 55

My home server is back at build 55. After reading the heads up message about ZFS on build 58 I wasted no time to send the system back to build 55 which was the last release I had installed prior to build 58. I miss the improvements in gnome that have turned up in the later releases but data integrity trumps everything.

Hopefully I will be able to get to build 59 (which despite what the heads up says I am told should contain the fix) later this week.

Tags:

Wednesday Feb 07, 2007

Home Server powernow now operational

Powernow has not been working on my home server. It was fine as 2200Mhz or 1800Mhz or 1000Mhz but if it ever tried to run at 2000Mhz the system would crash so I have been pretty much doing power now by hand sticking to 2200Mhz when I am using the system or 1000Mhz when I am not, saving about 40Watts when it is quiet.

Recently ASUS released a new BIOS version for the M2NPV-VM motherboard I have and this has now fixed the problem so that powernow works perfectly. If you have this motherboard and an Athlon 3500+ CPU then upgrade to version BIOS 0705 and powernow will work. Well it does for me.

Tuesday Jan 16, 2007

Squid

I have finally installed a transparent caching server on the home server. Mainly as it provides an easy way to block unsuitable sites from the kids.

Adding these lines to ipnat.conf, recall my internal network is nge0 and the internet lives on rtls0

rdr nge0 0.0.0.0/0 port 80 -> 127.0.0.1 port 8080 tcp

The proxy server is listening on port 8080.


Then taking the source to squid which I built with these options:


CC=cc ./configure --prefix=/opt/squid --enable-ipf-transparent --enable-ssl

Whilst I could have used the package from blastwave.org I would like to wean the system off blastwave packages as they pull in lots of duplicated libraries when used on Solaris 10 or as in my case Nevada.


The squid cache is being stored in it's own ZFS file system /tank/squid/cache as you would expect however thanks to the way I have laid out the file systems it does snapshotted so it won't chew through disk.


Then using the work that Trev has done I now have a working manifest and start script.


Tags:

Monday Nov 27, 2006

Sun Ray Server @ home

I finally got around to installing the Sun Ray software on the new Home Server. This was less simple than I had hoped. For some reason the configuration script would not add the correct settings to the dhcp server. However following the instructions over on the Think Thin blog for adding the settings manually. Once I did this the Sun Ray would still not work, getting a 26 D error, which if I understood it right meant that the appliance was not connecting to the Xserver.

Guess who had disabled the X server on the system? Well I did not need an X server on a headless system so I had disabled it. Thankfully restarting it is a snip:

# svcadm enable svc:/application/graphical-login/cde-login:default

and suddenly the Sun Ray burst into life, running build 53 it flies along. Now I'm off to Ebay to see if I can pick up some appliances, no bidding them up now.


Tags:

Thursday Nov 23, 2006

A faster ZFS snapshot massacre

I moved the zfs snapshot script into the office and started running it on our build system. Being a cautious type when it comes to other people's data I ran the clean up script in “do nothing” mode so I could be sure it was not cleaning snapshots that it should not. After a while running like this we had over 150,000 snapshots of 114 file systems which meant that zfs list was now taking a long time to run.

So long in fact that the clean up script was not actually making forward progress against snapshots being created every 10 minutes. So I now have a new clean up script. This is functionally identical to the old one but a lot faster. Unfortunately I have now cleaned out the snapshots so the times are not what they were, zfs list was taking 14 minutes, however the difference is still easy to see.

When run with the option to do nothing the old script:

# time /root/zfs_snap_clean > /tmp/zfsd2

real    2m23.32s
user    0m21.79s
sys     1m1.58s
#

And the new:

# time ./zfs_cleanup -n > /tmp/zfsd

real    0m7.88s
user    0m2.40s
sys     0m4.75s
#

which is a result.


As you can see the new script is mostly a nawk script and more importantly only calls the zfs command once to get all the information about the snapshots:


#!/bin/ksh -p
#
# Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
# Use is subject to license terms.
#
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License, Version 1.0 only
# (the "License").  You may not use this file except in compliance
# with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
#
#	Script to clean up snapshots created by the script from this blog
#	entry:
#
#	http://blogs.sun.com/chrisg/entry/cleaning_up_zfs_snapshots
#
#	or using the command given in this entry to create snapshots when
#	users mount a file system using SAMBA:
#
#	http://blogs.sun.com/chrisg/entry/samba_meets_zfs
#
#	Chris.Gerhard@sun.com 23/11/2006
#

PATH=$PATH:$(dirname $0)

while getopts n c
do
	case $c in
	n) DO_NOTHING=1 ;;
	\\?) echo "$0 [-n] [filesystems]"
		exit 1 ;;
	esac
done
shift $(($OPTIND - 1))
if (( $# == 0))
then
	set - $(zpool list -Ho name)
fi


export NUMBER_OF_SNAPSHOTS_boot=${NUMBER_OF_SNAPSHOTS:-10}
export DAYS_TO_KEEP_boot=${DAYS_TO_KEEP:-365}

export NUMBER_OF_SNAPSHOTS_smb=${NUMBER_OF_SNAPSHOTS:-100}
export DAYS_TO_KEEP_smb=${DAYS_TO_KEEP:-14}

export NUMBER_OF_SNAPSHOTS_month=${NUMBER_OF_SNAPSHOTS:-24}
export DAYS_TO_KEEP_month=365

export NUMBER_OF_SNAPSHOTS_day=${NUMBER_OF_SNAPSHOTS:-$((28 \* 2))}
export DAYS_TO_KEEP_day=${DAYS_TO_KEEP:-28}

export NUMBER_OF_SNAPSHOTS_hour=$((7 \* 24 \* 2))
export DAYS_TO_KEEP_hour=$((7))

export NUMBER_OF_SNAPSHOTS_minute=$((100))
export DAYS_TO_KEEP_minute=$((1))


zfs get -Hrpo name,value creation $@ | sort -r -n -k 2 |\\
	nawk -v now=$(convert2secs $(date)) -v do_nothing=${DO_NOTHING:-0} '
function ttg(time)
{
	return (now - (time \* 24 \* 60 \* 60));
}
BEGIN {
	time_to_go["smb"]=ttg(ENVIRON["DAYS_TO_KEEP_smb"]);
	time_to_go["boot"]=ttg(ENVIRON["DAYS_TO_KEEP_boot"]);
	time_to_go["minute"]=ttg(ENVIRON["DAYS_TO_KEEP_minute"]);
	time_to_go["hour"]=ttg(ENVIRON["DAYS_TO_KEEP_hour"]);
	time_to_go["day"]=ttg(ENVIRON["DAYS_TO_KEEP_day"]);
	time_to_go["month"]=ttg(ENVIRON["DAYS_TO_KEEP_month"]);
	number_of_snapshots["smb"]=ENVIRON["NUMBER_OF_SNAPSHOTS_smb"];
	number_of_snapshots["boot"]=ENVIRON["NUMBER_OF_SNAPSHOTS_boot"];
	number_of_snapshots["minute"]=ENVIRON["NUMBER_OF_SNAPSHOTS_minute"];
	number_of_snapshots["hour"]=ENVIRON["NUMBER_OF_SNAPSHOTS_hour"];
	number_of_snapshots["day"]=ENVIRON["NUMBER_OF_SNAPSHOTS_day"];
	number_of_snapshots["month"]=ENVIRON["NUMBER_OF_SNAPSHOTS_month"];
} 
/.\*@.\*/ { 
	split($1, a, "@");
	split(a[2], b, "_");
	if (number_of_snapshots[b[1]] != 0 &&
		++snap_count[a[1], b[1]] > number_of_snapshots[b[1]] &&
		time_to_go[b[1]] > $2) {
		str= sprintf("zfs destroy %s\\n", $1);
		printf(str);
		if (do_nothing == 0) {
			system(str);
		}
	}
}'

Tags:

Tuesday Nov 14, 2006

Home Server questions answered

Rayson Ho asked the following quesitons in response to my posting listing my home server configuration.

1) How noisy is the power supply under Solaris??

Well it is sat less than 1 meter away from me and I can live with it easily. I would prefer to put the whole system in another room or the loft or somewhere out of the way and use it all remotely since it is a server but the hassle of sorting out network cabling and the fact that the layout of the house may change soon means I have not.

2) Could you get the built-in ethernet interface to run at full speed??

Yes. The users even noticed that life was faster when I finally bought the gigabit hub.

3) Is sound working??

I've not even tried. Just to answer you I looked in /dev/sound and it is empty but I have made no attempt to see if there is something I can tweak to get sound working. I don't need or want sound on my server.

Lastly, what is the difference with using the SATA framework vs legacy mode? Would I get better performance with the new framework??

I don't think the sata frame work buys you any performance at all. It allows you to do things like hot plugging which is not possible with the case I have so is not an issue.

Tags:

Monday Nov 13, 2006

Good Morning Build 52

Build 52 hit the Sun Ray server:

: estale.eu IA 1 $; uname -a
SunOS estale 5.11 snv_52 sun4u sparc SUNW,Sun-Fire
: estale.eu IA 2 $;

and all seems well.


However at home all was not well when I upgraded my home server. The laptops had been fine, except for the message about the now nonexistent pfil service failing. Disabling that service removed the irritating but harmless warning. On the server there were two issues:

  1. The dhcp service was not working

  2. The web server was not starting

The failed dhcp service resulted in a quick about face to build 51 as the users would not stand not having a computer. However after they had left the house and before I had real work and thanks to Casper for confirming that his dhcp server was working on build 52. It turned out to be a misconfiguration of the firewall. Exactly why this worked in build 51 is a mystery. Addling this line:

pass in quick on nge0 proto udp from any to 192.168.254.20 port = bootps keep state

to the /etc/ipf/ipf.conf file brought the dhcp server to light. This however was not before I discovered that the dhcpmgr would not start, giving this java error:

3 # /usr/sadm/admin/bin/dhcpmgr 
Exception in thread "main" java.lang.UnsupportedClassVersionError: Bad version number in .class file
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:56)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
4 #

Which turns out to be due to the version of java that the /usr/java symbolic link points to being wrong. An upgrade bug it appears. This bug has not been filed:

CR 6492789 Created P2 java/install /usr/java link points to Java 1.5.0 instead of Java 1.6 after upgrade from snv_51 to snv_52.


The work around is simple:


# cd /usr
# rm java
# ln -s jdk/jdk1.6.0 java

The second problem of the web server not working was that apache2 had changed and so it needed the configuration files rejigging. I am now back running build 52 and it is happily serving planetcycling.org. However I am now using the more modular configuration files so if there are issues in the future updating will be simpler.


Tags:



Saturday Nov 11, 2006

ZFS snapshot massacre.

As the number of snapshots grows I started wondering how much space they are really taking up on the home server. This is pretty much also shows how much data gets modified after being initially created. I would guess not much as the majority of the data on the server would be:

  1. Solaris install images. Essentially read only.

  2. Photographs.

  3. Music mostly in the form of iTunes directories.

Running this command line get the result:

zfs get -rHp used $(zpool list -H -o name ) |\\
nawk '/@/ && $2 == "used" { tot++; total_space+=$3 ;\\
        if ( $3 == 0 ) { empty++ }} \\
END { printf("%d snapshots\\n%d empty snapshots\\n%2.2f G in %d snapshots\\n", tot, \\
        empty, total_space/(1024\^3), tot - empty ) }'
68239 snapshots
63414 empty snapshots
2.13 G in 4825 snapshots'
: pearson TS 15 $; zfs get used $(zpool list -H -o name )
NAME  PROPERTY  VALUE  SOURCE
tank  used      91.2G  -
: pearson TS 16 $;

So I only have 2.13G of data saved in snapshots out of 91.2 G of data. Not really a surprising result. The biggest user of space for snapshots is one file system. The one that contains planetcycling.org. As the planet gets updated every 30 minutes and the data is only indirectly controlled by me I'm not shocked by this. I would expect the amount to stabilize over time as the system and to that end I will note the current usage:


zfs get -rHp used tank/fs/web|\\
nawk '/@/ && $2 == "used" { tot++; total_space+=$3 ;\\
        if ( $3 == 0 ) { empty++ }} \\
END { printf("%d snapshots\\n%d empty snapshots\\n%2.2f G in %d snapshots\\n", tot,
        empty, total_space/(1024\^3), tot - empty ) }'
1436 snapshots
789 empty snapshots
0.98 G in 647 snapshots

All this caused me to look a bit harder at the zfs_snapshot_clean script I have as it appeared to be keeping some really old snapshots from some of the classes that I did not expect. Now while the 68000 snapshots were having no negative impact on the running of the system it it was not right. There were two issues. First it was sorting the list of snapshots using the snapshot creation time, which was correct, but it was sorting in reverse order which was not. Secondly I was keeping a lot more of the hourly snapshots than I intended.


After fixing this and running the script (you can download it from here) there was a bit of a snapshot massacre leading to a lot less snapshots:


zfs get -rHp used $(zpool list -H -o name ) |\\
nawk '/@/ && $2 == "used" { tot++; total_space+=$3 ;\\
        if ( $3 == 0 ) { empty++ }} \\
END { printf("%d snapshots\\n%d empty snapshots\\n%2.2f G in %d snapshots\\n", tot, \\
        empty, total_space/(1024\^3), tot - empty ) }'
25512 snapshots
23445 empty snapshots
2.20 G in 2067 snapshots

Only 25000 snapshots, much better, most of them remain empty.

Tags:

Tuesday Nov 07, 2006

Home Server hardware configuration

Somehow I missed posting the configuration of my home server directly here as I posted a link to the site where I bought it and I keep being asked for the specifications.


Here are the details of the core of the system.



Component

Details

Notes

Motherboard

ASUS M2NPV-VM

BIOS upgrades can be down with a USB memory stick so no need to have any other OS other than Solaris.


I've not tried the audio, IEEE 1394a port or the graphics card beyond just starting X. Everything I have tried to use has worked on Solaris with the exception of powernow.

CPU

AMD Athlon 3500+

I've been having trouble with powernow. When the system runs at 2GHs it crashes with memory errors. Colleagues who have the Athlon 3800 CPU report it works fine. This is irritating as I only chose the single core CPU to get powernow sooner rather than later.

Memory

2 \* 1Gb 240pin 800Mhz DDR2 RAM


Disks

2 \* Seagate Baracuda ST3320620AS 7200rpm 320Gb SATA disks.

You have to remove a jumper on the drives to get these to run at full speed.

Case

Antec NSK1300 Case MicroATX

A good case which will hold 3 31/2” disks as well as a DVD drive. However it really shows just how good the mechanics on Sun Systems are. You get what you pay for.


Which end up looking like this as far as prtdiag is concerned:


System Configuration: System manufacturer System Product Name
BIOS Configuration: Phoenix Technologies, LTD ASUS M2NPV-VM ACPI BIOS Revision 0504 10/17/2006

==== Processor Sockets ====================================

Version                          Location Tag
-------------------------------- --------------------------
AMD Athlon(tm) 64 Processor 3500+ Socket AM2

==== Memory Device Sockets ================================

Type    Status Set Device Locator      Bank Locator
------- ------ --- ------------------- --------------------
unknown empty  0   A0                  Bank0/1
unknown in use 0   A1                  Bank2/3
unknown empty  0   A2                  Bank4/5
unknown in use 0   A3                  Bank6/7

==== On-Board Devices =====================================

==== Upgradeable Slots ====================================

ID  Status    Type             Description
--- --------- ---------------- ----------------------------
1   available PCI              PCI1
2   in use    PCI              PCI2
4   available PCI Express      PCIEX16
5   available PCI Express      PCIEX1_1

Even with the powernow issue I'm very happy with the system.

Tags:

Friday Oct 27, 2006

Where to put ZFS filesystems in a pool

Before we had ZFS, I was always telling people not not put things in the root of a file system. Specifically don't share the root of a file sytem. That way if you want to add another share to the file system later with different permissions you could and all was good. It was (is) good advice.

With ZFS you end up with lots of file systems and the advice does not hold anymore. Where previously you were trying to share the file system resources now you would just create a new file system in a pool and have done with it.

Today I realized that for ZFS there is some similar advice worth following and that is don't put all your file systems in the root of the pool. Todays example is that I have a number of file systems and one zvol in a pool. It would be nice to be able to use a single recursive snapshot to back up all the file systems but not the zvol, since that zvol is my swap partition. While a snapshot of swap is kind of cool in that you can do it serves no purpose other than to use storage at an alarming rate.

So now I have moved all the file systems under one uber filesystem called “fs”:

# zfs list -t filesystem,volume
NAME                              USED  AVAIL  REFER  MOUNTPOINT
tank                             84.9G   187G  26.5K  /tank
tank/fs                          82.8G   187G  32.5K  /tank/fs
tank/fs/downloads                31.9G   187G  2.20G  /tank/fs/downloads
tank/fs/downloads/nv             26.9G   187G  37.5K  /tank/fs/downloads/nv
tank/fs/downloads/nv/46          2.49G   187G  2.48G  /tank/fs/downloads/nv/46
tank/fs/downloads/nv/47          2.45G   187G  2.45G  /tank/fs/downloads/nv/47
tank/fs/downloads/nv/48          2.55G   187G  2.45G  /tank/fs/downloads/nv/48
tank/fs/downloads/nv/49          2.46G   187G  2.45G  /tank/fs/downloads/nv/49
tank/fs/downloads/nv/50          2.52G   187G  2.45G  /tank/fs/downloads/nv/50
tank/fs/downloads/nv/51          2.50G   187G  2.46G  /tank/fs/downloads/nv/51
tank/fs/downloads/nv/tmp         12.0G   187G  4.78G  /tank/fs/downloads/nv/tmp
tank/fs/local                    66.8M   187G  57.0M  /tank/fs/local
tank/fs/opt                      1.67G  28.3G  25.5K  /tank/fs/opt
tank/fs/opt/SUNWspro              459M  28.3G   453M  /opt/SUNWspro
tank/fs/opt/csw                   340M  28.3G   121M  /opt/csw
tank/fs/opt/sfw                   907M  28.3G   880M  /opt/sfw
tank/fs/opt/spamd                 110K  28.3G  24.5K  /tank/fs/opt/spamd
tank/fs/shared                   12.1G  37.9G  28.5K  /tank/fs/shared
tank/fs/shared/music             5.71G  37.9G  5.70G  /tank/fs/shared/music
tank/fs/shared/pics              6.36G  37.9G  6.32G  /tank/fs/shared/pics
tank/fs/shared/projects           424K  37.9G  25.5K  /tank/fs/shared/projects
tank/fs/shared/projects/kitchen   376K  37.9G  46.5K  /tank/fs/shared/projects/kitchen
tank/fs/users                    25.4G  74.6G  32.5K  /tank/fs/users
tank/fs/users/user1               300K  74.6G  29.5K  /tank/fs/users/user1
tank/fs/users/user1              23.1G  74.6G  22.2G  /tank/fs/users/user2
tank/fs/users/user3              13.5M  74.6G  9.72M  /tank/fs/users/user3
tank/fs/users/user4               652M  74.6G   614M  /tank/fs/users/user4
tank/fs/users/user5              1.12G  74.6G   987M  /tank/fs/users/user5
tank/fs/users/user6               500M  74.6G   341M  /tank/fs/users/user6
tank/fs/var                      10.8G  19.2G  35.5K  /tank/fs/var
tank/fs/var/crash                5.10G  19.2G  5.09G  /var/crash
tank/fs/var/dhcp                  128K  19.2G    30K  /tank/fs/var/dhcp
tank/fs/var/log                  49.5K  19.2G    27K  /tank/fs/var/log
tank/fs/var/mail                  871M  19.2G   179M  /var/mail
tank/fs/var/mqueue                662K  19.2G  24.5K  /var/spool/mqueue
tank/fs/var/named                 442K  19.2G   369K  /tank/fs/var/named
tank/fs/var/openldap-data         130K  19.2G  82.5K  /tank/fs/var/openldap-datatank/fs/var/opt                    46K  19.2G  24.5K  /tank/fs/var/opt
tank/fs/var/samba                  46K  19.2G  24.5K  /tank/fs/var/samba
tank/fs/var/tmp                  4.90G  19.2G  2.45G  /tank/fs/var/tmp
tank/fs/web                       920M   104M  2.56M  /tank/fs/web
tank/swap                        45.1M   189G  45.1M  -


I could have tweaked the mount point of tank to be “none” and of tank/fs to be “tank” but did not to avoid potential confusion in the future. I should really also ask that “zfs snapshot -r” have a -t option so you could get it to snapshot based on a type.

Tags:


About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today