Tuesday Aug 04, 2009

Making a simple script faster

Many databases get backed up by simply stopping the database copying all the data files and then restarting the database. This is fine for things that don't require 24 hour access. However if you are concerned about the time it takes to take the back up then don't do this:

cp /data/file1.db .
gzip file1.db
cp /data/file2.db .
gzip file2.db

Now there are many ways to improve this using ZFS and snapshots being one of the best but if you don't want to go there then at the very least stop doing the “cp”. It is completely pointless. The above should just be:

gzip < /data/file1.db > file1.db
gzip < /data/file2.db > file2.db

You can continue to make it faster by backgrounding those gzips if the system has spare capacity while the back up is running but that is another point. Just stopping those extra copies will make life faster as they are completely unnecessary.

Friday Jul 24, 2009

gethrtime and the real time of day

Seeing Katsumi Inoue blogging about Oracle 10g reporting timestamps using the output from gethrtime() reminded me that I have had on occasion wished I had a log to map hrtime to the current time. As Katsumi points out the output of gethrtime() is not absolutely tied to the current time. So there is no way to take the output from it and tell when in real time the output was generated unless you have some reference point. To make things more complex the output is reset each time the system reboots.

For this reason it is useful to keep a file that contains a history of the hrtime and the real time so that any logs can be retrospectively coerced back into a readable format.

There are lots of ways to do this but since on this blog we seem to be in Dtrace mode here is how using dtrace

pfexec /usr/sbin/dtrace -o /var/log/hrtime.log -qn 'BEGIN,tick-1hour,END {
        timestamp, walltimestamp/1000000000,
        walltimestamp%1000000000, walltimestamp);

Then you get a nice file that contains three columns. The hrtime, the time in seconds since January 1st 1970 and a human readable representation of the time in the current timezone:

: s4u-10-gmp03.eu TS 39 $; cat /var/log/hrtime.log    
5638545510919736:1248443226.350000625:2009 Jul 24 14:47:06
5642145449325180:1248446826.279995332:2009 Jul 24 15:47:06

I have to confess however that using Dtrace for this does not feel right, not least as you need to be root for this to be reliable and also the C code is trivial to write, compile and run from cron and send the output to syslog:

: exdev.eu FSS 39 $; cat  ./gethrtime_base.c
#include <sys/time.h>
#include <stdio.h>

main(int argc, char \*\*argv)
	hrtime_t hrt = gethrtime();
	struct timeval tv;
	gettimeofday(&tv, NULL);

	printf("%lld:%d.%6.6d:%s", hrt, tv.tv_sec, tv.tv_usec,
: exdev.eu FSS 40 $; make ./gethrtime_base
cc    -o gethrtime_base gethrtime_base.c 
: exdev.eu FSS 41 $;  ./gethrtime_base
11013365852133078:1248444379.163215:Fri Jul 24 15:06:19 2009
: exdev.eu FSS 42 $; 
./gethrtime_base | logger -p daemon.notice -t hrtime
: exdev.eu FSS 43 $;  tail -10 /var/adm/messages | grep hrtime
Jul 24 15:32:33 exdev hrtime: [ID 702911 daemon.notice] 11014939896174861:1248445953.109855:Fri Jul 24 15:32:33 2009
Jul 24 16:09:21 exdev hrtime: [ID 702911 daemon.notice] 11017148054584749:1248448161.131675:Fri Jul 24 16:09:21 2009
: exdev.eu FSS 50 $; 

Wednesday Jul 22, 2009

1,784,593 the highest load average ever?

As I cycled home I realised there was one more thing I could do on the exploring the limits of threads and processes on Solaris. That would be the highest load average ever. Modifying the thread creator program to not have each thread sleep once started but instead wait until all the threads were set up and then go into an infinite compute loop that should get me the highest load average possible on a system or so you would think.

With 784001 threads the load stabilised at:

10:16am  up 18:07,  2 users,  load average: 22114.50, 22022.68, 21245.781

Which was somewhat disappointing. However an earlier run with just 780,000 threads managed to peak the load at 1,784,593 while it was exiting:

 7:44am  up 15:35,  2 users,  load average: 1724593.79, 477392.80, 188985.10

I' still pondering how 780000 thread can result in a load average of more than 1 million.

Sunday Jul 19, 2009

784972 threads in a process

After the surprise interest in the maximum number of processes on a system it seems rude not to try and see how many threads I can squeeze into a single process while I have access to a system where physical memory will not be the limiting factor. The expectation is that this will closely match the number of processes as each thread will have an LWP in the kernel which will in turn consume the segkp.

A slight modification to the forker program:

: exdev.eu FSS 62 $; cat thr_creater.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <thread.h>

main(int argc, char \*\*argv)
        pid_t pid;
        int count=0;
        while(count < (argc != 2 ? 100 : atoi(argv[1])) &&
            (pid = thr_create(NULL, 0, (void \* (\*)(void \*))pause,
            NULL, THR_DETACHED, NULL)) != -1) {
                if (pid == 0 ) {
                        /\* Success, ) \*/
                        if (count % 1000 == 0)
                                printf("%d\\n", count);
        if (pid < 0)
        printf("%d\\n", count);

and this time it has to be built as a 64 bit program:

# make "CFLAGS=-m64 -mt" thr_creater

Here is how it went:

$; ./thr_creater 1000000        

Here things have stopped and for some bizarre reason attaching a debugger to see what is going on does not seem to be a good idea. I had prstat running in another window and it reported:

  2336 cg13442  7158M 7157M cpu73    0    0   1:42:59 1.6% thr_creater/784970

Which is just a few more threads than I got processes (784956) when running in multi user. However at this point the system is pretty much a warm brick as if I exit any process thr_creater hoovers up the process so I can create no more. Fortunately I had realized this would happen and had some sleep(1) processes running so I could pause the thr_creater and then kill one of the sleeps to allow me to run a command:

$; ps -o pid,vsz,rss,nlwp,comm -p 2336
  2336 7329704 7329248 784972 ./thr_creater

as you can see it managed to get another two threads created since the prstat exited.

Friday Jul 17, 2009

10 Steps to OpenSolaris Laptop Heaven

If you have recently come into possession of a Laptop onto which to load Solaris then here are my top tips:

  1. Install OpenSolaris. At the time of writing the release is 2009.06, install that, parts of this advice may become obsolete with later releases. Do not install Solaris 10 or even worse Nevada. You should download the live CD and burn it onto a disk boot that and let it install but before you start the install read the next tip.

  2. Before you start the install open a terminal so that you can turn on compression on the root pool once it it created. You have to keep running “zpool list” until you see the pool is created and then run (pfexec zfs set compression=on rpool). You may think that disk is big but after a few months you will be needing every block you can get. Also laptop drives are so slow that compression will probably make things faster.

  3. Before you do anything after installation take a snapshot of the system so you can always go back (pfexec beadm create opensolaris@initialinstall). I really mean this.

  4. Add the extras repository. It contains virtualbox, the flash plugin for firefox, true type fonts and more. All you need is a sun online account. See https://pkg.sun.com/register/ and http://blogs.sun.com/chrisg/entry/installing_support_certificates_in_opensolaris

  5. Decide whether you want to use the development or support repository. If in doubt choose the supported one. Sun employees get access to the support repository. Customers need to get a support contract. (http://www.opensolaris.com/learn/subscriptions/). Then update to the latest bigs (pfexec pkg image-update).

  6. Add any extra packages you need. Since I am now writing this retrospectively there may be things missing. My starting list is:

    • OpenOffice (pfexec pkg install openoffice)

    • SunStudio (pfexec pkg install sunstudioexpress)

    • Netbeans (pfexec pkg install netbeans)

    • Flash (pkfexec pkg install flash)

    • Virtualbox (pfexec pkg install virtualbox)

    • TrueType fonts (pfxec pkg install ttf-fonts-core)

  7. If you are a Sun Employee install the punchin packages so you can access SWAN. I actually rarely use this as I have a Solaris 10 virtualbox image that I use for punchin so I can be both on and off SWAN at the same time but it is good to have the option.

  8. Add you keys to firefox so that you can browse the extras and support repositories from firefox. See http://wikis.sun.com/display/OpenSolarisInfo200906/How+to+Browse+the+Support+and+Extra+Repositories.

  9. Go to Fluendo and get and install the free mp3 decoder. They also sell a complete and legal set of decoders for the major video formats, I have them and have been very happy with them. They allow me to view the videos I have cycling events.

  10. Go to Adobe and get acroread. I live in hope that at some point this will be in a repository either at Sun or one Adobe runs so that it can be installed using the standard pkg commands but until then do it by hand.


Tuesday Jun 30, 2009

Using dtrace to track down memory leaks

I've been working with a customer to try and find a memory “leak” in their application. Many things have been tried, libumem, and the mdb ::findleaks command all with no success.

So I was, as I am sure others before me have, pondering if you could use dtrace to do this. Well I think you can. I have a script that puts probes into malloc et al and counts how often they are called by this thread and when they are freed often free is called.

Then in the entry probe of the target application note away how many calls there have been to the allocators and how many to free and with a bit of care realloc. Then in the return probe compare the number of calls to allocate and free with the saved values and aggregate the results. The principle is that you find the routines that are resulting in allocations that they don't clear up. This should give you a list of functions that are possible leakers which you can then investigate1.

Using the same technique I for getting dtrace to “follow fork” that I described here I ran this up on diskomizer, a program that I understand well and I'm reasonably sure does not have systemic memory leaks. The dtrace script reports three sets of results.

  1. A count of how many times each routine and it's descendents have called a memory allocator.

  2. A count of how many times each routine and it's descendents have called free or realloc with a non NULL pointer as the first argument.

  3. The difference between the two numbers above.

Then with a little bit of nawk to remove all the functions for which the counts are zero gives:

# /usr/sbin/dtrace -Z -wD TARGET_OBJ=diskomizer2 -o /tmp/out-us \\
	-s /tmp/followfork.d \\
	-Cs /tmp/allocated.d -c \\
         "/opt/SUNWstc-diskomizer/bin/sparcv9/diskomizer -f /devs -f background \\
          -o background=0 -o SECONDS_TO_RUN=1800"
dtrace: failed to compile script /tmp/allocated.d: line 20: failed to create entry probe for 'realloc': No such process
dtrace: buffer size lowered to 25m
dtrace: buffer size lowered to 25m
dtrace: buffer size lowered to 25m
dtrace: buffer size lowered to 25m
# nawk '$1 != 0 { print  $0 }' < /tmp/out.3081
           1 diskomizer`do_dev_control
           1 diskomizer`set_dev_state
           1 diskomizer`set_state
           3 diskomizer`report_exit_reason
           6 diskomizer`alloc_time_str
           6 diskomizer`alloc_time_str_fmt
           6 diskomizer`update_aio_read_stats
           7 diskomizer`cancel_all_io
           9 diskomizer`update_aio_write_stats
          13 diskomizer`cleanup
          15 diskomizer`update_aio_time_stats
          15 diskomizer`update_time_stats
          80 diskomizer`my_calloc
         240 diskomizer`init_read
         318 diskomizer`do_restart_stopped_devices
         318 diskomizer`start_io
         449 diskomizer`handle_write
         606 diskomizer`do_new_write
        2125 diskomizer`handle_read_then_write
        2561 diskomizer`init_buf
        2561 diskomizer`set_io_len
       58491 diskomizer`handle_read
       66255 diskomizer`handle_write_then_read
      124888 diskomizer`init_read_buf
      124897 diskomizer`do_new_read
      127460 diskomizer`expect_signal
           1 diskomizer`expect_signal
           3 diskomizer`report_exit_reason
           4 diskomizer`close_and_free_paths
           6 diskomizer`update_aio_read_stats
           9 diskomizer`update_aio_write_stats
          11 diskomizer`cancel_all_io
          15 diskomizer`update_aio_time_stats
          15 diskomizer`update_time_stats
          17 diskomizer`cleanup
         160 diskomizer`init_read
         318 diskomizer`do_restart_stopped_devices
         318 diskomizer`start_io
         442 diskomizer`handle_write
         599 diskomizer`do_new_write
        2125 diskomizer`handle_read_then_write
        2560 diskomizer`init_buf
        2560 diskomizer`set_io_len
       58491 diskomizer`handle_read
       66246 diskomizer`handle_write_then_read
      124888 diskomizer`do_new_read
      124888 diskomizer`init_read_buf
      127448 diskomizer`cancel_expected_signal
     -127448 diskomizer`cancel_expected_signal
          -4 diskomizer`cancel_all_io
          -4 diskomizer`cleanup
          -4 diskomizer`close_and_free_paths
           1 diskomizer`do_dev_control
           1 diskomizer`init_buf
           1 diskomizer`set_dev_state
           1 diskomizer`set_io_len
           1 diskomizer`set_state
           6 diskomizer`alloc_time_str
           6 diskomizer`alloc_time_str_fmt
           7 diskomizer`do_new_write
           7 diskomizer`handle_write
           9 diskomizer`do_new_read
           9 diskomizer`handle_write_then_read
          80 diskomizer`init_read
          80 diskomizer`my_calloc
      127459 diskomizer`expect_signal


From the above you can see that there are two functions that create and free the majority of the allocations and the allocations almost match each other, which is expected as they are effectively constructor and destructor for each other. The small mismatch is not unexpected in this context.

However it is the vast number of functions that are not listed at all as they and their children make no calls to the memory allocator or have exactly matching allocation and free that are important here. Those are the functions that we have just ruled out.

From here it is easy now to drill down on the functions that are interesting you, ie the ones where there are unbalanced allocations.

I've uploaded the files allocated.d and followfork.d so you can see the details. If you find it useful then let me know.

1Unfortunately the list is longer than you want as on SPARC it includes any functions that don't have their own stack frame due to the way dtrace calculates ustackdepth, which the script makes use of.

2The script only probes particular objects, in this case the main diskomizer binary, but you can limit it to a particular library or even a particular set of entry points based on name if you edit the script.

Saturday Jun 27, 2009

Follow fork for dtrace pid provider?

There is a ongoing request to have follow fork functionality for the dtrace pid provider but so far no one has stood upto the plate for that RFE. In the mean time my best workaround for this is this:

cjg@brompton:~/lang/d$ cat followfork.d
/ppid == $target/
	printf("fork %d\\n", pid);
	system("dtrace -qs child.d -p %d", pid);
cjg@brompton:~/lang/d$ cat child.d
	printf("%d %s:%s %d\\n", pid, probefunc, probename, ustackdepth)
cjg@brompton:~/lang/d$ pfexec /usr/sbin/dtrace -qws followfork.d -s child.d -p 26758
26758 malloc:entry 22
26758 malloc:entry 15
26758 malloc:entry 18
26758 malloc:entry 18
26758 malloc:entry 18
fork 27548
27548 malloc:entry 7
27548 malloc:entry 7
27548 malloc:entry 18
27548 malloc:entry 16
27548 malloc:entry 18

Clearly you can have the child script do what ever you wish.

Better solutions are welcome!

Thursday Jun 18, 2009

Diskomizer Open Sourced

I'm pleased to announce the Diskomizer test suite has been open sourced. Diskomizer started life in the dark days before ZFS when we lived in a world full1 of bit flips, phantom writes, phantom reads, misplaced writes and misplaced reads.

With a storage architecture that does not use end to end data verification the best that you can hope for was that your application will spot errors quickly and allow you to diagnose the broken part or bug quickly. Diskomizer was written to be a “simple” application that could verify all the data paths worked correctly and worked correctly under extreme load. It has been and is used by support, development and test groups for system verification.

For more details of what Diskomizer is and how to build and install read these pages:


You can download the source and precompiled binaries from:


and can browse the source here:


Using Diskomizer

First remember in most cases Diskomizer will destroy all the data on any target you point it at. So extreme care is advised.

I will say that again.

Diskomizer will destroy all the data on any target that you point it at.

For the purposes of this explanation I am going to use ZFS volumes so that I can create and destroy them with confidence that I will not be destroying someone's data.

First lets create some volumes.

# i=0
# while (( i < 10 ))
zfs create -V 10G storage/chris/testvol$i
let i=i+1

Now write the names of the devices you wish to test into a file after the key “DEVICE=”:

# echo DEVICE= /dev/zvol/rdsk/storage/chris/testvol\* > test_opts

Now start the test. When you installed diskomizer it put the standard option files on the system and has a search path so that it can find them. I'm using the options file “background” which will make the test go into the back ground redirecting the output into a file called “stdout” and any errors into a file called “stderr”:

# /opt/SUNWstc-diskomizer/bin/diskomizer -f test_opts -f background

If Diskomizer has any problems with the configuration it will report them and exit. This is to minimize the risk to your data from a typo. Also the default is to open devices and files exclusively to again reduce the danger to your data (and to reduce false positives where it detects data corruption).

Once up and running it will report it's progress for each process in the output file:

# tail -5 stdout
PID 1152: INFO /dev/zvol/rdsk/storage/chris/testvol7 (zvol0:a)2 write times (0.000,0.049,6.068) 100%
PID 1152: INFO /dev/zvol/rdsk/storage/chris/testvol1 (zvol0:a) write times (0.000,0.027,6.240) 100%
PID 1152: INFO /dev/zvol/rdsk/storage/chris/testvol7 (zvol0:a) read times (0.000,1.593,6.918) 100%
PID 1154: INFO /dev/zvol/rdsk/storage/chris/testvol9 (zvol0:a) write times (0.000,0.070,6.158)  79%
PID 1151: INFO /dev/zvol/rdsk/storage/chris/testvol0 (zvol0:a) read times (0.000,0.976,7.523) 100%

meanwhile all the usual tools can be used to view the IO:

# zpool iostat 5 5                                                  
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
storage      460G  15.9T    832  4.28K  6.49M  31.2M
storage      460G  15.9T  3.22K  9.86K  25.8M  77.2M
storage      460G  15.9T  3.77K  6.04K  30.1M  46.8M
storage      460G  15.9T  2.90K  11.7K  23.2M  91.4M
storage      460G  15.9T  3.63K  5.86K  29.1M  45.7M

1Full may be an exaggeration but we will never know thanks to the fact that the data loss was silent. There were enough cases reported where there was reason to doubt whether the data was good to keep me busy.

2The fact that all the zvols have the same name (zvol0:a) is bug 6851545 found with diskomizer.

Sunday Jun 07, 2009

OpenSolaris 2009.06

After a week of running 2009.06 on my Toshiba Tecra M9 having upgraded from 2008.11 I'm in a position to comment on it. I've been able to remove all the workarounds I had on the system. Nwam appears to work even in the face a suspend and resume. Removalbe media also appears to be robust without the occasional panics that would happen when I removed the SD card with 2008.11.

Feature wise the things I have noticed are the new tracker system for searching files, but it seems to be completely non functional. The big improvements are in the support for the special keys on the Toshiba and the volume control, which unlike the volume on the M2 is a logical control so requires software support. 2009.06 has this support along with support fo the number pad, brightness and mute buttons.

The downside was hitting this bug. This pretty much renders resume useless and I was about to go back to 2008.06 when the bug was updated to say it will be fixed in the first update release and in the mean time there are binary packages. So after creating a new boot enviroment so that I have an unpatched one to switch to when the fix gets into the support repository I have applied the patch. Seems to work which is very pleasing as it has not taken me long to get used to the brightness buttons working.

Tuesday May 26, 2009

Why everyone should be using ZFS

It is at times like these that I'm glad I use ZFS at home.

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested

        NAME           STATE     READ WRITE CKSUM
        tank           ONLINE       0     0     0
          mirror       ONLINE       0     0     0
            c20t0d0s7  ONLINE       6     0     4
            c21t0d0s7  ONLINE       0     0     0
          mirror       ONLINE       0     0     0
            c21t1d0    ONLINE       0     0     0
            c20t1d0    ONLINE       0     0     0

errors: No known data errors
: pearson FSS 14 $; 

The drive with the errors was also throwing up errors that iostat could report and from it's performance was trying heroicially to give me back data. However it had failed. It's performance was terrible and then it failed to give the right data on 4 occasions. Anyother file system would, if that was user data, just had deliviered it to the user without warning. That bad data could then propergate from there on, probably into my backups. There is certainly no good that could come from that. However ZFS detected and corrected the errors.

Now I have offlined the disk the performance of the system is better but I have no redundancy until the new disk I have just ordered arriaves. Now time to check out Seagate's warranty return system.

Sunday May 10, 2009

Another update to Sun Ray access hours script

I have made a change to up access hours script for my Sun Rays. Now the access file can also contain a comma separated list of Sun Ray DTUs so that the control is only applied to those DTUs:

: pearson FSS 3 $; cat /etc/opt/local/access_hours 
: pearson FSS 4 $; 

The practical reason for this is that it allows control of DTUs that are in bedrooms but if the computer is really needed another DTU can be used for homework.

Now that bug 6791062 is fixed the script is safe to use in nevada.

The script is where it always was.

Friday May 08, 2009

Stopping find searching remote directories.

Grizzled UNIX users look away now.

The find command is a wonderful thing but there are some uses of it that seem to cause confusion enough that it seems worth documenting them for google. Today's is:

How can I stop find(1) searching remote file systems?

On reading the documentation the “-local” option should be just what you want and it is, but not on it's own. If you just do:

$ find . -local -print

It will indeed only report on files that are on local file systems below the current directory. However it will search the entire directory tree for those local files even if the directory tree is on NFS

To get find to stop searching when it finds a remote file system you need:

$ find . \\( ! -local -prune \\) -o -print


Friday May 01, 2009

Installing support certificates in OpenSolaris

For some reason you only get the instructions on how to install a certificate to get access to supported or extras updates on your OpenSolaris system after you have downloaded the certificate. Not a big issue as that is generally when you want the instructions. However if you already have your certificates and now want to install them on another system (that you have support for) you can't get the instructions without getting another certificate.

So here are the instructions cut'n'pasted from the support page, as much for me as for you:

How to Install this OpenSolaris 2008.11 standard support Certificate

  1. Download the provided key and certificate files, called OpenSolaris_2008.11_standard_support.key.pem andOpenSolaris_2008.11_standard_support.certificate.pem using the buttons above. Don't worry if you get logged out, or lose the files. You can come back to this site later and re-download them. We'll assume that you downloaded these files into your Desktop folder,~/Desktop/.

  2. Use the following comands to make a directory inside of /var/pkg to store the key and certificate, and copy the key and certificate into this directory. The key files are kept by reference, so if the files become inaccessible to the packaging system, you will encounter errors. Here is how to do it:

            $ pfexec mkdir -m 0755 -p /var/pkg/ssl
            $ pfexec cp -i ~/Desktop/OpenSolaris_2008.11_standard_support.key.pem /var/pkg/ssl
            $ pfexec cp -i ~/Desktop/OpenSolaris_2008.11_standard_support.certificate.pem /var/pkg/ssl 

  3. Add the publisher:

            $ pfexec pkg set-authority \\
               -k /var/pkg/ssl/OpenSolaris_2008.11_standard_support.key.pem \\
               -c /var/pkg/ssl/OpenSolaris_2008.11_standard_support.certificate.pem \\
               -O https://pkg.sun.com/opensolaris/support/ opensolaris.org 
  4. To see the packages supplied by this authority, try:

            $ pkg list -a 'pkg://opensolaris.org/\*' 

If you use the Package Manager graphical application, you will be able to locate the newly discovered packages when you restart Package Manager.

How to Install this OpenSolaris extras Certificate

  1. Download the provided key and certificate files, called OpenSolaris_extras.key.pem and OpenSolaris_extras.certificate.pem using the buttons above. Don't worry if you get logged out, or lose the files. You can come back to this site later and re-download them. We'll assume that you downloaded these files into your Desktop folder, ~/Desktop/.

  2. Use the following comands to make a directory inside of /var/pkg to store the key and certificate, and copy the key and certificate into this directory. The key files are kept by reference, so if the files become inaccessible to the packaging system, you will encounter errors. Here is how to do it:

            $ pfexec mkdir -m 0755 -p /var/pkg/ssl
            $ pfexec cp -i ~/Desktop/OpenSolaris_extras.key.pem /var/pkg/ssl
            $ pfexec cp -i ~/Desktop/OpenSolaris_extras.certificate.pem /var/pkg/ssl
  3. Add the publisher:

            $ pfexec pkg set-authority \\
                -k /var/pkg/ssl/OpenSolaris_extras.key.pem \\
                -c /var/pkg/ssl/OpenSolaris_extras.certificate.pem \\
                -O https://pkg.sun.com/opensolaris/extra/ extra
  4. To see the packages supplied by this authority, try:

            $ pkg list -a 'pkg://extra/\*'

    If you use the Package Manager graphical application, you will be able to locate the newly discovered packages when you restart Package Manager.

Monday Apr 20, 2009

Off to Newcastle for the mash up

This is worse than being on a mobile, but I'm on the train o.k.?

Tomorrow I will be a the System Admin Mash up event at Newcastle. If you are going to be there I suggest you don't bother asking us about Sun/Oracle and instead go straight to www.oracle.com/sun then you will know as much as us.

Sunday Apr 19, 2009

User and group quotas for ZFS!

This push will be very popular amoung those who are managing servers with thousands of users:

Repository: /export/onnv-gate
Total changesets: 1

Changeset: f41cf682d0d3

PSARC/2009/204 ZFS user/group quotas & space accounting
6501037 want user/group quotas on ZFS
6830813 zfs list -t all fails assertion
6827260 assertion failed in arc_read(): hdr == pbuf->b_hdr
6815592 panic: No such hold X on refcount Y from zfs_znode_move
6759986 zfs list shows temporary %clone when doing online zfs recv

User quotas for zfs has been the feature I have been asked about most when talking to customers. This probably relfects that most customers are simply blown away by the other features of ZFS and the only missing feature was user quotas if you have a large user base.

Tuesday Apr 14, 2009

zfs list -d

I've just pushed the changes for zfs list that give it a -d option to limit the depth to which recursive listings will go. This is of most use when you wish to list the snapshots of a given data set and only the snapshots of that data set.

PSARC 2009/171 zfs list -d and zfs get -d
6762432 zfs list --depth

Before this you could achieve this using a short pipe line which while it produced the correct results was horribly inefficient and very slow for datasets that had lots of descendents.

: v4u-1000c-gmp03.eu TS 6 $; zfs list -t snapshot rpool | grep '\^rpool@'
rpool@spam                         0      -    64K  -
rpool@two                          0      -    64K  -
: v4u-1000c-gmp03.eu TS 7 $; zfs list -d 1 -t snapshot              
rpool@spam      0      -    64K  -
rpool@two       0      -    64K  -
: v4u-1000c-gmp03.eu TS 8 $; 

It will allow the zfs-snapshot service to be much more efficient when it needs to list snapshots. The change will be in build 113.

Friday Apr 10, 2009

Native solaris printing working

The solution to my HP printer not working on Solaris is to use CUPS. Since I have a full install and solaris now has the packages installed all I had to do was switch the service over:

$ pfexec /usr/sbin/print-service -s cups

then configure the printer using http://localhosts:631 in the same way as I did with ubunto. Now I don't need the virtual machine running which is a bonus. I think cups may be a better fit for me at home since I don't have a nameservice running so the benefits of the lp system are very limited.

The bug is 6826086: printer drivers from package SUNWhpijs are not update with HPLIP

Saturday Mar 28, 2009

snapshot on unlink?

This thread on OpenSolaris made me wonder how hard it would be to take a snapshot before any file is deleted. It turns out that using dtrace it is not hard at all. Using dtrace to monitor unlink and unlinkat calls and a short script to take the snapshots:


function snapshot
	eval $(print x=$2)

	until [[ "$x" == "/" || -d "$x/.zfs/snapshot" ]]
	if [[ "$x" == "/" || "$x" == "/tmp" ]]
	if [[ -d "$x/.zfs/snapshot" ]]
		print mkdir "$x/.zfs/snapshot/unlink_$1"
		pfexec mkdir "$x/.zfs/snapshot/unlink_$1"
function parse
	eval $(print x=$4)
	if [[ "${x%%/\*}" == "" ]]
		snapshot $1 "$2$4"
		snapshot $1 "$2$3/$4"
pfexec dtrace -wqn 'syscall::fsat:entry /pid != '$$' && uid > 100 && arg0 == 5/ {
	printf("%d %d \\"%s\\" \\"%s\\" \\"%s\\"\\n",
	pid, walltimestamp, root, cwd, copyinstr(arg2)); stop()
syscall::unlink:entry /pid != '$$' && uid > 100 / {
	printf("%d %d \\"%s\\" \\"%s\\" \\"%s\\"\\n",
	pid, walltimestamp, root, cwd, copyinstr(arg0)); stop()
}' | while read pid timestamp root cwd file
	print prun $pid
	parse $timestamp $root $cwd $file
	pfexec prun $pid

Now this is just a Saturday night proof of concept and it should be noted it has a significant performance impact and single threads all calls to unlink.

Also you end up with lots of snapshots:

cjg@brompton:~$ zfs list -t snapshot -o name,used | grep unlink

rpool/export/home/cjg@unlink_1238270760978466613                           11.9M

rpool/export/home/cjg@unlink_1238275070771981963                             59K

rpool/export/home/cjg@unlink_1238275074501904526                             59K

rpool/export/home/cjg@unlink_1238275145860458143                             34K

rpool/export/home/cjg@unlink_1238275168440000379                            197K

rpool/export/home/cjg@unlink_1238275233978665556                            197K

rpool/export/home/cjg@unlink_1238275295387410635                            197K

rpool/export/home/cjg@unlink_1238275362536035217                            197K

rpool/export/home/cjg@unlink_1238275429554657197                            136K

rpool/export/home/cjg@unlink_1238275446884300017                            350K

rpool/export/home/cjg@unlink_1238275491543380576                            197K

rpool/export/home/cjg@unlink_1238275553842097361                            197K

rpool/export/home/cjg@unlink_1238275643490236001                             63K

rpool/export/home/cjg@unlink_1238275644670212158                             63K

rpool/export/home/cjg@unlink_1238275646030183268                               0

rpool/export/home/cjg@unlink_1238275647010165407                               0

rpool/export/home/cjg@unlink_1238275648040143427                             54K

rpool/export/home/cjg@unlink_1238275649030124929                             54K

rpool/export/home/cjg@unlink_1238275675679613928                            197K

rpool/export/home/cjg@unlink_1238275738608457151                            198K

rpool/export/home/cjg@unlink_1238275800827304353                           57.5K

rpool/export/home/cjg@unlink_1238275853116324001                           32.5K

rpool/export/home/cjg@unlink_1238275854186304490                           53.5K

rpool/export/home/cjg@unlink_1238275862146153573                            196K

rpool/export/home/cjg@unlink_1238275923255007891                           55.5K

rpool/export/home/cjg@unlink_1238275962114286151                           35.5K

rpool/export/home/cjg@unlink_1238275962994267852                           56.5K

rpool/export/home/cjg@unlink_1238275984723865944                           55.5K

rpool/export/home/cjg@unlink_1238275986483834569                             29K

rpool/export/home/cjg@unlink_1238276004103500867                             49K

rpool/export/home/cjg@unlink_1238276005213479906                             49K

rpool/export/home/cjg@unlink_1238276024853115037                           50.5K

rpool/export/home/cjg@unlink_1238276026423085669                           52.5K

rpool/export/home/cjg@unlink_1238276041792798946                           50.5K

rpool/export/home/cjg@unlink_1238276046332707732                           55.5K

rpool/export/home/cjg@unlink_1238276098621721894                             66K

rpool/export/home/cjg@unlink_1238276108811528303                           69.5K

rpool/export/home/cjg@unlink_1238276132861080236                             56K

rpool/export/home/cjg@unlink_1238276166070438484                             49K

rpool/export/home/cjg@unlink_1238276167190417567                             49K

rpool/export/home/cjg@unlink_1238276170930350786                             57K

rpool/export/home/cjg@unlink_1238276206569700134                           30.5K

rpool/export/home/cjg@unlink_1238276208519665843                           58.5K

rpool/export/home/cjg@unlink_1238276476484690821                             54K

rpool/export/home/cjg@unlink_1238276477974663478                             54K

rpool/export/home/cjg@unlink_1238276511584038137                           60.5K

rpool/export/home/cjg@unlink_1238276519053902818                             71K

rpool/export/home/cjg@unlink_1238276528213727766                             62K

rpool/export/home/cjg@unlink_1238276529883699491                             47K

rpool/export/home/cjg@unlink_1238276531683666535                           3.33M

rpool/export/home/cjg@unlink_1238276558063169299                           35.5K

rpool/export/home/cjg@unlink_1238276559223149116                           62.5K

rpool/export/home/cjg@unlink_1238276573552877191                           35.5K

rpool/export/home/cjg@unlink_1238276584602668975                           35.5K

rpool/export/home/cjg@unlink_1238276586002642752                             53K

rpool/export/home/cjg@unlink_1238276586522633206                             51K

rpool/export/home/cjg@unlink_1238276808718681998                            216K

rpool/export/home/cjg@unlink_1238276820958471430                           77.5K

rpool/export/home/cjg@unlink_1238276826718371992                             51K

rpool/export/home/cjg@unlink_1238276827908352138                             51K

rpool/export/home/cjg@unlink_1238276883227391747                            198K

rpool/export/home/cjg@unlink_1238276945366305295                           58.5K

rpool/export/home/cjg@unlink_1238276954766149887                           32.5K

rpool/export/home/cjg@unlink_1238276955946126421                           54.5K

rpool/export/home/cjg@unlink_1238276968985903108                           52.5K

rpool/export/home/cjg@unlink_1238276988865560952                             31K

rpool/export/home/cjg@unlink_1238277006915250722                           57.5K

rpool/export/home/cjg@unlink_1238277029624856958                             51K

rpool/export/home/cjg@unlink_1238277030754835625                             51K

rpool/export/home/cjg@unlink_1238277042004634457                           51.5K

rpool/export/home/cjg@unlink_1238277043934600972                             52K

rpool/export/home/cjg@unlink_1238277045124580763                             51K

rpool/export/home/cjg@unlink_1238277056554381122                             51K

rpool/export/home/cjg@unlink_1238277058274350998                             51K

rpool/export/home/cjg@unlink_1238277068944163541                             59K

rpool/export/home/cjg@unlink_1238277121423241127                           32.5K

rpool/export/home/cjg@unlink_1238277123353210283                           53.5K

rpool/export/home/cjg@unlink_1238277136532970668                           52.5K

rpool/export/home/cjg@unlink_1238277152942678490                               0

rpool/export/home/cjg@unlink_1238277173482320586                               0

rpool/export/home/cjg@unlink_1238277187222067194                             49K

rpool/export/home/cjg@unlink_1238277188902043005                             49K

rpool/export/home/cjg@unlink_1238277190362010483                             56K

rpool/export/home/cjg@unlink_1238277228691306147                           30.5K

rpool/export/home/cjg@unlink_1238277230021281988                           51.5K

rpool/export/home/cjg@unlink_1238277251960874811                             57K

rpool/export/home/cjg@unlink_1238277300159980679                           30.5K

rpool/export/home/cjg@unlink_1238277301769961639                             50K

rpool/export/home/cjg@unlink_1238277302279948212                             49K

rpool/export/home/cjg@unlink_1238277310639840621                             28K

rpool/export/home/cjg@unlink_1238277314109790784                           55.5K

rpool/export/home/cjg@unlink_1238277324429653135                             49K

rpool/export/home/cjg@unlink_1238277325639636996                             49K

rpool/export/home/cjg@unlink_1238277360029166691                            356K

rpool/export/home/cjg@unlink_1238277375738948709                           55.5K

rpool/export/home/cjg@unlink_1238277376798933629                             29K

rpool/export/home/cjg@unlink_1238277378458911557                             50K

rpool/export/home/cjg@unlink_1238277380098888676                             49K

rpool/export/home/cjg@unlink_1238277397738633771                             48K

rpool/export/home/cjg@unlink_1238277415098386055                             49K

rpool/export/home/cjg@unlink_1238277416258362893                             49K

rpool/export/home/cjg@unlink_1238277438388037804                             57K

rpool/export/home/cjg@unlink_1238277443337969269                           30.5K

rpool/export/home/cjg@unlink_1238277445587936426                           51.5K

rpool/export/home/cjg@unlink_1238277454527801430                           50.5K

rpool/export/home/cjg@unlink_1238277500967098623                            196K

rpool/export/home/cjg@unlink_1238277562866135282                           55.5K

rpool/export/home/cjg@unlink_1238277607205456578                             49K

rpool/export/home/cjg@unlink_1238277608135443640                             49K

rpool/export/home/cjg@unlink_1238277624875209357                             57K

rpool/export/home/cjg@unlink_1238277682774484369                           30.5K

rpool/export/home/cjg@unlink_1238277684324464523                             50K

rpool/export/home/cjg@unlink_1238277685634444004                             49K

rpool/export/home/cjg@unlink_1238277686834429223                           75.5K

rpool/export/home/cjg@unlink_1238277700074256500                             48K

rpool/export/home/cjg@unlink_1238277701924235244                             48K

rpool/export/home/cjg@unlink_1238277736473759068                           49.5K

rpool/export/home/cjg@unlink_1238277748313594650                           55.5K

rpool/export/home/cjg@unlink_1238277748413593612                             28K

rpool/export/home/cjg@unlink_1238277750343571890                             48K

rpool/export/home/cjg@unlink_1238277767513347930                           49.5K

rpool/export/home/cjg@unlink_1238277769183322087                             50K

rpool/export/home/cjg@unlink_1238277770343306935                             48K

rpool/export/home/cjg@unlink_1238277786193093885                             48K

rpool/export/home/cjg@unlink_1238277787293079433                             48K

rpool/export/home/cjg@unlink_1238277805362825259                           49.5K

rpool/export/home/cjg@unlink_1238277810602750426                            195K

rpool/export/home/cjg@unlink_1238277872911814531                            195K

rpool/export/home/cjg@unlink_1238277934680920214                            195K

rpool/export/home/cjg@unlink_1238277997220016825                            195K

rpool/export/home/cjg@unlink_1238278063868871589                           54.5K

rpool/export/home/cjg@unlink_1238278094728323253                             61K

rpool/export/home/cjg@unlink_1238278096268295499                             63K

rpool/export/home/cjg@unlink_1238278098518260168                             52K

rpool/export/home/cjg@unlink_1238278099658242516                             56K

rpool/export/home/cjg@unlink_1238278103948159937                             57K

rpool/export/home/cjg@unlink_1238278107688091854                             54K

rpool/export/home/cjg@unlink_1238278113907980286                             62K

rpool/export/home/cjg@unlink_1238278116267937390                             64K

rpool/export/home/cjg@unlink_1238278125757769238                            196K

rpool/export/home/cjg@unlink_1238278155387248061                            136K

rpool/export/home/cjg@unlink_1238278160547156524                            229K

rpool/export/home/cjg@unlink_1238278165047079863                            351K

rpool/export/home/cjg@unlink_1238278166797050407                            197K

rpool/export/home/cjg@unlink_1238278168907009714                             55K

rpool/export/home/cjg@unlink_1238278170666980686                            341K

rpool/export/home/cjg@unlink_1238278171616960684                           54.5K

rpool/export/home/cjg@unlink_1238278190336630319                            777K

rpool/export/home/cjg@unlink_1238278253245490904                            329K

rpool/export/home/cjg@unlink_1238278262235340449                            362K

rpool/export/home/cjg@unlink_1238278262915331213                            362K

rpool/export/home/cjg@unlink_1238278264915299508                            285K

rpool/export/home/cjg@unlink_1238278310694590970                             87K

rpool/export/home/cjg@unlink_1238278313294552482                             66K

rpool/export/home/cjg@unlink_1238278315014520386                             31K

rpool/export/home/cjg@unlink_1238278371773568934                            258K

rpool/export/home/cjg@unlink_1238278375673503109                            198K

rpool/export/home/cjg@unlink_1238278440802320314                            138K

rpool/export/home/cjg@unlink_1238278442492291542                           55.5K

rpool/export/home/cjg@unlink_1238278445312240229                           2.38M

rpool/export/home/cjg@unlink_1238278453582077088                            198K

rpool/export/home/cjg@unlink_1238278502461070222                            256K

rpool/export/home/cjg@unlink_1238278564359805760                            256K

rpool/export/home/cjg@unlink_1238278625738732194                           63.5K

rpool/export/home/cjg@unlink_1238278633428599541                           61.5K

rpool/export/home/cjg@unlink_1238278634568579678                            137K

rpool/export/home/cjg@unlink_1238278657838186760                            288K

rpool/export/home/cjg@unlink_1238278659768151784                            223K

rpool/export/home/cjg@unlink_1238278661518121640                            159K

rpool/export/home/cjg@unlink_1238278664378073421                            136K

rpool/export/home/cjg@unlink_1238278665908048641                            138K

rpool/export/home/cjg@unlink_1238278666968033048                            136K

rpool/export/home/cjg@unlink_1238278668887996115                            281K

rpool/export/home/cjg@unlink_1238278670307970765                            227K

rpool/export/home/cjg@unlink_1238278671897943665                            162K

rpool/export/home/cjg@unlink_1238278673197921775                            164K

rpool/export/home/cjg@unlink_1238278674027906895                            164K

rpool/export/home/cjg@unlink_1238278674657900961                            165K

rpool/export/home/cjg@unlink_1238278675657885128                            165K

rpool/export/home/cjg@unlink_1238278676647871187                            241K

rpool/export/home/cjg@unlink_1238278678347837775                            136K

rpool/export/home/cjg@unlink_1238278679597811093                            199K

rpool/export/home/cjg@unlink_1238278687297679327                            197K

rpool/export/home/cjg@unlink_1238278749616679679                            197K

rpool/export/home/cjg@unlink_1238278811875554411                           56.5K


Good job that snapshots are cheap. I'm not going to be doing this all the time but it makes you think what could be done.

Friday Mar 27, 2009

zfs list webrev

I've just posted the webrev for review for an RFE to “zfs list”:

PSARC 2009/171 zfs list -d and zfs get -d
6762432 zfs list --depth

This will allow you to limit the depth to which a recursive listing of zfs file systems will go. This is particularly useful if you only want to list the snapshots of the current file system.

The webrev is here:


Comments welcome.

Sunday Mar 15, 2009

Converting flac encoded audio to mp3.

Solaris has the flac command which will happily decode flac encoded file and metaflac to read the flac meta data but you need to download lame either precompiled or from sourceforge and build it. Then it is a simple matter to convert your flac encoded files:

$ flac -c -d file.flac | lame - file.mp3

you can add various flags to change the data rates and put tags into the mp3. I feel sure someone should have written a script that would convert an entire library but I could not find one so here is the one I wrote to do this:

umask 22

export PATH=/opt/lame/bin:/usr/sbin:/usr/sfw/bin:/usr/bin

function flactags2lametags
        typeset IFS="$IFS ="
        typeset key value
        metaflac --export-tags /dev/fd/1 "$1" | while read key value
                print typeset "${key}=\\"$(print $value| sed 's/["'\\'']/\\\\&/g')\\";"

function cleanup
        if [[ "$FILE" != "" ]]
                rm "$FILE"

function convert_file
        typeset -i ret=0
        typeset f="${1%.flac}"
        if [[ "$f" != "${1}" ]]
                typeset out="${DST}/${f}.mp3"
                if ! [[ -f "${out}" ]] || [[ "$1" -nt "$out" ]]
                        print $out

                        eval $(flactags2lametags "$1") 
                        flac --silent -c -d "$1" | \\
                                lame --quiet -b ${BITRATE} -h \\
                                ${ARTIST:+--ta }"${ARTIST}" \\
                                ${ALBUM:+--tl }"${ALBUM}" \\
                                ${TRACKNUMBER:+--tn }"${TRACKNUMBER}" \\
                                ${DATE:+--ty }"${DATE%%-\*}" \\
                                - "${out}"
        elif [[ $1 != "${1%.jpg}" ]]
                [[ -f "${DST}/${1}" ]] || cp "$1" "${DST}"
        return ${ret}

function do_dir
        typeset i
        for i in "$1"/\*
                if [[ -f "$i" ]]
                        convert_file "$i" || return 1
                elif [[ -d "$i" ]]
                        if test -d "${DST}/$i" || mkdir "${DST}/$i"
                                do_dir "$i" || return 1
        return 0

function usage
        print USAGE: $1 src_dir dest_dist >&2
        exit 1

trap cleanup EXIT

(( $# == 2 )) || usage $0
if test -d "${DST}/$1" || mkdir "${DST}/$1"
        do_dir "$1"
exit $?

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com


« April 2014