Thursday Aug 06, 2009

Monitoring mounts

Sometimes in the course of being a system administrator it is useful to know what file systems are being mounted and when and what mounts fail and why. While you can turn on automounter verbose mode that only answers the question for the automounter.

Dtrace makes answering the general question a snip:

: exdev.eu FSS 24 $; cat mount_monitor.d                         
#!/usr/sbin/dtrace -qs

fbt::domount:entry
/ args[1]->dir /
{
        self->dir = args[1]->flags & 0x8 ? args[1]->dir : 
              copyinstr((intptr_t)args[1]->dir);
}
fbt::domount:return
/ self->dir != 0 /
{
        
        printf("%Y domount ppid %d, %s %s pid %d -> %s", walltimestamp, 
              ppid, execname, self->dir, pid, arg1 == 0 ? "OK" : "failed");
}
fbt::domount:return
/ self->dir != 0 && arg1 == 0/
{
        printf("\\n");
        self->dir = 0;
}
fbt::domount:return
/ self->dir != 0 && arg1 != 0/
{
        printf("errno %d\\n", arg1);
        self->dir = 0;
}
: exdev.eu FSS 25 $; pfexec /usr/sbin/dtrace -qs  mount_monitor.d
2009 Aug  6 12:57:57 domount ppid 0, sched /share/consoles pid 0 -> OK
2009 Aug  6 12:57:59 domount ppid 0, sched /share/chroot pid 0 -> OK
2009 Aug  6 12:58:00 domount ppid 0, sched /share/newsrc pid 0 -> OK
2009 Aug  6 12:58:00 domount ppid 0, sched /share/build2 pid 0 -> OK
2009 Aug  6 12:58:00 domount ppid 0, sched /share/chris_at_play pid 0 -> OK
2009 Aug  6 12:58:00 domount ppid 0, sched /share/ws_eng pid 0 -> OK
2009 Aug  6 12:58:00 domount ppid 0, sched /share/ws pid 0 -> OK
2009 Aug  6 12:58:03 domount ppid 0, sched /home/tx pid 0 -> OK
2009 Aug  6 12:58:04 domount ppid 0, sched /home/fl pid 0 -> OK
2009 Aug  6 12:58:05 domount ppid 0, sched /home/socal pid 0 -> OK
2009 Aug  6 12:58:07 domount ppid 0, sched /home/bur pid 0 -> OK
2009 Aug  6 12:58:23 domount ppid 0, sched /net/e2big.uk/export/install/docs pid 0 -> OK
2009 Aug  6 12:58:23 domount ppid 0, sched /net/e2big.uk/export/install/browser pid 0 -> OK
2009 Aug  6 12:58:23 domount ppid 0, sched /net/e2big.uk/export/install/cdroms pid 0 -> OK
2009 Aug  6 12:59:45 domount ppid 8929, Xnewt /tmp/.X11-pipe/X6 pid 8935 -> OK

In particular that last line if repeated often can give you a clue to things not being right.

Tuesday Dec 09, 2008

Automounter verbose toggle and GNU ls

As I have mentioned before you can turn automounters verbose mode on and off by accessing the file "=v" as root in the root of an automount point. Typically this means "/home/=v" so I would usually do:

# ls /home/=v

and the logging would burst into life. However you need to be sure you don't use the gnu ls (which is the default on OpenSolaris) as if you do you will see in the log file this:

t4 Automountd: verbose on 
t4 Automountd: verbose off 

The reason is clear when you truss the ls:

cjg@brompton:/var/crash/brompton$ pfexec truss -o /tmp/tr ls /home/=v
ls: cannot access /home/=v: No such file or directory
cjg@brompton:/var/crash/brompton$ egrep =v /tmp/tr
stat64("/home/=v", 0x0807A20C)                  Err#2 ENOENT
lstat64("/home/=v", 0x0807A20C)                 Err#2 ENOENT
cjg@brompton:/var/crash/brompton$ 

It accesses the file twice, so toggles verbose mode on and then off. I think this is a bug in the gnu ls since if they did lstat64 first and it returned ENOENT they would not need to do the stat64 at all. Anyway the solaris ls does the right thing:

cjg@brompton:/var/crash/brompton$ pfexec /usr/bin/ls /home/=v
/home/=v: No such file or directory
cjg@brompton:/var/crash/brompton$ tail -1 /var/svc/log/system-filesystem-autofs:default.log
t4      Automountd: verbose on
cjg@brompton:/var/crash/brompton$ 

Wednesday May 21, 2008

Using mirror mounts to get a better /net

One problem with the automounter is that when you use the /net mount points to mount a server if the admin on that server adds a share then you client won't see that share until the automounter timesout the mount. This obviously requires that the mounts are unused which for a large nfs server could never happen.

So given an NFS server host called sa64-zfs-gmp03.eu which is sharing a directory /newpool/cjg on a client you can do:

#  ls /net/sa64-zfs-gmp03.eu/newpool
cjg
#  ls /net/sa64-zfs-gmp03.eu/newpool/cjg
SPImage         ipmiLog         ppcenv          sel.bin         tmp
SPValueAdd      mcCpu0Core0Log  processLog      summaryLog
evLog           mcCpu1Core0Log  prsLog          swLog
hwLog           mcCpu2Core0Log  pstore          tdulog.tar
# cd  /net/sa64-zfs-gmp03.eu/newpool/cjg
# ls
SPImage         ipmiLog         ppcenv          sel.bin         tmp
SPValueAdd      mcCpu0Core0Log  processLog      summaryLog
evLog           mcCpu1Core0Log  prsLog          swLog
hwLog           mcCpu2Core0Log  pstore          tdulog.tar

However if at this point on the server you create and share a new file system:

# zfs create -o sharenfs=rw newpool/cjg2
# share
-@newpool/cjg   /newpool/cjg   rw   ""  
-@newpool/cjg2  /newpool/cjg2   rw   ""  
# echo foo > /newpool/cjg2/file
# 

You can't now directly access it on the client:

# ls /net/sa64-zfs-gmp03.eu/newpool/cjg2
/net/sa64-zfs-gmp03.eu/newpool/cjg2: No such file or directory
#

Now we all know you can work around this by using aliases for the server or even different capitalization:

# ls /net/SA64-zfs-gmp03.eu/newpool/cjg2
file
# 

however lots of users just won't buy that and I don't blame them.

With the advent or mirror mounts to NFSv4 you can do a lot better and there is an RFE (4107375) for the automounter to do this for you, which looks like it would be simple on a client that can do mirror mounts but until that is done here is a work-around. Create a file “/etc/auto_mirror “that contains this line:

\* &:/

Then add this line to auto_master:

/mirror auto_mirror  -nosuid,nobrowse,vers=4

or add a new key to an existing automount table:

: s4u-nv-gmp03.eu TS 50 $; nismatch mirror auto_share
mirror / -fstype=autofs,nosuid,nobrowse auto_mirror.org_dir.cte.sun.com.
: s4u-nv-gmp03.eu TS 51 $; 

Now if we do the same test this time replacing the “/net” path with the “/mirror” path you get:

# ls /mirror/sa64-zfs-gmp03.eu/newpool/
cjg
# ls /mirror/sa64-zfs-gmp03.eu/newpool/cjg
SPImage         ipmiLog         ppcenv          sel.bin         tmp
SPValueAdd      mcCpu0Core0Log  processLog      summaryLog
evLog           mcCpu1Core0Log  prsLog          swLog
hwLog           mcCpu2Core0Log  pstore          tdulog.tar
# (cd /mirror/sa64-zfs-gmp03.eu/newpool/cjg ; sleep 1000000) &
[1]     10455
# ls /mirror/sa64-zfs-gmp03.eu/newpool/cjg2
/mirror/sa64-zfs-gmp03.eu/newpool/cjg2: No such file or directory

Here I created the new file system on the server and put the file in.

# ls /mirror/sa64-zfs-gmp03.eu/newpool/cjg2
file
# 

If you are an entirely NFSv4 shop then you could change the “/net” mount point to use this.

Saturday Jul 14, 2007

/net can be evil

I've written before about what a fan I am of the automounter. However the curse of the automounter is laziness. Direct mounts I covered but the next topic is the “/net” mount point.

Like direct automount points “/net” has it's uses, however when it is used without thought it is evil. The thing that has to be thought about is that “/net” quickly leads you into some of the eight fallacies of distributed computing: which I reproduce here from Geoff Arnold's blog:

Essentially everyone, when they first build a distributed application, makes the following eight assumptions. All prove to be false in the long run and all cause big trouble and painful learning experiences.
  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. Topology doesn’t change
  6. There is one administrator
  7. Transport cost is zero
  8. The network is homogeneous

Now since when using “/net” you are just a user not a developer that cuts you some slack with me. However if you are a engineer looking at crash dumps that are many gigabytes via “/net” or uploading from a local tape to and from an NFS server on the other side of the world, or even close but over a WAN, you need to be aware of fallacies 1,2,3 and 7. Then wonder if there is a better way, invariably there is a faster, less resource hungry way to do this if you can login to a system closer to the NFS server.

If that is the case then you should get yourself acquainted some of the options to ssh(1). Specifically compression, X11 forwarding and for the more adventurous agent forwarding.



Tuesday Jun 19, 2007

Where are all the log files?

Todays question was:

Is there a list of all the default log files that are used in Solaris?

Not that I know of. Mostly since most software you can configure to log anywhere you wish it would be an impossible task to come up with a complete list that was of any practical benefit.

However there are some places to go looking for log files:

  1. The file /etc/syslog.conf will contain the names of logfiles written to via syslog.

  2. The contents of the directory /var/svc/log is the default location for log files from SMF. These files are connected to any daemons standard out and standard error so can grow.

  3. Then the files in /etc/default will define logfiles for services that are not using syslog. For example /var/adm/sulog

So having ticked off those log files and decided upon a strategy for maintaining them, mine is to keep 100k of log for the logs in /var/svc/log and let logadm(1M) look after them. I keep sulog forever and clean it by hand as I'm paranoid. Configuring logadm to look after the SMF logs is easy:

for i in /var/svc/log/\*.log
do
logadm -w $i -C1 -c -s100k
done

So how can I be sure that there are no more log files out there? You could use find to find all the files modified in the last 24 hours however this will get you a lot of false positives. Since what is really interesting are the active log files that are in the “/” and “/var” file systems, I can use dtrace to find them by running this script for a few hours:

syscall::write:entry
/ ! (execname == "ksh" && arg0 == 63 ) &&
    fds[arg0].fi_oflags & O_APPEND &&
    (fds[arg0].fi_mount == "/" || fds[arg0].fi_mount == "/var" )/
{
        @logs[fds[arg0].fi_pathname] = count();
        logfiles[ fds[arg0].fi_pathname]++
}
syscall::write:entry
/ logfiles[ fds[arg0].fi_pathname] == 1 &&
    ! (execname == "ksh" && arg0 == 63 ) &&
    fds[arg0].fi_oflags & O_APPEND &&
    (fds[arg0].fi_mount == "/" || fds[arg0].fi_mount == "/var" )/
{
        printf("%s %s", fds[arg0].fi_fs, fds[arg0].fi_pathname);
}

in half an hour gives me:

# dtrace -s /home/cjg/lang/d/log.d
dtrace: script '/home/cjg/lang/d/log.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0   4575                      write:entry ufs /var/cron/log
  0   4575                      write:entry ufs /var/adm/wtmpx
  0   4575                      write:entry ufs /var/adm/sulog
  0   4575                      write:entry ufs /var/adm/messages
  0   4575                      write:entry ufs /var/apache2/logs/access_log
  0   4575                      write:entry ufs /var/svc/log/system-filesystem-autofs:default.log
  0   4575                      write:entry ufs /var/log/syslog
  0   4575                      write:entry ufs /var/log/exim/mainlog
\^C

  /var/adm/messages                                                 1
  /var/adm/sulog                                                    2
  /var/adm/wtmpx                                                    2
  /var/svc/log/system-filesystem-autofs:default.log                 4
1
  /var/apache2/logs/access_log                                      7
  /var/log/exim/mainlog                                            28
  /var/log/syslog                                                  42
  /var/cron/log                                                  16772
# 

Clearly there is still scope for false positives files in /var/tmp that are opened O_APPEND for example, or if you use a different shell but it gives a very good starting point.



1The autofs log file has been written to thanks to me using the well hidden feature of being able to turn automounter verbose mode on and off by accessing the f file “=v” in the as root in the root of an indirect mount point. Typically this is “/net/=v”. Equally you can set the trace level by accessing “/net/=N” where N is any integer.

2Cron is so busy as I am still running the test jobs for the timezone aware cron.

Friday Jun 15, 2007

Direct automount points are evil

As I have said before, I love the automounter. I love that via executable maps you can get it to do things that you really should not be able to do, like restore files to be a poor mans HSM or archive retrieval solution. I love that automount maps can contain more automount points to form a hiearchy (eg: from our own automounter set up):

: eagain.eu FSS 10 $; nismatch chroot auto_share
chroot -fstype=autofs auto_chroot.org_dir.eu.cte.sun.com.
: eagain.eu FSS 11 $; 

which allows this.

There is however one feature of the automounter that while well intentioned and some times required, is evil.

I speak of direct automount points.

They are evil for many reasons.

  1. They pollute the name space.

  2. They can't be added dynamically

  3. They can't be deleted dynamically

  4. They encourage sloppy administration.

I think it is point 4 that really wins it for me and I have an example I found today. I found it as my lab system had mounted up a load of nfs mount points that it should not. Now if the server goes away my session hangs when I had no need to mount the server in the first place. The reason it had mounted it was due to name space pollution such that “find . -xdev ....” triggers the mount of direct mount points but not indirect mount points1.

What do I mean sloppy administration? I'll take the example from today:

: eagain.eu FSS 8 $; niscat auto_direct.org_dir | grep cdrom
/cdroms/jumpstart foohost:/export/jumpstart
/cdroms/prefcs -rw foohost:/export/prefcs
/cdroms/fcs -rw foohost:/export/fcs
: eagain.eu FSS 9 $; 


Instead of just one indirect mount point with the entries added to an indirect automount table we have an extra directory that only contains mount points, which is really the definition of an indirect mount point. That in our case the mount points could live under an existing mount point just adds to the irritation. So now to fix this the automount table has to be updated and then every system in the domain needs rebooting or at the very least the automount command run.


It should be the goal of every administrator to have an empty auto_direct table.


1Indirect mounts don't get triggered as they sit below another mount point so the find stops before reading the directory. So even if the mount point is browseable the mounts don't get triggered.

Thursday Dec 01, 2005

ZFS snapshots meet automounter and have a happy time together

There is a thread going over on zfs-interest where the following question was posed by biker (I live in hope that this is really “cyclist”):

How can I make a snapshot of home in zfs containing the data including the stuff within the user homes (home/ann, home/bob) - like a recursive snapshot.

The only way so far I could think of was
 - copy the directory structure (home/ann, home/bob) to /snap
 - initiate a snapshot of every dataset (home/ann, home/bob)
 - mount each snapshot to the counterpart under /snap
 - run the backup
 - remove the mounts
 - release the snapshots
 - clear /snap

If there is something like a recursive snapshot or user and group quota in the classical sense, the efford needed could be minimized, ...

It got me thinking that in the absence of a real solution this should be doable with a script. For the recursive backup script we have:

#!/bin/ksh -p

for filesystem in $(zfs list -H -o name -t filesystem)
do
        zfs snapshot $filesystem@$1
done

No prizes there but what biker wanted was a copy of the file system structure. The problem is that all those snapshots are each under the individual file systems .zfs/snapshot directory so are spread about.


If only we could mount all of them under one tree? By adding this line to /etc/auto_master:


/backup /root/auto_snapshot

and then this script as /root/auto_snapshot:

#!/bin/ksh -p

PATH=/usr/sbin:/sbin:$PATH

for filesystem in $(zfs list -H -o name -t filesystem)
do
if [[ -d /$filesystem/.zfs/snapshot/$1 ]]
then
        fs=${filesystem#\*/}
        if [[ ${fs} = ${filesystem} ]]
        then
                fs=""
        fi
        ANS="${ANS:-}${ANS:+ }/$fs localhost:/$filesystem/.zfs/snapshot/$1"
fi
done
echo $ANS

Suddenly I can do this:


1071 # ./backup fullbackup
1072 # (cd /backup/fullbackup ; tar cf /mypool/root/backup.tar . )
1073 # df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c0d0s0        7.3G   6.7G   574M    93%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                   829M   704K   829M     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
/usr/lib/libc/libc_hwcap1.so.1
                       7.3G   6.7G   574M    93%    /lib/libc.so.1
fd                       0K     0K     0K     0%    /dev/fd
swap                   829M   136K   829M     1%    /tmp
swap                   829M    44K   829M     1%    /var/run
mypool                 9.5G   5.8G   2.9G    67%    /mypool
mypool/jfg             9.5G     8K   2.9G     1%    /mypool/jfg
mypool/keg             9.5G    16M   2.9G     1%    /mypool/keg
mypool/rlg             9.5G     8K   2.9G     1%    /mypool/rlg
mypool/stc             9.5G    14M   2.9G     1%    /mypool/stc
/mypool/cg13442        8.8G   5.8G   2.9G    67%    /home/cg13442
/mypool/.zfs/snapshot/fullbackup
                       8.8G   3.9G   4.8G    45%    /backup/fullbackup
/mypool/rlg/.zfs/snapshot/fullbackup
                       4.8G     8K   4.8G     1%    /backup/fullbackup/rlg
/mypool/jfg/.zfs/snapshot/fullbackup
                       4.8G     8K   4.8G     1%    /backup/fullbackup/jfg
/mypool/stc/.zfs/snapshot/fullbackup
                       4.8G    14M   4.8G     1%    /backup/fullbackup/stc
1074 #

The tar backup file now contains the whole of the “fullback snapshot” and apart from the snapshot not really being atomic, since each file system is snapped in sequence this pretty much does what is wanted.


If you were really brave/foolish you could have the automounter executable maps generate the snapshots for you but that would be a receipe for filling the pool with snapshots. Deleting the snapshots is also a snip:

#!/bin/ksh -p

for filesystem in $(zfs list -H -o name -t filesystem)
do
        zfs destroy $filesystem@$1
done


Tags:

Tuesday Nov 02, 2004

core files and Friday madness

Last Friday I had one of those escalation's from another time zone where all I had was a number of core files from an application and the question was why did it fail.


Now the problem with application core files, prior to 10, is that they don't contain everything. You have to look at them on a system with the same binary and shared libraries as the one that created them. This is fine in a development environment as you can just login to the system where the application was running and you will have the right files. However once you are not in a development environment this quickly ceases to be true. If the files don't match then, depending on which debugger you use, you either get bogus information or errors or both:


dbx /usr/bin/sparcv9/ptree /var/tmp/core.ptree.9847.ep>
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.3' in your .dbxrcReading ptree
core file header read successfully
Reading ld.so.1
Reading libproc.so.1
Reading libc.so.1
Reading librtld_db.so.1
Reading libelf.so.1
Reading libdl.so.1
Reading libc_psr.so.1
WARNING!!
A loadobject was found with an unexpected checksum value.
See `help core mismatch' for details, and run `proc -map'
to see what checksum values were expected and found.
dbx: warning: Some symbolic information might be incorrect.
program terminated by signal SEGV (no mapping at the fault address)
0x0000000100001810: main+0x06cc:        cmp      %o0, 0
(dbx) >

Even the warning here is misleading, as it is not just the symbolic information that is incorrect. SEGV from “cmp %o0, 0”, I don't think so.

Luckily for me the customer has sent in an explorer from the system which contained the output from showrev -p. Using this it was a simple matter to install a lab system that matched the customers and away I went. The reason it was simple is we have some software to build a jumpstart profile from the output of showrev -p for exactly this kind of scenario.

However the down side of this is the 1 hour 10 minute wait while the system installs and patches itself. In this time I got thinking there must be a better (IE faster) way.

Asking the customer for the libraries was one possibility, but the customer was from a different time zone so that would induce a delay.

Since we have all the patches, I wondered if I could automount all the files from all the patches over a loop back mount of the root file system similar to the chroot we use to create our build systems. It was the first attempt at this that lead to last Friday 's failure.

However the principle of loop back mounting just the files that were in the patches might not be insane. The script that mounts them only has to run faster than the 1 hour 10 minutes of an install and this could be a winner.

So a short script later, and 1335 loop back mounts on the system and I have a chroot environment where all is well:

dbx /usr/bin/sparcv9/ptree /var/tmp/core.ptree.9847.eps>
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.3' in your .dbxrcReading ptree
core file header read successfully
Reading ld.so.1
Reading libproc.so.1
Reading libc.so.1
Reading librtld_db.so.1
Reading libelf.so.1
Reading libdl.so.1
Reading libc_psr.so.1
program terminated by signal SEGV (no mapping at the fault address)
0x0000000100001810: main+0x06cc:        ld       [%o0], %g2

The script to do all the mounts took 3m41.07s to run so that is a bit quicker than the full install and patch!

Now I will have to tidy this up for general consumption, make it so that it will run via RBAC (chroot will always need some sort of privilege) I've still not given up on getting the automounter to do the majority of the work mainly as it will save me having to work out how to unmount all the loop backs.

So if you are about to send a core file from an application to Sun to debug. Remember to include the output from showrev -p from the system that created the core file when the core was produced.

Monday Oct 11, 2004

Automounting ISO images....

One thing that I find irritating on Solaris is not being able to look into ISO images that I have on disk. Obviously I can do this but it involves, being root, knowing about lofiadm and then typing a mount command. It really should not be like that. Well at least for me now it is not. I just put the ISO file in ~/isos (or a symbolic link) and then cd /iso/${LOGNAME}/image where image is the name of the ISO, eg:


$ ls ~/isos
Trusted_S8_0401_CD1_of_2.img
$ ls /isos/cjg/Trusted_S8_0401_CD1_of_2.img
Copyright          Trusted_Solaris_8
$ df -k  /isos/cjg/Trusted_S8_0401_CD1_of_2.img
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/lofi/1           539590  539590       0   100%    /isos/cjg/Trusted_S8_0401_CD1_of_2.img
$

All this is achieved with three short shell scripts. First one that will mount an iso image directly:


#!/bin/ksh -p

function getsubopts
{
	typeset x
	typeset y
	typeset generic_opts

	y=$1
	x=${y##,\*}

	while [[ "${x}" != "" ]]
	do
		case $x in
			lofi_backfs=\*) BACK_FS=${i#lofi_backfs=} ;;
			\*) generic_opts="${generic_opts}${generic_opts:+,}$x" ;;
		esac
		y=${y%${x}}
		y=${y%,}
		x=${y##,\*}
	done
}

while getopts Ormqo: c
do
	case $c in
		o) generic_opts=$(getsubopts $OPTARG) ;;
		O|m|q|r) opt="${opt}${opt:+ }-$c" ;;
	esac
done

shift $(($OPTIND - 1))

image=$1
dir=$2

lofidev=$(lofiadm ${image} 2> /dev/null)
if [[ "${lofidev}" = "" ]]
then
	lofidev=$(lofiadm -a ${image})
	cleanup="lofiadm -d ${image}"
else
	cleanup=:
fi
fstype=${BACK_FS:-$(fstyp ${lofidev})}

if ! mount -F ${fstype} ${generic_opts:+-o }${generic_opts} ${opt}  $lofidev $dir 
then
	${cleanup}
	exit 1
fi
exit 0

That goes in /usr/lib/fs/isofs/mount.

Now this command:

mount -F isofs /iso_store/Trusted_Solaris/Trusted_S8_0401_CD2_of_2.img /mnt



will mount the iso the image /iso_store/Trusted_Solaris/Trusted_S8_0401_CD2_of_2.img automatically without any need to mess with creating lofi device by hand. Then I just need a couple of executable automount maps so that I can let the automounter do it's stuff.

First /etc/auto_iso:

#!/bin/ksh -p
if ! getent passwd $1 > /dev/null
then
        exit 0
fi
root_dir=/var/run/auto_iso
user_dir=${root_dir}/users
master_script=${root_dir}/master_script
if mkdir -p ${user_dir}
then
        cp /etc/auto_iso_users $master_script
fi
if ! [[ -f  ${user_dir}/$1 ]]
then
        ln $master_script ${user_dir}/$1
fi
echo "/ -fstype=autofs ${user_dir}/$1"
exit 0

This just allows me to have a separate executable map per user, so that that executable map can have some context, (the users login name), passed in as the name of the script. for who it is run. The script /etc/auto_iso_users then does the work for the users:

#!/bin/ksh -p

user=${0##\*/}

if [[ -f ~${user}/isos/$1 ]]
then
        echo / -fstype=isofs ~${user}/isos/$1
fi
exit 0

Nice simple one to finish with, then I just needed the /etc/auto_master entry:

/isos           auto_iso -nosuid

This or this can be added to another automount point, that we already have, like /share by adding this entry:

isos      / -fstype=autofs /usr/local/share/sh/auto_iso

The only problem with this is that there is no way for the system to automatically destroy the lofi devices when they are unmounted..

Update: The lofi devices can be removed using dtrace.

Monday Sep 27, 2004

Solaris automounting to the Nth....

My first introduction to the automounter was when it arrived as part of NSE in the days of SunOS 3.5 it was always kind of neat, but with the advent of autofs it was ready for some real abuse.

From contacts with the development team and comments in bug reports it is now clear that it was never intended that you could do some of the things that you can do with the automounter, and parts of this certainly fall into that category.

Sustaining Solaris

One of the problems with sustaining Solaris is that you have to have a build system for each architecture and release you are sustaining. Needless to say this either leads to you having to install a build system each time or have some form of multi boot or have lots of systems.

We used to opt of the lots of systems but that meant that they were relatively small and so builds would take a long time. Plus they each sat idle for long periods of time.

Reinventing the wheel

To solve this we built 2 systems, one SPARC and one x86 onto which we restored all the root images of our build servers in a sub directory such that we could chroot into the directories and build any release. Neat and for the most part it worked very well. Only recently did I discover from a colleague that this was the same as one of the original uses for chroot.

Dr. Marshall Kirk Mckusick, private communication: ``According to the SCCS logs, the chroot call was added by Bill Joy on March 18, 1982 approximately 1.5 years before 4.2BSD was released. That was well before we had ftp servers of any sort (ftp did not show up in the source tree until January 1983). My best guess as to its purpose was to allow Bill to chroot into the /4.2BSD build directory and build a system using only the files, include files, etc contained in that tree. That was the only use of chroot that I remember from the early days.''

This all relies on the interface between the kernel and libc not changing for the system calls used in the build, which fortunately for SunOS 5.6 to 5.9 is the case. However for 5.10 we have had to build new systems, but hey we still only have 4 build systems instead of 10. This allows those systems to be really powerful and hence our builds take less time.

Automounting everything

Once all this was done we realised that it would be nice to have our home directories and other mount points available under the chroots, so we loopback mounted the /home autofs mount point from the real root, then there were other mount points so we started, but did not get far with, building a /etc/vfstab file that would do this. The revelation was when we used the automounter to mount the chroot areas. We use it to mount all the files and directories from the real root file system to get things working.

Lets look at the automounter map entry for 5.9:

on81
    / -suid,ro cpr-bld.uk:/export/d10/roots/&/$CPU
    /devices            -fstype=lofs,rw /devices
    /dev                -fstype=lofs,rw /dev
    /etc/passwd         -fstype=lofs,ro /etc/passwd
    /etc/shadow         -fstype=lofs,ro /etc/shadow
    /share              -fstype=lofs    /share
    /home               -fstype=lofs    /home
    /vol                -fstype=lofs    /vol
    /usr/dist           -fstype=lofs    /usr/dist
    /var/tmp            -fstype=lofs    /var/tmp
    /var/nis            -fstype=lofs,ro /var/nis
    /var/run            -fstype=lofs    /var/run
    /var/adm/utmpx      -fstype=lofs,ro /var/adm/utmpx
    /var/spool/mqueue   -fstype=lofs    /var/spool/mqueue
    /tmp                -fstype=lofs    /tmp
    /export             -fstype=lofs    /export
    /local              -fstype=lofs,ro /local
    /local/root         -fstype=lofs,ro /
    /opt/cprbld         -ro             cpr-bld.uk:/export/d10/roots/cprhome/$CPU
    /ws                 -fstype=lofs    /ws
    /net                -fstype=lofs    /net
    /usr/local          -fstype=lofs    /usr/local
    /proc               -fstype=lofs    /proc
    /opt/SUNWspro       -fstype=lofs,ro /share/on81-patch-tools/SUNWspro/SC6.1
    /opt/teamware       -fstype=lofs,ro /share/on81-patch-tools/teamware
    /opt/onbld          -fstype=lofs,ro /share/on81-patch-tools/onbld
    /etc/mnttab         -fstype=lofs,ro /etc/mnttab
    /var/spool/clientmqueue -fstype=lofs,rw /var/spool/clientmqueue
    /share/SUNWspro_latest -fstype=lofs /share/eu/lang/solaris/$CPU/links_perOS/latest_5.9/SUNWspro
    /share/SUNWspro_prefcs -fstype=lofs /share/eu/lang/solaris/$CPU/links_perOS/prefcs_5.9/SUNWspro

Now you can see we are automounting all sorts of things that you may not expect. In particular /etc/passwd and /etc/shadow so that we get the same password entries as the host system. In our world /home and /share are automount points, but for since the automounter runs in the real root automount maps that contain $OSROOT to select a particular OS specific mount point get the wrong entry when in the chroot. Hence we have the two entries that are in green.

The one thing that does not work is /etc/mnttab, since unlike the mnttab used in zones it has no knowledge of the chroot so gives bogus information.

Does it work? Yes, well enough for our old build systems to be consigned back to the lab for general use and us to be allowed to have some fast ones with lots of CPUs for our real systems.

For those Sun Employees who wish to know more see http://pod6.uk

About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today