Wednesday Jun 03, 2015

Quick and Dirty iSCSI between Solaris 11.1 targets and a Solaris 10 Initiator

I recently found myself with a support request to do some research involving looking at the results of removing vdevs from a pool in a recoverable way while doing operations on the pool.

My initial thought was to make the disk devices available to a guest ldom from a control ldom, but I found that Solaris and LDOMS coupled things too tightly for me to do something which had the potential to cause damage.

After a bit of thought, I realised that I also had two Solaris machines already configured in our dynamic lab set up based in the UK that I could use to create some iSCSI targets that could be made available to the guest domain that I'd already built. I needed to use two hosts to provide the targets as for reasons that I really don't need to go in to, I wanted an easy way to make them progressively unavailable in such a way that I could make them available again. Using two hosts meant that I could do this with shutdown/boot.

The tricky part is that the ldom I wanted to test on was running Solaris 10 and the two target machines were running Solaris 11.1

I needed to reference the following documents

The boxes

Name Address Location Solaris Release
target1 UK Solaris 11.1
target2 UK Solaris 11.1
initiator Australia Solaris 10

Setting up target1

Install the iSCSI packages
target1# pkg install group/feature/storage-server
target1# svcadm enable stmf
  Create a small pool. Use a file as we don't have any extra disk attached to the machine and we really don't need much and then make a small volume.
target1# mkfile 4g /var/tmp/iscsi
target1# zpool create iscsi /var/tmp/iscsi
target1# zfs create -V 1g iscsi/vol0
  Make it available as an iSCSI target. Take note of the target name, we'll need that later.
target1# stmfadm create-lu /dev/zvol/rdsk/iscsi/vol0 
Logical unit created: 600144F000144FF8C1F0556D55660001
target1# stmfadm list-lu
LU Name: 600144F000144FF8C1F0556D55660001
target1# stmfadm add-view 600144F000144FF8C1F0556D55660001
target1# stmfadm list-view -l 600144F000144FF8C1F0556D55660001
target1# svcadm enable -r svc:/network/iscsi/target:default
target1# svcs -l iscsi/target
fmri         svc:/network/iscsi/target:default
name         iscsi target
enabled      true
state        online
next_state   none
state_time   Tue Jun 02 08:06:29 2015
logfile      /var/svc/log/network-iscsi-target:default.log
restarter    svc:/system/svc/restarter:default
manifest     /lib/svc/manifest/network/iscsi/iscsi-target.xml
dependency   require_any/error svc:/milestone/network (online)
dependency   require_all/none svc:/system/stmf:default (online)
target1# itadm create-target
Target successfully created
target1# itadm list-target -v
TARGET NAME                                                  STATE    SESSIONS  online   0        
        alias:                  -
        auth:                   none (defaults)
        targetchapuser:         -
        targetchapsecret:       unset
        tpg-tags:               default

Setting up target2

Pretty much the same as what we just did on target1. Install the iSCSI packages
target2# pkg install group/feature/storage-server
target2# svcadm enable stmf
  Create a small pool. Use a file as we don't have any extra disk attached to the machine and we really don't need much and then make a small volume.
target2# mkfile 4g /var/tmp/iscsi
target2# zpool create iscsi /var/tmp/iscsi
target2# zfs create -V 1g iscsi/vol0
  Make it available as an iSCSI target. Take note of the target name, we'll need that later.
target2# stmfadm create-lu /dev/zvol/rdsk/iscsi/vol0
Logical unit created: 600144F000144FFB7899556D5B750001
target2# stmfadm add-view 600144F000144FFB7899556D5B750001
target2# stmfadm list-view -l 600144F000144FFB7899556D5B750001
View Entry: 0
    Host group   : All
    Target Group : All
    LUN          : Auto
target2# svcadm enable -r svc:/network/iscsi/target:default
target2# svcs -l iscsi/target
fmri         svc:/network/iscsi/target:default
name         iscsi target
enabled      true
state        online
next_state   none
state_time   Tue Jun 02 08:31:01 2015
logfile      /var/svc/log/network-iscsi-target:default.log
restarter    svc:/system/svc/restarter:default
manifest     /lib/svc/manifest/network/iscsi/iscsi-target.xml
dependency   require_any/error svc:/milestone/network (online)
dependency   require_all/none svc:/system/stmf:default (online)
target2# itadm create-target
Target successfully created
target2# itadm list-target -v
TARGET NAME                                                  STATE    SESSIONS  online   0        
        alias:                  -
        auth:                   none (defaults)
        targetchapuser:         -
        targetchapsecret:       unset
        tpg-tags:               default

Setting up initiator

Now make them statically available on the initiator. Note that we use the Target Names we got from the last name of the earlier setups. We also need to provide the IP address of the machine hosting the target as we are attaching them statically for simplicity.
initiator# iscsiadm add static-config,
initiator# iscsiadm add static-config,
initiator# iscsiadm modify discovery --static enable
  Now we need to get the device nodes created.
initiator# devfsadm -i iscsi
initiator# format < /dev/null
Searching for disks...done

c1t600144F000144FF8C1F0556D55660001d0: configured with capacity of 1023.75MB
c1t600144F000144FFB7899556D5B750001d0: configured with capacity of 1023.75MB

0. c0d0
1. c0d1
2. c0d2
3. c1t600144F000144FF8C1F0556D55660001d0
4. c1t600144F000144FFB7899556D5B750001d0
Specify disk (enter its number):
  Great, we've found them. Let's make a mirrored pool.
initiator# zpool create tpool mirror c1t600144F000144FF8C1F0556D55660001d0 c1t600144F000144FFB7899556D5B750001d0
initiator# zpool status -v tpool
  pool: tpool
 state: ONLINE
 scan: none requested
tpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t600144F000144FF8C1F0556D55660001d0 ONLINE 0 0 0
c1t600144F000144FFB7899556D5B750001d0 ONLINE 0 0 0

errors: No known data errors

I was then in a position to go and do the testing that I needed to do.

Tuesday May 19, 2015

Solaris CAT 5.5 available

Last Friday we made the latest version of Solaris CAT (Crashdump Analysis Tool) available on MOS.

This version will now look at Solaris 11.2 (and future) crashdump files.

There is also new functionality which we will talk about in Solaris Cat blog in the near future.

To download it

  1. go to MOS and login
  2. Click on tab entitled "Patches and Updates" at top and enter 21099218, 21099215 for patch numbers
  3. Click on Search

Update 1

There was a comment (that has since been removed) suggesting that it should be part of the Solaris 11 support repository.

I absolutely agree with this. The question that needs to be asked though is, Given the number of customers that we've had asking for it, should we

  • Get it out any way that we can (which actually also took a lot of work)
  • Not release it in any form until we can make it available in the Solaris 11 support repository

We opted for the former.

We also have a more complete release announcement on the Solaris CAT blog

Monday Sep 29, 2014

Bashed and Shellshocked

What a last four days this has been! Certainly from the perspective of a support engineer dealing with this who on days one and two ended up going 43 hours without sleep.

Unless you've been off the grid for the last week, you would know about Shellshock (CVE-2014-6271, CVE-2014-7169, CVE-2014-7186 and CVE-2014-7187).

To get the obligatories out of the way first, ...

Oracle has released a formal alert about this vulnerability which you can read at

This document points to a MOS Note containing links to the fixes for this issue. Our current recommendation is to download all patches and IDR patches listed for your particular OS.

This document will be updated as patches get formally released. Given how quickly the initial IDRs were released as formal patches, I would not expect this to take very long.

I also need to thank those of you who have had patience while dealing with us in support on this. As you can imagine, we've had a huge number of calls on it, that given the nature of the issue and the fact that we hit the weekend, needed to be handled by a small number of support engineers. It's taken me most of the weekend to make sure that each of the calls that I own has a current status in it, which involved a couple of statuses from the initial updates (starting around 6am Friday morning Australia/Sydney) through various incarnations of updates. Any update that needs to go into all of my calls now will likely take at least four straight hours.

So I hope you can see that I am indeed grateful for your patience.

There are a few things that have cropped up quite a bit that I will detail here in the hope of avoiding some further calls on the issue.

The patch failed to install

The big one is that people get errors installing the Solaris 8 - Solaris 10 patches. Generally accompanied with a message from checkinstall about being unable to open something.

Checkinstall, runs as user nobody, group root (group 0 is really not special on Solaris, I would say that we get it because we only setuid(nobody) and don't touch the group), so if you don't have permissions for "nobody" to read the patch, it will fail. Check the permissions on each element of the directory path into which you extracted the patch. I've found that "chmod go=u-w path", where path is the directory the patch was installed in, fixes the issue. Of course extracting in somewhere like /tmp after a "umask 02" would also help.

Installing the patch didn't fix it

We've had a few folks tell us that they've installed the patch but bash still fails the tests. It's generally turned out that they've had another bash binary installed (eg in /usr/local/bin) that comes first in $PATH. Check "which bash".

Questions that we cannot answer

One last thing, Oracle has a policy of not announcing the time frame of releasing security fixes. If you ask when the IDR patches will become formal patches, we are not going to be able to answer you. What is worth noting though is that the -01 IDRs became formal patches right quickly. Indeed all fixes are now available in Solaris 11.2 sru 2.8.

Friday Jun 13, 2014

Why you should Patch NTP

This story about massive DDoS attacks using monlist as a threat vector give an excellent reason as to why you should apply the patches listed on the Sun Security Blog for NTP.

Monday Feb 24, 2014

That IOS bug

Alan Coopersmith has done an interesting writeup on compilers and the IOS bug that is worth having a read of.

Monday Dec 23, 2013

Who is renicing these processes?

I was helping out a colleague on such a call this morning. While the DTrace script I produced was not helpful in this actual case, I think it bear sharing, anyway.

What we wanted was a way to find out why various processes were running with nice set to -20. There are two ways in which a process can have its nice changed.

  • nice(2) - where it changes itself
  • priocntl(2) - where something else changes it

I ended up with the following script after a bit of poking around.

# dtrace -n '
syscall::nice:entry {
        printf("[%d] %s calling nice(%d)", pid, execname, arg0);}
syscall::priocntlsys:entry /arg2 == 6/ {
        this->n = (pcnice_t *)copyin(arg3, sizeof(struct pnice));
        this->id = (procset_t *)copyin(arg1, sizeof(struct procset));
        printf("[%d] %s renicing %d by %d",
            pid, execname, this->id->p_lid, this->n->pc_val); }'

There is an assumption in there about p_lid being the PID that I want, but in this particular case it turns out to be ok. Matching arg2 against 6 is so that we only get priocntl() calls with the command PC_DONICE. I could have also had it check the pcnice_t->pc_op but I can put up with the extra output.

So what happens when we have this running and then try something like

# renice -20 4147
dtrace: description 'syscall::nice:entry ' matched 2 probes
 0 508 priocntlsys:entry [4179] renice renicing 4147 by 0
 0 508 priocntlsys:entry [4179] renice renicing 4147 by -20

Which is exactly what we wanted. We see the renice command (pid 4179) modifying pid 4179.

Oh, why didn't this help I hear you ask?

Turns out that in this instance, the process in question was being started by init from /etc/inittab, as such starting with nice set to whatever init is running at. In this case it is -20.

Wednesday Sep 25, 2013

Counting how many threads a cv_broadcast wakes up

I had occasion during a call this week to want to observe what was causing a lot of threads to suddenly be made runnable, and thought I should share the DTrace that I wrote to do it. It's using the fbt provider so don't even think about considering the interfaces stable :)

# dtrace -q -x dynvarsize=4m -n '
BEGIN {trace("Monitoring ..\n"); }
fbt::cv_broadcast:entry {self->cv = (condvar_impl_t *)arg0; }
fbt::cv_broadcast:entry /self->cv && self->cv->cv_waiters>500/ {
       printf("%Y [%d] %s %d woken\n", walltimestamp, pid, execname, self->cv->cv_waiters);
fbt::cv_broadcast:entry /self->cv/ {self->cv = 0;}' 

I needed to make the dynvarsize 4m as I was running this on a pretty large machine so we were getting a lot of thread local variables created and destroyed.

I was rewarded with output like

Monitoring ..                                                                                                                       
2013 Sep 23 15:20:49 [0] sched 1024 woken

2013 Sep 23 15:21:28 [0] sched 1024 woken

2013 Sep 23 15:26:47 [0] sched 1024 woken

Posting in case anyone else has found themselves wanting to find out this kind of thing. Happy DTracing all.

Monday Mar 11, 2013

Oracle Support Service Request Surveys

This morning I had my manager tell me about a survey that was taken against one of my closed calls, where we had a very unhappy and dissatisfied customer.

On having a look at the survey comments, it looks like the dissatisfaction was with a completely different call, as the comments don't bear any resemblance to anything in that particular call.

I can understand that if you've had a poor experience, that if you get a survey call, you will want to use that opportunity to express your dissatisfaction, but, ...

It's really important that if you want your dissatisfaction to go to the group that needs to hear about it that the information is put against the correct call.

You don't need to wait to see if your call is going to be randomly selected for a survey. By sending email to and mentioning the SR in question after the call has been closed, you can request a survey on that call. In fact, I would encourage folks to do exactly this for any call in which they want to say negative (or positive) things about.

The surveys are important feedback, but expressing your dissatisfaction against the wrong call number does not get the comments to the people who need to see them.

Thursday Feb 28, 2013

A Solaris tmpfs uses real memory

That title may sound a little self explanatory and obvious, but over the last two weeks I have had two customers tell me flat out that /tmp uses swap and that I should still continue to investigate where their memory is being used.

This is likely because when you define /tmp in /etc/vfstab, you list the device being used as swap.

In the context of a tmpfs, swap means physical memory + physical swap. A tmpfs uses pageable kernel memory. This means that it will use kernel memory, but if required these pages can be paged to the swap device. Indeed if you put more data onto a tmpfs than you have physical memory, this is pretty much guaranteed.

If you are still not convinced try the following.

  1. In one window start up the command
    $ vmstat 2
  2. In another window make a 1gb file in /tmp.
    $ mkfile 1g /tmp/testfile
  3. Watch what happens in the free memory column in the vmstat.

There seems to be a misconception amongst some that a tmpfs is a way of stealing some of the disk we have allocated as swap to use as a filesystem without impacting memory. I'm sorry, this is not the case.

Tuesday Jan 22, 2013

Using /etc/system on Solaris

I had cause to be reminded of this article I wrote for on#sun almost ten years ago and just noticed that I had not transferred it to my blog.

/etc/system is a file that is read just before the root filesystem is mounted. It contains directives to the kernel about configuring the system. Going into depth on this topic could span multiple books so I'm just going to give some pointers and suggestions here.

Warning, Danger Will Robinson

Settings can affect initial array and structure allocation, indeed such things as module load path and where the root directory actually resides.

It is possible to render your system unbootable if you are not careful. If this happens you might try booting with the '-a' option where you get the choice to tell the system to not load /etc/system.

Just because you find a set of values works well on one system does not necessarily mean that they will work properly on another. This is especially true if we are looking at different releases of the operating system, or different hardware.

You will need to reboot your system before these new values will take effect.

The basic actions that can be taken are outlined in the comments of the file itself so I won't go into them here.

The most common action is to set a value. Any number of products make suggestions for settings in here (eg Oracle, Veritas Volume Manager and Filesystem to name a few). Setting a value overrides the system default.

A practice that I make when working on this file is to place a comment explaining why and when I make a particular setting (remember that a comment in this file is prefixed by a '*', not a '#'). This is useful later down the track when I may have to upgrade a system. It could be that the setting may actually not have the desired effect and it would be good to know why we originally did it.

I harp on this point but it is important.

Just because settings work on one machine does not make them directly transferable to another.

For example

set lotsfree=1024

This tells the kernel not to start running the page scanner (to start paging out memory to disc) until free memory drops below 8mb (1024 x 8k blocks). While this setting may be fine on a machine with around 512mb of memory, it does not make sense for a machine with 10gb. Indeed if the machine is under memory pressure, by the time we get down to 8mb of free memory, we have very little breathing space to try to recover before requiring memory. The end result being a system that grinds to a halt until it can free up some resources.

Oracle makes available the Solaris Tunable Parameters guide as a part of the documentation for each release of Solaris. It gives information about the default values and the uses of a lot of system parameters.

Monday Jul 30, 2012

Using a from a previous kernel patch (Just Don't)

I was just assisting a colleague with an issue that after patching they found that there was higher lock spinning in malloc() in libc.

He just told me that the customer copied the old libc into a directory in /tmp, changed LD_LIBRARY_PATH to point there first and ran their application observing that the issue went away.

OK, where do we start here, ...


Two things immediately spring to mind as to why this is a bad idea.

  1. libc is tightly linked to the kernel system call interfaces. These interfaces are private to libc. As such they can be changed as long as the same change is made in the libc code. If you mismatch libc and the kernel you risk incorrectly calling system calls, with potentially fatal consequences.
  2. Placing a library into /tmp (or a directory under /tmp). Picture the following scenario. Someone builds their own library (doesn't have to be libc, just has to be something that your application uses) and places it into the directory you added to your search path (eg renaming your directory and creating their own). Now we have the potential of having your application run trojan code with any kind of side effect. Similar issues if you leave the path in a startup script and reboot, if the directory doesn't exist, anyone can create it and do the same thing.

In short, please don't.

Sunday Jun 03, 2012

The Importance of Fully Specifying a Problem

I had a customer call this week where we were provided a forced crashdump and asked to determine why the system was hung.

Normally when you are looking at a hung system, you will find a lot of threads blocked on various locks, and most likely very little actually running on the system (unless it's threads spinning on busy wait type locks).

This vmcore showed none of that. In fact we were seeing hundreds of threads actively on cpu in the second before the dump was forced.

This prompted the question back to the customer:

What exactly were you seeing that made you believe that the system was hung?

It took a few days to get a response, but the response that I got back was that they were not able to ssh into the system and when they tried to login to the console, they got the login prompt, but after typing "root" and hitting return, the console was no longer responsive.

This description puts a whole new light on the "hang". You immediately start thinking "name services".

Looking at the crashdump, yes the sshds are all in door calls to nscd, and nscd is idle waiting on responses from the network.

Looking at the connections I see a lot of connections to the secure ldap port in CLOSE_WAIT, but more interestingly I am seeing a few connections over the non-secure ldap port to a different LDAP server just sitting open.

My feeling at this point is that we have an either non-responding LDAP server, or one that is responding slowly, the resolution being to investigate that server.


When you log a service ticket for a "system hang", it's great to get the forced crashdump first up, but it's even better to get a description of what you observed to make to believe that the system was hung.

Tuesday Jan 24, 2012

Using lightning from homedir on SPARC and x86 Solaris

I make great use of lightning in my thunderbird installation.

At the moment I am in the process of migrating from my Sun Blade 2000 Sun Ray server to an x86 based one.

The problem is that I am running the lightning plugin from my automounted home directory and the lightning plugin has one shared library ( in it.

Now the thunderbird as installed in Solaris 11 actually comes with a compatible lightning installed so you can use that. Unfortunately (or fortunately) I try to run current thunderbird (at the time of writing 9.0.1).

For reference, you can get the lightning plugin for Solaris from

The obvious answer would have been to install it where I keep my thunderbird executables, but I couldn't quickly work out how to do that.

I already had the SPARC version installed. Apart from the Identifier number being different the only differences in lightning.xpi (after unzipping it) appear to be a platform line in install.rdf and the shared library.

What I did was to make a directory in my thunderbird install directory to house the architecture specific library on both the SPARC and x86 machine.

$ mkdir /rpool/thunderbird/arch

On each machine I got hold of the shared library and put a copy of it into this directory.

$ unzip lightning.xpi
$ cp components/ /rpool/thunderbird/arch

The we head into the currently installed plugin in my home directory. Note the quotes. Shells have special meanings for braces.

$ cd '.thunderbird/profilename/extensions/{e2fda1a4-762b-4020-b5ad-a41df1933103}/components'
$ rm
$ ln -s /rpool/thunderbird/arch/ .

Almost there.

Now in the directory one up from the components directory there is a file called install.rdf. In this file there is the following line:


This needs to be commented out:

<!-- <em:targetPlatform>SunOS_sparc-sunc</em:targetPlatform> >

I now can run my thunderbird from either machine and continue to use lightning. I just need to follow this process whenever I upgrade thunderbird/lightning (Part of the reason for doing this blog).

As an aside, my /rpool/thunderbird and /rpool/firefox are each a zfs filesystem under rpool. Before I upgrade anything I make a zfs snapshot. That way if anything breaks, rolling back to a working version is trivial.

Friday Nov 25, 2011

Interim Patches for CVE-2011-4313 released through MOS

As reported on the article on the Sun Security Blog, interim patches are available for Solaris 8,9 and 10 directly from MOS without the need to log a Service Request. There is also Interim Relief available for Solaris 11, but at this point in time that will still require a Service Request.

As seen from running "named -V", these patches implement the same fix as ISC by taking Bind to the version:
BIND 9.6-ESV-R5-P1.

Thursday Nov 10, 2011

Upgrading Solaris 11 Express b151a with support to Solaris 11

The most common problem that I am seeing on the aliases this morning is folks who are running 151a with a support update finding that their upgrade is failing.

The reason for this is that the version of pkg that you need to do the upgrade is in SRU#13. You need to update to this before switching to the release repository and upgrading.

This is an absolutely required step.

If you have an SRU older than #13 and have already switched to the release repository, you will need to switch back to the support repository, update and then go back to the release repository.


* - Solaris and Network Domain, Technical Support Centre

Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him


« July 2015

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder