Saturday Feb 20, 2016

Solaris 11.3 SRU 5.6: updates in ps(1) and /proc/<pid>/{cmdline,environ,execname}

Almost as soon as Solaris 2.0 was released, people started to complain about the limit of the ps(1) command line output; it was limited to 80 characters. The standard ps(1) command was also not able to print the environment variables.

The /usr/ucb/ps command could, but it needed to trawl through the address space of the target process.  In order to do so it needs to have at least the same privileges and uids/gids to prevent privilege escalation.  Simple having the {proc_owner} privilege is not sufficient.

When we added pkill(1)/pgrep(1), they to were limited in the same way: they could only find search the first 80 bytes of the command line (PRARGSZ) and the first 16 bytes of the command name (PRFNSZ).

 These were serious limitation; for one, it became difficult to find a specific java process as the typical java command line is generally much larger than 80 bytes and the often the important jar file is beyond the 80 byte limit.

 Of course, our customers did not like this limit either.

 We fixed this problem in Solaris 12 and now also in Solaris 11.3 SRU 5.6 by adding three new files under /proc/<pid>:

  • cmdline - all original arguments separated by NUL bytes
  • environ -  all original environment values separated by NUL bytes
  • execname - the original program name given to exec.

The cmdline and execname are publicly readable; the environ file is restricted to the owner of the process or those processes which have the {proc_owner} privilege. The cmdline and environment file are very similar to those found under Linux, however these do reflect the actual argument vectors in the process' address space, so they do not reflect the changes made by the programs themselves.

A new -o format option "env" was added to ps(1); the new files are used and ps(1) will now display the full command line.

 As neither ps(1) or ps(1b) needs to open /proc/<pid>/as, fewer privileges are now needed and read access to he executable is no longer required: this is big performance win for ps(1b) especially when NFS binaries are in the mix.

As I basically back ported changes to ps and /proc from Solaris 12, the whole list bugs and enhancement is as follows:

        PSARC/2015/207 /proc/<pid>/{cmdline,environ,execname} extensions to /proc.
        15742822 SUNBT7092685 Extend /proc interfaces to allow ps(1) to show more of the command
        15420404 SUNBT6599384 pgrep/pkill don't find processes with 16 char filenames or match ...
        19669195 memory-leak in ucb_procinfo of ucbps.c:569
        15227016 SUNBT5100626 ps(1) sometimes shows an empty string for the ttyname
        15282779 SUNBT6313436 /usr/ucb/ps malloc() failure results in unexpected argument parsing
        14966583 SUNBT4157509 /usr/ucb/ps not bsd or sunos 4.x compatible on command line
        15488063 SUNBT6715628 ps -d makes -z have no effect
        21447952 /usr/ucb/ps gxw hangs, but w/out the w does not; never open /proc/<pid>/as
        21297345 procfs limits the size of the control messages
        15582848 SUNBT6872216 ps command needs to keep trackof prior name/uid information
        15584899 SUNBT6875625 ps command should chdir to /proc to remove lock contention

Tuesday Jul 07, 2015

Solaris 11.3: rtc(1m) no longer warps the time by default

On x86 hardware Solaris derives the time-of-day from the "real-time-clock (RTC)"; traditionally, this clock is defined ticking in "local time".

 That has always been problematic when you have multiple OSes installed on the same hardware. All OSes will want to change RTC when they believe that we have just crossed one of the day-light-saving time boundaries. With Solaris 11's new boot environment, it is even a problem when you only have one OS installed but with multiple boot environments.  That is why Solaris 11 prefers to run the RTC in UTC so it never needs to be changed and all the boot environments are perfectly happy.

If you want to change the timezone for the RTC as recorded in /etc/rtc_config, you would use "rtc -z <timezone>"; unfortunately, the existing behavior was to warp the time. That was rather surprising as most of the time the system's time is properly set. This behavior has always annoyed me but it wasn't until some customer complained about this behavior in comp.unix.solaris that I realized that I wasn't the only person annoyed and so I decided to fix it.

In Solaris 11.3 rtc(1m) will no longer wrap the time.  If you really want to warp the time you will now need to use the new "-w" option.

Solaris 11.3: New Immutable Global Zone file-mac-profile: dynamic-zones

In Solaris 11.2 we introduced the Immutable Global Zone.  Just like the Immutable Zones introduced in Solaris 11/11, it supports three different file-mac-profiles: strict, fixed-configuration and flexible-configuration.

To refresh your memory, these three file-mac-profiles as well as the default value, "none",  are described in zonecfg(1m) as follows:

           There are currently four supported values for this property:  none,
           strict, fixed-configuration, and flexible-configuration.

           none  makes the zone exactly the same as a normal, r/w zone. strict
           allows no exceptions to the read-only  policy.  fixed-configuration
           allows  the zone to write to files in and below /var, except direc-
           tories containing configuration files:


           flexible-configuration is equal to fixed-configuration, but  allows
           writing to files in /etc in addition.

In Solaris 11.3 we are adding fourth file-mac-profile: dynamic-zones.  It should be seen as sitting between fixed-configuration and flexible-configuration.

This particular profile is only valid for the global zone; it allows the administrator to create and destroy non-global zones, kernel zones, etc.

While this is already possible with the flexible-configuration, that file-mac-profile allows the ability to change much of the system configuration; but with the other profiles, creating or destroying a zone requires using the Trusted Path.  The dynamic-zones profile is a compromise: it allows to restrict the configuration of the system, yet it does allow a user with proper authorizations to create and destroy zones.

The dynamic-zones profile was targeted specifically at using an immutable global zone on the OpenStack Nova compute nodes.

Solaris 11.3: New per-share, per-instance reserved port property for NFS

It sounds like a lifetime ago, that I added the following question to the Solaris FAQ:

7.8) How can I make the NFS server ignore unprivileged clients?

    In a restricted environment, i.e., an environment where the
    administrator controls root access, you can enhance NFS security
    by setting the "NFS_PORTMON" variable.  This variable is set in
    /etc/system, like this:

    * Prior to Solaris 2.5
    set nfs:nfs_portmon = 1

    * Solaris 2.5 and later
    set nfssrv:nfs_portmon = 1

 You could wonder why this was never the default, the answer is that reserved ports are a BSD Unix invention from the time that computers where large and centrally administrated; an invention later copied to all Unix like operating system but outside of that world it makes little sense. As a result, many NFS clients can use any port and might not be able to restrict the ports they use.

The "nfs_portmon" variable was global; Solaris has evolved and now has multiple different NFS server instances (one for each zone); customers also have requested to have a per-share setting.

In Solaris 11.3 we introduce a new sharectl property:

 # sharectl get -p resvport nfs

as well as a new resvport share option:

# zfs get share.nfs.sys.resvport build/casper
NAME          PROPERTY                    VALUE  SOURCE
build/casper  share.nfs.sec.sys.resvport  off    default

The sharectl property is global for the NFS server instance; if it is set to true, this overrides per-share properties.  If a system is upgraded, it will take the value from /etc/system and it will log a message that in future, sharectl(1m) should be used instaed.

When the sharectl property is set to false, you can set resvport for each share individually.  As you can that this is restricted to the "sys" security mode; when proper security such as Kerberos V is used, we do not verify that the NFS client uses privileged ports.

It goes without saying that actual NFS security can only be had when using a security mode other than "sys"

Wednesday Sep 24, 2014

getcwd(NULL, 0) revisited

Earlier I claimed that the POSIX standard didn't allow an extension in which the following statement returned anything other than NULL while setting errno to EINVAL:

char *cwd = getcwd(NULL, 0);

However, standard also says:

"If buf is a null pointer, the behavior of getcwd() is unspecified."

so the GNU/Linux extension is perfectly legal.  I was clearly wrong.

As many application check for this behavior when configuring, replacing the standard Solaris getcwd() with a concoction which runs much slower and fails more often, when Solaris getcwd() is found wanting, we're changing getcwd() to allocate sufficient memory when it is called with a buffer pointer of NULL and a size of zero.

Of course, you could already have that in Solaris 11, if you replaced getcwd(NULL, 0) with realpath(".", NULL).

Addendum: this fix is included in Solaris 11.2 SRU 3

Monday May 26, 2014

Solaris 11.2: presentations in Holland

I'm giving a presentation on Solaris 11.2 new features among others in June  12th and 19th,2014; register here:

I'll be highlighting some new features but will also show how new and old features are combined and how Solaris 11 is our first operating system with a holistic design philosophy.

Friday May 16, 2014

Solaris 11.2: unlink(2)/link(2) for directories: your time is up.

Some thirty years ago, the 4.2BSD Unix release included two new system calls: mkdir(2) and rmdir(2).  Before that time, in order to make a directory, you first needed to call mknod(2) and create the "." and ".." links.  When you remove a file, you would remove those two links and finally unlink the directory itself. As you couldn't call mknod(2) as an ordinary user nor could you call unlink(2) on a directory, the mkdir(1) and rmdir(1) commands were set-uid root.  A cursory inspection of the UNIX-V7 showed that both commands likely had security bugs.

Did 4.2BSD remove the ability to link or unlink directories?  It didn't.  It was probably kept temporarily for backward compatibility.  But many years later, and many Unix releases later, it is still their; neither Sun in SunOS or Solaris, nor Oracle in Solaris 11/11 or 11.1.

If you ask fsck(1m), the final arbiter about what is a valid UFS file system, it will complain loudly and it generally required system admin intervention when you made an additional hardlink to a directory; this was later hidden by logging UFS; fsck was hardly ever run since the introduction of UFS logging especially once it became the default.  In tmpfs it was a good way to lose swap, hide data or confuse the kernel. Special code was needed in find(1) and du(1) to not lose their way when the file system isn't a tree but rather a cyclic graph.

It is one of the reasons why, when Solaris Zones were developed, we decided that non-global zones can only be run without the {SYS_LINKDIR} privilege and that when we introduced ZFS it came without the ability to use link(2) or unlink(2) on directories.  VxFS also doesn't allow additional hardlinks to directories. And no-one complained!

This discrepancy between the global zone and non-global zones and ZFS versus the rest and it gave us problems when developing code; code run in tmpfs file system in the global zone, suddenly stopped working when moved to a non-global zone; code that worked before in UFS stopped working when moved to ZFS or to a non-global zone.  As Linux never allowed unlink(2) on directories, code developed there might suddenly have disastrous effect on Solaris when it was run with (not-so) appropriate privileges under Solaris.  There were at least two cases during the development of Solaris 11.2 when we were bitten by this problem for code we developed ourselves.

The time has arrived to disable link(2) and unlink(2) on directories; and that is what we have done in Solaris 11.2.  The {SYS_LINKDIR} privileges still exists in Solaris 11.2 but it is obsolete and has no effect.  We will likely remove it in a future minor release.

Is this a sudden incompatible change?  Perhaps, but is well within the limits of the specification and using this feature only leads to downtime and support calls. Sorry for removing this rope from your toolbox.

Monday May 12, 2014

Solaris 11: Evolution of v_path.

In Solaris 10, Eric Schrock (now at Delphix) added vnode-to-pathname functionality in the kernel; it stored the pathname used to find a file in the vnode but it did not handle renames nor did it elide ".." from the stored pathnames; the pathname stored was generally a full pathname from the root from the global zone.  It was used for getcwd(3) and for path subdirectory in /proc/pid/.

The v_path was implemented as a hint and whenever it was retrieved, e.g., for getcwd(3) or for the /proc file ssytem, the actual path was computed and the current zone's root directory was removed.

When I started to work on the Extended Policy and later on the Immutable Global zone, it was clear that the v_path was very useful but it wasn't ready for those projects.

The Immutable Non-Global Zone (Solaris 11/11)

In the IMNGZ we need to compute the pathname and then check the pathname against the black-list and the white-list; however, where we are doing that the kernel is deep inside the file system code and we can't verify and recompute the pathname as we might be hold locks that we need further down; but since we are protecting a particular set of files and those files cannot be changed or renamed, it is safe to use the v_path as if it is more than a hint.  We did need to elide ".." and simplify pathnames; this is done directly when we are setting the v_path for a newly created pathname and if the code tries to add a ".." it instead removes the last component of the pathname. We did need to prevent linking protected files into the non-protected file space as that would circumvent the MWAC(5) protection offered in an IMNGZ.

The Extended Policy (Solaris 11.1)

The Extended Policy applies to all filenames in the filesystem, including those that can be renamed.  This is why we put some effort in handling renames better.  We now update the v_path name on rename(2) in all file systems; in the case of a link(2) we also handle this as a rename(2) as the observation is that the new name outlives the first name.  This new behavior works well with leaf nodes but there is no efficient algorithm that can handle the rename of a directory and all its children, yet we have no option other than using v_path for the same reasons we have for the IMNGZ. When we recalculate the pathname, e.g., for /proc or for getcwd() and we find it wanting, we update the v_path to the newly computed path, including all directories making up the full pathname.

One possible security risk is that a vnode has an incorrect v_path and the Extended Policy gives more privileges on that v_path then it gives for the actual pathname.  As this can only happen if the file once lived in that location this is not actually a risk at all; the process was able in the past to use those privileges on that file. We do make sure that linking is not allowed when the Extended Policy gives more privileges for the new pathname.

An update was needed for the secpolicy_*() routines to allow the Extended Policy to make a decision about files or directories that do not exist yet; as an extra benefit privilege debugging now gives even more information as we have more information deep down in the policy routines:

solaris11.0$ ppriv -De mkdir /casper
mkdir[11162]: missing privilege "ALL" (euid = 12345, syscall = 102) for "/" needed at zfs_zaccess+0x2c8
mkdir: Failed to make directory "/casper"; Permission denied

In Solaris 11.1 we know the full filename to be created and also show that with privilege debugging:

solaris11.1$ ppriv -De mkdir /casper
mkdir[13924]: missing privilege "ALL" (euid = 12345, syscall = 102) for "/casper" needed at zfs_zaccess+0x245
mkdir: Failed to make directory "/casper"; Permission denied

In Solaris 11.2 we also show the sycall name:

solaris 11.2$ ppriv -De mkdir /casper
mkdir[17488]: missing privilege "ALL" (euid = 12345, syscall = "mkdirat") for "/casper" at zfs_zaccess+0x245
mkdir: Failed to make directory "/casper"; Permission denied

Getcwd(3), realpath(3) fixes.

As part of the Extended Policy project, fixes to getcwd() and realpath() were made during the development of Solaris 11.1.  We've also put some of these fixes in 11.0 SRUs and in Solaris 10 patches. These fixes are the following:

  • Improved getcwd()/realpath() performance in zones.
  • Improved getcwd()/realpath() performance in the case of renaming (in some cases 1000x faster)
  • Fix getcwd() for chrooted process when the current working directory is not under the root directory. (This was a regression of the in-kernel getcwd())
  • Don't fail with EACCES so quickly
  • No limit on the size of the returned path from getcwd() and realpath()
  • realpath() moved into the kernel and the frealpath() system call (Solaris 11.1 and later only)

Several operating systems have "extended" getcwd(3) to return an unrestricted pathname when called as follows:

   char *cwd = getcwd(NULL, 0);

unfortunately, this is strictly forbidden by the standard:

     The getcwd() function shall fail if:


     EINVAL    The size argument is 0.

So in Solaris you have to loop with a longer and longer buffer until getcwd() no longer returns NULL with errno set to ERANGE or you could use realpath(".", NULL) in which case we can return a long pathname.

Both are actually a lot faster than running your own userland getcwd() implentation and such implementations are more likely to fail.

Friday May 02, 2014

Solaris 11.2: No Limits

In the past, I have increased a number of limitations in Solaris:

  • In Solaris 11.0, I increased NGROUPS_MAX to 1024 (from 32); also available since Solaris 10u8.
  • In Solaris 11.1, I added support for more than 16 groups for NFS AUTH_SYS authentication
  • In Solaris 11.1, I changed the system calls getcwd() and realpath() to support returning pathnames longer than MAXPATHLEN (and introduced frealpath() while I was in that code)

So what did I change in Solaris 11.2?   It was about time to look at the restrictions of user names and group names.

In a micro release, such as a Solaris 11 update, we cannot modify constants such as LOGNAME_MAX because of binary compatibility, we can only do that in a future minor release.  However, we can modify the code that limit usernames.  These are the bugs we have fixed and this shows how much work it actually was:

    14933330 SUNBT4033673 getlogin causes passwd to fail if login name is longer than 8 chars
    14954449 SUNBT4109819 programs inconsistently limit the size of user names
    15059729 SUNBT4435330 logname(1) prints out only part of long login name
    15178384 SUNBT4927530 *w* w(1) truncates usernames to 8 chars
    15393621 SUNBT6551524 su truncates LOGNAME for long usernames.
    15436992 SUNBT6627292 *cron* confused about username lengths
    15550167 SUNBT6819489 *su* sulog source username truncated to 8 chars but not destination
    15574163 SUNBT6857992 ps -u does not support usernames longer than 10 chars
    15579148 SUNBT6866548 last command does not support usernames longer than 8 characters
    17528753 group name handling in Solaris is a standards violation
    17528788 useradd(1m) user name handling problems
    17600453 bug 15226690, find with long usernames, not completely fixed
    17600724 The fix for 14954449 misses some programs (in.rlogind, in.rshd. zone*, dump)
    17625438 group file updates very inefficient.
    17625458 pwck lives in the past
    18068180 SunSSH truncates usernames/home directories with %.100s
    18068355 A few programs still limit the size of user names.
    18068215 passmgmt invents its own limits for the sizes of entries in /etc/passwd

In generaly, the code was changed to lift limits, but we are generally limited by the format of the utmpx file.  The maximum length of a username that can be stored there, is 32 bytes.  This is now a safe limit and we support user names in length upto 32 characters, despite protests from useradd(1m).  getlogin() and getlogin_r() can return a string of at most 33 characters, including the final NUL character.  Of course, getlogin_r() will not store past the end of the buffer given to it but it will now accept a buffer of any size.   Programs changed are, among others:

  • logname(1)
  • w(1)
  • who(1)
  • last(1)
  • ls(1)  - now a 64 bit executable
  • find(1) - now a 64 bit executable
  • passmgmt(1)
  • useradd/usermod/roleadd/rolemod(1m)
  • sshd(1mr)
  • repquota(1m)
  • zfs(1)
  • yppasswd(1)
  • tar(1)
  • lastcomm(1)
  • cron(1) etc
  • newtask(1)
  • ps(1)
  • wall(1)
  • rwall(1)
  • zlogin(1)
  • grpck(1)
  • pwck(1)
  • login(1)
  • in.rexecd(1m), in.rshd(1m), in.rlogind(1m)

And libraries such as libsocket (remote shell/remote login/rexec protocol)

I could only wonder why so many applications cache the return value of getpwuid() and getgrgid() while doing that in a fixed sized character array.

For reasons only known in New-Jersey, we didn't allow groupnames over 8 characters while limiting the characters to lower case and digits; as there is no manifest constant defining the size of a group name, there is no problem increasing it so we currently support upto 32 characters and we now accept all portable file name characters in a group name (lower and upper case, digits, dot, hyphen and underscore as long as the name doesn't start with a hyphen. Other than programs caching the result of getpwuid(), I found no other limits on the length of a group name in our code.

Thursday May 01, 2014

Solaris 11.2: Immutable Global Zone

This is blog is a bit more substantial; it requires some knowledge about Solaris Zones, Immutable Zones and Solaris administration in general. It is high-level; in future I'm hoping to get down to the nuts and bolts.

Immutable Zones

In Solaris 11 we added the Read-Only Root Non-Global Zones, marketed as Immutable Zones; this is a feature that makes a zone tamper-proof.

In an Immutable Zone is configured simply by setting the "file-mac-profile" to one of "strict" (not much writeable), "fixed-configuration" and "flexible-configuration" (configuration is writeable but binaries and such or not). This is all implemented in the kernel based on pathnames and depending on the context; the super-user in the global zone can still update the zone or even modify protected files as long as that is not done from within the zone.

We have made some changes to Immutable Non-Global Zones (IMZ, for short) that came out of developing the Immutable Global Zones (IMGZ); we have added a new feature, the "Trusted Path (TP)"; when logged in through the Trusted Path using the "-T" option to zlogin(1m), you can now modify protected files from within the zone. This is much safer as you no longer need to give root access in the global zone nor do you need to boot the IMZ in writeable mode. In the following example, we log in to the zone "fixed" which has been configured with the fixed-configuration file-mac-profile. A normal root login doesn't allow us to modify "/etc/passwd"; I'm using touch(1) under privilege debugging to illustrate the error and was caused the error. When we login with the "-T" option we suddenly can modify "/etc/passwd" because we're now in the Trusted Path. Notice also that the output from privilege debugging has been clarified; it points to the MWAC(5) manual page and it now also lists the system call name and not the number as it did before.

# zlogin fixed
[Connected to zone 'fixed' pts/3]
Oracle Corporation      SunOS 5.11      11.2    April 2014
root@fixed:~# ppriv -De touch /etc/passwd
touch[117063]: MWAC(5) policy violation (euid = 0, syscall = "utimensat") for "/etc/passwd" at fop_setattr+0x10b
touch: cannot change times on /etc/passwd: Read-only file system
root@fixed:~# logout
[Connection to zone 'fixed' pts/3 closed]

# zlogin -T fixed
[Connected to zone 'fixed' pts/3]
Oracle Corporation      SunOS 5.11      11.2    April 2014
root@fixed:~# ppriv -De touch /etc/passwd

Additionally, we have restricted the use of mount(1m) in an IMZ, while we allowed random loopback mounts before we now only allow loopback mounts on empty directories unless the file or directory isn't protected by MWAC(5).

Immutable Global Zone

In order to prevent tampering of the file system, we have extended Immutable Zones in Solaris 11.2 to the global zone; using the same mechanism you can now configure the global zone as an IMGZ. As there is no "super-global" zone, a different mechanism has been designed to enter the Trusted Path. A kernel-zone still has a bare-metal zone controlling it, so this doesn't apply to kernel zones. Some additional steps need to be taken and they are listed here.

Preparing the global zone for immutable global zone.

As maintenance of the global zone is only possible using the Trusted Path access; Trusted Path is only available on the console, so make sure the console is accessible through the ILOM, a serial connection or through the graphical console.

Once a system is configured as an immutable global zone, the break sequence, F1-A on a graphical console, <break> or the alternate break sequence (CR-tilde-<ctl-b>) on a serial console, will instead start the Trusted Path login. (A immediate second break sequence will work as a standard break-sequence: start the kernel debugger (if it is loaded), drop to the OBP, etc)

Configuring the Global Immutable Zone

The configuration of the global zone is done through zonecfg(1m) by picking the appropriate file-mac-profile for your situation; they allowed values are the same for non-global immutable zones: "strict", "fixed-configuration", "flexible-configuration". See zonecfg(1m).

Note that if the system uses DHCP to set network interfaces, the "flexible-configuration" must be selected.

        # zonecfg -z global
        zonecfg:global> set file-mac-profile=flexible-configuration
The "rpool" dataset will be restricted but sub dataset can be unrestricted using "add dataset"
        zonecfg:global> add dataset
        zonecfg:global:dataset> set name=rpool/export
        zonecfg:global:dataset> end

        zonecfg:global> add dataset
        zonecfg:global:dataset> set name=rpool/zones
        zonecfg:global:dataset> end
In this example we add "rpool/export" and "rpool/zones"; writable data sets for users and for zones. An immutable global zone can only run zones in unrestricted datasets. All the children of an unrestricted dataset are also unrestricted.

Note that all datasets on other zpools are unrestricted and there is no needed to add them with "add dataset".

After committing the zonecfg boot information is written and the boot archive is updated:

        zonecfg:global> commit
        updating /platform/sun4u/boot_archive
When the system is configured, it should be rebooted the system will boot with an immutable global zone.

Maintenance of the immutable global zone

An immutable zone cannot be updated other then through the Trusted Path login or when the system is booted in writeable mode by using the "-w" flag when booting. Note that if you try to reboot the immutable zone with "reboot -- -w", the argument is ignored when not performed through the Trusted Path login.

After using the break-sequence on the console, you should be greeted with:

        trusted path console login:
Login and assume the root role; at that point ordinary commands used to update the system are available; this includes "pkg update", "beadm activate" or also "zonecfg" if the need arises to change the global zone's configuration.

A separate pam stack can be configured for tpdlogin(1).

When "pkg update" is performed, the first boot of the immutable global zone is read write; this is needed by the system to perform the needed self-assembly steps. When the self-assembly steps have been performed, the system will reboot and in this second boot the system will be immutable again.

Wednesday Apr 30, 2014

Solaris 11.2: User, Pid and Commands in netstat(1m)

As it has been years since I've blogged, let me start with one of smallest features I added to Solaris 11.2; an option to netstat(1m), allowing administrators to figure out who is using which port and which with process or command is using a particular network connection.

As there little or no similarity between other netstat implementation, we picked our own option letter "-u". At the same time we realigned the columns as the standard width didn't fit modern TCP window sizes, the length of Unix sockets, etc. We've also removed, for unprivileged users, unusable information such as the "kernel addresses", leaving a bit more room, though an 80 width terminal isn't really enough room for all of the information. Alignment only guaranteed with -n, of course.

Our implementation doesn't use /proc like the Linux implementation uses nor does it look through /dev/kmem like lsof(1m) does; instead we get the information available direct in the kernel. While some of the information might be out of date, we can give information about sockets in TIME_WAIT or CLOSE_WAIT, even when the latter sockets haven't been accepted yet! Additionally, those sockets owned by the kernel are also listed. This works in the global zone, non-global zones, kernel zones *and* even in Solaris 10 branded zones; the latter uses the "native" Solaris 11.2 netstat command.

Here is some sample output, partially hidden by how we format blogs (so, install Solaris 11.2 and all will be revealed)

% netstat -aun

   Local Address        Remote Address      User    Pid      Command       State
-------------------- -------------------- -------- ------ -------------- ----------
      *.50258                             root       1038 syslogd        Idle
      *.*                                 root        133 in.mpathd      Unbound
      *.*                                 root        133 in.mpathd      Unbound
      *.*                                 netadm      721 nwamd          Unbound
      *.*                                 netadm      721 nwamd          Unbound
      *.123                               root        961 ntpd           Idle
      *.123                               root        961 ntpd           Idle                             root        961 ntpd           Idle
10.311.249.18.123                         root        961 ntpd           Idle
      *.111                               daemon      980 rpcbind        Idle
      *.*                                 daemon      980 rpcbind        Unbound
      *.41327                             daemon      980 rpcbind        Idle
      *.111                               daemon      980 rpcbind        Idle
      *.*                                 daemon      980 rpcbind        Unbound
      *.37058                             daemon      980 rpcbind        Idle
      *.*                                 root        988 in.ndpd        Unbound
      *.*                                 root        999 statd          Unbound
      *.*                                 root        999 statd          Unbound
      *.39150                             root        999 statd          Idle
      *.43382                             root        999 statd          Idle
      *.4045                              daemon     1008 lockd          Idle
      *.4045                              daemon     1008 lockd          Idle
      *.56874                             root       1004 inetd          Idle
      *.37069                             root       1004 inetd          Idle
      *.42765                             root       1148 mountd         Idle
      *.64957                             root       1148 mountd         Idle
      *.2049                              root       1150 nfsd           Idle
      *.2049                              root       1150 nfsd           Idle

   Local Address                     Remote Address                   User    Pid      Command       State      If
--------------------------------- --------------------------------- -------- ------ -------------- ---------- -----
      *.*                                                           root        133 in.mpathd      Unbound    
      *.*                                                           netadm      721 nwamd          Unbound    
      *.123                                                         root        961 ntpd           Idle       
::1.123                                                             root        961 ntpd           Idle       
      *.111                                                         daemon      980 rpcbind        Idle       
      *.*                                                           daemon      980 rpcbind        Unbound    
      *.41327                                                       daemon      980 rpcbind        Idle       
      *.*                                                           root        988 in.ndpd        Unbound    
      *.39150                                                       root        999 statd          Idle       
      *.4045                                                        daemon     1008 lockd          Idle       
      *.37069                                                       root       1004 inetd          Idle       
      *.42765                                                       root       1148 mountd         Idle       
      *.2049                                                        root       1150 nfsd           Idle       

   Local Address        Remote Address      User     Pid     Command     Swind  Send-Q  Rwind  Recv-Q    State
-------------------- -------------------- -------- ------ ------------- ------- ------ ------- ------ -----------             *.*            root        133 in.mpathd           0      0  128000      0 LISTEN
      *.111                *.*            daemon      980 rpcbind             0      0  128000      0 LISTEN
      *.*                  *.*            daemon      980 rpcbind             0      0  128000      0 IDLE
      *.111                *.*            daemon      980 rpcbind             0      0  128000      0 LISTEN
      *.*                  *.*            daemon      980 rpcbind             0      0  128000      0 IDLE
      *.36887              *.*            root        999 statd               0      0  128000      0 LISTEN
      *.65159              *.*            root        999 statd               0      0  128000      0 LISTEN
10.311.249.18.58810  10.312.132.13.636    root        851 nscd            49232      0  128872      0 ESTABLISHED
      *.4045               *.*            daemon     1008 lockd               0      0 1049200      0 LISTEN
      *.4045               *.*            daemon     1008 lockd               0      0 1048952      0 LISTEN
      *.22                 *.*            root       1030 sshd                0      0  128000      0 LISTEN               *.*            root       1068 sendmail            0      0  128000      0 LISTEN              *.*            root       1068 sendmail            0      0  128000      0 LISTEN
      *.47629              *.*            root       1148 mountd              0      0  128000      0 LISTEN
      *.35906              *.*            root       1148 mountd              0      0  128000      0 LISTEN
      *.2049               *.*            root       1150 nfsd                0      0 1049200      0 LISTEN
      *.2049               *.*            root       1150 nfsd                0      0 1048952      0 LISTEN             *.*            pkg5srv    1600                     0      0  128000      0 LISTEN
10.311.249.18.857    10.311.246.25.2049   casper        0 <kernel>        49232      0 1049800    116 ESTABLISHED
10.311.249.18.22     10.311.249.34.64127  root       1030 sshd           263536     63  128872      0 ESTABLISHED             *.*            casper     1969 sshd                0      0  128000      0 LISTEN

   Local Address                     Remote Address                   User    Pid      Command      Swind  Send-Q  Rwind  Recv-Q   State      If
--------------------------------- --------------------------------- -------- ------ -------------- ------- ------ ------- ------ ----------- -----
::1.5999                                *.*                         root        133 in.mpathd            0      0  128000      0 LISTEN      
      *.111                             *.*                         daemon      980 rpcbind              0      0  128000      0 LISTEN      
      *.*                               *.*                         daemon      980 rpcbind              0      0  128000      0 IDLE        
      *.36887                           *.*                         root        999 statd                0      0  128000      0 LISTEN      
      *.4045                            *.*                         daemon     1008 lockd                0      0 1049200      0 LISTEN      
      *.22                              *.*                         root       1030 sshd                 0      0  128000      0 LISTEN      
::1.25                                  *.*                         root       1068 sendmail             0      0  128000      0 LISTEN      
      *.47629                           *.*                         root       1148 mountd               0      0  128000      0 LISTEN      
      *.2049                            *.*                         root       1150 nfsd                 0      0 1049200      0 LISTEN      
::1.6010                                *.*                         casper     1969 sshd                 0      0  128000      0 LISTEN      
::1.51794                         ::1.6010                          casper     1970 xterm           130880      0  139264      0 ESTABLISHED 
::1.6010                          ::1.51794                         casper     1969 sshd            139060      0  130880      0 ESTABLISHED 

Active UNIX domain sockets
Type       User        Pid Command        Local Address                           Remote Address
stream-ord casper     1969 sshd            (socketpair)                            (socketpair)
stream-ord casper     1969 sshd            (socketpair)                            (socketpair)
stream-ord casper     1969 sshd            (socketpair)                            (socketpair)
stream-ord casper     1969 sshd            (socketpair)                            (socketpair)
stream-ord casper     1969 sshd            (socketpair)                            (socketpair)
stream-ord root        372 dbus-daemon    /var/run/dbus/system_bus_socket
stream-ord root       1028 rmvolmgr                                               /var/run/dbus/system_bus_socket
stream-ord root        372 dbus-daemon    /var/run/dbus/system_bus_socket
stream-ord root        943 hald                                                   /var/run/dbus/system_bus_socket
stream-ord root       1004 inetd          /system/volatile/inetd.uds
stream-ord root        943 hald           /system/volatile/hald/dbus-TM2nMhzrpM
stream-ord root        993 hald-addon-sto                                         /system/volatile/hald/dbus-TM2nMhzrpM
stream-ord pkg5srv    1601 httpd.worker   /system/volatile/pkg/sysrepo/wsgi.1601.0.1.sock
stream-ord root        943 hald           /system/volatile/hald/dbus-TM2nMhzrpM
dgram      root        988 in.ndpd        /system/volatile/in.ndpd_mib
stream-ord root        988 in.ndpd        /system/volatile/in.ndpd_ipadm
stream-ord root        970 hald-addon-cpu                                         /system/volatile/hald/dbus-TM2nMhzrpM
stream-ord root        943 hald           /system/volatile/hald/dbus-MIhDasTVfy
stream-ord root        944 hald-runner                                            /system/volatile/hald/dbus-MIhDasTVfy
stream-ord root        943 hald           /system/volatile/hald/dbus-MIhDasTVfy
stream-ord root        943 hald           /system/volatile/hald/dbus-TM2nMhzrpM
stream-ord root        372 dbus-daemon    /var/run/dbus/system_bus_socket
stream-ord root        922 console-kit-da                                         /var/run/dbus/system_bus_socket
stream-ord root        196 rad            /system/volatile/rad/radsocket-unauth
stream-ord root        372 dbus-daemon     (socketpair)                            (socketpair)
stream-ord root        372 dbus-daemon     (socketpair)                            (socketpair)
stream-ord root        196 rad            /system/volatile/rad/radsocket
stream-ord root        372 dbus-daemon    /var/run/dbus/system_bus_socket
Adding the option -v, you also get command line:
   Local Address        Remote Address      User    Pid     State       Command
-------------------- -------------------- -------- ------ ---------- ----------------
      *.50258                             root       1038 Idle       /usr/sbin/syslogd
      *.*                                 root        133 Unbound    /lib/inet/in.mpathd
      *.*                                 root        133 Unbound    /lib/inet/in.mpathd
      *.*                                 netadm      721 Unbound    /lib/inet/nwamd
      *.*                                 netadm      721 Unbound    /lib/inet/nwamd
      *.123                               root        961 Idle       /usr/lib/inet/ntpd -p /var/run/ -g
And for half-closed connection, you'd also get the information you want:
   Local Address        Remote Address      User     Pid     Command     Swind  Send-Q  Rwind  Recv-Q    State
-------------------- -------------------- -------- ------ ------------- ------- ------ ------- ------ -----------       casper     1033 closewait      130880      0  139264      0 FIN_WAIT_2      casper     1031 closewait      139264      0  130880      0 CLOSE_WAIT       casper     1033 closewait      130880      0  139264      0 FIN_WAIT_2      casper     1031 closewait      139264      0  130880      0 CLOSE_WAIT       casper     1033 closewait      130880      0  139264      0 FIN_WAIT_2      casper     1031 closewait      139264      0  130880      0 CLOSE_WAIT

PS: I used the Hollywood IP extension to masquerade the IP addresses.

Monday Mar 12, 2007

OGP election

About a week ago I accepted my nomination for the OGB after being nominated by Garrett D'Amore medio february.

Why do I nominate myself?

I've always felt a strong sense of community with all folks involved with Unix, SunOS and later Solaris. Having earned the dubious distinction of running one of the few Solaris 2.1 sites in production and sharing my experiences of that time with the world, I can truly say that I have been part of the Solaris community pretty much from the day it was born.

I joined Sun some years later, in 1995, and continued to be outward facing and involved with the community, regardless of whatever folly reigned at Sun at the time such as the time the edict came that all outside communication needed to be approved by a PR person. Surely they wouldn't have found the time to approve my 1000s of posts, even if they had found the will.

As the most prolific Sun employee/poster in OpenSolaris I believe I have firmly established my role as a community player; leading the laptop community and sharing some of the stuff I made through the OpenSolaris website.

I also think the OGB needs a person who is well-versed in Solaris "ON" development; someone who knows an ARC from a C-team and who has more than passing knowledge of our development process.

As an OGB member, I think I foremost want to focus on getting the open development process moving ahead more smoothly; and this does mean direct commit access. The current system is too much of a bottleneck for external development. As a Sun employee, I can look at both sides of the fence which can help resolve issues between Sun and the community.

But I also believe in quality all the time; it is what makes Solaris fairly stable to use, even using the more or less experimental releases.

I've written a bunch of code and have literally build 100s of kernels; some of which blew up spectacularly but others of which code found its way back into (Open)Solaris such as Solaris privileges, getpeerucred() I've distributed experimental code (acpidrv, powernow) and even experimented with the X server. As a security person, I have needed to touch larger parts of the system than many of my peers; security bugs know no boundaries.

I make a point of always running the latest release of Solaris Nevada on most of my systems, that is, unless there's fatal brakage. So my little home server, my laptop and my desktops all run Snv_59 today.

Once we have established all our procedures, I see the OGB pretty much as a hands-off body. We are there to quell conflicts but I don't think we should be pussyfooting around the mailing lists; a spade is a spade so let's not call it by another name. Arguments are healthy and should not be supressed unless they become destructive. I like to think that developers like it that way: just enjoy doing there thing with as little as possible outside interference.

Yes, I'm late in posting this; let me just say that life was pretty full the last few weeks. Both workwise (moving office and some urgent matters which required me to skip some vacation) and personal (buying a new house). Lame excuses, I know.

Let your vote be counted!

Thursday Oct 19, 2006

NLOSUG: 26/10/2006 Dutch OpenSolaris User Group First Meeting

The Dutch OpenSolaris User Group will have a meeting at the Sun offices in Amersfoort at Oktober 26, 2006. For program and registration see the website.

Monday Jan 16, 2006

Updated drivers: but only at

I've updated the powernow driver because of a serious incompatibility with the upcoming SOlaris build 32.
I've also updated acpidrv and also moved it to the opensolaris laptop community.

Start with downloading the frkit script and running it.

Tuesday Oct 25, 2005

Small acpidrv update

I've created a small update to acpidrv which lets you specify automatic shutdown parameters when your battery runs low.

As usual, it can be found here.




« May 2016