Friday Oct 31, 2014

Collected advice on Unix CLI Design & Implementation

Last week, a blog post by made the rounds. It presented nine suggestions on what makes a command a good citizen of the Unix command-line ecosystem, especially for fitting into pipelines and filters.

This reminded me of a longer list of guidelines I recently gathered as part of our efforts to train new hires in Solaris engineering. I polled long time engineers, trawled the Best Practices documents of our Architecture Review Committee, cross referenced to the WCAG2ICT accessibility guidelines for non-web applications recommended by Oracle’s accessibility group, and linked to our online documentation, to come up with our suggestions on writing new CLI tools for Solaris.

Since these may be useful to others writing commands, I figured I’d share some of them. I’ve left out the bits specific to complying with our internal policies or using private interfaces that aren’t documented for external use, but many of these are generally applicable. Do note that these are based in part on lessons learned from 40+ years of Unix history, and that history means that many existing commands do not follow these suggestions, and in some cases, can’t follow them without breaking backwards compatibility, so please don’t start calling tech support to complain about every case our old code isn’t doing one of these things.

One of the key points of our best practices is that many commands belong to part of a larger family, and it’s best to fit in with that existing family. If you’re writing a Solaris-specific command, it should follow the Solaris Command Line Interface Paradigm guidelines (as listed in the Solaris Intro(1) man page), but GNU commands should instead follow the GNU Coding Standards, classic X11 commands should use the options described in the OPTIONS section of the X(7) man page, and so on.

Command names & paths

  • Most new commands should have names 3-9 letters long. Command names longer than 9 letters should be commands users rarely have to type in.
  • Follow common naming patterns, such as:
    Pattern Usage
    *adm Command to change current state & administer a subsystem
    *cfg Command to make permanent configuration changes to a subsystem
    *info Command to print information about objects managed by a subsystem
    *prop Command to print properties of objects managed by a subsystem
    *stat Command to print/monitor statistics on a subsystem
  • Commands run by normal users should be delivered in /usr/bin/. Commands normally only run by sysadmins should be delivered in /usr/sbin/. Commands only run by other programs, not humans, should be in an appropriate subdirectory under /usr/lib/. (Commands not delivered with the OS should instead use the appropriate subdirectory under /opt instead of /usr in the above paths.)


  • Never provide an option to take a password or other sensitive data on the command line or environment variables, as ps and the proc tools can show those to other users. (see Passing secrets to subprocesses).
  • All commands should have a --help and -? option to print recognized options/arguments/subcommands.
  • Option parsing should use one of the standard getopt() routines if at all possible. If you don’t use one, your custom parser will need to replicate a lot of things the standard routines provide for error checking & handling.
    • When reporting errors, be specific about which argument/option failed, don’t just dump usage output and make the user guess which part of the command line was wrong. (See WCAG2ICT #3.3.1. Examples of fixing this in X11 programs: bitmap, fslsfonts, mkfontscale, xgamma, xpr, xsetroot.)
    • If possible, provide suggestions to correct - if option is invalid, list options that would be valid. Same for subcommands, arguments, etc. (See WCAG2ICT #3.3.3.)
  • Option flags should be similar across commands when possible.


If you are writing a command that uses subcommands, then being careful in your work can make your command much easier to use.

Good examples to follow: hg, zfs, dladm

  • The help subcommand should list the other subcommands, but not overwhelm the user with pages of details on all of them. (Remember, the Solaris kernel text console has no scrollback and users with text-to-speech don’t want 10 minutes of output from it.)
    Good examples: hg, svccfg
  • The help foo or foo --help subcommands should list the options specific to that subcommand.
    Good examples: hg
  • Look at existing commands with similar subcommands and use similar names for your subcommands

Text output

  • All functionality should be available when TERM=dumb. Use of color output, bold text, terminal positioning codes, etc. can be used to enhance output on more capable terminals, but users need to be able to use the system without it. Users may need to run different commands to get plain text interface instead of curses/terminal mode, such as ed instead of vi, or mailx vs. mutt, as long as it’s clearly documented what they need to run instead, but they must be able to get their work done in some way. (See WCAG2ICT #1.3.2, WCAG2ICT #1.4.1, & WCAG2ICT #1.4.3)
  • Text output is generally composed of messages and data. Messages are the text included in the program, such as status descriptions, error messages, and output headers; while data comes from the subsystem the command interacts with, and depends on the system in question.
    • Messages displayed to users should use gettext(3C) to allow translation & localization.
    • Errors should be printed to stderr, other output to stdout, unless specific output redirection options (such as logging errors to a file) are given.
  • Users should be able to disable any use of ASCII art, line drawing characters, figlet-style text and any other output other than plain text which a text-to-speech screen reader cannot figure out how to read, while not losing information, only formatting of it. (See WCAG2ICT #1.1.1 & WCAG2ICT #1.4.5)
  • Error messages should include the program name to help track down which program produced an error in a shell script, SMF method, etc. This is automatically done if you use the standard libc functions err, verr, errx, verrx, warn, vwarn, warnx, vwarnx (3c).
  • Parsable output should follow the design outlined in Creating Shell-Friendly Parsable Output:
    • Parsable output should require the user to specify the fields to output, via a -o or similar flag, so that new fields can be added to the command without breaking existing parsers.
    • Headers should be omitted in parsable output mode or when a flag such as -H is specified to omit them.
    • Parsable output should use a non-whitespace delimiter, such as “:” between fields.

Privileges on Solaris

User Interaction

  • If you offer an interactive command prompt mode, such as svccfg does for executing subcommands, consider using libtecla or similar support for command line editing in this mode.
  • Any operation that may permanently alter or destroy data should either have an “undo” option (such as rollback to prior snapshot) or have a mode offering the user a chance to confirm (such as the -i option to rm). (See WCAG2ICT #3.3.4)
  • Users should be able to configure timeout lengths for any operation that expects user interaction before a timeout expires. (See WCAG2ICT #2.2.1)



Wednesday Oct 01, 2014

Moving Oracle Solaris to LP64 bit by bit

A few people have noticed a trend in the Oracle Solaris 11 update releases of delivering more and more Solaris commands as 64-bit binaries, so I figured it was time to write a detailed explanation to answer some of the questions and help prepare users & developers for further change, as it now becomes more critical to deliver shared objects in both 32-bit (ILP32) and 64-bit (LP64) versions.

I’d like to thank Ali, Gary, Margot, Rod, and Jeff for their feedback on this post, and most especially to Sharon for helping rework it to get to the most important bits first.

For the developers and administrators who are familiar with Solaris history, I’ll start with info about what you should be doing to make sure you’re ready for increased use of 64-bit software in Solaris. You can refer to the section that describes issues that arise when converting binaries — I cover examples from the X Window System packages for Oracle Solaris that I’ve done the conversion work for, and discuss LP64 conversion work by other Solaris engineers.

For those who need more background, see the sections about LP64 history in Solaris and the Application Binary Interface (ABI) differences between 32-bit and 64-bit binaries.

What do you need to do?

You need to do what you have always done — ensure you have both 32-bit and 64-bit versions of any shared objects. This requirement is being highlighted because the consequences of not providing both binary versions is becoming more disruptive in each subsequent Solaris release.

Development requirements

If you develop software for Solaris, the requirement is to provide both 32-bit and 64-bit versions of any shared objects you provide for other software to use, whether as libraries to link their programs against or as loadable objects for frameworks such as localized input methods, PAM service modules, custom crypt(3C) password hashes, or dozens of other shared object uses in Solaris.

Additionally, you should make sure all of your software is 64-bit clean, even if you still support it on 32-bit systems. While Solaris 10 still has more than 6 years left of active support, eventually you’ll move on to only supporting Solaris 11 and later and be able to go 64-bit only as well. The Solaris 64-bit Developer’s Guide and Solaris Studio Guide to Converting Applications for a 64-Bit Environment can help here. Oracle’s ISV support team is also looking to provide more assistance in future versions of the Oracle Solaris Preflight Applications Checker tool to help find possible issues for you.

User requirements

Administrators and users should verify that the software that they install and use provides both 32-bit & 64-bit shared objects. If they are not provided, contact the developers to provide both binary versions.

You should also keep an eye on the End of Features (EOF) Planned for Future Releases of Oracle Solaris page to see if something you may still need is being removed because it was determined not to be useful any more when it was reviewed for LP64 conversion. The page is updated regularly as new items make their way through the internal obsolescence review processes. If something appears there that would cause you major grief, let Solaris development know through your sales or support channels, so we can supply a better transition or replacement plan where possible.

And of course, if you find bugs in the Solaris 64-bit converted software, or find that you need a 64-bit version of a particular Solaris library or shared object that’s not already available, file bugs via Oracle Support or the Oracle Partner Network for Solaris.

Delivery of X commands as LP64 binaries

Because Solaris 11 no longer requires programs run on a 32-bit kernel, and the minimum supported system has 64 times as much RAM as the first Ultra 1, Oracle Solaris can now ship programs directly as 64-bit binaries, which better equips them to run on modern sized data sets, while utilizing the full capabilities of today’s hardware, and have started doing so.

For instance, before Solaris 11 shipped, I switched the default build flags for the X Window System programs in Solaris to 64-bit (with a few exceptions). Solaris has long shipped all the non-obsolete public libraries for X as both 32-bit and 64-bit, and the upstream open source versions of this software were made 64-bit clean long ago, originally by DEC for their Alpha workstations, and maintained since then by the BSD & Linux platforms that delivered 64-bit only distributions instead of multilib models. For the most part, this was just an implementation detail for the X programs - their functionality didn’t change, nor did their interfaces.

One case with a visible impact from becoming LP64-only was the Xorg server, which uses dlopen() to load shared object drivers for the specific hardware in use. Solaris 10 8/07 moved from Xorg 6.9 to 7.2 and started delivering Xorg as LP64 binaries for SPARC & x64, since video cards would soon have more VRAM than a 32-bit X server could access. This was also the first delivery of Xorg for SPARC, and did not need to support 32-bit SPARC platforms, so on SPARC Xorg was LP64 from the beginning. On x86, since Solaris was still supporting 32-bit platforms, the 64-bit Xorg was added alongside the existing 32-bit version. As of the 64-bit only release of Solaris 11, we could drop the 32-bit version of Xorg in Solaris, but because Solaris wasn’t the only source of the loadable driver modules (for instance, nvidia & VirtualBox both provided drivers for their graphics), we ensured that 64-bit versions were available, before announcing the end of support for 32-bit driver modules.

One less visible impact was in xdm, the old style login GUI for X, which Solaris still ships, though most Solaris systems use the more modern GNOME display manager (gdm) instead. In order to authenticate users, xdm uses the PAM framework, which allows administrators to configure a variety of login methods, such as Kerberos or SmartCards. Administrators can also install additional PAM methods to work with authentication systems Solaris doesn’t have built-in support for, or additional pluggable crypt modules to handle other password hashing methods. While PAM has supported 64-bit modules since Solaris 7, and the crypt framework has supported 64-bit modules since it was introduced in Solaris 9, most programs calling PAM and the crypt framework have been 32-bit. Installing only the 32-bit version of a crypt or PAM module thus worked most of the time. However, now that xdm is 64-bit, a 32-bit-only module will generate failure messages for users trying to login via xdm, because the system won’t find a 64-bit module to load. While xdm and PAM show that a non-existent binary can prevent system login, other less prominent 64-bit shared objects are going to be required over time, so providers of shared objects need to ensure they are installing both 32-bit & 64-bit versions of their software going forward.

Some users of custom input methods for different languages may also notice that their input methods are not available in 64-bit programs, since input methods are similarly provided as shared objects that are loaded via dlopen() and thus also have to exist in the same ABI variants as the calling programs.

Conversion of Solaris commands to LP64 binaries

This effort isn’t limited to X11 software — engineers in all areas of Solaris are evaluating the various programs and determining what needs to be done. They have two tasks - to ensure the program itself is 64-bit clean, and to ensure that the surrounding ecosystem (such as the PAM modules example above) is 64-bit ready. In some cases, they’ve found software Solaris doesn’t need to ship any longer, like the gettable tool for maintaining Internet host tables in the days before DNS, and are publishing End of Support notices for them. But for most cases, they’re working to deliver 64-bit versions into Solaris as time allows.

The results of this work are already visible — the number of LP64 programs in /usr/bin and /usr/sbin in the full Oracle Solaris package repositories has climbed each release:

Solaris 11.01707 (92%)144 (8%)1851
Solaris 11.11723 (92%)150 (8%)1873
Solaris 11.21652 (86%)271 (14%)1923
(These numbers only count ELF binaries, not python programs, shell scripts, etc.)

In Solaris 11.0, X11 programs provided the bulk of the LP64 programs, but there were some from other subsystems, such as gdb, emacs, and the NSS crypto commands. Solaris 11.1 added LP64 crypto commands, including digest, decrypt, elfsign, pktool, and tpmadm; as well as other commands like top & gzip. Solaris 11.2 added a number of LP64 GNU programs, including the GNU coreutils and groff. New tools added in 11.2 were 64-bit in their first delivery, such as mlocate, the Intel GPU tools, and the jsl JavaScript Lint package. The bzip2 compression program was made LP64 in 11.2 at the request of a customer who uses it on large files and wanted the extra performance on x64.

Solaris 11.2 also adds Java 8 packages, alongside Java 7, and the now deprecated Java 6 packages. Java 8 for Solaris is 64-bit-only, dropping the 32-bit binary option found in previous Java releases. As noted under the Removal of 32-bit Solaris item in the Features Removed From JDK8 list, the Java plugin for web browsers was also removed from Java 8 for Solaris, because a 64-bit Java plugin cannot be run in the 32-bit Firefox browser.

Solaris Data Model History

Solaris 11.2 represents the latest step in a 20 year journey.

The original Solaris 2.x ABI used a data model referred to as ILP32, in which the size of the C language types “int”, “long”, and pointers are all 32-bit numbers. This data model matched the SPARC & x86 CPUs available at the time.

In 1995, Sun introduced its first SPARC v9 CPU’s, the UltraSPARC I, which offered 64-bit integers and addresses. Solaris 7 followed, bringing a second ABI to Solaris, using the LP64 model, in which “int” remained 32-bits wide, but “long” and pointer sizes doubled to 64-bits.

This affected both the kernel and user space code, and Solaris 7 delivered both 32-bit (ILP32) and 64-bit (LP64) kernels for UltraSPARC systems. The Solaris kernel implementation only allowed running 64-bit user space processes if a 64-bit kernel was loaded, but 32-bit user space software could be used with either a 32-bit or 64-bit kernel.

For user-space programs, the class (32 or 64-bit) of libraries must match the program that links to them. In order to support both 32 and 64-bit programs, it was necessary to provide both 32 and 64-bit versions of libraries. To preserve binary compatibility with existing 32-bit software, the libraries in directories such as /usr/lib were left as the 32-bit versions, and the 64-bit versions were added in a new sparcv9 subdirectory. On modern Linux systems, this approach is now called “multilib.”

Because the UltraSPARC CPUs did not impose any significant performance penalty when running existing 32-bit code on a 64-bit CPU, Sun continued to ship 32-bit user-space programs in the ILP32 ABI so that the same binary could be used on both 32-bit-only and 64-bit-capable CPUs, reducing development, testing, and support costs. It also reduced by half the memory requirements for pointers and long ints (thus more easily fitting them in the CPU caches) in programs which wouldn’t benefit from the larger sizes, an important consideration in systems with only 32MB of RAM. For the small number of programs that had to run 64-bit to be able to read 64-bit kernel structures or debug other 64-bit binaries, Solaris shipped both 32-bit & 64-bit versions, with isaexec used to execute the version matching the kernel in use.

On the x86 side, nearly a decade later AMD’s first AMD64 CPUs provided similar hardware support, and Solaris 10 introduced a matching LP64 ABI for x86 platforms in 2005, with the 64-bit libraries delivered in amd64 subdirectories. Even though 32-bit binaries did not run as fast as fully 64-bit binaries, Solaris followed the same model of providing mostly ILP32 programs to get the release to market faster; to save on development, test, and support costs; and to be consistent with Solaris on SPARC platforms.

In these transitional phases, 32-bit & 64-bit software and hardware coexisted. For SPARC, support for 32-bit-only CPUs was phased out in Solaris 10, when support for the last pre-sun4u platforms was dropped. Support for the 32-bit kernel was dropped at the same time, because all remaining supported hardware could run the 64-bit kernel. For x86, support for 32-bit-only CPUs was dropped in Solaris 11.

Therefore, as of the shipping of Oracle Solaris 11 in November 2011, the supported set of platforms have 64-bit kernels that can run 32-bit or 64-bit user space binaries.

Other Differences Between 32-bit & 64-bit ABIs

While the size of the types is the defining difference between the two ABI models, the opportunity to introduce a fresh new ABI after learning of the mistakes and limitations of the old ABI was hard to resist, and other changes were made as well. Significant differences include:

Additionally, since not all 64 bits in pointers are needed to address current memory sizes, unused ones can be used for additional tasks, such as some of the new features coming in the SPARC M7 CPU.

Some other platforms followed a different strategy. For example, Linux introduced “x32”, a new 32-bit ABI, and is considering a proposal for year-2038-safe ABIs. Engineers at Sun long ago debated a “large time” extension to the 32-bit ABI like the large file interfaces, but decided to concentrate efforts on LP64 instead. Because Solaris is not trying to maintain 32-bit kernel support for embedded devices, that is not a problem we have to solve as we move forward. The result should be a simpler system, which is always a benefit for developers and ISVs.

We don’t know yet when we’ll finish this journey, but hopefully we’ll get there before the industry starts converting software to run on CPUs with 128-bit addressing.

Disclaimer: The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Monday Aug 04, 2014

Solaris 11.2: Changes since beta to bundled software packages

In April, when Solaris 11.2 Beta was released, I posted a list of changes to bundled software packages between Solaris 11.1 & 11.2. Now that the final release of Solaris 11.2 reached General Availability last week, I've gone back to compare the beta release via the GA release.

As you would expect, there are many fewer changes in the three months between beta & GA than in the 18 months before that. Most of the change came from upgrading the OpenStack packages from the Grizzly (2013.1) release to the Havana (2013.2) release, and adding the Swift OpenStack Object Storage components and other packages like Django which the new OpenStack components needed. There are also some general bug fix or security fix updates, such as upgrading OpenSSL from 1.0.1g to 1.0.1h.

One other change that showed up when gathering data for this list was that the Oracle Database 12c prerequisites package was renamed between beta & GA to better match the database naming style - previously it was called group/prerequisite/oracle/oracle-rdbms-server-12cR1-preinstall but is now group/prerequisite/oracle/oracle-rdbms-server-12-1-preinstall. Fortunately, you don't have to type in the whole FMRI to install it, pkg install oracle-rdbms-server-12-1-preinstall is enough.

Detailed list of changes

This table shows most of the changes to the bundled packages between the 11.2 beta released in April, and the 11.2 GA release in July.

As before, some were excluded for clarity, or to reduce noise and duplication. All of the bundled packages which didn’t change the version number in their packaging info are not included, even if they had updates to fix bugs, security holes, or add support for new hardware or new features of Solaris.

PackageUpstream11.2 Beta11.2 GA
cloud/openstack/cinder OpenStack 0.2013.1.4 0.2013.2.3
cloud/openstack/glance OpenStack 0.2013.1.4 0.2013.2.3
cloud/openstack/horizon OpenStack 0.2013.1.4 0.2013.2.3
cloud/openstack/keystone OpenStack 0.2013.1.4 0.2013.2.3
cloud/openstack/neutron OpenStack 0.2013.1.4 0.2013.2.3
cloud/openstack/nova OpenStack 0.2013.1.4 0.2013.2.3
cloud/openstack/swift OpenStack not included 1.10.0
developer/java/jdk-7 Java
(Java SE 7u55)
(Java SE 7u65)
developer/java/jdk-8 Java
(Java SE 8u5)
(Java SE 8u11)
diagnostic/wireshark Wireshark 1.10.6 1.10.7
library/java/javadb Java
library/nspr Mozilla NSPR 4.8.9 4.9.5
library/python/ceilometerclient OpenStack not included 1.0.10
library/python/cffi Python CFFI not included 0.8.2
library/python/cinderclient OpenStack 1.0.7 1.0.9
library/python/django Django not included 1.4.11
library/python/dnspython dnspython not included 1.11.1
library/python/dogpile.cache dogpile.cache not included 0.5.3
library/python/dogpile.core dogpile.core not included 0.4.1
library/python/heatclient OpenStack not included 0.2.9
library/python/iso8601 pyiso8601 not included 0.1.10
library/python/jinja2 Jinja not included 2.7.2
library/python/keystoneclient OpenStack 0.4.1 0.8.0
library/python/neutronclient OpenStack 2.3.1 2.3.4
library/python/novaclient OpenStack 2.15.0 2.17.0
library/python/oslo.config OpenStack not included 1.3.0
library/python/pbr OpenStack not included 0.8.1
library/python/pycparser pycparser not included 2.10
library/python/python-memcached python-memcached not included 1.53
library/python/six pypi six not included 1.6.1
library/python/swiftclient OpenStack 2.0.2 2.1.0
library/python/troveclient OpenStack not included 0.1.4
library/python/websockify websockify not included 0.5.1
library/python/xattr xattr not included 0.7.4
library/security/nss Mozilla NSS 4.13.1 4.14.3
library/security/openssl OpenSSL
mail/thunderbird Mozilla Thunderbird 17.0.6 17.0.11
network/dns/bind ISC BIND
network/rsync rsync 3.0.9 3.1.0
runtime/java/jre-7 Java
(Java SE 7u55)
(Java SE 7u65)
runtime/java/jre-8 Java
(Java SE 8u5)
(Java SE 8u11)
security/nss-utilities Mozilla NSS 4.13.1 4.14.3
service/network/dns/bind ISC BIND
shell/bash GNU Bash 4.1.9 4.1.11
system/test/sunvts Oracle VTS 7.18.0 7.18.1
web/browser/firefox Mozilla Firefox 17.0.6 17.0.11
web/java-servlet/tomcat Apache Tomcat 6.0.39 6.0.41
web/server/ejabberd ejabberd 2.1.8 2.1.13

Monday Jun 09, 2014

Solaris 11.2: Functional Deprecation

In Solaris 11.1, I updated the system headers to enable use of several attributes on functions, including noreturn and printf format, to give compilers and static analyzers more information about how they are used to give better warnings when building code.

In Solaris 11.2, I've gone back in and added one more attribute to a number of functions in the system headers: __attribute__((__deprecated__)). This is used to warn people building software that they’re using function calls we recommend no longer be used. While in many cases the Solaris Binary Compatibility Guarantee means we won't ever remove these functions from the system libraries, we still want to discourage their use.

I made passes through both the POSIX and C standards, and some of the Solaris architecture review cases to come up with an initial list which the Solaris architecture review committee accepted to start with. This set is by no means a complete list of Obsolete function interfaces, but should be a reasonable start at functions that are well documented as deprecated and seem useful to warn developers away from. More functions may be flagged in the future as they get deprecated, or if further passes are made through our existing deprecated functions to flag more of them.

Header Interface Deprecated by Alternative Documented in
<door.h> door_cred(3C) PSARC/2002/188 door_ucred(3C) door_cred(3C)
<kvm.h> kvm_read(3KVM), kvm_write(3KVM) PSARC/1995/186 Functions on kvm_kread(3KVM) man page kvm_read(3KVM)
<stdio.h> gets(3C) ISO C99 TC3 (Removed in ISO C11), POSIX:2008/XPG7/Unix08 fgets(3C) gets(3C) man page, and just about every gets(3C) reference online from the past 25 years, since the Morris worm proved bad things happen when it’s used.
<unistd.h> vfork(2) PSARC/2004/760, POSIX:2001/XPG6/Unix03 (Removed in POSIX:2008/XPG7/Unix08) posix_spawn(3C) vfork(2) man page.
<utmp.h> All functions from getutent(3C) man page PSARC/1999/103 utmpx functions from getutentx(3C) man page getutent(3C) man page
<varargs.h> varargs.h version of va_list typedef ANSI/ISO C89 standard <stdarg.h> varargs(3EXT)
<volmgt.h> All functions PSARC/2005/672 hal(5) API volmgt_check(3VOLMGT), etc.
<sys/nvpair.h> nvlist_add_boolean(3NVPAIR), nvlist_lookup_boolean(3NVPAIR) PSARC/2003/587 nvlist_add_boolean_value, nvlist_lookup_boolean_value nvlist_add_boolean(3NVPAIR) & (9F), nvlist_lookup_boolean(3NVPAIR) & (9F).
<sys/processor.h> gethomelgroup(3C) PSARC/2003/034 lgrp_home(3LGRP) gethomelgroup(3C)
<sys/stat_impl.h> _fxstat, _xstat, _lxstat, _xmknod PSARC/2009/657 stat(2) old functions are undocumented remains of SVR3/COFF compatibility support

If the above table is cut off when viewing in the blog, try viewing this standalone copy of the table.

To See or Not To See

To see these warnings, you will need to be building with either gcc (versions 3.4, 4.5, 4.7, & 4.8 are available in the 11.2 package repo), or with Oracle Solaris Studio 12.4 or later (which like Solaris 11.2, is currently in beta testing). For instance, take this oversimplified (and obviously buggy) implementation of the cat command:

#include <stdio.h>

int main(int argc, char **argv) {
    char buf[80];

    while (gets(buf) != NULL)
    return 0;
Compiling it with the Studio 12.4 beta compiler will produce warnings such as:
% cc -V
cc: Sun C 5.13 SunOS_i386 Beta 2014/03/11
% cc gets_test.c
"gets_test.c", line 6: warning:  "gets" is deprecated, declared in : "/usr/include/iso/stdio_iso.h", line 221

The exact warning given varies by compilers, and the compilers also have a variety of flags to either raise the warnings to errors, or silence them. Of couse, the exact form of the output is Not An Interface that can be relied on for automated parsing, just shown for example.

gets(3C) is actually a special case — as noted above, it is no longer part of the C Standard Library in the C11 standard, so when compiling in C11 mode (i.e. when __STDC_VERSION__ >= 201112L), the <stdio.h> header will not provide a prototype for it, causing the compiler to complain it is unknown:

% gcc -std=c11 gets_test.c
gets_test.c: In function ‘main’:
gets_test.c:6:5: warning: implicit declaration of function ‘gets’ [-Wimplicit-function-declaration]
     while (gets(buf) != NULL)
The gets(3C) function of course is still in libc, so if you ignore the error or provide your own prototype, you can still build code that calls it, you just have to acknowledge you’re taking on the risk of doing so yourself.

Solaris Studio 12.4 Beta

% cc gets_test.c
"gets_test.c", line 6: warning:  "gets" is deprecated, declared in : "/usr/include/iso/stdio_iso.h", line 221

% cc -errwarn=E_DEPRECATED_ATT gets_test.c
"gets_test.c", line 6:  "gets" is deprecated, declared in : "/usr/include/iso/stdio_iso.h", line 221
cc: acomp failed for gets_test.c
This warning is silenced in the 12.4 beta by cc -erroff=E_DEPRECATED_ATT
No warning is currently issued by Studio 12.3 & earler releases.

gcc 3.4.3

% /usr/sfw/bin/gcc gets_test.c
gets_test.c: In function `main':
gets_test.c:6: warning: `gets' is deprecated (declared at /usr/include/iso/stdio_iso.h:221)

Warning is completely silenced with gcc -Wno-deprecated-declarations

gcc 4.7.3

% /usr/gcc/4.7/bin/gcc gets_test.c
gets_test.c: In function ‘main’:
gets_test.c:6:5: warning: ‘gets’ is deprecated (declared at /usr/include/iso/stdio_iso.h:221) [-Wdeprecated-declarations]

% /usr/gcc/4.7/bin/gcc -Werror=deprecated-declarations gets_test.c
gets_test.c: In function ‘main’:
gets_test.c:6:5: error: ‘gets’ is deprecated (declared at /usr/include/iso/stdio_iso.h:221) [-Werror=deprecated-declarations]
cc1: some warnings being treated as errors

Warning is completely silenced with gcc -Wno-deprecated-declarations

gcc 4.8.2

% /usr/bin/gcc gets_test.c
gets_test.c: In function ‘main’:
gets_test.c:6:5: warning: ‘gets’ is deprecated (declared at /usr/include/iso/stdio_iso.h:221) [-Wdeprecated-declarations]
     while (gets(buf) != NULL)

% /usr/bin/gcc -Werror=deprecated-declarations gets_test.c
gets_test.c: In function ‘main’:
gets_test.c:6:5: error: ‘gets’ is deprecated (declared at /usr/include/iso/stdio_iso.h:221) [-Werror=deprecated-declarations]
     while (gets(buf) != NULL)
cc1: some warnings being treated as errors

Warning is completely silenced with gcc -Wno-deprecated-declarations

Tuesday Apr 29, 2014

Solaris 11.2: Changes to bundled software packages

When Solaris 11.1 came out in October 2012, I posted about the changes to the included FOSS packages. With the publication today of Solaris 11.2 beta, I thought it would be nice to revisit this and see what’s changed in the past year and a half. This time around, I’m including some bundled packages that aren’t necessarily covered by a free software or open source license, but are of interest to Solaris users.

Removing software in updates

Last time I discussed how IPS allowed us to make a variety of changes in update releases much more easily than in the Solaris 10 package system. One of these changes is obsoleting packages, and we’ve done that in a couple rare cases in both Solaris 11.1 and 11.2 where the software is abandoned by the upstream, and we’ve decided it would be worse to keep it around, potentially broken, than to remove it on upgrade.

When we do this, notices will be posted to the End of Features for Solaris 11 web page, alongside the list of features that have been declared deprecated and may be removed in future releases. As you can see there, in Solaris 11.1 the Adobe Flash Player and tavor HCA driver packages were removed.

In Solaris 11.2, three more packages have been removed. slocate was a “secure” version of the locate utility, which wouldn’t show a user any files that they didn’t have permission to access. Unfortunately, this utility was broken by changes in the AST library, and since there is no longer an upstream for it, we decided to follow the lead of several Linux distros and moved to mlocate instead, which is added in this release.

The other two removed packages are both Xorg video drivers - the nv driver for NVIDIA graphics, and the trident driver for old Trident graphics chipsets. Most users will not notice these removals, but if you had manually created an xorg.conf file specifying one of these drivers, you may need to edit it to use the vesa driver instead.

NVIDIA had previously supported the nv open source driver and contributed updates to X.Org to support new chipsets in it, but in 2010, they announced they would no longer do so, and considered nv deprecated, recommending the use of the VESA driver for those who had no better driver to use. While we had continued to ship the nv driver in Solaris, it led to an increasing number of crashes, hangs, and other bugs for which the resolution was to remove the nv driver and use vesa instead, so we are removing it to end those issues. For systems with graphics devices new enough to be supported by the bundled nvidia closed-source driver, this will have no effect. For those with older devices, this will cause Xorg autoconfiguration to load the vesa driver instead, until and unless the user downloads & installs an appropriate NVIDIA legacy driver.

The trident driver was still in Solaris even after we dropped 32-bit support on x86, and years after Trident Microsystems exited the graphics business and sold its graphics product line to XGI, as the Sun Fire V20z server included a Trident chipset for the console video device. Unfortunately, the upstream driver has been basically unmaintained since then, and Oracle has had to apply patches to port to new Xorg releases. Meanwhile, in order to resolve bugs that caused system hangs, the trident driver was modified to not load on V20z systems, which left us shipping an unmaintained driver solely for a system that could not use it, but uses the vesa driver instead, so we decided to remove it as well.

If you had either of these Xorg driver packages installed, then when you update to 11.2, then pkg update will inform you there are release notes for these drivers, to warn you of the possibility you may need to edit your xorg.conf.

System Management Stack

The popular Puppet system for automating configuration changes across machines has been included in Solaris, and updated to support several Solaris features in both the framework and in individual configuration provideers. For instance, configuration changes made via Puppet will be recorded in the Solaris audit logs as part of a puppet session, and Puppet’s configuration file is generated from SMF properties using the new SMF stencil facilities. Providers are included that can configure IPS publishers, SMF properties, ZFS datasets, Solaris boot environments, and a variety of Solaris NIC, VNIC, and VLAN settings.

Another addition is the Oracle Hardware Management Pack (HMP), a set of tools that work with the ILOM, firmware, and other components in Sun/Oracle servers to configure low-level system options. Previously these needed to be downloaded and installed separately, now they are a simple pkg install away, and kept up to date with the rest of the OS.

A collaboration with Intel led to the integration of a Solaris port of Intel’s numatop tool for observing memory access locality across CPUs.

From the open source world, we’ve integrated several tools to allow admins and users to do multiple things at once, including the tmux terminal multiplexer, cssh tool for cluster administration via ssh, and GNU Parallel for running commands in parallel.

Developer Stack

For developers, GNU Compiler Collection (gcc) versions 4.7 & 4.8 are added alongside the previous 3.4 & 4.5 packages, and the gcc packages have been refactored to better allow installing different subsets of compilers. Other updated developer tools include Mercurial 2.8.2, GNU emacs 24.3, pylint 0.25.2, and version 7.6 of the GNU debugger, gdb. Newly added tools for developers include GNU indent, JavaScript Lint, and Python’s pep8.

The Java 8 development kit & runtime environment are both available as well. The default installation clusters will only install Java 7, but you can install the Java 8 runtime with “pkg install jre-8” or get both the runtime & development kits with “pkg install jdk-8”. The /usr/java mediated link, through which all the links in /usr/bin for the java, jar, javac, etc. commands flow will be set by default to the most recent version installed, so installing Java 8 will make that version default. You can see this via “ls -l /usr/java” reporting:

lrwxrwxrwx   1 root   root     15 Apr 23 14:01 /usr/java -> jdk/jdk1.8.0_05
or via “pkg mediator java” reporting:
java         system    1.8     system
If you want to choose a different version to be default, you can manually set the mediator to that version with “pkg set-mediator -V 1.7 java”. Of course, for many operations, you can directly access any installed java version via the full path, such as /usr/jdk/instances/jdk1.8.0/bin/java instead of relying on the /usr/bin symlinks.

One caveat to be aware of is that Java 8 for Solaris is only provided as 64-bit binaries, as all Solaris 11 and later machines are running 64-bit now. This means that any JNI modules you rely on will need to be compiled as 64-bit and any programs that try to load Java must be 64-bit. There is also no 64-bit version provided of either the Java plugin for web browsers, or the Java Webstart program for starting Java client applications from web pages.

Desktop Stack

Most of the changes in the desktop stack in this release were updates needed to fix security issues, and are mostly covered on the Oracle Third Party Vulnerability Resolution Blog.

There were some feature updates in the X Window System layers of the desktop stack though – most notably the Xorg server was upgraded from 1.12 to 1.14, and the accompanying Mesa library was upgraded to version 9.0.3, which includes support for OpenGL 3.1 and GLSL 1.40 on Intel graphics. The bundled version of NVIDIA’s graphics driver was also updated, to NVIDIA’s latest “long lived branch” - 331. For users with older graphics cards which are no longer supported in this branch, legacy branches are available from NVIDIA’s Unix driver download site.


And last, but certainly not least, especially in the number of packages added to the repository, is the addition of OpenStack support in Solaris. The Cinder Block Storage Service, Glance Image Service, Horizon Dashboard, Keystone Identity Service, Neutron Networking Service, and Nova Compute Service from the OpenStack Grizzly (2013.1) release are all provided, in versions tested and integrated with Solaris features. Between the Open Stack packages themselves and all the python modules required for them, there’s over 100 new FOSS packages in this release.

Detailed list of changes

This table shows most of the changes to the bundled packages between the original Solaris 11.1 release, the latest Solaris 11.1 support repository update (SRU18, released April 14, 2014), and the Solaris 11.2 beta released today.

As with last time, some were excluded for clarity, or to reduce noise and duplication. All of the bundled packages which didn’t change the version number in their packaging info are not included, even if they had updates to fix bugs, security holes, or add support for new hardware or new features of Solaris.

PackageUpstream11.111.1 SRU1811.2 Beta
archiver/gnu-tarGNU tar1.
cloud/openstack/cinderOpenStacknot includednot included0.2013.1.4
cloud/openstack/glanceOpenStacknot includednot included0.2013.1.4
cloud/openstack/horizonOpenStacknot includednot included0.2013.1.4
cloud/openstack/keystoneOpenStacknot includednot included0.2013.1.4
cloud/openstack/neutronOpenStacknot includednot included0.2013.1.4
cloud/openstack/novaOpenStacknot includednot included0.2013.1.4
compress/gzipGNU gzip1.41.51.5
compress/pbzip2Parallel bzip2not includednot included1.1.6
compress/pixzpixznot includednot included1.0
database/berkeleydb-5Oracle Berkeley DB5.
database/mysql-55MySQLnot includednot included5.5.31
developer/build/antApache Ant1.
developer/debug/gdbGNU GDB6.86.87.6
developer/gcc-47GNU Compiler Collectionnot includednot included4.7.3
developer/gcc-48GNU Compiler Collectionnot includednot included4.8.2
developer/gnu-indentGNU indentnot includednot included2.2.9
developer/java/jdk-8Javanot includednot included1.
developer/javascript/jslJavaScript Lintnot includednot included0.3.0
developer/versioning/mercurialMercurial SCM2.
diagnostic/numatopnumatopnot includednot included1.0
editor/gnu-emacsGNU Emacs23.423.424.3
file/gnu-coreutilsGNU Coreutils8.58.58.16
file/mcGNU Midnight Commander4.
file/mlocatemlocatenot includednot included0.25
file/slocate3.13.1not included
library/cacaoCommon Agent Container2.
library/libarchivelibarchivenot includednot included3.0.4
library/libxml2XML C parser2.
library/perl-5/perl-x11-protocolCPAN: X11-Protocolnot includednot included0.56
library/perl-5/xml-libxmlCPAN: XML::LibXMLnot included2.142.14
library/perl-5/xml-namespacesupportCPAN: XML::NamespaceSupportnot included1.111.11
library/perl-5/xml-parser-threaded-512CPAN: XML::Parsernot included2.362.36
library/perl-5/xml-saxCPAN: XML::SAXnot included0.990.99
library/perl-5/xml-sax-baseCPAN: XML::SAX::Basenot included1.081.08
library/perl5/perl-tkCPAN: Tknot includednot included804.31
library/python-2/alembicalembicnot includednot included0.6.0
library/python-2/amqpamqpnot includednot included1.0.12
library/python-2/anyjsonanyjsonnot includednot included0.3.3
library/python-2/argparseargparsenot included1.
library/python-2/babelbabelnot includednot included1.3
library/python-2/beautifulsoup4beautifulsoup4not includednot included4.2.1
library/python-2/botobotonot includednot included2.9.9
library/python-2/cheetahcheetahnot includednot included2.4.4
library/python-2/cliffcliffnot includednot included1.4.5
library/python-2/cmd2cmd2not includednot included0.6.7
library/python-2/cov-corecov-corenot includednot included1.7
library/python-2/cssutilscssutilsnot includednot included0.9.6
library/python-2/d2to1d2to1not includednot included0.2.10
library/python-2/decoratordecoratornot includednot included3.4.0
library/python-2/djangodjangonot includednot included1.4.10
library/python-2/django-appconfdjango-appconfnot includednot included0.6
library/python-2/django_compressordjango_compressornot includednot included1.3
library/python-2/django_openstack_authOpenStacknot includednot included1.1.3
library/python-2/eventleteventletnot includednot included0.13.0
library/python-2/filechunkiofilechunkionot includednot included1.5
library/python-2/formencodeformencodenot includednot included1.2.6
library/python-2/greenletgreenletnot includednot included0.4.1
library/python-2/httplib2httplib2not includednot included0.8
library/python-2/importlibimportlibnot includednot included1.0.2
library/python-2/ipythonipythonnot includednot included0.10
library/python-2/iso8601iso8601not includednot included0.1.4
library/python-2/jsonpatchjsonpatchnot includednot included1.1
library/python-2/jsonpointerjsonpointernot includednot included1.0
library/python-2/jsonschemajsonschemanot includednot included2.0.0
library/python-2/kombukombunot includednot included2.5.12
library/python-2/lesscpylesscpynot includednot included0.9.10
library/python-2/librabbitmqlibrabbitmqnot includednot included1.0.1
library/python-2/lockfilelockfilenot includednot included0.9.1
library/python-2/markdownmarkdownnot includednot included2.3.1
library/python-2/markupsafemarkupsafenot includednot included0.18
library/python-2/mockmocknot includednot included1.0.1
library/python-2/netaddrnetaddrnot includednot included0.7.10
library/python-2/netifacesnetifacesnot includednot included0.8
library/python-2/nose-cover3nose-cover3not includednot included0.0.4
library/python-2/ordereddictordereddictnot includednot included1.1
library/python-2/oslo.configoslo.confignot includednot included1.2.1
library/python-2/passlibpasslibnot includednot included1.6.1
library/python-2/pastepastenot includednot included1.7.5.1
library/python-2/paste.deploypaste.deploynot includednot included1.5.0
library/python-2/pbrpbrnot includednot included0.5.21
library/python-2/pep8pep8not includednot included1.4.4
library/python-2/pippipnot includednot included1.4.1
library/python-2/prettytableprettytablenot includednot included0.7.2
library/python-2/pypynot includednot included1.4.15
library/python-2/pyasn1pyasn1not includednot included0.1.7
library/python-2/pyasn1-modulespyasn1-modulesnot includednot included0.0.5
library/python-2/pycountrypycountrynot includednot included0.17
library/python-2/pydnspydnsnot includednot included2.3.6
library/python-2/pyflakespyflakesnot includednot included0.7.2
library/python-2/pygmentspygmentsnot includednot included1.6
library/python-2/pyparsingpyparsingnot includednot included2.0.1
library/python-2/pyrabbitpyrabbitnot includednot included1.0.1
library/python-2/pytestpytestnot includednot included2.3.5
library/python-2/pytest-capturelogpytest-capturelognot includednot included0.7
library/python-2/pytest-codecheckerspytest-codecheckersnot includednot included0.2
library/python-2/pytest-covpytest-covnot includednot included1.6
library/python-2/python-imagingpython-imagingnot includednot included1.1.7
library/python-2/python-ldappython-ldapnot includednot included2.4.10
library/python-2/python-mysqlpython-mysqlnot includednot included1.2.2
library/python-2/python-zope-interfaceZopenot includednot included3.3.0
library/python-2/pytzpytznot includednot included2013.4
library/python-2/repoze.lrurepoze.lrunot includednot included0.6
library/python-2/requestsrequestsnot includednot included1.2.3
library/python-2/routesroutesnot includednot included1.13
library/python-2/setuptools-gitsetuptools-gitnot includednot included1.0
library/python-2/simplejsonsimplejsonnot includednot included2.1.2
library/python-2/sixsixnot includednot included1.4.1
library/python-2/sqlalchemysqlalchemynot includednot included0.7.9
library/python-2/sqlalchemy-migratesqlalchemy-migratenot includednot included0.7.2
library/python-2/stevedorestevedorenot includednot included0.10
library/python-2/sudssudsnot includednot included0.4
library/python-2/tempitatempitanot includednot included0.5.1
library/python-2/toxtoxnot includednot included1.4.3
library/python-2/unittest2unittest2not includednot included0.5.1
library/python-2/virtualenvvirtualenvnot includednot included1.9.1
library/python-2/waitresswaitressnot includednot included0.8.5
library/python-2/warlockwarlocknot includednot included1.0.1
library/python-2/webobwebobnot includednot included1.2.3
library/python-2/websockifywebsockifynot includednot included0.3.0
library/python-2/webtestWebTestnot includednot included2.0.6
library/python/cinderclientOpenStacknot includednot included1.0.7
library/python/glanceclientOpenStacknot includednot included0.12.0
library/python/keystoneclientOpenStacknot includednot included0.4.1
library/python/neutronclientOpenStacknot includednot included2.3.1
library/python/novaclientOpenStacknot includednot included2.15.0
library/python/quantumclientOpenStacknot includednot included2.2.4.3
library/python/swiftclientOpenStacknot includednot included2.0.2
library/security/opensslOpenSSL1.0.0.10 (1.0.0j) (1.0.0k) (1.0.1g)
mail/thunderbirdMozilla Thunderbird10.0.61717.0.6
mail/thunderbird/plugin/thunderbird-lightningMozilla Lightning10.0.61717.0.6
network/amqp/rabbitmqRabbitMQnot includednot included3.1.3
network/dns/bindISC BIND9.
runtime/clispGNU CLISP2.472.472.49
runtime/java/jre-8Javanot includednot included1.
runtime/perl-threaded-512Perlnot included5.
runtime/ruby-19Rubynot includednot included1.9.3.484
runtime/ruby-19/ruby-tkRubynot includednot included1.9.3.484
service/network/dhcp/isc-dhcpISC DHCP4.
service/network/dns/bindISC BIND9. (9.6-ESV-R7-P2) (9.6-ESV-R10-P2) (9.6-ESV-R10-P2)
service/network/dnsmasqDnsmasqnot includednot included2.68
service/network/ftpProFTPD1. (1.3.3g) (1.3.4c) (1.3.4c)
service/network/ntpNTP4.2.5.200 (4.2.5p200) (4.2.7p381) (4.2.7p381)
service/network/ptpPTPdnot includednot included2.2.0
shell/gnu-getoptGNU getoptnot includednot included1.1.5
shell/parallelGNU parallelnot includednot included0.2012.11.22
system/library/hmp-libsHMPnot includednot included2.2.8
system/library/security/libgcryptGNU libgcrypt1.
system/management/biosconfigHMPnot includednot included2.2.8
system/management/facterPuppetnot includednot included1.6.18
system/management/fwupdateHMPnot includednot included2.2.8
system/management/fwupdate/emulexHMPnot includednot included6.3.12.2
system/management/fwupdate/qlogicHMPnot includednot included1.7.3
system/management/hmp-snmpHMPnot includednot included2.2.8
system/management/hwmgmtcliHMPnot includednot included2.2.8
system/management/hwmgmtdHMPnot includednot included2.2.8
system/management/puppetPuppetnot includednot included3.4.1
system/management/raidconfigHMPnot includednot included2.2.8
system/management/ubiosconfigHMPnot includednot included2.2.8
terminal/csshCluster SSHnot includednot included4.2.1
terminal/tmuxtmuxnot includednot included1.8
text/gnu-grepGNU grep2.102.142.14
text/texinfoGNU texinfo4.74.134.13
web/browser/firefoxMozilla Firefox10.0.61717.0.6
web/java-servlet/tomcatApache Tomcat6.0.356.0.376.0.39
web/php-53/extension/php-zendopcacheZend OPcachenot includednot included7.0.2
web/server/apache-22Apache HTTPD2.
web/server/apache-22/module/apache-fcgidApache FastCGI2.
web/server/apache-22/module/apache-sedApache HTTPD2.
web/wgetGNU wget1.121.121.14
x11/demo/mesa-demosMesa 3-D8.
x11/diagnostic/intel-gpu-toolsX.Orgnot includednot included1.3
x11/library/mesaMesa 3-D7.
x11/server/xorg/driver/xorg-video-nvX.Org2. included
x11/server/xorg/driver/xorg-video-tridentX.Org1. included

Saturday Mar 29, 2014

15 years

Fifteen years ago today, March 29, 1999, I showed up at Sun’s MPK29 building for my first day of work as a student intern. I was off-cycle from the normal summer interns, since I wasn’t enrolled Spring semester but was going to finish my last two credits (English & Math) during the Summer Session at Berkeley. A friend from school convinced his manager at Sun to give me a chance, and after interviewing me, they offered to bring me on board for 3 months full-time, then dropping down to half-time when classes started.

I joined the Release Engineering team as a tools developer and backup RE in what was then the Power Client Desktop Software organization (to differentiate it from the Thin Client group) in the Workstation Products Group of Sun’s hardware arm. The organization delivered the X Window System, OpenWindows, and CDE into Solaris, and was in the midst of two big projects for the Solaris 7 8/99 & 11/99 releases: a project called “SolarEZ” to make the CDE desktop more usable and to add support for syncing CDE calendar & mail apps with Palm Pilot PDAs; and the project to merge the features from X11R6.1 through X11R6.4 (including Xinerama, Display Power Management, Xprint, and LBX) into Sun’s proprietary X11 fork, which was still based on the X11R6.0 release.

I started out with some simple bug fixes to learn the various build systems, before starting to try to write the scripts they wanted to simplify the builds. My first bug was to find out why every time they did a full build of Sun’s X gate they printed a copy of the specs. (To save trees, they’d implemented a temporary workaround of setting the default printer queue on the build machines to one not connected to a printer, but then they had to go delete all the queued jobs every few days to free disk space.) After a couple days of digging, and learning far too much about Imake for my own good, I found the cause was a merge error in the Imake configuration specs. The TroffCmd setting for how to convert troff documents to PostScript had been resynced to the upstream setting of psroff, but the flags were still the ones for Sun’s troff. These flags to Sun’s troff generated a PostScript file on disk, while they made psroff send the PostScript to the printer - switching TroffCmd back to Solaris troff solved the issue, so I filed Sun bug 4227466 and made my first commit to the Solaris X11 code base.

Six months later, at the end of my internship, diploma in hand, Sun offered to convert my position to a regular full-time employee, and I signed on. (This is why I always get my anniversary recognition in September, since they don’t count the internship.) Six months after that, the X11R6.4 project was finishing up and several of the X11 engineers decided to move on, making an opening for me to switch to the X11 engineering team as a developer. Like many junior developers joining new teams, I started out doing a lot of bug fixes, and some RFE’s, such as integrating Xvfb & Xnest into Solaris 9.

A couple years in, Sun’s attempts to reach agreement amongst all of the CDE co-owners to open source CDE had failed, resulting only in the not-really-open-source release of OpenMotif, so Sun chose to move on once again, to the GNOME Desktop. This required us to provide a lot of infrastructure in the X server and libraries that GNOME clients needed, and I got to take on some more challenging projects such as trying to port the Render extension from XFree86 to Xsun, assisting on the STSF font rendering project, adding wheel mouse support to Xsun, and working to ensure Xsun had the underlying support needed by GNOME’s accessibility projects. Unfortunately, the results of some of those projects were not that good, but we learned a lot about how hard it was to retrofit Xsun with the newly reinvigorated X11 development work and how maintaining our own fork of all the shared code was simply slowing us down.

Then one fateful day in 2004, our manager was on vacation, so I got a call from the director of the Solaris x86 platform team, who had been meeting with video card vendors to decide what graphics to include in Sun’s upcoming AMD64 workstations. The one consistent answer they’d gotten was that the time and cost to produce Solaris drivers went up and the feature set went down if they had to port to Xsun as well, instead of being able to reuse the XFree86 driver code they were already shipping for Linux. He asked what it would take to ship XFree86 on Solaris instead, and we discussed the options, then I talked with the other X engineers, and soon I was the lead of a project to replace the Solaris X server.

This was right at the time XFree86 was starting to come apart while the old X.Org industry consortium was trying to move X11 to an open development model, resulting in the formation of the X.Org Foundation. We chose to just go straight to Xorg, not XFree86, and not to fork as we’d done with Xsun, but instead to just apply any necessary changes as patches, and try to limit those to not changing core areas, so it was easier to keep up with new releases.

Thus, right at the end of the Solaris 10 development cycle we integrated the Xorg 6.8 X server for x86 platforms, and even made it the default for that release. We had no SPARC drivers ported yet, just all the upstream x86 drivers, so only provided Xsun in the SPARC release. The SPARC port, along with a 64-bit build for x64 systems, came in later Solaris 10 update releases. This worked out much better than our attempts to retrofit Xsun ever did.

Along with the Solaris 10 release, Sun announced that Solaris was going to be open sourced, and eventually as much of the OS source as we could release would be. But for the Solaris 10 release we only had time to add the Xorg server and the few libraries we needed for the new extensions. Most of the libraries and clients were still using the code from Sun’s old X11R6 fork, and we needed to figure out which changes we could publish and/or contribute back upstream. We weren’t ready to do this yet on the opening day of, but I put together a plan for us to do so going forward, combing the tasks of migrating from our fork to a patched upstream code base, and the work of excising proprietary bits so it could be released as fully open source.

Due to my work with the X.Org open source community, my participation in the existing Solaris communities on Usenet and Yahoo Groups, and interest in X from the initial OpenSolaris participants, I was pulled into the larger OpenSolaris effort, culminating in serving 2 years on the OpenSolaris Governing Board. When the OpenSolaris distro effort (“Project Indiana”) kicked off, I got involved in it to figure out how to integrate the X11 packages to it. Between driving the source tree changes and the distro integration work for X, I became the Tech Lead for the Solaris X Window System for the Solaris 11 release.

Of course, the best laid plans are always beset by changes in the surrounding world, and these were no different. While we finished the migration to a pure open-source X code base in November 2009, it was shortly followed by the closing of Oracle’s acquisition of Sun, which eventually led to the shutdown of the OpenSolaris project. Since the X11 code is still open source and mostly external sources, it is still published on, alongside other open source code included in Solaris. So when Solaris 11 was released in November 2011, while the core OS code was closed again, it included the open source X11 code with the culmination of our efforts.

Solaris 11 also included a few bug fixes I’d found via experimenting with the Parfait static analysis tool created by Sun Labs. Since shortly after starting in the X11 group, I’d been handling the security bugs in X. Having tools to help find those seemed like a good idea, and that was reinforced when we used Coverity to find a privilege escalation bug in Xorg. When the Sun Labs team came to demo parfait to us, I decided to try it out, and found and fixed a number of bugs in X with it (though not in security sensitive areas, just places that could cause memory leaks or crashes in code not running with raised privileges).

After the Oracle acquisition, part of adopting Oracle’s security assurance policies was using static analysis on our code base. With my experience in using static analyzers on X, I helped create the overall Solaris plan for using static analysis tools on our entire code base. This plan was accepted and I was asked to take on the role of leading our security assurance efforts across Solaris.

Another part of integrating Sun into Oracle was migrating all the bugs from Sun’s bug database into Oracle’s, so we could have a unified system, allowing our Engineered Systems teams to work together more productively, tracking bugs from the hardware through the OS up to the database & application level in a single system. Valerie Fenwick & Scott Rotondo had managed the Solaris bug tracking for years, and I’d worked with them, representing both the X11 and larger Desktop software teams. When Scott decided to move to the Oracle Database Security team, Valerie asked me to replace him, just as the effort to migrate the bugs was beginning. That turned into 2 years of planning, mapping, updating tools and processes, coordinating changes across the organization, traveling to multiple sites to train our engineers on the Oracle tools, and lots of communication to prepare our teams for the changes.

As we looked to the finish line of that work, planning was beginning for the Solaris 12 release. All of the above experiences led to me being named as the Technical Lead for the Solaris 12 release as a whole, covering the entire OS, coordinating and working with the tech leads for the individual consolidations. We’re two years into that now, and while I can’t really say much more yet than the official roadmap, I think we’ve got some solid plans in place for the OS.

So here it is now 15 years after starting at Sun, 14 after joining the X11 team. I still do a little work on X in my spare time (especially around the area of X security bugs), and officially am still in the X11 group, but most of my day is spent on the rest of Solaris now, between planning Solaris 12, managing our bug tracking, and ensuring our software is secure.

In those 15 years, I’ve learned a lot, accomplished a bit, and made a lot of friends. I’ve also:

  • filed 2572 bugs & enhancement requests in the Sun/Oracle bug database, and fixed or closed 2155, assuming I got the bug queries correct.
  • worked under 3 CEO’s (Scott, Jonathan, Larry), and 4 managers, but more VP’s and Directors than I can count thanks to Sun’s love of regular reorgs.
  • been part of the Workstation Products Group, the Webtop Applications group, the Software Globalization group, the User Experience Engineering group, the x64 Platform group, the Solaris Open Source group, and a few I’ve forgotten, again thanks to Sun’s love of regular reorgs.
  • been seen by my co-workers in a suit exactly 4 times (my initial job interview, and funerals for 3 colleagues we’ve lost over the years).

Where will I be in 15 more years? Hard to guess, but I bet it involves making sure everything is ready for Y2038, so I can retire in peace when that arrives.

For now, I offer my heartfelt thanks to everyone whose help made the last 15 years possible - none of it was done alone, and I couldn't have got here without a lot of you.

Sunday Feb 23, 2014

The goto fail heard ‘round the world

There has been much discussion online this weekend of what happens when you introduce an extra “goto fail” line into your code, such as making a mistake using a merge tool to combine your changes into those made in parallel to a code repository under active development by other engineers. (That is the most common cause I’ve seen of such duplication in source code, and thus the one I see most likely — others guess similarly and discuss that side more in depth — but that’s not what this post is about.)

Adam Langley posted a good explanation of this bug, including a brief snippet of code showing the general class of problem:

 1     extern int f();
 3     int g() {
 4             int ret = 1;
 6             goto out;
 7             ret = f();
 9     out:
10             return ret;
11     }
which he pointed out “If I compile with -Wall (enable all warnings), neither GCC 4.8.2 or Clang 3.3 from Xcode make a peep about the dead code.” Others pointed out that the -Wunreachable-code option is one of many that neither compiler includes in -Wall by default, since both compilers treat -Wall not as “all possible warnings” but rather more of “the subset of warnings that the compiler authors think are suitable for use on most software”, leaving out those that are experimental, unstable, controversial, or otherwise not quite ready for prime time.

Since this has a simple test case, and I’ve got a few compilers handy on my Solaris x86 machine for doing various builds with, I did some testing myself:

gcc 3.4.3

% /usr/gcc/3.4/bin/gcc -Wall -Wextra -c fail.c
[no warnings or other output]

% /usr/gcc/3.4/bin/gcc -Wall -Wextra -Wunreachable-code -c fail.c
fail.c: In function `g':
fail.c:7: warning: will never be executed

As expected, gcc warned when the specific flag was given.

gcc 4.5.2 & 4.7.3

% /usr/gcc/4.5/bin/gcc -Wall -Wextra -Wunreachable-code -c fail.c
[no warnings or other output]

% /usr/gcc/4.7/bin/gcc -Wall -Wextra -Wunreachable-code -c fail.c
[no warnings or other output]

It turns out this is due to the gcc maintainers removing the support for -Wunreachable-code in later versions.

clang 3.1

% clang -Wall -Wextra -c fail.c
[no warnings or other output]

% clang -Wall -Wunreachable-code -c fail.c
fail.c:7:8: warning: will never be executed [-Wunreachable-code]
        ret = f();
1 warning generated.

% clang -Wall -Weverything -c fail.c
fail.c:3:5: warning: no previous prototype for function 'g'
int g() {
fail.c:7:8: warning: will never be executed [-Wunreachable-code]
        ret = f();
2 warnings generated.

So clang also finds it with -Wunreachable-code, which is also enabled by -Weverything, but not -Wall or -Wextra. (I didn’t see a similar -Weverything flag for gcc.)

Solaris Studio 12.3

% cc -c fail.c
"fail.c", line 7: warning: statement not reached

% lint -c fail.c

statement not reached

Both Studio’s default compiler flags and old-school lint static analyzer caught this by default. Though, as I noted last year, warnings about some unreachable code chunks will be seen in some releases but not others, as the information available to the compiler about the control flow evolves.

As anyone who has compiled much software with more than one of the above (including just different versions of the same compiler) can attest, the warnings generated on real code bases by various compilers will almost always be different. There’s many cases where gcc & clang warn about things Studio doesn’t, and vice versa, as every compiler & analyzer has been fed different test cases over the years of bad code that developers wanted them to find, or specific problems they needed to solve. This is why when compiling the Solaris kernel, we run two compilation passes on all the sources - once with Studio cc to both get its code analysis and to make the binaries we ship, and once with gcc just to get its code analysis, plus running it through Oracle’s internal parfait static analyser as well.

Having a growing body of warnings doesn’t help either, if you never do anything about them. This is why Solaris kernel and core utility code have policies on not introducing new warnings and we’ve been making efforts in X.Org to clean up warnings, so that you can see new issues when you introduce them, not lose them in a sea of thousands of existing bits of noise.

Of course, it’s too late to stop the bug that caused this discussion from slipping into the code base, and easy to second guess after the fact how it may have been prevented. We have lots of tools at our disposal and they continue to grow and improve, including human code review (though human review is much less reliable and repeatable than most tools, humans are much more flexible at finding things the automated tools weren't prepared to catch), and we need to use multiple ones (more on the most sensitive code, such as security protocols) to improve our chances of finding and fixing the bugs before we integrate them. Hopefully we can learn from the ones that slip through how to improve our checks for the future, such as adding -Wunreachable-code compiler flags where appropriate.

Thursday Jan 02, 2014

Book Review: “Oracle Solaris 11: First Look”

I saw a tweet this weekend from Packt Publishing (@packtpub) offering their entire catalog of ebooks for $5 apiece until Friday, January 3, 2014, so I decided to check out their selection.

One of the books I found there was Oracle Solaris 11: First Look by long-time Solaris admin Phil Brown, whose website was a staple of the Solaris x86 mailing list a decade ago, and whose advocacy of Solaris x86 helped save it from permanent cancellation. I first met Phil when we were undergrads at U. C. Berkeley, where we served as sysadmin staff on the student-run cluster of Unix systems together, and we’ve kept in touch via the Solaris & OpenSolaris communities we both found ourselves in later.

Since it was only $5, I picked it up to see if it was worth recommending to others. When not on sale, the ebook is only $15, since it’s not a large book - the PDF is 168 pages, the epub on my iPad was 199 pages - and in both forms that includes an index of about 25 pages. That also made it a quick enough read that I could get through in an afternoon, skimming over the examples and reference materials.

This book is intended to get existing Solaris admins up to speed quickly on Solaris 11 - it’s not going to introduce the basics of system administration, but will tell you what commands to run now. If you don’t know what routers, subnets, or tunnels are, you probably want a more introductory book - if you know what they are, and need to know what to run instead of ifconfig or editing /etc/hostname.e1000g0 to configure them on Solaris 11, this book will help.

Phil’s biases as a long time server admin are obvious in some sections, such as the introduction to NWAM, the Network AutoMagic feature, which he suggests “from a server sysadmin perspective, it might perhaps be better named "Never Wake A Monster"” though he admits on a laptop it can be useful to adjust to networks in different locations. There’s also a few areas where you can tell the book was written before Solaris 11.1 was out and didn’t get updated for the latest changes.

As the author of the classic pkg-get tool for installing Solaris SVR4 packages from network repositories, he has plenty to say about the new IPS packaging system in Solaris 11 as well, providing some useful tips on finding packages and setting up local repositories, though he does discourage use of many of the more advanced pkg subcommands that can help admins take more control over exactly what gets installed and updated on their systems.

Overall it’s a decent aide, and something I may refer to in the future, as I don’t do system administration that often these days, and often need a refresher, especially when old habits no longer work. It’s more detailed in many sections that the official Transitioning From Oracle Solaris 10 to Oracle Solaris 11.1 manual, which mainly points off to the other Solaris 11 manuals for details; but concentrated only on the areas a typical system administrator will be configuring, not developer or end-user visible changes. I’d recommend it to experienced admins looking for a hands on guide for dealing with Solaris 11 systems, but it’s not the right level for those trying to plan a migration or learn Solaris administration from scratch.

As always, the above is solely my personal opinion, not an official Oracle corporate position or endorsement.

Friday Nov 01, 2013

Finding nuggets in ARC discussions

A bit over twenty years ago, Sun formed an Architecture Review Committee (ARC) that evaluates proposals to change interfaces between components in Sun software products. During the OpenSolaris days, we opened many of these discussions to the community. While they’re back behind closed doors, and at a different company now, we still continue to hold these reviews for the software from what’s now the Sun Systems Group division of Oracle.

Recently one of these reviews was held (via e-mail discussion) to review a proposal to update our GNU findutils package to the latest upstream release. One of the upstream changes discussed was the addition of an “oldfind” program. In findutils 4.3, find was modified to use the fts() function to walk the directory tree, and oldfind was created to provide the old mechanism in case there were bugs in the new implementation that users needed to workaround.

In Solaris 11 though, we still ship the find descended from SVR4 as /usr/bin/find and the GNU find is available as either /usr/bin/gfind or /usr/gnu/bin/find. This raised the discussion of if we should add oldfind, and if so what should we call it. Normally our policy is to only add the g* names for GNU commands that conflict with an existing Solaris command – for instance, we ship /usr/bin/emacs, not /usr/bin/gemacs. In this case however, that seemed like it would be more confusing to have /usr/bin/oldfind be the older version of /usr/bin/gfind not of /usr/bin/find. Thus if we shipped it, it would make more sense to call it /usr/bin/goldfind, which several ARC members noted read more naturally as “gold find” than as “g old find”.

One of the concerns we often discuss in ARC is if a change is likely to be understood by users or if it will result in more calls to support. As we hit this part of the discussion on a Friday at the end of a long week, I couldn’t resist putting forth a hypothetical support call for this command:

“Hello, Oracle Solaris Support, how may I help you?”

“My admin is out sick, but he sent an email that he put the findutils package on our server, and I can run goldfind now. I tried it, but goldfind didn’t find gold.”

“Did he get the binutils package too?”

“No he just said findutils, do we need binutils?”

“Well, gold comes in the binutils package, so goldfind would be able to find gold if you got that package.”

“How much does Oracle charge for that package?”

“It’s free for Solaris users.”

“You mean Oracle ships packages of gold to customers for free?”

“Yes, if you get the binutils package, it includes GNU gold.”

“New gold? Is that some sort of alchemy, turning stuff into gold?”

“Not new gold, gold from the GNU project.”

“Oracle’s taking gold from the GNU project and shipping it to me?”

“Yes, if you get binutils, that package includes gold along with the other tools from the GNU project.”

“And GNU doesn’t mind Oracle taking their gold and giving it to customers?”

“No, GNU is a non-profit whose goal is to share their software.”

“Sharing software sure, but gold? Where does a non-profit like GNU get gold anyway?”

“Oh, Google donated it to them.”

“Ah! So Oracle will give me the gold that GNU got from Google!”

“Yes, if you get the package from us.”

“How do I get the package with the gold?”

“Just run pkg install binutils and it will put it on your disk.”

“We’ve got multiple disks here - which one will it put it on?”

“The one with the system image - do you know which one that is?

“Well the note from the admin says the system is on the first disk and the users are on the second disk.”

“Okay, so it should go on the first disk then.”

“And where will I find the gold?”

“It will be in the /usr/bin directory.”

“In the user’s bin? So thats on the second disk?”

“No, it would be on the system disk, with the other development tools, like make, as, and what.”

“So what’s on the first disk?”

“Well if the system image is there the commands should all be there.”

“All the commands? Not just what?”

“Right, all the commands that come with the OS, like the shell, ps, and who.”

“So who’s on the first disk too?”

Picture of Abbott and Costello

“Yes. Did your admin say when he’d be back?”

“No, just that he had a massive headache and was going home after I tried to get him to explain this stuff to me.”

“I can’t imagine why.”

“Oh, is why a command too?”

“No, _why was a Ruby programmer.

“Ruby? Do you give those away with the gold too?”

“Yes, but it comes in the ruby package, not binutils.”

“Oh, I’ll have to have my admin get that package too! Thanks!”

Needless to say, we decided this might not be the best idea. Since the GNU package hasn’t had to release a serious bug fix in the new find in the past few years, the new GNU find seems pretty stable, and we always have the SVR4 find to use as a fallback in Solaris, so it didn’t seem that adding oldfind was really necessary, so we passed on including it when we update to the new findutils release.

[Apologies to Abbott, Costello, their fans, and everyone who read this far. The Gold (linker) page on Wikipedia may explain some of the above, but can’t explain why goldfind is the old GNU find, but gold is the new GNU ld.]

Sunday Aug 04, 2013

Some times it’s a book sprint, other times it’s a marathon

One of the areas the X.Org Foundation has been working on in recent years is trying to bring new developers on board. Programs like Google Summer of Code and The X.Org Endless Vacation of Code (EVoC) help with this somewhat, but we’ve also been trying to find ways to lower the barrier to entry, both so the students in those programs can learn more and so that other developers have an easier time joining us.

Some of our efforts here have been technological - one of the driving efforts of the conversions from Imake to automake and from CVS to git was to make use of tools developers would already be familiar and productive with from other projects. The Modularization project, which broke up X.Org from one giant tree into over 200 small ones, had the goal of making it possible to fix a bug in a single library or driver without having to download and build many megabytes of software & fonts that were not being changed.

Another area of effort has been around the X.Org documentation, as I’ve written about before. Several years ago, Bart Massey suggested we try organizing a Book Sprint to write a guide for new X.Org developers, similar to the sprint he’d participated in to write a Google Summer of Code Mentoring Guide and a matching GSoC Student Guide. The Board agreed to let him try organizing one, but finding a time to bring everyone together was challenging.

We finally ended up holding a first session as a virtual gathering over the internet one weekend in March 2012. Bart, Matt Dew, Keith Packard, Stéphane Marchesin, and I used Gobby and Mumble to agree on an outline and write several chapters, along with creating some accompanying diagrams using graphviz dot. Unfortunately, the combination of the work from home setting, in which people kept dropping in and out as their home life intervened, the small number of people, and lack of an experienced organizer made this not as productive as other book sprints have been for other projects.

After the first weekend we had drafts of about 7 chapters and a preface, and outlines of several more. We also had a large chunk of prewritten material on graphics drivers that Stéphane had been working on already to link with. Over the next few months, Bart edited and formatted the chapters we had, while we got a little more written, including recruiting additional authors such as Peter Hutterer.

We then gathered again, this time in person, hosted by Matthias Hopf at the Georg-Simon-Ohm-University in Nürnberg, Germany, over the two days prior to the 2012 X.Org Developer’s Conference. We had some additional people join us this time, including Martin Peres, Lucas Stach, Ben Skeggs, and probably more I’ve forgotten. At this session, Bart, Matt & I worked on polsihing the rough edges on the guide we’d been creating, while the others worked more on the driver guide that Stéphane had started. Bart figured out the tools needed to generate ebook formats of the guide, such as epub, and made a cover and pulled together an overall structure for the guide, and by the end of that sprint, we had something we could read on the Android ebook reader he'd brought with him.

And then, we waited. Bart tried to figure out how to make the setup maintainable and reproducible, and how to make it available to our users, but his day job as a University professor kept taking time away from that. Bart also gave a lightning talk on our book sprint experience at the Write the Docs conference in Portland earlier this year covering what worked and what we learned didn’t work in our virtual sprint, and asking for ideas or help on how to go forward.

Meanwhile, the burden of fighting the spammers on the X.Org & FreeDesktop.Org wikis had gotten overwhelming on the current wiki software, so the freedesktop sitewranglers evaluated different solutions, and after looking at how well ikiwiki had been working for the past few years on the XCB wiki decided to move the rest of the FreeDesktop hosted wikis to ikiwiki as well. One major advantage is that it let us use the existing freedesktop authentication infrastructure for wiki accounts instead of having to find different ways to let our users in while keeping the spammers out. Another benefit is that it uses Markdown and git to update the wiki, so we can easily push files in Markdown format to the wiki to publish them now.

In a shocking coincidence, the geeks who wrote the X.Org Developer's Guide also used Markdown & git to author it, so with just a very little conversion to make links work between sections, it was possible to publish the guide to the wiki, where you can find it now:

It’s not perfect, it’s not even fully finished, but it’s better than nothing, and since it’s now on a wiki, it’s even easier for others to fill in the missing pieces or improve the existing bits.

Stephane is still working on the companion guide to graphics drivers, covering the stack from DRM and KMS in the kernel up to Xorg & Mesa. He posted the PDF of the draft as it was at the end of the March book sprint, but not yet a version with the contributions added from the September followup.

When working on applications higher up the stack, not hacking on the graphics stack itself, we refer developers to the documentation for their toolkits, or to general guides to OpenGL programming, as those are beyond what we can document ourselves in X.Org.

And for other open source projects, if you’d like a chance to have a professionally organized, in person doc sprint for your project, applications are being taken until August 7 for 2-4 projects to hold sprints as part of the Google Summer of Code Doc Camp in October, including travel expenses for up to 5 participants from the sponsors.

Sunday Nov 11, 2012

Solaris 11.1 changes building of code past the point of __NORETURN

While Solaris 11.1 was under development, we started seeing some errors in the builds of the upstream X.Org git master sources, such as:

"Display.c", line 65: Function has no return statement : x_io_error_handler
"hostx.c", line 341: Function has no return statement : x_io_error_handler
from functions that were defined to match a specific callback definition that declared them as returning an int if they did return, but these were calling exit() instead of returning so hadn't listed a return value.

These had been generating warnings for years which we'd been ignoring, but X.Org has made enough progress in cleaning up code for compiler warnings and static analysis issues lately, that the community turned up the default error levels, including the gcc flag -Werror=return-type and the equivalent Solaris Studio cc flags -v -errwarn=E_FUNC_HAS_NO_RETURN_STMT, so now these became errors that stopped the build. Yet on Solaris, gcc built this code fine, while Studio errored out. Investigation showed this was due to the Solaris headers, which during Solaris 10 development added a number of annotations to the headers when gcc was being used for the amd64 kernel bringup before the Studio amd64 port was ready. Since Studio did not support the inline form of these annotations at the time, but instead used #pragma for them, the definitions were only present for gcc.

To resolve this, I fixed both sides of the problem, so that it would work for building new X.Org sources on older Solaris releases or with older Studio compilers, as well as fixing the general problem before it broke more software building on Solaris.

To the X.Org sources, I added the traditional Studio #pragma does_not_return to recognize that functions like exit() don't ever return, in patches such as this Xserver patch. Adding a dummy return statement was ruled out as that introduced unreachable code errors from compilers and analyzers that correctly realized you couldn't reach that code after a return statement.

And on the Solaris 11.1 side, I updated the annotation definitions in <sys/ccompile.h> to enable for Studio 12.0 and later compilers the annotations already existing in a number of system headers for functions like exit() and abort(). If you look in that file you'll see the annotations we currently use, though the forms there haven't gone through review to become a Committed interface, so may change in the future.

Actually getting this integrated into Solaris though took a bit more work than just editing one header file. Our ELF binary build comparison tool, wsdiff, actually showed a large number of differences in the resulting binaries due to the compiler using this information for branch prediction, code path analysis, and other possible optimizations, so after comparing enough of the disassembly output to be comfortable with the changes, we also made sure to get this in early enough in the release cycle so that it would get plenty of test exposure before the release.

It also required updating quite a bit of code to avoid introducing new lint or compiler warnings or errors, and people building applications on top of Solaris 11.1 and later may need to make similar changes if they want to keep their build logs similarly clean.

Previously, if you had a function that was declared with a non-void return type, lint and cc would warn if you didn't return a value, even if you called a function like exit() or panic() that ended execution. For instance:

#include <stdlib.h>

callback(int status)
    if (status == 0)
        return status;
would previously require a never executed return 0; after the exit() to avoid lint warning "function falls off bottom without returning value".

Now the compiler & lint will both issue "statement not reached" warnings for a return 0; after the final exit(), allowing (or in some cases, requiring) it to be removed. However, if there is no return statement anywhere in the function, lint will warn that you've declared a function returning a value that never does so, suggesting you can declare it as void. Unfortunately, if your function signature is required to match a certain form, such as in a callback, you not be able to do so, and will need to add a /* LINTED */ to the end of the function.

If you need your code to build on both a newer and an older release, then you will either need to #ifdef these unreachable statements, or, to keep your sources common across releases, add to your sources the corresponding #pragma recognized by both current and older compiler versions, such as:

#pragma does_not_return(exit)
#pragma does_not_return(panic) 
Hopefully this little extra work is paid for by the compilers & code analyzers being able to better understand your code paths, giving you better optimizations and more accurate errors & warning messages.

Sunday Oct 28, 2012

Solaris 11.1: Changes to included FOSS packages

Besides the documentation changes I mentioned last time, another place you can see Solaris 11.1 changes before upgrading is in the online package repository, now that the 11.1 packages have been published to, as the “” branch. (Oracle Solaris Package Versioning explains what each field in that version string means.)

When you’re ready to upgrade to the packages from either this repo, or the support repository, you’ll want to first read How to Update to Oracle Solaris 11.1 Using the Image Packaging System by Pete Dennis, as there are a couple issues you will need to be aware of to do that upgrade, several of which are due to changes in the Free and Open Source Software (FOSS) packages included with Solaris, as I’ll explain in a bit.

Solaris 11 can update more readily than Solaris 10

In the Solaris 10 and older update models, the way the updates were built constrained what changes we could make in those releases. To change an existing SVR4 package in those releases, we created a Solaris Patch, which applied to a given version of the SVR4 package and replaced, added or deleted files in it. These patches were released via the support websites (originally SunSolve, now My Oracle Support) for applying to existing Solaris 10 installations, and were also merged into the install images for the next Solaris 10 update release. (This Solaris Patches blog post from Gerry Haskins dives deeper into that subject.)

Some of the restrictions of this model were that package refactoring, changes to package dependencies, and even just changing the package version number, were difficult to do in this hybrid patch/OS update model. For instance, when Solaris 10 first shipped, it had the Xorg server from X11R6.8. Over the first couple years of update releases we were able to keep it up to date by replacing, adding, & removing files as necessary, taking it all the way up to Xorg server release 1.3 (new version numbering begun after the X11R7 split of the X11 tree into separate modules gave each module its own version). But if you run pkginfo on the SUNWxorg-server package, you’ll see it still displayed a version number of 6.8, confusing users as to which version was actually included.

We stopped upgrading the Xorg server releases in Solaris 10 after 1.3, as later versions added new dependencies, such as HAL, D-Bus, and libpciaccess, which were very difficult to manage in this patching model. (We later got libpciaccess to work, but HAL & D-Bus would have been much harder due to the greater dependency tree underneath those.) Similarly, every time the GNOME team looked into upgrading Solaris 10 past GNOME 2.6, they found these constraints made it so difficult it wasn’t worthwhile, and eventually GNOME’s dependencies had changed enough it was completely infeasible. Fortunately, this worked out for both the X11 & GNOME teams, with our management making the business decision to concentrate on the “Nevada” branch for desktop users - first as Solaris Express Desktop Edition, and later as OpenSolaris, so we didn’t have to fight to try to make the package updates fit into these tight constraints.

Meanwhile, the team designing the new packaging system for Solaris 11 was seeing us struggle with these problems, and making this much easier to manage for both the development teams and our users was one of their big goals for the IPS design they were working on. Now that we’ve reached the first update release to Solaris 11, we can start to see the fruits of their labors, with more FOSS updates in 11.1 than we had in many Solaris 10 update releases, keeping software more up to date with the upstream communities.

Of course, just because we can more easily update now, doesn’t always mean we should or will do so, it just removes the package system limitations from forcing the decision for us. So while we’ve upgraded the X Window System in the 11.1 release from X11R7.6 to 7.7, the Solaris GNOME team decided it was not the right time to try to make the jump from GNOME 2 to GNOME 3, though they did update some individual components of the desktop, especially those with security fixes like Firefox. In other parts of the system, decisions as to what to update were prioritized based on how they affected other projects, or what customer requests we’d gotten for them.

So with all that background in place, what packages did we actually update or add between Solaris 11.0 and 11.1?

Core OS Functionality

One of the FOSS changes with the biggest impact in this release is the upgrade from Grub Legacy (0.97) to Grub 2 (1.99) for the x64 platform boot loader. This is the cause of one of the upgrade quirks, since to go from Solaris 11.0 to 11.1 on x64 systems, you first need to update the Boot Environment tools (such as beadm) to a new version that can handle boot environments that use the Grub2 boot loader. System administrators can find the details they need to know about the new Grub in the Administering the GRand Unified Bootloader chapter of the Booting and Shutting Down Oracle Solaris 11.1 Systems guide. This change was necessary to be able to support new hardware coming into the x64 marketplace, including systems using UEFI firmware or booting off disk drives larger than 2 terabytes.

For both platforms, Solaris 11.1 adds rsyslog as an optional alternative to the traditional syslogd, and OpenSCAP for checking security configuration settings are compliant with site policies.

Note that the support repo actually has newer versions of BIND & fetchmail than the 11.1 release, as some late breaking critical fixes came through from the community upstream releases after the Solaris 11.1 release was frozen, and made their way to the support repository. These are responsible for the other big upgrade quirk in this release, in which to upgrade a system which already installed those versions from the support repo, you need to either wait for those packages to make their way to the 11.1 branch of the support repo, or follow the steps in the aforementioned upgrade walkthrough to let the package system know it's okay to temporarily downgrade those.

Developer Stack

While Solaris 11.0 included Python 2.7, many of the bundled python modules weren’t packaged for it yet, limiting its usability. For 11.1, many more of the python modules include 2.7 versions (enough that I filtered them out of the below table, but you can always search on the package repository server for them.

For other language runtimes and development tools, 11.1 expands the use of IPS mediated links to choose which version of a package is the default when the packages are designed to allow multiple versions to install side by side.

For instance, in Solaris 11.0, GNU automake 1.9 and 1.10 were provided, and developers had to run them as either automake-1.9 or automake-1.10. In Solaris 11.1, when automake 1.11 was added, also added was a /usr/bin/automake mediated link, which points to the automake-1.11 program by default, but can be changed to another version by running the pkg set-mediator command.

Mediated links were also used for the Java runtime & development kits in 11.1, changing the default versions to the Java 7 releases (the 1.7.0.x package versions), while allowing admins to switch links such as /usr/bin/javac back to Java 6 if they need to for their site, to deal with Java 7 compatibility or other issues, without having to update each usage to use the full versioned /usr/jdk/jdk1.6.0_35/bin/javac paths for every invocation.

Desktop Stack

As I mentioned before, we upgraded from X11R7.6 to X11R7.7, since a pleasant coincidence made the X.Org release dates line up nicely with our feature & code freeze dates for this release. (Or perhaps it wasn’t so coincidental, after all, one of the benefits of being the person making the release is being able to decide what schedule is most convenient for you, and this one worked well for me.) For the table below, I’ve skipped listing the packages in which we use the X11 “katamari” version for the Solaris package version (mainly packages combining elements of multiple upstream modules with independent version numbers), since they just all changed from 7.6 to 7.7.

In the graphics drivers, we worked with Intel to update the Intel Integrated Graphics Processor support to support 3D graphics and kernel mode setting on the Ivy Bridge chipsets, and updated Nvidia’s non-FOSS graphics driver from 280.13 to 295.20.

Higher up in the desktop stack, PulseAudio was added for audio support, and liblouis for Braille support, and the GNOME applications were built to use them.

The Mozilla applications, Firefox & Thunderbird moved to the current Extended Support Release (ESR) versions, 10.x for each, to bring up-to-date security fixes without having to be on Mozilla’s agressive 6 week feature cycle release train.

Detailed list of changes

This table shows most of the changes to the FOSS packages between Solaris 11.0 and 11.1. As noted above, some were excluded for clarity, or to reduce noise and duplication. All the FOSS packages which didn't change the version number in their packaging info are not included, even if they had updates to fix bugs, security holes, or add support for new hardware or new features of Solaris.

archiver/unrar 3.8.5 4.1.4
audio/sox 14.3.0 14.3.2
backup/rdiff-backup 1.2.1 1.3.3
communication/im/pidgin 2.10.0 2.10.5
compress/gzip 1.3.5 1.4
compress/xz not included 5.0.1
database/sqlite-3 3.7.11
desktop/remote-desktop/tigervnc 1.0.90 1.1.0
desktop/window-manager/xcompmgr 1.1.5 1.1.6
desktop/xscreensaver 5.12 5.15
developer/build/autoconf 2.63 2.68
developer/build/autoconf/xorg-macros 1.15.0 1.17
developer/build/automake-111 not included 1.11.2
developer/build/cmake 2.6.2 2.8.6
developer/build/gnu-make 3.81 3.82
developer/build/imake 1.0.4 1.0.5
developer/build/libtool 1.5.22 2.4.2
developer/build/makedepend 1.0.3 1.0.4
developer/gnu-binutils 2.19 2.21.1
developer/java/jdepend not included 2.9
developer/java/jpackage-utils not included 1.7.5
developer/java/junit 4.5 4.10
developer/lexer/jflex not included 1.4.1
developer/parser/byaccj not included 1.14
developer/parser/java_cup not included 0.10
developer/quilt 0.47 0.60
developer/versioning/mercurial 1.8.4 2.2.1
developer/versioning/subversion 1.6.16 1.7.5
diagnostic/constype 1.0.3 1.0.4
diagnostic/nmap 5.21 5.51
diagnostic/scanpci 0.12.1 0.13.1
diagnostic/wireshark 1.4.8 1.8.2
diagnostic/xload 1.1.0 1.1.1
editor/gnu-emacs 23.1 23.4
editor/vim 7.3.254 7.3.600
file/lndir 1.0.2 1.0.3
image/editor/bitmap 1.0.5 1.0.6
image/gnuplot 4.4.0 4.6.0
image/library/libexif 0.6.19 0.6.21
image/library/libpng 1.4.8 1.4.11
image/library/librsvg 2.26.3 2.34.1
image/xcursorgen 1.0.4 1.0.5
library/audio/pulseaudio not included 1.1
library/expat 2.0.1 2.1.0
library/gc 7.1 7.2
library/graphics/pixman 0.22.0 0.24.4
library/guile 1.8.4 1.8.6
library/java/subversion 1.6.16 1.7.5
library/json-c not included 0.9
library/libedit not included 3.0
library/libee not included 0.3.2
library/libestr not included 0.1.2
library/libevent 1.3.5
library/liblouis not included 2.1.1
library/liblouisxml not included 2.1.0
library/libtecla 1.6.0 1.6.1
library/libtool/libltdl 1.5.22 2.4.2
library/nspr 4.8.8 4.8.9
library/openldap 2.4.25 2.4.30
library/pcre 7.8 8.21
library/perl-5/subversion 1.6.16 1.7.5
library/python-2/jsonrpclib not included 0.1.3
library/python-2/lxml 2.1.2 2.3.3
library/python-2/nose not included 1.1.2
library/python-2/pyopenssl not included 0.11
library/python-2/subversion 1.6.16 1.7.5
library/python-2/tkinter-26 2.6.4 2.6.8
library/python-2/tkinter-27 2.7.1 2.7.3
library/security/nss 4.12.10 4.13.1
library/security/openssl (1.0.0e) (1.0.0j)
mail/thunderbird 6.0 10.0.6
package/pkgbuild not included 1.3.104
print/filter/enscript not included 1.6.4
print/filter/gutenprint 5.2.4 5.2.7
print/lp/filter/foomatic-rip 3.0.2 4.0.15
runtime/perl-512 5.12.3 5.12.4
runtime/python-26 2.6.4 2.6.8
runtime/python-27 2.7.1 2.7.3
runtime/tcl-8/tcl-sqlite-3 3.7.11
security/compliance/openscap not included 0.8.1
security/nss-utilities 4.12.10 4.13.1
service/network/dhcp/isc-dhcp 4.1
service/network/ftp (ProFTPD)
service/network/samba 3.5.10 3.6.6
shell/conflict 0.2004.9.1 0.2010.6.27
shell/pipe-viewer 1.1.4 1.2.0
shell/zsh 4.3.12 4.3.17
system/boot/grub 0.97 1.99
system/font/truetype/liberation 1.4 1.7.2
system/library/freetype-2 2.4.6 2.4.9
system/library/libnet 1.1.5
system/management/cim/pegasus 2.9.1 2.11.0
system/management/ipmitool 1.8.10 1.8.11
system/management/wbem/wbemcli 1.3.7
system/network/routing/quagga 0.99.8 0.99.19
system/rsyslog not included 6.2.0
terminal/luit 1.1.0 1.1.1
text/convmv 1.14 1.15
text/gawk 3.1.5 3.1.8
text/gnu-grep 2.5.4 2.10
web/browser/firefox 6.0.2 10.0.6
web/browser/links 1.0 1.0.3
web/java-servlet/tomcat 6.0.33 6.0.35
web/php-53 not included 5.3.14
web/php-53/extension/php-apc not included 3.1.9
web/php-53/extension/php-idn not included 0.2.0
web/php-53/extension/php-memcache not included 3.0.6
web/php-53/extension/php-mysql not included 5.3.14
web/php-53/extension/php-pear not included 5.3.14
web/php-53/extension/php-suhosin not included 0.9.33
web/php-53/extension/php-tcpwrap not included 1.1.3
web/php-53/extension/php-xdebug not included 2.2.0
web/php-common not included 11.1
web/proxy/squid 3.1.8 3.1.18
web/server/apache-22 2.2.20 2.2.22
web/server/apache-22/module/apache-sed 2.2.20 2.2.22
web/server/apache-22/module/apache-wsgi not included 3.3
x11/diagnostic/xev 1.1.0 1.2.0
x11/diagnostic/xscope 1.3 1.3.1
x11/documentation/xorg-docs 1.6 1.7
x11/keyboard/xkbcomp 1.2.3 1.2.4
x11/library/libdmx 1.1.1 1.1.2
x11/library/libdrm 2.4.25 2.4.32
x11/library/libfontenc 1.1.0 1.1.1
x11/library/libfs 1.0.3 1.0.4
x11/library/libice 1.0.7 1.0.8
x11/library/libsm 1.2.0 1.2.1
x11/library/libx11 1.4.4 1.5.0
x11/library/libxau 1.0.6 1.0.7
x11/library/libxcb 1.7 1.8.1
x11/library/libxcursor 1.1.12 1.1.13
x11/library/libxdmcp 1.1.0 1.1.1
x11/library/libxext 1.3.0 1.3.1
x11/library/libxfixes 4.0.5 5.0
x11/library/libxfont 1.4.4 1.4.5
x11/library/libxft 2.2.0 2.3.1
x11/library/libxi 1.4.3 1.6.1
x11/library/libxinerama 1.1.1 1.1.2
x11/library/libxkbfile 1.0.7 1.0.8
x11/library/libxmu 1.1.0 1.1.1
x11/library/libxmuu 1.1.0 1.1.1
x11/library/libxpm 3.5.9 3.5.10
x11/library/libxrender 0.9.6 0.9.7
x11/library/libxres 1.0.5 1.0.6
x11/library/libxscrnsaver 1.2.1 1.2.2
x11/library/libxtst 1.2.0 1.2.1
x11/library/libxv 1.0.6 1.0.7
x11/library/libxvmc 1.0.6 1.0.7
x11/library/libxxf86vm 1.1.1 1.1.2
x11/library/mesa 7.10.2 7.11.2
x11/library/toolkit/libxaw7 1.0.9 1.0.11
x11/library/toolkit/libxt 1.0.9 1.1.3
x11/library/xtrans 1.2.6 1.2.7
x11/oclock 1.0.2 1.0.3
x11/server/xdmx 1.10.3 1.12.2
x11/server/xephyr 1.10.3 1.12.2
x11/server/xorg 1.10.3 1.12.2
x11/server/xorg/driver/xorg-input-keyboard 1.6.0 1.6.1
x11/server/xorg/driver/xorg-input-mouse 1.7.1 1.7.2
x11/server/xorg/driver/xorg-input-synaptics 1.4.1 1.6.2
x11/server/xorg/driver/xorg-input-vmmouse 12.7.0 12.8.0
x11/server/xorg/driver/xorg-video-ast 0.91.10 0.93.10
x11/server/xorg/driver/xorg-video-ati 6.14.1 6.14.4
x11/server/xorg/driver/xorg-video-cirrus 1.3.2 1.4.0
x11/server/xorg/driver/xorg-video-dummy 0.3.4 0.3.5
x11/server/xorg/driver/xorg-video-intel 2.10.0 2.18.0
x11/server/xorg/driver/xorg-video-mach64 6.9.0 6.9.1
x11/server/xorg/driver/xorg-video-mga 1.4.13 1.5.0
x11/server/xorg/driver/xorg-video-openchrome 0.2.904 0.2.905
x11/server/xorg/driver/xorg-video-r128 6.8.1 6.8.2
x11/server/xorg/driver/xorg-video-trident 1.3.4 1.3.5
x11/server/xorg/driver/xorg-video-vesa 2.3.0 2.3.1
x11/server/xorg/driver/xorg-video-vmware 11.0.3 12.0.2
x11/server/xserver-common 1.10.3 1.12.2
x11/server/xvfb 1.10.3 1.12.2
x11/server/xvnc 1.0.90 1.1.0
x11/session/sessreg 1.0.6 1.0.7
x11/session/xauth 1.0.6 1.0.7
x11/session/xinit 1.3.1 1.3.2
x11/transset 0.9.1 1.0.0
x11/trusted/trusted-xorg 1.10.3 1.12.2
x11/x11-window-dump 1.0.4 1.0.5
x11/xclipboard 1.1.1 1.1.2
x11/xclock 1.0.5 1.0.6
x11/xfd 1.1.0 1.1.1
x11/xfontsel 1.0.3 1.0.4
x11/xfs 1.1.1 1.1.2

P.S. To get the version numbers for this table, I ran a quick perl script over the output from:

% pkg contents -H -r -t depend -a type=incorporate -o fmri \
  `pkg contents -H -r -t depend -a type=incorporate -o fmri entire@0.5.11,5.11-` \
  | sort >> /tmp/11.1
% pkg contents -H -r -t depend -a type=incorporate -o fmri \
  `pkg contents -H -r -t depend -a type=incorporate -o fmri entire@0.5.11,5.11-` \
  | sort >> /tmp/11.0

Thursday Oct 25, 2012

Documentation Changes in Solaris 11.1

One of the first places you can see Solaris 11.1 changes are in the docs, which have now been posted in the Solaris 11.1 Library on I spent a good deal of time reviewing documentation for this release, and thought some would be interesting to blog about, but didn't review all the changes (not by a long shot), and am not going to cover all the changes here, so there's plenty left for you to discover on your own.

Just comparing the Solaris 11.1 Library list of docs against the Solaris 11 list will show a lot of reorganization and refactoring of the doc set, especially in the system administration guides. Hopefully the new break down will make it easier to get straight to the sections you need when a task is at hand.

Packaging System

Unfortunately, the excellent in-depth guide for how to build packages for the new Image Packaging System (IPS) in Solaris 11 wasn't done in time to make the initial Solaris 11 doc set. An interim version was published shortly after release, in PDF form on the OTN IPS page. For Solaris 11.1 it was included in the doc set, as Packaging and Delivering Software With the Image Packaging System in Oracle Solaris 11.1, so should be easier to find, and easier to share links to specific pages the HTML version.

Beyond just how to build a package, it includes details on how Solaris is packaged, and how package updates work, which may be useful to all system administrators who deal with Solaris 11 upgrades & installations. The Adding and Updating Oracle Solaris 11.1 Software Packages was also extended, including new sections on Relaxing Version Constraints Specified by Incorporations and Locking Packages to a Specified Version that may be of interest to those who want to keep the Solaris 11 versions of certain packages when they upgrade, such as the couple of packages that had functionality removed by an (unusual for an update release) End of Feature process in the 11.1 release.

Also added in this release is a document containing the lists of all the packages in each of the major package groups in Solaris 11.1 (solaris-desktop, solaris-large-server, and solaris-small-server). While you can simply get the contents of those groups from the package repository, either via the web interface or the pkg command line, the documentation puts them in handy tables for easier side-by-side comparison, or viewing the lists before you've installed the system to pick which one you want to initially install.

X Window System

We've not had good X11 coverage in the online Solaris docs in a while, mostly relying on the man pages, and upstream X.Org docs. In this release, we've integrated some X coverage into the Solaris 11.1 Desktop Adminstrator's Guide, including sections on installing fonts for fontconfig or legacy X11 clients, X server configuration, and setting up remote access via X11 or VNC. Of course we continue to work on improving the docs, including a lot of contributions to the upstream docs all OS'es share (more about that another time).


One of the things Oracle likes to do for its products is to publish security guides for administrators & developers to know how to build systems that meet their security needs. For Solaris, we started this with Solaris 11, providing a guide for sysadmins to find where the security relevant configuration options were documented. The Solaris 11.1 Security Guidelines extend this to cover new security features, such as Address Space Layout Randomization (ASLR) and Read-Only Zones, as well as adding additional guidelines for existing features, such as how to limit the size of tmpfs filesystems, to avoid users driving the system into swap thrashing situations.

For developers, the corresponding document is the Developer's Guide to Oracle Solaris 11 Security, which has been the source for years for documentation of security-relevant Solaris API's such as PAM, GSS-API, and the Solaris Cryptographic Framework. For Solaris 11.1, a new appendix was added to start providing Secure Coding Guidelines for Developers, leveraging the CERT Secure Coding Standards and OWASP guidelines to provide the base recommendations for common programming languages and their standard API's. Solaris specific secure programming guidance was added via links to other documentation in the product doc set.

In parallel, we updated the Solaris C Libary Functions security considerations list with details of Solaris 11 enhancements such as FD_CLOEXEC flags, additional *at() functions, and new stdio functions such as asprintf() and getline(). A number of code examples throughout the Solaris 11.1 doc set were updated to follow these recommendations, changing unbounded strcpy() calls to strlcpy(), sprintf() to snprintf(), etc. so that developers following our examples start out with safer code. The Writing Device Drivers guide even had the appendix updated to list which of these utility functions, like snprintf() and strlcpy(), are now available via the Kernel DDI.

Little Things

Of course all the big new features got documented, and some major efforts were put into refactoring and renovation, but there were also a lot of smaller things that got fixed as well in the nearly a year between the Solaris 11 and 11.1 doc releases - again too many to list here, but a random sampling of the ones I know about & found interesting or useful:

Saturday Mar 31, 2012

Solaris: What comes next?

As you probably know by now, a few months ago, we released Solaris 11 after years of development. That of course means we now need to figure out what comes next - if Solaris 11 is “The First Cloud OS”, then what do we need to make future releases of Solaris be, to be modern and competitive when they're released? So we've been having planning and brainstorming meetings, and I've captured some notes here from just one of those we held a couple weeks ago with a number of the Silicon Valley based engineers.

Now before someone sees an idea here and calls their product rep wanting to know what's up, please be warned what follows are rough ideas, and as I'll discuss later, none of them have any committment, schedule, working code, or even plan for integration in any possible future product at this time. (Please don't make me force you to read the full Oracle future product disclaimer here, you should know it by heart already from the front of every Oracle product slide deck.)

To start with, we did some background research, looking at ideas from other Oracle groups, and competitive OS'es. We examined what was hot in the technology arena and where the interesting startups were heading. We then looked at Solaris to see where we could apply those ideas.

Making Network Admins into Socially Networking Admins

Wall art at the new Facebook HQ in the former Sun MPK Campus

We all know an admin who has grumbled about being the only one stuck late at work to fix a problem on the server, or having to work the weekend alone to do scheduled maintenance. But admins are humans (at least most are), and crave companionship and community with their fellow humans. And even when they're alone in the server room, they're never far from a network connection, allowing access to the wide world of wonders on the Internet.

Our solution here is not building a new social network - there's enough of those already, and Oracle even has its own Oracle Mix social network already. What we proposed is integrating Solaris features to help engage our system admins with these social networks, building community and bringing them recognition in the workplace, using achievement recognition systems as found in many popular gaming platforms.

For instance, if you had a Facebook account, and a group of admin friends there, you could register it with our Social Network Utility For Facebook, and then your friends might see:

Alan earned the achievement Critically Patched (April 2012) for patching all his servers.
Matt is only at 50% - encourage him to complete this achievement today!
To avoid any undue risk of advertising who has unpatched servers that are easier targets for hackers to break into, this information would be tightly protected via Facebook's world-renowned privacy settings to avoid it falling into the wrong hands.

A related form of gamification we considered was replacing simple certfications with role-playing-game-style Experience Levels. Instead of just knowing an admin passed a test establishing a given level of competency, these would provide recruiters with a more detailed level of how much real-world experience an admin has. Achievements such as the one above would feed into it, but larger numbers of experience points would be gained by tougher or more critical tasks - such as recovering a down system, or migrating a service to a new platform. (As long as it was an Oracle platform of course - migrating to an HP or IBM platform would cause the admin to lose points with us.)

Unfortunately, we couldn't figure out a good way to prevent (if you will) “gaming” the system. For instance, a disgruntled admin might decide to start ignoring warnings from FMA that a part is beginning to fail or skip preventative maintenance, in the hopes that they'd cause a catastrophic failure to earn more points for bolstering their resume as they look for a job elsewhere, and not worrying about the effect on your business of a mission critical server going down.

More Z's for ZFS

Our suggested new feature for ZFS was inspired by the worlds most successful Z-startup of all time: Zynga.

Using the Social Network Utility For Facebook described above, we'd tie it in with ZFS monitoring to help you out when you find yourself in a jam needing more disk space than you have, and can't wait a month to get a purchase order through channels to buy more. Instead with the click of a button you could post to your group:

Alan can't find any space in his server farm! Can you help?
Friends could loan you some space on their connected servers for a few weeks, knowing that you'd return the favor when needed. ZFS would create a new filesystem for your use on their system, and securely share it with your system using Kerberized NFS.

If none of your friends have space, then you could buy temporary use space in small increments at affordable rates right there in Facebook, using your Facebook credits, and then file an expense report later, after the urgent need has passed.

Universal Single Sign On

One thing all the engineers agreed on was that we still had far too many "Single" sign ons to deal with in our daily work. On the web, every web site used to have its own password database, forcing us to hope we could remember what login name was still available on each site when we signed up, and which unique password we came up with to avoid having to disclose our other passwords to a new site.

In recent years, the web services world has finally been reducing the number of logins we have to manage, with many services allowing you to login using your identity from Google, Twitter or Facebook. So we proposed following their lead, introducing PAM modules for web services - no more would you have to type in whatever login name IT assigned and try to remember the password you chose the last time password aging forced you to change it - you'd simply choose which web service you wanted to authenticate against, and would login to your Solaris account upon reciept of a cookie from their identity service.

Pinning notes to the cloud

We also all noted that we all have our own pile of notes we keep in our daily work - in text files in our home directory, in notebooks we carry around, on white boards in offices and common areas, on sticky notes on our monitors, or on scraps of paper pinned to our bulletin boards. The contents of the notes vary, some are things just for us, some are useful for our groups, some we would share with the world.

For instance, when our group moved to a new building a couple years ago, we had a white board in the hallway listing all the NIS & DNS servers, subnets, and other network configuration information we needed to set up our Solaris machines after the move. Similarly, as Solaris 11 was finishing and we were all learning the new network configuration commands, we shared notes in wikis and e-mails with our fellow engineers.

Users may also remember one of the popular features of Sun's old BigAdmin site was a section for sharing scripts and tips such as these. Meanwhile, the online "pin board" at Pinterest is taking the web by storm. So we thought, why not mash those up to solve this problem?

We proposed a new BigAddPin site where users could “pin” notes, command snippets, configuration information, and so on. For instance, once they had worked out the ideal Automated Installation manifest for their app server, they could pin it up to share with the rest of their group, or choose to make it public as an example for the world. Localized data, such as our group's notes on the servers for our subnet, could be shared only to users connecting from that subnet. And notes that they didn't want others to see at all could be marked private, such as the list of phone numbers to call for late night pizza delivery to the machine room, the birthdays and anniversaries they can never remember but would be sleeping on the couch if they forgot, or the list of automatically generated completely random, impossible to remember root passwords to all their servers.

For greater integration with Solaris, we'd put support right into the command shells — redirect output to a pinned note, set your path to include pinned notes as scripts you can run, or bring up your recent shell history and pin a set of commands to save for the next time you need to remember how to do that operation.

Location service for Solaris servers

A longer term plan would involve convincing the hardware design groups to put GPS locators with wireless transmitters in future server designs. This would help both admins and service personnel trying to find servers in todays massive data centers, and could feed into location presence apps to help show potential customers that while they may not see many Solaris machines on the desktop any more, they are all around. For instance, while walking down Wall Street it might show “There are over 2000 Solaris computers in this block.”

[Note: this proposal was made before the recent media coverage of a location service aggregrator app with less noble intentions, and in hindsight, we failed to consider what happens when such data similarly falls into the wrong hands. We certainly wouldn't want our app to be misinterpreted as “There are over $20 million dollars of SPARC servers in this building, waiting for you to steal them.” so it's probably best it was rejected.]

Harnessing the power of the GPU for Security

Most modern OS'es make use of the widespread availability of high powered GPU hardware in today's computers, with desktop environments requiring 3-D graphics acceleration, whether in Ubuntu Unity, GNOME Shell on Fedora, or Aero Glass on Windows, but we haven't yet made Solaris fully take advantage of this, beyond our basic offering of Compiz on the desktop.

Meanwhile, more businesses are interested in increasing security by using biometric authentication, but must also comply with laws in many countries preventing discrimination against employees with physical limations such as missing eyes or fingers, not to mention the lost productivity when employees can't login due to tinted contacts throwing off a retina scan or a paper cut changing their fingerprint appearance until it heals.

Fortunately, the two groups considering these problems put their heads together and found a common solution, using 3D technology to enable authentication using the one body part all users are guaranteed to have -, a new PAM module that uses an array USB attached web cams (or just one if the user is willing to spin their chair during login) to take pictures of the users head from all angles, create a 3D model and compare it to the one in the authentication database. While Mythbusters has shown how easy it can be to fool common fingerprint scanners, we have not yet seen any evidence that people can impersonate the shape of another user's cranium, no matter how long they spend beating their head against the wall to reshape it.

This could possibly be extended to group users, using modern versions of some of the older phrenological studies, such as giving all users with long grey beards access to the System Architect role, or automatically placing users with pointy spikes in their hair into an easy use mode.

Unfortunately, there are still some unsolved technical challenges we haven't figured out how to overcome. Currently, a visit to the hair salon causes your existing authentication to expire, and some users have found that shaving their heads is the only way to avoid bad hair days becoming bad login days.

Reaction to these ideas

After gathering all our notes on these ideas from the engineering brainstorming meeting, we took them in to present to our management. Unfortunately, most of their reaction cannot be printed here, and they chose not to accept any of these ideas as they were, but they did have some feedback for us to consider as they sent us back to the drawing board.

They strongly suggested our ideas would be better presented if we weren't trying to decipher ink blotches that had been smeared by the condensation when we put our pint glasses on the napkins we were taking notes on, and to that end let us know they would not be approving any more engineering offsites in Irish themed pubs on the Friday of a Saint Patrick's Day weekend. (Hopefully they mean that situation specifically and aren't going to deny the funding for travel to this year's X.Org Developer's Conference just because it happens to be in Bavaria and ending on the Friday of the weekend Oktoberfest starts.)

They recommended our research techniques could be improved over just sitting around reading blogs and checking our Facebook, Twitter, and Pinterest accounts, such as considering input from alternate viewpoints on topics such as gamification.

They also mentioned that Oracle hadn't fully adopted some of Sun's common practices and we might have to try harder to get those to be accepted now that we are one unified company.

So as I said at the beginning, don't pester your sales rep just yet for any of these, since they didn't get approved, but if you have better ideas, pass them on and maybe they'll get into our next batch of planning.

Monday Dec 19, 2011

Flame on: Graphing hot paths through X server code

Last week, Brendan Gregg, who wrote the book on DTrace, published a new tool for visualizing hot sections of code by sampling the process stack every few milliseconds and then graphing which stacks were seen the most during the run: Flame Graphs. This looked neat, but my experience has been that I get the most understanding out of things like this by trying them out myself, so I did.

Fortunately, it's very easy to setup, as Brendan provided the tools as two standalone perl scripts you can download from the Flame Graph github repo. Then the next step is deciding what you want to run it on and capturing the data from a run of that.

It just so turns out that a few days before the Flame Graph release, another mail had crossed my inbox that gave me something to look at. Several years ago, I added some Userland Statically Defined Tracing probes to the Xserver code base, creating an Xserver provider for DTrace. These got integrated first to the Solaris Xservers (both Xsun & Xorg) and then merged upstream to the X.Org community release for Xserver 1.4, where they're available for all platforms with DTrace support.

Earlier this year, Adam Jackson enabled the dtrace hooks in the Fedora Xorg server package, using the SystemTap static probe support for DTrace probe compatibility. He then noticed a performance drop, which isn't supposed to happen, as DTrace probes are simply noops when not being actively traced, and submitted a fix for it upstream.

This was due to two of the probes I'd defined - when the Xserver processes a request from a client, there's a request-start probe just before the request is processed, and a request-done probe right after the request is processed. If you just want to see what requests a client is making you can trace either one, but if you want to measure the time taken to run a request, or determine if something is happening while a request is being processed, you need both of the probes. When they first got integrated, the code was simple:

                              ((xReq *)client->requestBuffer)->length,
                              client->index, client->requestBuffer);
                // skipping over input checking and auditing, to the main event,
                // the call to the request handling proc for the specified opcode:
                    result = (* client->requestVector[MAJOROP])(client);
                XSERVER_REQUEST_DONE(GetRequestName(MAJOROP), MAJOROP,
                              client->sequence, client->index, result);

The compiler sees XSERVER_REQUEST_START and XSERVER_REQUEST_DONE as simple function calls, so it does whatever work is necessary to set up their arguments and then calls them. Later, during the linking process, the actual call instructions are replaced with noops and the addresses recorded so that when a dtrace user enables the probe the call can be activated at that time. In these cases, that's not so bad, just a bunch of register access and memory loads of things that are going to be needed nearby. The one outlier is GetRequestName(MAJOROP) which looks like a function call, but was really just a macro that used the opcode as the index an array of strings and returned the string name for the opcode so that DTrace probes could see the request names, especially for extensions which don't have static opcode mappings. For that the compiler would just load a register with the address of the base of the array and then add the offset of the entry specified by MAJOROP in that array.

All was well and good for a bit, until a later project came along during the Xorg 1.5 development cycle to unify all the different lists of protocol object names in the Xserver, as there were different ones in use by the DTrace probes, the security extensions, and the resource management system. That replaced the simple array lookup macro with a function call. While the function doesn't do a lot more work, it does enough to be noticed, and thus the performance hit was taken in the hot path of request dispatching. Adam's patch to fix this simply uses is-enabled probes to only make those function calls when the probes are actually enabled. x11perf testing showed the win on a Athlon 64 3200+ test system running Solaris 11:

Before: 250000000 trep @   0.0001 msec ( 8500000.0/sec): X protocol NoOperation
After:  300000000 trep @   0.0001 msec (10400000.0/sec): X protocol NoOperation 

But numbers are boring, and this gave an excuse to try out Flame Graphs. To capture the data, I took advantage of synchronization features built into xinit and dtrace -c:

# xinit /usr/sbin/dtrace -x ustackframes=100 \
  -n 'profile-997 /execname == "Xorg"/ { @[ustack()] = count(); }' \
  -o out.user_stacks.noop -c "x11perf -noop" \
  -- -config xorg.conf.dummy
To explain this command, I'll start at the end. For xinit, everything after the double dash is a set of arguments to the Xserver it starts, in this case, Xorg is told to look in the normal config paths for xorg.conf.dummy, which it would find is this simple config file in /etc/X11/xorg.conf.dummy setting the driver to the Xvfb-like “dummy” driver, which just uses RAM as a frame buffer to take graphics driver considerations out of this test:
Section "Device"
	Identifier  "Card0"
	Driver      "dummy"
Since I'm using a modern Xorg version, that's all the configuration needed, all the unspecified sections are autoconfigured. xinit starts the Xserver, waits for the Xserver to signal that it's finished its start up, and then runs the first half of the command line as a client with the DISPLAY set to the new Xserver. In this case it runs the dtrace command, which sets up the probes based on the examples in the Flame Graphs README, and then runs the command specified as the -c argument, the x11perf benchmark tool. When x11perf exits, dtrace stops the probes, generates its report, and then exits itself, which in turn causes xinit to shut down the X server and exit.

The resulting Flame Graphs are, in their native SVG interactive form:



If your browser can't handle those as inline SVG's, or they're scrunched too small to read, try Before (SVG) or Before (PNG), and After (SVG) or After (PNG).

You can see in the first one a bar showing a little over 10% of the time was in stacks involving LookupMajorName, which is completely gone in the second patch. Those who saw Adam's patch series come across the xorg-devel list last week may also notice the presence of XaceHook calls, which Adam optimized in another patch. Unfortunately, while I did build with that patch as well, we don't get the benefits of it since the XC-Security extension is on by default, and those fill in the hooks, so it can't just bypass them as it does when the hooks are empty.

I also took measurements of what Xorg did as gdm started up and a test user logged in, which produced the much larger flame graph you can see in SVG or PNG. As you can see the recursive calls in the font catalogue scanning functions make for some really tall flame graphs. You can also see that, to no one's surprise, xf86SlowBCopy is slow, and a large portion of the time is spent “blitting” bits from one place to another. Some potential areas for improvement stand out - like the 5.7% of time spent rescanning the font path because the Solaris gnome session startup scripts make xset fp calls to add the fonts for the current locale to the legacy font path for old clients that still use it, and another nearly 5% handling the ListFonts and ListFontsWithInfo calls, which dtracing with the request-start probes turned out to be the Java GUI for the Solaris Visual Panels gnome-panel applet.

Now because of the way the data for these is gathered, from looking at them alone you can't tell if a wide bar is one really long call to a function (as it is for the main() function bar in all these) or millions of individual calls (as it was for the ProcNoOperation calls in the x11perf -noop trace), but it does give a quick and easy way to pick out which functions the program is spending most of its time in, as a first pass for figuring out where to dig deeper for potential performance improvements.

Brendan has made these scripts easy to use to generate these graphs, so I encourage you to try them out as well on some sample runs to get familiar with them, so that when you really need them, you know what cases they're good for and how to capture the data and generate the graphs for yourself. Trying really is the best method of learning.


Engineer working on Oracle Solaris and with the X.Org open source community.


The views expressed on this blog are my own and do not necessarily reflect the views of Oracle, the X.Org Foundation, or anyone else.

See Also
Follow me on twitter


« March 2015