Friday Nov 01, 2013

Finding nuggets in ARC discussions

A bit over twenty years ago, Sun formed an Architecture Review Committee (ARC) that evaluates proposals to change interfaces between components in Sun software products. During the OpenSolaris days, we opened many of these discussions to the community. While they’re back behind closed doors, and at a different company now, we still continue to hold these reviews for the software from what’s now the Sun Systems Group division of Oracle.

Recently one of these reviews was held (via e-mail discussion) to review a proposal to update our GNU findutils package to the latest upstream release. One of the upstream changes discussed was the addition of an “oldfind” program. In findutils 4.3, find was modified to use the fts() function to walk the directory tree, and oldfind was created to provide the old mechanism in case there were bugs in the new implementation that users needed to workaround.

In Solaris 11 though, we still ship the find descended from SVR4 as /usr/bin/find and the GNU find is available as either /usr/bin/gfind or /usr/gnu/bin/find. This raised the discussion of if we should add oldfind, and if so what should we call it. Normally our policy is to only add the g* names for GNU commands that conflict with an existing Solaris command – for instance, we ship /usr/bin/emacs, not /usr/bin/gemacs. In this case however, that seemed like it would be more confusing to have /usr/bin/oldfind be the older version of /usr/bin/gfind not of /usr/bin/find. Thus if we shipped it, it would make more sense to call it /usr/bin/goldfind, which several ARC members noted read more naturally as “gold find” than as “g old find”.

One of the concerns we often discuss in ARC is if a change is likely to be understood by users or if it will result in more calls to support. As we hit this part of the discussion on a Friday at the end of a long week, I couldn’t resist putting forth a hypothetical support call for this command:

“Hello, Oracle Solaris Support, how may I help you?”

“My admin is out sick, but he sent an email that he put the findutils package on our server, and I can run goldfind now. I tried it, but goldfind didn’t find gold.”

“Did he get the binutils package too?”

“No he just said findutils, do we need binutils?”

“Well, gold comes in the binutils package, so goldfind would be able to find gold if you got that package.”

“How much does Oracle charge for that package?”

“It’s free for Solaris users.”

“You mean Oracle ships packages of gold to customers for free?”

“Yes, if you get the binutils package, it includes GNU gold.”

“New gold? Is that some sort of alchemy, turning stuff into gold?”

“Not new gold, gold from the GNU project.”

“Oracle’s taking gold from the GNU project and shipping it to me?”

“Yes, if you get binutils, that package includes gold along with the other tools from the GNU project.”

“And GNU doesn’t mind Oracle taking their gold and giving it to customers?”

“No, GNU is a non-profit whose goal is to share their software.”

“Sharing software sure, but gold? Where does a non-profit like GNU get gold anyway?”

“Oh, Google donated it to them.”

“Ah! So Oracle will give me the gold that GNU got from Google!”

“Yes, if you get the package from us.”

“How do I get the package with the gold?”

“Just run pkg install binutils and it will put it on your disk.”

“We’ve got multiple disks here - which one will it put it on?”

“The one with the system image - do you know which one that is?

“Well the note from the admin says the system is on the first disk and the users are on the second disk.”

“Okay, so it should go on the first disk then.”

“And where will I find the gold?”

“It will be in the /usr/bin directory.”

“In the user’s bin? So thats on the second disk?”

“No, it would be on the system disk, with the other development tools, like make, as, and what.”

“So what’s on the first disk?”

“Well if the system image is there the commands should all be there.”

“All the commands? Not just what?”

“Right, all the commands that come with the OS, like the shell, ps, and who.”

“So who’s on the first disk too?”

Picture of Abbott and Costello

“Yes. Did your admin say when he’d be back?”

“No, just that he had a massive headache and was going home after I tried to get him to explain this stuff to me.”

“I can’t imagine why.”

“Oh, is why a command too?”

“No, _why was a Ruby programmer.

“Ruby? Do you give those away with the gold too?”

“Yes, but it comes in the ruby package, not binutils.”

“Oh, I’ll have to have my admin get that package too! Thanks!”

Needless to say, we decided this might not be the best idea. Since the GNU package hasn’t had to release a serious bug fix in the new find in the past few years, the new GNU find seems pretty stable, and we always have the SVR4 find to use as a fallback in Solaris, so it didn’t seem that adding oldfind was really necessary, so we passed on including it when we update to the new findutils release.

[Apologies to Abbott, Costello, their fans, and everyone who read this far. The Gold (linker) page on Wikipedia may explain some of the above, but can’t explain why goldfind is the old GNU find, but gold is the new GNU ld.]

Sunday Nov 11, 2012

Solaris 11.1 changes building of code past the point of __NORETURN

While Solaris 11.1 was under development, we started seeing some errors in the builds of the upstream X.Org git master sources, such as:

"Display.c", line 65: Function has no return statement : x_io_error_handler
"hostx.c", line 341: Function has no return statement : x_io_error_handler
from functions that were defined to match a specific callback definition that declared them as returning an int if they did return, but these were calling exit() instead of returning so hadn't listed a return value.

These had been generating warnings for years which we'd been ignoring, but X.Org has made enough progress in cleaning up code for compiler warnings and static analysis issues lately, that the community turned up the default error levels, including the gcc flag -Werror=return-type and the equivalent Solaris Studio cc flags -v -errwarn=E_FUNC_HAS_NO_RETURN_STMT, so now these became errors that stopped the build. Yet on Solaris, gcc built this code fine, while Studio errored out. Investigation showed this was due to the Solaris headers, which during Solaris 10 development added a number of annotations to the headers when gcc was being used for the amd64 kernel bringup before the Studio amd64 port was ready. Since Studio did not support the inline form of these annotations at the time, but instead used #pragma for them, the definitions were only present for gcc.

To resolve this, I fixed both sides of the problem, so that it would work for building new X.Org sources on older Solaris releases or with older Studio compilers, as well as fixing the general problem before it broke more software building on Solaris.

To the X.Org sources, I added the traditional Studio #pragma does_not_return to recognize that functions like exit() don't ever return, in patches such as this Xserver patch. Adding a dummy return statement was ruled out as that introduced unreachable code errors from compilers and analyzers that correctly realized you couldn't reach that code after a return statement.

And on the Solaris 11.1 side, I updated the annotation definitions in <sys/ccompile.h> to enable for Studio 12.0 and later compilers the annotations already existing in a number of system headers for functions like exit() and abort(). If you look in that file you'll see the annotations we currently use, though the forms there haven't gone through review to become a Committed interface, so may change in the future.

Actually getting this integrated into Solaris though took a bit more work than just editing one header file. Our ELF binary build comparison tool, wsdiff, actually showed a large number of differences in the resulting binaries due to the compiler using this information for branch prediction, code path analysis, and other possible optimizations, so after comparing enough of the disassembly output to be comfortable with the changes, we also made sure to get this in early enough in the release cycle so that it would get plenty of test exposure before the release.

It also required updating quite a bit of code to avoid introducing new lint or compiler warnings or errors, and people building applications on top of Solaris 11.1 and later may need to make similar changes if they want to keep their build logs similarly clean.

Previously, if you had a function that was declared with a non-void return type, lint and cc would warn if you didn't return a value, even if you called a function like exit() or panic() that ended execution. For instance:

#include <stdlib.h>

int
callback(int status)
{
    if (status == 0)
        return status;
    exit(status);
}
would previously require a never executed return 0; after the exit() to avoid lint warning "function falls off bottom without returning value".

Now the compiler & lint will both issue "statement not reached" warnings for a return 0; after the final exit(), allowing (or in some cases, requiring) it to be removed. However, if there is no return statement anywhere in the function, lint will warn that you've declared a function returning a value that never does so, suggesting you can declare it as void. Unfortunately, if your function signature is required to match a certain form, such as in a callback, you not be able to do so, and will need to add a /* LINTED */ to the end of the function.

If you need your code to build on both a newer and an older release, then you will either need to #ifdef these unreachable statements, or, to keep your sources common across releases, add to your sources the corresponding #pragma recognized by both current and older compiler versions, such as:

#pragma does_not_return(exit)
#pragma does_not_return(panic) 
Hopefully this little extra work is paid for by the compilers & code analyzers being able to better understand your code paths, giving you better optimizations and more accurate errors & warning messages.

Sunday Oct 28, 2012

Solaris 11.1: Changes to included FOSS packages

Besides the documentation changes I mentioned last time, another place you can see Solaris 11.1 changes before upgrading is in the online package repository, now that the 11.1 packages have been published to http://pkg.oracle.com/solaris/release/, as the “0.175.1.0.0.24.2” branch. (Oracle Solaris Package Versioning explains what each field in that version string means.)

When you’re ready to upgrade to the packages from either this repo, or the support repository, you’ll want to first read How to Update to Oracle Solaris 11.1 Using the Image Packaging System by Pete Dennis, as there are a couple issues you will need to be aware of to do that upgrade, several of which are due to changes in the Free and Open Source Software (FOSS) packages included with Solaris, as I’ll explain in a bit.

Solaris 11 can update more readily than Solaris 10

In the Solaris 10 and older update models, the way the updates were built constrained what changes we could make in those releases. To change an existing SVR4 package in those releases, we created a Solaris Patch, which applied to a given version of the SVR4 package and replaced, added or deleted files in it. These patches were released via the support websites (originally SunSolve, now My Oracle Support) for applying to existing Solaris 10 installations, and were also merged into the install images for the next Solaris 10 update release. (This Solaris Patches blog post from Gerry Haskins dives deeper into that subject.)

Some of the restrictions of this model were that package refactoring, changes to package dependencies, and even just changing the package version number, were difficult to do in this hybrid patch/OS update model. For instance, when Solaris 10 first shipped, it had the Xorg server from X11R6.8. Over the first couple years of update releases we were able to keep it up to date by replacing, adding, & removing files as necessary, taking it all the way up to Xorg server release 1.3 (new version numbering begun after the X11R7 split of the X11 tree into separate modules gave each module its own version). But if you run pkginfo on the SUNWxorg-server package, you’ll see it still displayed a version number of 6.8, confusing users as to which version was actually included.

We stopped upgrading the Xorg server releases in Solaris 10 after 1.3, as later versions added new dependencies, such as HAL, D-Bus, and libpciaccess, which were very difficult to manage in this patching model. (We later got libpciaccess to work, but HAL & D-Bus would have been much harder due to the greater dependency tree underneath those.) Similarly, every time the GNOME team looked into upgrading Solaris 10 past GNOME 2.6, they found these constraints made it so difficult it wasn’t worthwhile, and eventually GNOME’s dependencies had changed enough it was completely infeasible. Fortunately, this worked out for both the X11 & GNOME teams, with our management making the business decision to concentrate on the “Nevada” branch for desktop users - first as Solaris Express Desktop Edition, and later as OpenSolaris, so we didn’t have to fight to try to make the package updates fit into these tight constraints.

Meanwhile, the team designing the new packaging system for Solaris 11 was seeing us struggle with these problems, and making this much easier to manage for both the development teams and our users was one of their big goals for the IPS design they were working on. Now that we’ve reached the first update release to Solaris 11, we can start to see the fruits of their labors, with more FOSS updates in 11.1 than we had in many Solaris 10 update releases, keeping software more up to date with the upstream communities.

Of course, just because we can more easily update now, doesn’t always mean we should or will do so, it just removes the package system limitations from forcing the decision for us. So while we’ve upgraded the X Window System in the 11.1 release from X11R7.6 to 7.7, the Solaris GNOME team decided it was not the right time to try to make the jump from GNOME 2 to GNOME 3, though they did update some individual components of the desktop, especially those with security fixes like Firefox. In other parts of the system, decisions as to what to update were prioritized based on how they affected other projects, or what customer requests we’d gotten for them.

So with all that background in place, what packages did we actually update or add between Solaris 11.0 and 11.1?

Core OS Functionality

One of the FOSS changes with the biggest impact in this release is the upgrade from Grub Legacy (0.97) to Grub 2 (1.99) for the x64 platform boot loader. This is the cause of one of the upgrade quirks, since to go from Solaris 11.0 to 11.1 on x64 systems, you first need to update the Boot Environment tools (such as beadm) to a new version that can handle boot environments that use the Grub2 boot loader. System administrators can find the details they need to know about the new Grub in the Administering the GRand Unified Bootloader chapter of the Booting and Shutting Down Oracle Solaris 11.1 Systems guide. This change was necessary to be able to support new hardware coming into the x64 marketplace, including systems using UEFI firmware or booting off disk drives larger than 2 terabytes.

For both platforms, Solaris 11.1 adds rsyslog as an optional alternative to the traditional syslogd, and OpenSCAP for checking security configuration settings are compliant with site policies.

Note that the support repo actually has newer versions of BIND & fetchmail than the 11.1 release, as some late breaking critical fixes came through from the community upstream releases after the Solaris 11.1 release was frozen, and made their way to the support repository. These are responsible for the other big upgrade quirk in this release, in which to upgrade a system which already installed those versions from the support repo, you need to either wait for those packages to make their way to the 11.1 branch of the support repo, or follow the steps in the aforementioned upgrade walkthrough to let the package system know it's okay to temporarily downgrade those.

Developer Stack

While Solaris 11.0 included Python 2.7, many of the bundled python modules weren’t packaged for it yet, limiting its usability. For 11.1, many more of the python modules include 2.7 versions (enough that I filtered them out of the below table, but you can always search on the package repository server for them.

For other language runtimes and development tools, 11.1 expands the use of IPS mediated links to choose which version of a package is the default when the packages are designed to allow multiple versions to install side by side.

For instance, in Solaris 11.0, GNU automake 1.9 and 1.10 were provided, and developers had to run them as either automake-1.9 or automake-1.10. In Solaris 11.1, when automake 1.11 was added, also added was a /usr/bin/automake mediated link, which points to the automake-1.11 program by default, but can be changed to another version by running the pkg set-mediator command.

Mediated links were also used for the Java runtime & development kits in 11.1, changing the default versions to the Java 7 releases (the 1.7.0.x package versions), while allowing admins to switch links such as /usr/bin/javac back to Java 6 if they need to for their site, to deal with Java 7 compatibility or other issues, without having to update each usage to use the full versioned /usr/jdk/jdk1.6.0_35/bin/javac paths for every invocation.

Desktop Stack

As I mentioned before, we upgraded from X11R7.6 to X11R7.7, since a pleasant coincidence made the X.Org release dates line up nicely with our feature & code freeze dates for this release. (Or perhaps it wasn’t so coincidental, after all, one of the benefits of being the person making the release is being able to decide what schedule is most convenient for you, and this one worked well for me.) For the table below, I’ve skipped listing the packages in which we use the X11 “katamari” version for the Solaris package version (mainly packages combining elements of multiple upstream modules with independent version numbers), since they just all changed from 7.6 to 7.7.

In the graphics drivers, we worked with Intel to update the Intel Integrated Graphics Processor support to support 3D graphics and kernel mode setting on the Ivy Bridge chipsets, and updated Nvidia’s non-FOSS graphics driver from 280.13 to 295.20.

Higher up in the desktop stack, PulseAudio was added for audio support, and liblouis for Braille support, and the GNOME applications were built to use them.

The Mozilla applications, Firefox & Thunderbird moved to the current Extended Support Release (ESR) versions, 10.x for each, to bring up-to-date security fixes without having to be on Mozilla’s agressive 6 week feature cycle release train.

Detailed list of changes

This table shows most of the changes to the FOSS packages between Solaris 11.0 and 11.1. As noted above, some were excluded for clarity, or to reduce noise and duplication. All the FOSS packages which didn't change the version number in their packaging info are not included, even if they had updates to fix bugs, security holes, or add support for new hardware or new features of Solaris.

Package11.011.1
archiver/unrar 3.8.5 4.1.4
audio/sox 14.3.0 14.3.2
backup/rdiff-backup 1.2.1 1.3.3
communication/im/pidgin 2.10.0 2.10.5
compress/gzip 1.3.5 1.4
compress/xz not included 5.0.1
database/sqlite-3 3.7.6.3 3.7.11
desktop/remote-desktop/tigervnc 1.0.90 1.1.0
desktop/window-manager/xcompmgr 1.1.5 1.1.6
desktop/xscreensaver 5.12 5.15
developer/build/autoconf 2.63 2.68
developer/build/autoconf/xorg-macros 1.15.0 1.17
developer/build/automake-111 not included 1.11.2
developer/build/cmake 2.6.2 2.8.6
developer/build/gnu-make 3.81 3.82
developer/build/imake 1.0.4 1.0.5
developer/build/libtool 1.5.22 2.4.2
developer/build/makedepend 1.0.3 1.0.4
developer/documentation-tool/doxygen 1.5.7.1 1.7.6.1
developer/gnu-binutils 2.19 2.21.1
developer/java/jdepend not included 2.9
developer/java/jdk-6 1.6.0.26 1.6.0.35
developer/java/jdk-7 1.7.0.0 1.7.0.7
developer/java/jpackage-utils not included 1.7.5
developer/java/junit 4.5 4.10
developer/lexer/jflex not included 1.4.1
developer/parser/byaccj not included 1.14
developer/parser/java_cup not included 0.10
developer/quilt 0.47 0.60
developer/versioning/git 1.7.3.2 1.7.9.2
developer/versioning/mercurial 1.8.4 2.2.1
developer/versioning/subversion 1.6.16 1.7.5
diagnostic/constype 1.0.3 1.0.4
diagnostic/nmap 5.21 5.51
diagnostic/scanpci 0.12.1 0.13.1
diagnostic/wireshark 1.4.8 1.8.2
diagnostic/xload 1.1.0 1.1.1
editor/gnu-emacs 23.1 23.4
editor/vim 7.3.254 7.3.600
file/lndir 1.0.2 1.0.3
image/editor/bitmap 1.0.5 1.0.6
image/gnuplot 4.4.0 4.6.0
image/library/libexif 0.6.19 0.6.21
image/library/libpng 1.4.8 1.4.11
image/library/librsvg 2.26.3 2.34.1
image/xcursorgen 1.0.4 1.0.5
library/audio/pulseaudio not included 1.1
library/cacao 2.3.0.0 2.3.1.0
library/expat 2.0.1 2.1.0
library/gc 7.1 7.2
library/graphics/pixman 0.22.0 0.24.4
library/guile 1.8.4 1.8.6
library/java/javadb 10.5.3.0 10.6.2.1
library/java/subversion 1.6.16 1.7.5
library/json-c not included 0.9
library/libedit not included 3.0
library/libee not included 0.3.2
library/libestr not included 0.1.2
library/libevent 1.3.5 1.4.14.2
library/liblouis not included 2.1.1
library/liblouisxml not included 2.1.0
library/libtecla 1.6.0 1.6.1
library/libtool/libltdl 1.5.22 2.4.2
library/nspr 4.8.8 4.8.9
library/openldap 2.4.25 2.4.30
library/pcre 7.8 8.21
library/perl-5/subversion 1.6.16 1.7.5
library/python-2/jsonrpclib not included 0.1.3
library/python-2/lxml 2.1.2 2.3.3
library/python-2/nose not included 1.1.2
library/python-2/pyopenssl not included 0.11
library/python-2/subversion 1.6.16 1.7.5
library/python-2/tkinter-26 2.6.4 2.6.8
library/python-2/tkinter-27 2.7.1 2.7.3
library/security/nss 4.12.10 4.13.1
library/security/openssl 1.0.0.5 (1.0.0e) 1.0.0.10 (1.0.0j)
mail/thunderbird 6.0 10.0.6
network/dns/bind 9.6.3.4.3 9.6.3.7.2
package/pkgbuild not included 1.3.104
print/filter/enscript not included 1.6.4
print/filter/gutenprint 5.2.4 5.2.7
print/lp/filter/foomatic-rip 3.0.2 4.0.15
runtime/java/jre-6 1.6.0.26 1.6.0.35
runtime/java/jre-7 1.7.0.0 1.7.0.7
runtime/perl-512 5.12.3 5.12.4
runtime/python-26 2.6.4 2.6.8
runtime/python-27 2.7.1 2.7.3
runtime/ruby-18 1.8.7.334 1.8.7.357
runtime/tcl-8/tcl-sqlite-3 3.7.6.3 3.7.11
security/compliance/openscap not included 0.8.1
security/nss-utilities 4.12.10 4.13.1
security/sudo 1.8.1.2 1.8.4.5
service/network/dhcp/isc-dhcp 4.1 4.1.0.6
service/network/dns/bind 9.6.3.4.3 9.6.3.7.2
service/network/ftp (ProFTPD) 1.3.3.0.5 1.3.3.0.7
service/network/samba 3.5.10 3.6.6
shell/conflict 0.2004.9.1 0.2010.6.27
shell/pipe-viewer 1.1.4 1.2.0
shell/zsh 4.3.12 4.3.17
system/boot/grub 0.97 1.99
system/font/truetype/liberation 1.4 1.7.2
system/library/freetype-2 2.4.6 2.4.9
system/library/libnet 1.1.2.1 1.1.5
system/management/cim/pegasus 2.9.1 2.11.0
system/management/ipmitool 1.8.10 1.8.11
system/management/wbem/wbemcli 1.3.7 1.3.9.1
system/network/routing/quagga 0.99.8 0.99.19
system/rsyslog not included 6.2.0
terminal/luit 1.1.0 1.1.1
text/convmv 1.14 1.15
text/gawk 3.1.5 3.1.8
text/gnu-grep 2.5.4 2.10
web/browser/firefox 6.0.2 10.0.6
web/browser/links 1.0 1.0.3
web/java-servlet/tomcat 6.0.33 6.0.35
web/php-53 not included 5.3.14
web/php-53/extension/php-apc not included 3.1.9
web/php-53/extension/php-idn not included 0.2.0
web/php-53/extension/php-memcache not included 3.0.6
web/php-53/extension/php-mysql not included 5.3.14
web/php-53/extension/php-pear not included 5.3.14
web/php-53/extension/php-suhosin not included 0.9.33
web/php-53/extension/php-tcpwrap not included 1.1.3
web/php-53/extension/php-xdebug not included 2.2.0
web/php-common not included 11.1
web/proxy/squid 3.1.8 3.1.18
web/server/apache-22 2.2.20 2.2.22
web/server/apache-22/module/apache-sed 2.2.20 2.2.22
web/server/apache-22/module/apache-wsgi not included 3.3
x11/diagnostic/xev 1.1.0 1.2.0
x11/diagnostic/xscope 1.3 1.3.1
x11/documentation/xorg-docs 1.6 1.7
x11/keyboard/xkbcomp 1.2.3 1.2.4
x11/library/libdmx 1.1.1 1.1.2
x11/library/libdrm 2.4.25 2.4.32
x11/library/libfontenc 1.1.0 1.1.1
x11/library/libfs 1.0.3 1.0.4
x11/library/libice 1.0.7 1.0.8
x11/library/libsm 1.2.0 1.2.1
x11/library/libx11 1.4.4 1.5.0
x11/library/libxau 1.0.6 1.0.7
x11/library/libxcb 1.7 1.8.1
x11/library/libxcursor 1.1.12 1.1.13
x11/library/libxdmcp 1.1.0 1.1.1
x11/library/libxext 1.3.0 1.3.1
x11/library/libxfixes 4.0.5 5.0
x11/library/libxfont 1.4.4 1.4.5
x11/library/libxft 2.2.0 2.3.1
x11/library/libxi 1.4.3 1.6.1
x11/library/libxinerama 1.1.1 1.1.2
x11/library/libxkbfile 1.0.7 1.0.8
x11/library/libxmu 1.1.0 1.1.1
x11/library/libxmuu 1.1.0 1.1.1
x11/library/libxpm 3.5.9 3.5.10
x11/library/libxrender 0.9.6 0.9.7
x11/library/libxres 1.0.5 1.0.6
x11/library/libxscrnsaver 1.2.1 1.2.2
x11/library/libxtst 1.2.0 1.2.1
x11/library/libxv 1.0.6 1.0.7
x11/library/libxvmc 1.0.6 1.0.7
x11/library/libxxf86vm 1.1.1 1.1.2
x11/library/mesa 7.10.2 7.11.2
x11/library/toolkit/libxaw7 1.0.9 1.0.11
x11/library/toolkit/libxt 1.0.9 1.1.3
x11/library/xtrans 1.2.6 1.2.7
x11/oclock 1.0.2 1.0.3
x11/server/xdmx 1.10.3 1.12.2
x11/server/xephyr 1.10.3 1.12.2
x11/server/xorg 1.10.3 1.12.2
x11/server/xorg/driver/xorg-input-keyboard 1.6.0 1.6.1
x11/server/xorg/driver/xorg-input-mouse 1.7.1 1.7.2
x11/server/xorg/driver/xorg-input-synaptics 1.4.1 1.6.2
x11/server/xorg/driver/xorg-input-vmmouse 12.7.0 12.8.0
x11/server/xorg/driver/xorg-video-ast 0.91.10 0.93.10
x11/server/xorg/driver/xorg-video-ati 6.14.1 6.14.4
x11/server/xorg/driver/xorg-video-cirrus 1.3.2 1.4.0
x11/server/xorg/driver/xorg-video-dummy 0.3.4 0.3.5
x11/server/xorg/driver/xorg-video-intel 2.10.0 2.18.0
x11/server/xorg/driver/xorg-video-mach64 6.9.0 6.9.1
x11/server/xorg/driver/xorg-video-mga 1.4.13 1.5.0
x11/server/xorg/driver/xorg-video-openchrome 0.2.904 0.2.905
x11/server/xorg/driver/xorg-video-r128 6.8.1 6.8.2
x11/server/xorg/driver/xorg-video-trident 1.3.4 1.3.5
x11/server/xorg/driver/xorg-video-vesa 2.3.0 2.3.1
x11/server/xorg/driver/xorg-video-vmware 11.0.3 12.0.2
x11/server/xserver-common 1.10.3 1.12.2
x11/server/xvfb 1.10.3 1.12.2
x11/server/xvnc 1.0.90 1.1.0
x11/session/sessreg 1.0.6 1.0.7
x11/session/xauth 1.0.6 1.0.7
x11/session/xinit 1.3.1 1.3.2
x11/transset 0.9.1 1.0.0
x11/trusted/trusted-xorg 1.10.3 1.12.2
x11/x11-window-dump 1.0.4 1.0.5
x11/xclipboard 1.1.1 1.1.2
x11/xclock 1.0.5 1.0.6
x11/xfd 1.1.0 1.1.1
x11/xfontsel 1.0.3 1.0.4
x11/xfs 1.1.1 1.1.2

P.S. To get the version numbers for this table, I ran a quick perl script over the output from:

% pkg contents -H -r -t depend -a type=incorporate -o fmri \
  `pkg contents -H -r -t depend -a type=incorporate -o fmri entire@0.5.11,5.11-0.175.1.0.0.24` \
  | sort >> /tmp/11.1
% pkg contents -H -r -t depend -a type=incorporate -o fmri \
  `pkg contents -H -r -t depend -a type=incorporate -o fmri entire@0.5.11,5.11-0.175.0.0.0.2` \
  | sort >> /tmp/11.0

Thursday Oct 25, 2012

Documentation Changes in Solaris 11.1

One of the first places you can see Solaris 11.1 changes are in the docs, which have now been posted in the Solaris 11.1 Library on docs.oracle.com. I spent a good deal of time reviewing documentation for this release, and thought some would be interesting to blog about, but didn't review all the changes (not by a long shot), and am not going to cover all the changes here, so there's plenty left for you to discover on your own.

Just comparing the Solaris 11.1 Library list of docs against the Solaris 11 list will show a lot of reorganization and refactoring of the doc set, especially in the system administration guides. Hopefully the new break down will make it easier to get straight to the sections you need when a task is at hand.

Packaging System

Unfortunately, the excellent in-depth guide for how to build packages for the new Image Packaging System (IPS) in Solaris 11 wasn't done in time to make the initial Solaris 11 doc set. An interim version was published shortly after release, in PDF form on the OTN IPS page. For Solaris 11.1 it was included in the doc set, as Packaging and Delivering Software With the Image Packaging System in Oracle Solaris 11.1, so should be easier to find, and easier to share links to specific pages the HTML version.

Beyond just how to build a package, it includes details on how Solaris is packaged, and how package updates work, which may be useful to all system administrators who deal with Solaris 11 upgrades & installations. The Adding and Updating Oracle Solaris 11.1 Software Packages was also extended, including new sections on Relaxing Version Constraints Specified by Incorporations and Locking Packages to a Specified Version that may be of interest to those who want to keep the Solaris 11 versions of certain packages when they upgrade, such as the couple of packages that had functionality removed by an (unusual for an update release) End of Feature process in the 11.1 release.

Also added in this release is a document containing the lists of all the packages in each of the major package groups in Solaris 11.1 (solaris-desktop, solaris-large-server, and solaris-small-server). While you can simply get the contents of those groups from the package repository, either via the web interface or the pkg command line, the documentation puts them in handy tables for easier side-by-side comparison, or viewing the lists before you've installed the system to pick which one you want to initially install.

X Window System

We've not had good X11 coverage in the online Solaris docs in a while, mostly relying on the man pages, and upstream X.Org docs. In this release, we've integrated some X coverage into the Solaris 11.1 Desktop Adminstrator's Guide, including sections on installing fonts for fontconfig or legacy X11 clients, X server configuration, and setting up remote access via X11 or VNC. Of course we continue to work on improving the docs, including a lot of contributions to the upstream docs all OS'es share (more about that another time).

Security

One of the things Oracle likes to do for its products is to publish security guides for administrators & developers to know how to build systems that meet their security needs. For Solaris, we started this with Solaris 11, providing a guide for sysadmins to find where the security relevant configuration options were documented. The Solaris 11.1 Security Guidelines extend this to cover new security features, such as Address Space Layout Randomization (ASLR) and Read-Only Zones, as well as adding additional guidelines for existing features, such as how to limit the size of tmpfs filesystems, to avoid users driving the system into swap thrashing situations.

For developers, the corresponding document is the Developer's Guide to Oracle Solaris 11 Security, which has been the source for years for documentation of security-relevant Solaris API's such as PAM, GSS-API, and the Solaris Cryptographic Framework. For Solaris 11.1, a new appendix was added to start providing Secure Coding Guidelines for Developers, leveraging the CERT Secure Coding Standards and OWASP guidelines to provide the base recommendations for common programming languages and their standard API's. Solaris specific secure programming guidance was added via links to other documentation in the product doc set.

In parallel, we updated the Solaris C Libary Functions security considerations list with details of Solaris 11 enhancements such as FD_CLOEXEC flags, additional *at() functions, and new stdio functions such as asprintf() and getline(). A number of code examples throughout the Solaris 11.1 doc set were updated to follow these recommendations, changing unbounded strcpy() calls to strlcpy(), sprintf() to snprintf(), etc. so that developers following our examples start out with safer code. The Writing Device Drivers guide even had the appendix updated to list which of these utility functions, like snprintf() and strlcpy(), are now available via the Kernel DDI.

Little Things

Of course all the big new features got documented, and some major efforts were put into refactoring and renovation, but there were also a lot of smaller things that got fixed as well in the nearly a year between the Solaris 11 and 11.1 doc releases - again too many to list here, but a random sampling of the ones I know about & found interesting or useful:

Saturday Mar 31, 2012

Solaris: What comes next?

As you probably know by now, a few months ago, we released Solaris 11 after years of development. That of course means we now need to figure out what comes next - if Solaris 11 is “The First Cloud OS”, then what do we need to make future releases of Solaris be, to be modern and competitive when they're released? So we've been having planning and brainstorming meetings, and I've captured some notes here from just one of those we held a couple weeks ago with a number of the Silicon Valley based engineers.

Now before someone sees an idea here and calls their product rep wanting to know what's up, please be warned what follows are rough ideas, and as I'll discuss later, none of them have any committment, schedule, working code, or even plan for integration in any possible future product at this time. (Please don't make me force you to read the full Oracle future product disclaimer here, you should know it by heart already from the front of every Oracle product slide deck.)

To start with, we did some background research, looking at ideas from other Oracle groups, and competitive OS'es. We examined what was hot in the technology arena and where the interesting startups were heading. We then looked at Solaris to see where we could apply those ideas.


Making Network Admins into Socially Networking Admins

Wall art at the new Facebook HQ in the former Sun MPK Campus

We all know an admin who has grumbled about being the only one stuck late at work to fix a problem on the server, or having to work the weekend alone to do scheduled maintenance. But admins are humans (at least most are), and crave companionship and community with their fellow humans. And even when they're alone in the server room, they're never far from a network connection, allowing access to the wide world of wonders on the Internet.

Our solution here is not building a new social network - there's enough of those already, and Oracle even has its own Oracle Mix social network already. What we proposed is integrating Solaris features to help engage our system admins with these social networks, building community and bringing them recognition in the workplace, using achievement recognition systems as found in many popular gaming platforms.

For instance, if you had a Facebook account, and a group of admin friends there, you could register it with our Social Network Utility For Facebook, and then your friends might see:

Alan earned the achievement Critically Patched (April 2012) for patching all his servers.
Matt is only at 50% - encourage him to complete this achievement today!
To avoid any undue risk of advertising who has unpatched servers that are easier targets for hackers to break into, this information would be tightly protected via Facebook's world-renowned privacy settings to avoid it falling into the wrong hands.

A related form of gamification we considered was replacing simple certfications with role-playing-game-style Experience Levels. Instead of just knowing an admin passed a test establishing a given level of competency, these would provide recruiters with a more detailed level of how much real-world experience an admin has. Achievements such as the one above would feed into it, but larger numbers of experience points would be gained by tougher or more critical tasks - such as recovering a down system, or migrating a service to a new platform. (As long as it was an Oracle platform of course - migrating to an HP or IBM platform would cause the admin to lose points with us.)

Unfortunately, we couldn't figure out a good way to prevent (if you will) “gaming” the system. For instance, a disgruntled admin might decide to start ignoring warnings from FMA that a part is beginning to fail or skip preventative maintenance, in the hopes that they'd cause a catastrophic failure to earn more points for bolstering their resume as they look for a job elsewhere, and not worrying about the effect on your business of a mission critical server going down.

More Z's for ZFS

Our suggested new feature for ZFS was inspired by the worlds most successful Z-startup of all time: Zynga.

Using the Social Network Utility For Facebook described above, we'd tie it in with ZFS monitoring to help you out when you find yourself in a jam needing more disk space than you have, and can't wait a month to get a purchase order through channels to buy more. Instead with the click of a button you could post to your group:

Alan can't find any space in his server farm! Can you help?
Friends could loan you some space on their connected servers for a few weeks, knowing that you'd return the favor when needed. ZFS would create a new filesystem for your use on their system, and securely share it with your system using Kerberized NFS.

If none of your friends have space, then you could buy temporary use space in small increments at affordable rates right there in Facebook, using your Facebook credits, and then file an expense report later, after the urgent need has passed.

Universal Single Sign On

One thing all the engineers agreed on was that we still had far too many "Single" sign ons to deal with in our daily work. On the web, every web site used to have its own password database, forcing us to hope we could remember what login name was still available on each site when we signed up, and which unique password we came up with to avoid having to disclose our other passwords to a new site.

In recent years, the web services world has finally been reducing the number of logins we have to manage, with many services allowing you to login using your identity from Google, Twitter or Facebook. So we proposed following their lead, introducing PAM modules for web services - no more would you have to type in whatever login name IT assigned and try to remember the password you chose the last time password aging forced you to change it - you'd simply choose which web service you wanted to authenticate against, and would login to your Solaris account upon reciept of a cookie from their identity service.

Pinning notes to the cloud

We also all noted that we all have our own pile of notes we keep in our daily work - in text files in our home directory, in notebooks we carry around, on white boards in offices and common areas, on sticky notes on our monitors, or on scraps of paper pinned to our bulletin boards. The contents of the notes vary, some are things just for us, some are useful for our groups, some we would share with the world.

For instance, when our group moved to a new building a couple years ago, we had a white board in the hallway listing all the NIS & DNS servers, subnets, and other network configuration information we needed to set up our Solaris machines after the move. Similarly, as Solaris 11 was finishing and we were all learning the new network configuration commands, we shared notes in wikis and e-mails with our fellow engineers.

Users may also remember one of the popular features of Sun's old BigAdmin site was a section for sharing scripts and tips such as these. Meanwhile, the online "pin board" at Pinterest is taking the web by storm. So we thought, why not mash those up to solve this problem?

We proposed a new BigAddPin site where users could “pin” notes, command snippets, configuration information, and so on. For instance, once they had worked out the ideal Automated Installation manifest for their app server, they could pin it up to share with the rest of their group, or choose to make it public as an example for the world. Localized data, such as our group's notes on the servers for our subnet, could be shared only to users connecting from that subnet. And notes that they didn't want others to see at all could be marked private, such as the list of phone numbers to call for late night pizza delivery to the machine room, the birthdays and anniversaries they can never remember but would be sleeping on the couch if they forgot, or the list of automatically generated completely random, impossible to remember root passwords to all their servers.

For greater integration with Solaris, we'd put support right into the command shells — redirect output to a pinned note, set your path to include pinned notes as scripts you can run, or bring up your recent shell history and pin a set of commands to save for the next time you need to remember how to do that operation.

Location service for Solaris servers

A longer term plan would involve convincing the hardware design groups to put GPS locators with wireless transmitters in future server designs. This would help both admins and service personnel trying to find servers in todays massive data centers, and could feed into location presence apps to help show potential customers that while they may not see many Solaris machines on the desktop any more, they are all around. For instance, while walking down Wall Street it might show “There are over 2000 Solaris computers in this block.”

[Note: this proposal was made before the recent media coverage of a location service aggregrator app with less noble intentions, and in hindsight, we failed to consider what happens when such data similarly falls into the wrong hands. We certainly wouldn't want our app to be misinterpreted as “There are over $20 million dollars of SPARC servers in this building, waiting for you to steal them.” so it's probably best it was rejected.]

Harnessing the power of the GPU for Security

Most modern OS'es make use of the widespread availability of high powered GPU hardware in today's computers, with desktop environments requiring 3-D graphics acceleration, whether in Ubuntu Unity, GNOME Shell on Fedora, or Aero Glass on Windows, but we haven't yet made Solaris fully take advantage of this, beyond our basic offering of Compiz on the desktop.

Meanwhile, more businesses are interested in increasing security by using biometric authentication, but must also comply with laws in many countries preventing discrimination against employees with physical limations such as missing eyes or fingers, not to mention the lost productivity when employees can't login due to tinted contacts throwing off a retina scan or a paper cut changing their fingerprint appearance until it heals.

Fortunately, the two groups considering these problems put their heads together and found a common solution, using 3D technology to enable authentication using the one body part all users are guaranteed to have - pam_phrenology.so, a new PAM module that uses an array USB attached web cams (or just one if the user is willing to spin their chair during login) to take pictures of the users head from all angles, create a 3D model and compare it to the one in the authentication database. While Mythbusters has shown how easy it can be to fool common fingerprint scanners, we have not yet seen any evidence that people can impersonate the shape of another user's cranium, no matter how long they spend beating their head against the wall to reshape it.

This could possibly be extended to group users, using modern versions of some of the older phrenological studies, such as giving all users with long grey beards access to the System Architect role, or automatically placing users with pointy spikes in their hair into an easy use mode.

Unfortunately, there are still some unsolved technical challenges we haven't figured out how to overcome. Currently, a visit to the hair salon causes your existing authentication to expire, and some users have found that shaving their heads is the only way to avoid bad hair days becoming bad login days.


Reaction to these ideas

After gathering all our notes on these ideas from the engineering brainstorming meeting, we took them in to present to our management. Unfortunately, most of their reaction cannot be printed here, and they chose not to accept any of these ideas as they were, but they did have some feedback for us to consider as they sent us back to the drawing board.

They strongly suggested our ideas would be better presented if we weren't trying to decipher ink blotches that had been smeared by the condensation when we put our pint glasses on the napkins we were taking notes on, and to that end let us know they would not be approving any more engineering offsites in Irish themed pubs on the Friday of a Saint Patrick's Day weekend. (Hopefully they mean that situation specifically and aren't going to deny the funding for travel to this year's X.Org Developer's Conference just because it happens to be in Bavaria and ending on the Friday of the weekend Oktoberfest starts.)

They recommended our research techniques could be improved over just sitting around reading blogs and checking our Facebook, Twitter, and Pinterest accounts, such as considering input from alternate viewpoints on topics such as gamification.

They also mentioned that Oracle hadn't fully adopted some of Sun's common practices and we might have to try harder to get those to be accepted now that we are one unified company.

So as I said at the beginning, don't pester your sales rep just yet for any of these, since they didn't get approved, but if you have better ideas, pass them on and maybe they'll get into our next batch of planning.

Wednesday Nov 09, 2011

S11 X11: ye olde window system in today's new operating system

Today's the big release for Oracle Solaris 11, after 7 years of development. For me, the Solaris 11 release comes a little more than 11 years after I joined the X11 engineering team at what was then Sun, and finishes off some projects that were started all the way back then.

For instance, when I joined the X team, Sun was finishing off the removal of the old OpenWindows desktop, and we kept getting questions asking about the rest of the stuff being shipped in /usr/openwin, the directory that held both the OpenLook applications and the X Window System software. I wrote up an ARC case at the time to move the X software to /usr/X11, but there were various issues and higher priority work, so we didn't end up starting that move until near the end of the Solaris 10 development cycle several years later. Solaris 10 thus had a mix of the recently added Xorg server and related code delivered in /usr/X11, while most of the existing bits from Sun's proprietary fork of X11R6 were still in /usr/openwin.

During Solaris 11 development, we finished that move, and then jumped again, moving the programs directly into /usr/bin, following the general Solaris 11 strategy of using /usr/bin for most of the programs shipped with the OS, and using other directories, such as /usr/gnu/bin, /usr/xpg4/bin, /usr/sunos/bin, and /usr/ucb for conflicting alternate implementations of the programs shipped in /usr/bin, no longer as a way to segregate out various subsystems to allow the OS to better fit onto the 105Mb hard disks that shipped with Sun workstations back when /usr/openwin was created. However, if for some reason you wanted to build your own set of X binaries, you could put them in /usr/X11R7 (as I do for testing builds of the upstream git master repos), and then put that first in your $PATH, so nothing is really lost here.

The other major project that was started during Solaris 10 development and finished for Solaris 11 was replacing that old proprietary fork of X11R6, including the Xsun server, with the modernized, modularized, open source X11R7.* code base from the new X.Org, including the Xorg server. The final result, included in this Solaris 11 release, is based mostly on the X11R7.6 release, including recent additions such as the XCB API I blogged about last year, though we did include newer versions of modules that had upstream releases since the X11R7.6 katamari, such as Xorg server version 1.10.3.

That said, we do still apply some local patches, configuration options, and other changes, for things from just fitting into the Solaris man page style or adding support for Trusted Extensions labeled desktops. You can see all of those changes in our source repository, which is searchable and browsable via OpenGrok on src.opensolaris.org (or via hgweb on community mirrors such as openindiana.org) and available for anonymous hg cloning as well. That xnv-clone tree is now frozen, a permanent snapshot of the Solaris 11 sources, while we've created a new x-s11-update-clone tree for the Solaris 11 update releases now being developed to follow on from here.

Naturally, when your OS has 7 years between major release cycles, the hardware environment you run on greatly changes in the meantime as well, and as the layer that handles the graphics hardware, there have been changes due to that. Most of the SPARC graphics devices that were supported in Solaris 10 aren't any more, because the platforms they ran in are no longer supported - we still ship a couple SPARC drivers that are supported, the efb driver for the Sun XVR-50, XVR-100, and XVR-300 cards based on the ATI Radeon chipsets, and the astfb driver for the AST2100 remote Keyboard/Video/Mouse/Storage (rKVMS) chipset in the server ILOM devices. On the x86 side, the EOL of 32-bit platforms let us clear out a lot of the older x86 video device drivers for chipsets and cards you wouldn't find in x64 systems - of course, there's still many supported there, due to the wider variety of graphics hardware found in the x64 world, and even some recent updates, such as the addition of Kernel Mode Setting (KMS) support for Intel graphics up through the Sandy Bridge generation.

For those who followed the development as it happened, either via watching our open source code releases or using one of the many development builds and interim releases such as the various Solaris Express trains, much of this is old news to you. For those who didn't, or who want a refresher on the details, you can see last year's summary in my X11 changes in the 2010.11 release blog post. Once again, the detailed change logs for the X11 packages are available, though unfortunately, all the links in them to the bug reports are now broken, so browsing the hg history log is probably more informative.

Since that update, which covered up to the build 151 released as 2010.11, we've continued development and polishing to get this Solaris 11 release finished up. We added a couple more components, including the previously mentioned xcb libraries, the FreeGLUT library, and the Xdmx Distributed Multihead X server. We cleaned up documentation, including the addition of some docs for the Xserver DTrace provider in /usr/share/doc/Xserver/. The packaging was improved, clearing up errors and optimizing the builds to reduce unnecessary updates. A few old and rarely used components were dropped, including the rstart program for starting up X clients remotely (ssh X forwarding replaces this in a more secure fashion) and the xrx plugin for embedding X applications in a web browser page (which hasn't been kept up to date with the rapidly evolving browser environment). Because Solaris 11 only supports 64-bit systems, and most of the upstream X code was already 64-bit clean, the X servers and most of the X applications are now shipped as 64-bit builds, though the libraries of course are delivered in both 32-bit and 64-bit versions for binary compatibility with applications of each flavor. The Solaris auditing system can now record each attempt by a client to connect to the Xorg server and whether or not it succeeded, for sites which need that level of detail.

In total, we recorded 1512 change request id's during Solaris 11 development, from the time we forked the “Nevada” gate from the Solaris 10 release until the final code freeze for todays release - some were one line bug fixes, some were man page updates, some were minor RFE's and some were major projects, but in the end, the result is both very different (and hopefully much better) than what we started with, and yet, still contains the core X11 code base with 24 years of backwards compatibility in the core protocols and APIs.

Friday Mar 25, 2011

R_AMD64_PC32 error? There, I Fixed It!

I try to fairly regularly build recent git checkouts of all the upstream modules from X.Org (at least all those listed in the current build.sh) on Solaris. Normally I do this in 32-bit mode on x86 machines using the Sun compilers on the latest Solaris 11 internal development build, but I also occasionally do it in 64-bit mode, or with gcc compilers, or on a SPARC machine. This helps me catch issues that would break our builds when we integrate the new releases before those releases happen. (Ideally I'd set up a Solaris client of the X.Org tinderbox, but I've not gotten around to that.)

Anyways, recently I finally decided to track down an error that only shows up in the 64-bit builds of the xscope protocol monitor/decoder for X11 on Solaris. The builds run fine up until the final link stage, which fails with:

ld: fatal: relocation error: R_AMD64_PC32: file audio.o: symbol littleEndian: value 0x8086c355 does not fit
ld: fatal: relocation error: R_AMD64_PC32: file audio.o: symbol ServerHostName: value 0x8086b4fe does not fit
ld: fatal: relocation error: R_AMD64_PC32: file decode11.o: symbol LBXEvent: value 0x808664c3 does not fit
(and over 150 more symbols that didn't fit)

A google search turned up some forum posts, a blog post, and an article on the AMD64 ABI support in the Sun Studio compilers. And indeed, the solutions they offered did work - building with -Kpic did allow the program to link.

But is that really the best answer? xscope is a simple program, and shouldn't be overflowing the normal memory model. Once it linked, looking at the resulting binary was a bit shocking:

% /usr/gnu/bin/size  xscope
   text	   data	    bss	    dec	    hex	filename
 416753	   5256	2155921980	2156343989	808732b5	xscope

% /usr/bin/size -f xscope

23(.interp) + 32(.SUNW_cap) + 5860(.eh_frame_hdr) + 27200(.eh_frame)
 + 2964(.SUNW_syminfo) + 5944(.hash) + 4224(.SUNW_ldynsym)
 + 17784(.dynsym) + 14703(.dynstr) + 192(.SUNW_version)
 + 1482(.SUNW_versym) + 3168(.SUNW_dynsymsort) + 96(.SUNW_reloc)
 + 1944(.rela.plt) + 1312(.plt) + 291018(.text) + 33(.init) + 33(.fini)
 + 280(.rodata) + 38461(.rodata1) + 1376(.got) + 784(.dynamic)
 + 1952(.data) + 0(.bssf) + 1144(.picdata) + 0(.tdata) + 0(.tbss)
 + 2155921980(.bss) = 2156343989

% pmap -x `pgrep xscope`
26151:	./xscope
         Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
0000000000400000        408        408          -          - r-x--  xscope
0000000000476000          8          8          8          - rw---  xscope
0000000000478000    2105388       1064       1064          - rw---  xscope
0000000080C83000         52         52         52          - rw---    [ heap ]
[....]
FFFFFD7FFFDF8000         32         32         32          - rw---    [ stack ]
---------------- ---------- ---------- ---------- ----------
        total Kb    2108668       3204       1300          -

Two gigabytes of .bss space allocated!?!?! That can't be right. Looking through the output of the elfdump and nm programs a single symbol stood out:

Symbol Table Section:  .SUNW_ldynsym
     index    value              size              type bind oth ver shndx          name
[...]
      [89]  0x00000000009ff280 0x0000000080280000  OBJT GLOB  D    1 .bss           FDinfo

[Index]   Value                Size                Type  Bind  Other Shndx   Name
[...]
[528]   |            10482304|          2150105088|OBJT |GLOB |0    |28     |FDinfo

Unfortunately, that wasn't one of the ones listed in the linker errors, since it's starting address fit inside the normal memory model, but everything that came after it was out of range.

So what is this giant static allocation for? It's defined in scope.h:

#define BUFFER_SIZE (1024 \* 32)

struct fdinfo
{
  Boolean Server;
  long    ClientNumber;
  FD      pair;
  unsigned char   buffer[BUFFER_SIZE];
  int     bufcount;
  int     bufstart;
  int     buflimit;     /\* limited writes \*/
  int     bufdelivered; /\* total bytes delivered \*/
  Boolean writeblocked;
};

extern struct fdinfo   FDinfo[StaticMaxFD];

So it allocates a 32k buffer for up to StaticMaxFD file descriptors. How many is that? For that we need to look in xscope's fd.h:

/\* need to change the MaxFD to allow larger number of fd's \*/
#define StaticMaxFD FD_SETSIZE

and from there to the Solaris system headers, which define FD_SETSIZE in <sys/select.h>:

/\*
 \* Select uses bit masks of file descriptors in longs.
 \* These macros manipulate such bit fields.
 \* FD_SETSIZE may be defined by the user, but the default here
 \* should be >= NOFILE (param.h).
 \*/
#ifndef FD_SETSIZE
#ifdef _LP64
#define FD_SETSIZE      65536
#else
#define FD_SETSIZE      1024
#endif  /\* _LP64 \*/

So this makes the buffer fields alone in FDinfo become 65536 \* 32 \* 1024 bytes, aka 2 gigabytes.

Thus in this case, while compiler flags like -Kpic allow the code to link, using -DFD_SETSIZE=256 instead, builds code that's a little bit saner, fits in the normal memory model, and is less likely to fail with out of memory errors when you need it most:

% /usr/gnu/bin/size -f xscope
   text	   data	    bss	    dec	    hex	filename
 409388	   3352	8449804	8862544	 873b50	xscope

% pmap -x `pgrep xscope`
         Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
0000000000400000        404        404          -          - r-x--  xscope
0000000000475000          4          4          4          - rw---  xscope
0000000000476000       8248         20         20          - rw---  xscope
0000000000C84000         52         52         52          - rw---    [ heap ]
[...]
FFFFFD7FFFDFD000         12         12         12          - rw---    [ stack ]
---------------- ---------- ---------- ---------- ----------
        total Kb      11500       2136        232          -

Of course that assumes that xscope is not going to be monitoring more than about 120 clients at a time (since it opens two file descriptors for each client, one connected to the client and one to the real X server), and still wastes many page mappings if you're only monitoring one client. The real fix being worked on for the next upstream release is to make the buffer allocation be dynamic, and allocate just enough for the number of clients we actually are monitoring.

The moral of this story? Just because you can make it build doesn't mean you've fixed it well, and sometimes it's useful to understand why the linker is giving you a hard time.

Monday Nov 15, 2010

X11 changes in the 2010.11 release

Another OS release came out today, 2010.11, and as usual, it has a number of X11 changes. The biggest change in X is probably... Hmm, I can see by the look on your face, you're not buying the casual use of “as usual” there. Okay, you caught me, this OS release isn't quite following our previous pattern, so I guess we better get that out of the way first. Please remember I am not an Oracle spokesman, and can't speak on behalf of Oracle, so don't even think of quoting this as “Oracle says...”

In many ways, this release is simply the continuation of the OpenSolaris distro releases of the last few years. It's built the same way, using the IPS packaging system and repositories, and Caiman installers, as the OpenSolaris 2009.06 and prior releases were. Where OpenSolaris 2009.06 (the last full release) was the biweeekly build numbered 111b, and the release we'd planned to put out as OpenSolaris 2010.03 earlier this year (and which made it to the package repository, but was not put up as downloadable ISO's) would have been biweekly build 134b, this release is 151a. You should be able to upgrade to it from OpenSolaris 2009.06 or OpenSolaris /dev builds via the package repository following the instructions in the 2010.11 release notes.

So what's different about this OS release? Well, it's not named OpenSolaris anymore for starters - it's Oracle Solaris 11 Express. We'd always said that OpenSolaris releases were leading up to Solaris 11 eventually, and this name emphasizes we're getting closer to that (though still not there yet). It also recognizes that this release is built by Oracle, not Sun nor the OpenSolaris community. While it's built on the work done by the OpenSolaris community, and many portions of it are still developed as open projects on opensolaris.org, the kernel and core utilities are once again being developed behind closed doors, and the final assembly and testing are similarly done in house. The license terms for the free downloads have changed as well (though it's still offered under support contract for commercial production use as well), and the OS images include some of the encumbered packages we'd had to keep out of OpenSolaris in order to allow OpenSolaris to be freely redistributable. (Not all of them, since some were simply EOL'ed as they were for hardware well past the end of its supported lifetime, like many of the old SPARC frame buffers.)

So with that out of the way, back to the topic at hand - what's new in the X Window System in this release? Well that depends on how far back you're coming from. You can browse the complete changelogs for X going back to the point we branched the Nevada branches from the Solaris 10 release, so I'll try to stick to the highlights.

Changes since the last OpenSolaris X11 source release

None, since the X sources on opensolaris.org are still updated automatically from our internal master gate on each commit. (In fact, since the source gates currently reflect a point between biweekly builds 153 & 154, they have changes newer than this release, such as the integrations of libxcb and FreeGLUT.)

Changes since the last OpenSolaris developer build release (b134)

There were 17 biweekly builds between the last one published to pkg.opensolaris.org/dev in March and this release. The biggest change in the X packages in this period was their packaging. Previously we built our packages using the old SVR4 package format that was used since Solaris 2.0, and in many cases following the breakdown used in the old Solaris 2 releases (SUNWxwinc for most headers, SUNWxwplt for most libraries, SUNWxwman for most man pages), and then the release team converted those to the IPS format used in the OpenSolaris releases. Like several of the other consolidations, X has now converted to building IPS packages directly, and in the process refactored the X packages to better follow the way the upstream X.Org sources were split into modules at X11R7, which also happens to be more similar to the way most Linux distros break them up. This should allow easier creation of minimized environments with the subset of X packages you need.

As for headers and man pages, they are now included in the packages they are used with - for instance all the libX11 headers and API man pages are directly in the x11/library/libx11 package. System admins can still decide to include or exclude them in their installs though, since they are tagged with the devel and docfacets”, which are the IPS mechanism for controlling optional package components. To read more about how to use these with X or the other changes in the refactoring, see the heads up messages I posted when this work integrated.

Of course, there were also the usual updates to new upstream releases - Xorg 1.7.7, freetype 2.4.2, fontconfig 2.8.0, among many others. The X server packages now also include the mdb modules and scripts for getting client and grab information from the server that I blogged about back in April.

Changes since the last OpenSolaris stable release (2009.06 / b111b)

This period saw the completion of our multiyear project to completely replace the old Solaris X code base with the X11R7 open source code base from X.Org. Solaris 10 and earlier shipped with Sun's proprietary fork of X11R6, with bits of X11R5, X11R6.4, X11R6.6, & X11R6.8 mixed in. We're now set up to much more easily track upstream and are deviated from upstream in much fewer places than before (partially due to pushing a number of our previous fixes back upstream, in other cases, we determined the upstream code was better and went with it).

We also had a very large user-visible change in build 130: all the files moved from /usr/X11 directly into /usr/bin & /usr/lib, following the work done in other parts of Solaris to move files from locations like /usr/ccs/bin and /usr/sfw to the common /usr directories. We still have symlinks in /usr/openwin and /usr/X11 for backwards compatibility, so we shouldn't break your .xinitrc calls to /usr/openwin/bin/xrdb or /usr/X11/bin/xmodmap.

Since 2009.06, we moved from Xorg 1.5 to 1.7.4. Of course, with this upgrade, we got the HAL support for input device configuration working just as X.Org started moving off HAL upstream, something we still need to deal with for Solaris - for this release, input devices are still configured in HAL .fdi files. The xorgcfg and xorgconfig programs did go away as part of this move though - fortunately more and more systems are working without any xorg.conf at all, and when one is needed, only the sections being changed have to be included, lessening the utility of programs to generate full configuration files. The new Xorg also includes support for virtual consoles on systems with the necessary kernel driver support (all x86 systems and SPARCs with frame buffers supporting “coherent console”).

We also added the synaptics touchpad driver, synergy software for sharing input devices with multiple systems, the simple xcompmgr composite manager, the xinput client for input device configuration, and finally provided IPS packaged versions of the classic xdm display manager and xfs legacy font server. The Xprint server and several related commands did go away, but the libXp library was kept for binary compatibility.

Our VNC implementation was converted from RealVNC 4.1.3 to TigerVNC 1.0.1, which is being kept up-to-date with new Xorg releases, unlike RealVNC, which hasn't really been updating it's open source release in the last few years. xscreensaver was finally updated from 5.01 to 5.11, and was actually moved out of the X gate in OpenSolaris to building as a RPM-style pkgbuild spec file with the other higher-level desktop software - hopefully in the process we fixed some long-standing bugs in our forked code.

Graphics updates included Nvidia's driver support for various new devices and OpenGL 4.0, and Intel's DRI updates, including GEM support in their DRM module. Mesa was added on SPARC to provide a matching OpenGL implementation, but with only the software renderer, no hardware acceleration.

What else has changed?

Besides the official Solaris 11 Express release information, you can find more details on changes in this release on a bunch of other blogs, such as:

But here's some changes in other parts of the OS you may not see listed on those:

Of course, that's just a small sample, the full changelogs are a few thousand items long (and unfortunately, some of the consolidations haven't published theirs outside the firewall).

Wednesday Feb 18, 2009

Xorg 1.5.3 known issues in OpenSolaris 2009.06 & Solaris Express build 107

As Phoronix recently noticed, we upgraded the Xorg server from 1.3 to 1.5.3 in Nevada build 107 (currently available via Solaris Express Community Edition (SXCE) Install ISO’s and OpenSolaris IPS package updates in the /dev repo).

When we integrated, I sent out a heads up message with information about the driver compatibility changes and issues we knew about at that time.

So far it’s gone mostly smooth, thanks to the help of the people who tested the pre-integration builds I posted both internally and on the X community on opensolaris.org, but there’s a few issues that have hit some people so far, most with workarounds you can use to get past them:

6801598 Xorg 1.5 ignores kernel keyboard layout setting, always uses "us"

Fixed in build 109. Xorg on Solaris has always shipped with a patch to query the kernel for the console keyboard layout (which in turn either gets set automatically by the keyboard if, like Sun keyboards, it provides the layout in the optional USB HID layout identifier, or gets set by querying the user on initial installation), but the code we apply this patch to moved from the Xorg binary to the kbd_drv.so binary in Xorg 1.5 and our patch didn’t get migrated correctly. We didn’t notice in testing since our testing was done with input devices provided by HAL, which was handling this for us, but when that didn’t integrate as expected, this bug was exposed.

Workaround: Create an /etc/X11/xorg.conf file with a keyboard section specifying your desired layout, such as this one for a British layout:

Section "InputDevice"
	Identifier		"Keyboard0"
	Driver			"kbd"
	Option "XkbRules"	"xorg"
	Option "XkbModel"	"pc105"
	Option "XkbLayout"	"gb"
EndSection

6791361 SUNWxorg-xkb should deliver "base" rules

If you do need to set a localized keyboard layout, you may notice the default XKB rules file upstream changed from "xorg" (which is currently shipped in our packages) to "base" (which isn’t yet, though we’ve asked the localization teams who build our XKB packages to add it).

Workaround: Include an XkbRules option line in your keyboard settings as shown in the example above, or make a symlink from /usr/X11/lib/X11/xkb/rules/base to the xorg file in that directory. The fix for the above bug in build 109 should also set the default used by our Xorg builds back to "xorg" for now.

6801386 Xorg core dumps on startup if hald not running in snv_107

Fixed in build 109. Yet another instance of our old friend 6724478 libc printf should not SEGV when passed NULL for %s format. This was a design decision made in SVR4 and Solaris 2.0 many years ago, to force programmers to catch NULL pointer misuse - but since Linux & the BSD’s allow it, today it’s mostly become a source of pain in porting code that while technically incorrect, works on those platforms, so the decision was made last year to change Solaris libc to match. Unfortunately, that change is still undergoing standards conformance testing, so hasn’t integrated yet, and we still have to check for NULL first.

Workaround: Make sure hald starts before Xorg does. This shouldn’t be a problem for gdm users (default in OpenSolaris), since the gdm SMF service lists the hal service as a dependency that needs to start first, but may cause problems for dtlogin, xinit or startx users.

6806763 Xorg 1.5 doesn’t start if xorg.conf contains RgbPath entry

Fixed in 109, and in upstream head. This entry in the Files section of xorg.conf was obsoleted upstream, but became a parse error instead of just being ignored, which breaks people who upgraded a system with a working xorg.conf that happened to have this line.

Workaround: Remove RgbPath lines from /etc/X11/xorg.conf.

6799573 Metacity not starting on TX Xorg 1.5

Fixed in 108. The XACE framework used by extensions such as Xtsol & Xselinux added new permission types to check in the upstream release, but Xtsol doesn’t handle all of them yet. In this case, Metacity’s constant checking of round-trip time by appending to an existing property failed because Xtsol was only handling the WriteAccess check and not the BlendAccess check.

I don’t know of a workaround for this other than not running the desktop in multi-level mode, so users of the labeled/multi-level security desktop in Trusted Extensions may want to wait to upgrade until then.

6798452 X Server will not start on x86 systems with multiple graphics devices

Not yet fixed, either in our builds or upstream, but has a simple workaround of creating an xorg.conf file listing the devices you want to use. This may affect people with a single card if your motherboard has on-board graphics as well.

6797940 Can’t do gui installs on x86 systems with Nevada 107/Nevada 108

We haven’t yet tracked down this issue, which only affects the SXCE installer on x86/x64 platforms. Workarounds include using the OpenSolaris LiveCD installer or the text only mode of the SXCE installer instead. After installation, the installed Xorg works fine (modulo the above issues).

6686 SUNWefb[w] should be part of slim_install

Fixed in 108. The SPARC drivers added in OpenSolaris build 107 for Xorg on XVR-50, XVR-100, and XVR-300 graphics cards aren’t included in the default install. You can add them after the install by doing pfexec pkg install SUNWefb SUNWefbw and then rebooting. (These drivers are only available in OpenSolaris installs, not SXCE ones, since they conflict with the Xsun drivers for these cards included in SXCE.)

Saturday Jun 14, 2008

June 11, 2008 X Server security advisories

On June 11, iDefense & the X.Org Foundation released security advisories for a set of issues in extension protocol parsing code in the open source X server common code base that iDefense discovered and X.Org fixed.

Their advisories/reports are at:

Sun has released a Security Sun Alert for the X server versions in Solaris 8, 9, 10 and OpenSolaris 2008.05 at:

Preliminary T-patches are available for Solaris 8, 9, and 10 from the locations shown in the Sun Alert - these are not fully tested yet (hence the "T" in T-patch).

The fix for these issues has integrated into the X gate for Nevada in Nevada build 92, so users of SXCE or SXDE will get the fixes by upgrading to SXCE build 92 when it becomes available (probably in 3-4 weeks, though the first week of July is traditionally a holiday week in Sun's US offices, so may affect availability).

Fixes for OpenSolaris 2008.05 users following the development build trains will be available when the Nevada 92 packages are pushed to the pkg.opensolaris.org repo (also probably in about 3 weeks from now).

Fixes are planned for OpenSolaris 2008.05 users staying on the stable branch (i.e. nv_86 equivalent), but I do not have information yet on how or when those will be available.

Fixes for users building X from the OpenSolaris sources are currently available in the Mercurial repository of the FOX project in open-src/xserver/xorg/6683567.patch.

For users of all OS versions, the best defenses against this class of attacks is to never, ever, ever run “xhost +”, and if possible, to run X with incoming TCP connections disabled, since if the attacker can't connect to your X server in the first place, they can't cause the X server to parse the protocol stream incorrectly. This is not a complete defense, as anyone who can connect to the Xserver can still exploit it, so if you're in a situation where the X users don't have root access it won't protect you from them, but it is a strong first line of defense against attacks from other machines on the network.

Releases based on the Solaris Nevada train (including OpenSolaris & Solaris Express), default to “Secure by Default” mode, which disables incoming TCP connections to the X server. Current Solaris 10 releases offer to set the Secure by Default mode at install time. On both Solaris 10 & Nevada, the netservices command may be used to change the Secure by Default settings for all services, or the svccfg command may be used to disable listening for TCP connections for just X by running:

svccfg -s svc:/application/x11/x11-server setprop options/tcp_listen=false

and then restarting the X server (logout of your desktop session and log back in).

On older releases, the “-nolisten tcp” flag may be appended to the X server command line in /etc/dt/config/Xservers (copied from /usr/dt/config/Xservers if it doesn't exist) or in whatever other method is being used to start the X server.

See the Sun Alert for other prevention methods, such as disabling the vulnerable extensions if your applications can run effectively without them.

Thursday Oct 11, 2007

Stupid nv_74 cursor tricks

For those who have already grabbed Nevada build 74 from the Solaris Express Community Edition downloads, you can try out the new cursor code available with the integration of libXcursor (which most XFree86 and Xorg platforms have had for a while).

If your graphics card supports the Render extension & 32-bit alpha cursors, (i.e. most x86 graphics, but not Sun Rays nor SPARC graphics yet - I've mainly used it on nvidia cards and my laptop with a ATI Radeon chipset), you can see them in action by doing:

% su -
# mkdir -p /etc/dt/config/C/
# cat > /etc/dt/config/C/styleModern
#include "/usr/dt/config/C/styleModern"
Xcursor.theme: whiteglass
\^D
# exit

% echo 'Xcursor.theme: whiteglass' >> ~/.Xresources

and then logout & log back in.

Change 'whiteglass' to 'redglass' in the above if you want something that stands out more.

Monday Oct 08, 2007

X Font Server (xfs) Security Hole in Solaris

As noted in the ZDNet posting X Font Server flaw hits Sun Solaris hard, the recently announced X font server vulnerabilities not only affect Solaris, but are exposed to the network by default in some Solaris installs.

What the article fails to mention is that it's only older installs that are vulnerable by default - Solaris versions up through Solaris 10 6/06 run xfs by default from inetd listening to the network. Solaris 10 11/06 and later Solaris 10 releases ask you at install time if you want your network services to default to being open or closed. Solaris Nevada/Express just closes them all by default and requires you to turn back on the ones you want. (These changes came from the Solaris Secure by Default project, which has more information on its project pages.)

Our sustaining teams are producing patches and a Sun Alert covering this issue, but until then, if you don't need the X font server (on Solaris it's really only used for remote desktop sessions from computers without the standard Solaris fonts already installed - unlike some Linux'es, local sessions don't use it), you can easily turn it off in several ways:

  • On all Solaris releases: “/usr/openwin/bin/fsadmin -d”, which will either break the link that inetd uses (Solaris 2.6-Solaris 9) or use inetadm to disable the svc:/application/x11/xfs service (Solaris 10 & later).
  • On Solaris 10 and later, you can do the same thing explicitly with “/usr/sbin/inetadm -d svc:/application/x11/xfs:default”.
  • On Solaris 2.6 through 9, you can do the traditional editing of /etc/inetd.conf to disable it, then “pkill -HUP inetd”.
  • If you'll never need it, and want to be sure it's gone, remove the xfs package with “pkgrm SUNWxwfs”.

Update: Oops, had a typo in one of the instructions above - should have been “pkill -HUP inetd”, not kill. Also, as Paul noted in the comments the Sun Alert is now published, with interim fixes soon to follow, at http://sunsolve.sun.com/search/document.do?assetkey=1-26-103114-1.

Wednesday Jun 13, 2007

Xorg 7.2 Solaris status update

It's been a while since I've given an update on our Xorg 7.2 state in Solaris and OpenSolaris, and I was reminded by a new patch release today. For a summary of the new features included, and some bugs in the initial Nevada builds, see the Heads Up: Xorg 7.2 in Nevada build 58 message I sent to the OpenSolaris xwin-discuss list in February.

Besides the original release in Solaris Express: Community Edition and the OpenSolaris sources, Xorg 7.2 is now also available in:

For those on Solaris 10 with an older patch release, or the Solaris 10 7/07 Beta, patch 125720-08 was released today, which fixes several issues reported by early users, including:

  • A number of problems with auto-configuration of screen resolution
  • Xephyr's Caps Lock not being unlockable (bug 6539225)
  • A memory leak every time an X client connection was closed (6533650)
  • Crashes or failures using TrueType fonts in 64-bit Xorg servers (6535518)
  • xorgconfig listing the wrong keyboard driver in generated xorg.conf files (6559079)
  • Xorg -configure not including the bitstream TrueType font module in generated xorg.conf files (6546692)

Also included in the patch are:

  • A new version 2.0.2 of the “nv” open source driver for nVidia cards which adds support for the first set of GeForce 8000 series cards, though you can also get the updated nvidia closed-source/accelerated driver to support them as well. (A slightly older version of the nVidia closed driver is included in Solaris 10 7/07 already.)
  • Security fixes for two reported vulnerabilities: X Render Trapezoid divide-by-zero and XC-MISC memory corruption.
  • An update to the Radeon driver to check if you've plugged in an external monitor on your laptop (at least on Ferrari 3400's) when you run xrandr, and if so, activate the external display port without having to restart X. (A small step towards the more general and complete solution coming with X Resize-and-Rotate 1.2.)

There's a few more fixes in progress still (the -09 rev of the Solaris 10 patch is in QA now, and there will of course be more revisions in the future), including:

  • Fixing glxgears and glxinfo linking so they run (6560568)
  • Allowing Xorg to start with an old xorg.conf listing the "Keyboard" driver (capitalized) (6560332 - if you have this problem now, just change the driver name to "kbd" in your xorg.conf)
  • Adding the VMWare mouse driver, vmmouse_drv.so (6559114)
  • Fixing Xorg -configure in the 64-bit Xorg (6556115)
  • Making Xv playback work again in the ATI Radeon driver (6564910)
  • Making xorgcfg work again ( 6563653)

but the current state should be fairly usable by most people as it is today.

For those using Solaris Express/Nevada releases, or OpenSolaris sources, you can see which build the above changes went in at the X ChangeLogs page. (Solaris Express Developer Edition 5/07 corresponds to build 64 in that list.)

Monday Jun 11, 2007

The Irregular X11 DTrace Companion: Issue 2

In response to a Solaris 10 7/07 beta tester question last week, I updated the Xserver provider for DTrace web page to point out that besides the original method of patching the Xorg sources and rebuilding, pre-built binaries are now included directly in Solaris...

Getting an X server with DTrace probes built in

The Xserver dtrace providers are now included in the regular X server binaries (including Xorg, Xsun, Xvfb, Xephyr, Xnest, & Xprt) in recent Solaris releases, including:

For those who still want to build from source, the provider is integrated into the X.Org git master repository for the Xserver 1.4 release, planned to be released in August 2007 with X11R7.3, and into the OpenSolaris X code base for Nevada builds 53 and later.

Thursday May 17, 2007

The Irregular X11 DTrace Companion: Issue 1

All the cool kids are doing unscheduled update newsletters now, and I had a collection of X11 Dtrace things to post about piling up, so I figured I'd give it a whirl. (BTW, Nouveau guys - I'd read yours more if you had an RSS feed - having to go check the wiki on an irregular basis doesn't work so well.)

Doing newsletters on a regular basis is hard, as Glynn has learned, so don't expect updates too often...

Sample script for tracing events

Jeremy asked recently for an example DTrace script using the event probes in the Xserver DTrace provider. I had lost the ones I wrote when developing/testing that probe, but dug one out of my e-mail that was written by DTrace Master Guru Bryan Cantrill and posted it as ButtonPress-events.d.

He wrote this script when he'd upgraded to Nevada build 58 and found shift-click was no longer working. Using the script he was able to show that the Xserver was sending ButtonPress events without the ShiftMask set. I wrote a corresponding script to decode the VUID events the X server was getting from the kernel mouse and keyboard drivers to determine the keyboard wasn't giving us the shift key pressed events until another key was pressed, which wasn't happening for shift-clicks with the mouse. That script can be seen in the bug report I filed with the kernel keyboard team as Solaris bug 6526932.

Using DTrace to win a race

A bug was filed with X.Org recently stating “iceauth can dump a core in auth_initialize() if a signal is caught before iceauth_filename has been malloced.” As I was fixing bugs in iceauth anyway, I figured I'd look at it to see if I could roll in a fix to the new iceauth release I'd be making for those. Fortunately, iceauth is a small program, and I could see where I thought the race condition lied between the main code and the signal handler. Testing any race condition has always been a pain in the past, since if you don't get the timing just right you don't know if you really fixed it, or just changed the timing enough to not hit it.

So I pulled up the DTrace manual and DTrace wiki and created a one-liner to force the race condition by sending the signal when I knew it was in the right spot:

 # dtrace -w -c ./iceauth -n 'pid$target::IceLockAuthFile:return { raise(1); }'

For those who haven't memorized the DTrace manual, this can be broken down as:

  • -w - allows “destructive” actions - i.e. those that change the running of the program, and don't just monitor it - in this case, allows sending a signal to the process
  • -c ./iceauth - specifies the command to run with this script, and puts the process id of the command in $target in the script
  • pid$target::IceLockAuthFile:return - sets the probe we're looking for to be the return from any call to the function IceLockAuthFile in the process matching the pid $target
  • { raise(1); } - sets the action to be run when that probe is hit to send signal 1 (aka SIGHUP) to the target process

Ran that and a few seconds later had a nice core file to verify where the race was. Turned out I was right about where the signal had to come in for the race to cause a crash, but was slightly off about the point in the signal handler where it was crashing. The fix was easy, and thanks to DTrace, easy to verify, and is now in X.Org's git repository.

DTrace Wiki: Topics: X

After last month's Silicon Valley OpenSolaris User Group meeting, Brendan asked if I'd work on the “X” topic on the DTrace Wiki. I agreed to, and have put up a small placeholder while I figure out what to flesh it out with - if you have any suggestions or requests for what you'ld like to see in a short introductory article to using DTrace to probe X, especially using, but not limited to, the Xserver DTrace provider, send me e-mail or leave a comment here.

About

Engineer working on Oracle Solaris and with the X.Org open source community.

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle, the X.Org Foundation, or anyone else.

See Also
Follow me on twitter

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today