Thursday Feb 11, 2010

SourceJuicer in Beijing

While I was in Beijing I was invited to present OpenSolaris SourceJuicer at a combined meeting of the Beijing OpenSolaris User's Group and the Beijing GNOME User's group. (Thanks Emily and everyone!) SourceJuicer at Beijing OpenSolaris User's Group and Beijing GNOME User's group

Sunday Oct 25, 2009

Presenting SourceJuicer at Dresden OSDevcon

I'll be presenting OpenSolaris SourceJuicer at the OSDevcon conference in Dresden this Thursday, October 29th at 3:00p.m. I hope to see you there!

Wednesday Jul 22, 2009

OpenSolaris Source Juicer BOF at OSCON 2009

I'll be moderating a BOF at OSCON 2009 in San Jose on SourceJuicer discussing other cloud computing collaborative opensource development environments. For further information, see my post on the SourceJuicer blog.

Tuesday Mar 24, 2009

SourceJuicer 1.0 is released!


The SourceJuicer team has announced that SourceJuicer 1.0 is now available!. SourceJuicer is a web service which facilitates the building and review of OpenSolaris packages. Its goal is to pave the path between building an OpenSolaris package, (which may only work on your laptop) and publishing high quality packages into the OpenSolaris /contrib repository. Here are some of my favorite features:

  • Sourcejuicer provides a community package review system.
  • It preinstalls Laca's Common Build Environment (CBE) and developer libraries in a build zone so you don't have to track down all of the build tools and dependencies required for common FOSS projects.
  • SourceJuicer examples and automatic validation walk me through the required fields in a spec file.
  • Once a package passes validation criteria and builds, it appears in sourcejuicer's internal /pending package repository and can be immediately installed on a clean QA machine for testing:

    pfexec pkg set-authority -O jucr-pending pfexec pkg refresh pfexec pkg install {foo}

  • If the package is clean and passes community review, it can be promoted to the /contrib repository.

This has been quite an interesting project and I've really enjoyed working with this team on this technology. Because I was focused on authentication, authorization and the (not yet released) bug management components, I didn't get to use the full system until very recently and I found that it really did remove some barriers to building and testing the 'dillo' browser and Alvaro's 'cherokee' web server.

We're looking forward to hearing feedback as more OpenSolaris contributors make use of SourceJuicer. You can find out more about the progress of sourcejuicer at The SourceJuicer Blog

Wednesday Mar 18, 2009

NumptyPhysics on Opensolaris!

NumptyPhysics on Solaris NumptyPhysics is a fun opensource "Physics puzzle" game which uses the same Box2d physics engine as the award winning CrayonPhysics. As soon as I saw it, I really wanted to port it to OpenSolaris. One of my first computer programs (for the Commodore 64) simulated gravitational attraction between sprites and allowed the user to send sprites into orbit around each other. Cumulative floating point errors in my old BASIC program eventually caused orbiting objects to careen off the screen. My Solaris port of Numptyphysics still needs work and cleanup, buy it's a step in the right direction. Right now top priority is getting SourceJuicer out the door so there is a more efficient path between NumptyPhysics runs from my build environment on my laptop to, NumptyPhysics is available as a stable package for the OpenSolaris community.

Tuesday Mar 17, 2009

Coming soon, SourceJuicer. 'cloud based' OpenSolaris package development

Circular cloud In recent weeks I've been busy working on the bug management, user authorization and authentication components of the SourceJuicer project. SourceJuicer is a Django based web service which will allow developers to build OpenSolaris packages in a standard build environment and put them on a clearly paved path to review and publication in the OpenSolaris contrib repository.

Christian has a more detailed explanation on the SourceJuicer Blog. The technology behind it is really interesting, it makes good use of ZFS as well as Solaris Containers (a.k.a. zones.) Watch the SourceJuicer blog for more detail as the project unfolds.

Monday Mar 16, 2009

Is the recession accelerating OpenSolaris adoption?

This Slashdot article points to an IT Manager survey indicating that Linux adoption is growing during these difficult economic times. It does make sense that companies and governments which normally spent freely on proprietary software might begin to consider unorthodox, but much more cost effective alternatives now. What does this mean for other opensource operating systems such as OpenSolaris? I think the Google trends graph says it better than I could. Anyone looking for the root cause of this economic mess only needn't bother about property bubbles, dodgy investment shenanigans or massive increases in debt. Just look at the trend line of the third parameter in this Google graph ;-)

P.S.: I compared 'opensolaris' with 'economic downturn' instead of 'recession' because the magnitude of recession searches is so much larger that it pushes opensolaris towards the bottom of this graph. A similar scale issue makes it difficult to see that opensolaris seems to be gaining market share against Windows, Solaris, Linux and some of the most popular proprietary Linux distributions.

Google trends is an amazing tool, but it can't answer all psychohistory questions. Trends for some topics such as 'great depression' and 'great gatsby' are common topics in standardized U.S. school curriculum and therefore searches for these closely follow the school calendar. You'd think with so many students learning about Gatsby's 1920s hedonism and its unravelling during the 'Great Depression', it should be impossible to repeat this history.

Friday Dec 05, 2008

asc("") and printf("%s",NULL)

A few days ago I was reminded of one of the differences between Solaris and GNU/Linux which caused a few headaches for Sun's desktop team back in the days of GNOME 1.2. The problem is that while you can printf("%s",NULL); in most Linux distributions, doing the same in Solaris caused the executable to exit and generate a core. There were some debates about the correctness of each approach. Should the program crash to tell the developer that he shouldn't be trying to print a NULL string (the Solaris behavior) or should the program continue happily along printing a NULL pointer? I can see some advantages in both approaches I suspect this and other "Linuxisms" will end up in OpenSolaris simply because they make it more convenient for average coders to throw together a quick and dirty program or for someone to compile and run the thousands of source packages out there which (perhaps unknowingly) take advantage of this Linuxism.

But thinking about this reminded me of a similar "bug/feature" in my very first computer, the Commodore 64 back in 1982 when 64K seemed an unbelievably excessive amount of memory for a computer which cost only $595. The built in blitter and 4 channel 16 bit synthesizer made it a really fun computer for me to write simulations and sound generator programs for my father's physical science class. If you look closely at many of the programs which were published for Compute! and other magazines of the time, you might notice something strange. When a character was read from the user (e.g. via get (a$) ), the asc(a$) function would convert the character to its numeric ASCII value. But in the code you would usually see something like this:

n = asc(a$+chr$(0))

J64 java emulator

What is going on here? There was a bug/feature in Commodore 64 BASIC V2 which raised an "Illegal quantity error." whenever a null string was passed to the asc() function. The Commodore 64's 6510 processor had the unusual ability of being able to peek the ROM and write to shadow RAM which shared the same address space and then disable ROM so that BASIC was running from RAM. This allowed modifications to the BASIC interpreter. Jim Butterfield, a Commodore expert and author once demonstrated a one byte poke which fixed this asc("") bug. Ever since I learned of this simple fix, I wondered why so many BASIC's had this same one byte bug. The Commodore 64, Vic-20, Atari, Amiga, and at least some versions of the Apple and IBM PC Basic's shared this same bug! What was going on? Well, as it happens, a little company known as Microsoft wrote versions of BASIC for nearly all 8 bit computers of the 1980s and 1990s. Was this one byte bug overlooked by Microsoft and propagated to all Microsoft inspired codebases or were Microsoft's developers following the same purist philosophy as Solaris developers who assert that "good coders shouldn't pass NULL into string functions?" Either way, when such a company grows to what it has now become, it can decide that this one byte bug is actually a feature.

Friday Jun 20, 2008

My OpenSolaris for Developers talk at the Irish Opensource Technology Conference

I should thank the sponsors and organizers of the Irish OpenSource Technology Conference (IOTC) for giving me the opportunity to present OpenSolaris as an Open Source Developer Tool to some of Ireland's brightest and most energetic open source developers. There were quire a few university attendees and Barry was able to bring in people from small and midsized Irish companies such as openApp and hosting365 as well as multinationals such as Microsoft, IBM, RedHat, Sun and AIB (more about this later!)

My talk seemed to be well understood by the audience and I managed to empty out a heavy backpack full of ¨Free as in Free" OpenSolaris 2008.05 CDs afterwards. I didn't have enough time to talk about SMF or PKG(5) in detail, but I did spend some time on ZFS and Dtrace; both of which I'm certain would be useful to any Open Source developer. Even if your pointy-haired boss demands that you must code your application in VisualBasic and deploy on Redhat 3.5 via Wine, you can sneak OpenSolaris onto one of your QA department's test boxes and run your software in a zone where you can dtrace it. Or you could set up an OpenSolaris file server with ZFS snapshots as frequently as necessary (perhaps every keystroke for some UIDs?) I won't tell anyone... honest ;-)

[Read More]

Saturday Jun 14, 2008

How to keep gam_server from doing too much

The Gamin file monitoring subsystem was introduced to OpenSolaris a few months ago. Since it monitors file changes, there are cases where it can become very busy and consume significant system resources. Most of the resource consumption issues will probably be fixed by build 92, but for those of us running OpenSolaris 2008.05 or Nevada builds before build 92, or those of us with special requirements such as remote NFS mounted home directories, AlekZ's Scratchpad has a very nice workaround to put gam_server back in its place. I'd recommend the following slightly modified workaround:

   1. Create /etc/gamin directory:
      # mkdir /etc/gamin

   2. Create file /etc/gamin/gaminrc. It may contain the following lines (this is just an example, you can set your own polling intervals):
      fsset nfs poll 15
      fsset ufs poll 15
      fsset lofs poll 15
      fsset zfs poll 15

   3. Restart gam_server (let me know if there is a better way):
      # pkill gam_server ; rm -rf /tmp/gam_\*

Thursday Mar 20, 2008

dtrace getenv and GNOME

While tracking down the root cause of intermittent hangs (often > 1 second) when traversing the panel menus on GNOME 2.20.1, I found some interesting things about how GNOME works. On one of my installs, the gtk-update-icon-cache hadn't been created correctly and doing anything which read an icon crashed the application (e.g. clicking on a panel menu.) This led to me to searching for the cause of this failure. I wanted to make sure gtk-update-icon-cache wasn't unnecessarily failing on assert The code looked O.K. but I thought I'd look at the system with dtrace. This led me to a nice putenv.d dtrace script. gtk-update-icon-cache doesn't seem to set G_DEBUG so that question is answered. But I decided to make a small change to script to look at getenv during a simple launch of eog. This gave me some interesting results:

bash-3.2$  dtrace -s ./getenv.d -c "eog" 

CPU     ID                    FUNCTION:NAME
  0  66661                     getenv:entry env[LANGUAGE]= LANGUAGE

  0  66661                     getenv:entry env[NLSPATH]= NLSPATH

  0  66661                     getenv:entry env[LANGUAGE]= LANGUAGE

  0  66661                     getenv:entry env[NLSPATH]= NLSPATH

  0  66661                     getenv:entry env[G_MESSAGES_PREFIXED]= G_MESSAGES

  0  66661                     getenv:entry env[G_DEBUG]= G_DEBUG

  0  66661                     getenv:entry env[CHARSET]= CHARSET

  0  66661                     getenv:entry env[LIBCHARSET_ALIAS_DIR]= LIBCHARSE

  0  66661                     getenv:entry env[G_RANDOM_VERSION]= G_RANDOM_VERS

  0  66661                     getenv:entry env[LANGUAGE]= LANGUAGE
Then lots of redundant getenvs.
getenv LANGUAGE 1832 times 
getenv NLSPATH 1679 times
getenv TZ 432 times
aggregating on ustack shows that a good share of these come from these backtraces:

Now, here is the weird part. The application seems to continue polling getenv when you're on an active menu. I tried this with gnome-panel. The good news is that when I'm not touching panel, it doesn't seem to be polling environment variable. But when I hold the mouse over or traversed the launch menu, it continually tried to grab ESPEAKER and DISPLAY environment variables. These were all coming from this section of code:

Whoah, I had system sounds enabled for tooltips? I checked the preferences and sure enough system sounds were enabled. I never heard any so my audio hardware must not be supported. I still don't know for certain if this was the cause of the hang, since the hang was intermittent. But if you haven't tried it already, install GNOME on a machine with dtrace capability and look around a bit. If nothing else, you may learn something new.

Friday Mar 14, 2008

Workaround for nautilus/panel crashes in Nevada 84, 85.

A number of people have encountered a bug which causes nautilus and panel to crash on login, preventing the desktop from being used.

The symptoms are:
  • Nautilus crashes immediately on login, bringing up bug-buddy dialog.
  • Panel crashes immediately after the user clicks the Launch menu.
  • Any application which makes use of gtk icons crashes.

The bug is only experienced on some hardware and may not occur on every install even on hardware where it fails. It is caused by a file access race condition during the post install phase. The file access conflict causes an ASSERT in the gtk-update-icon-cache which forces this application to core during install. The user is left with incomplete and corrupt icon cache which causes failures of all applications depending on the gtk icon cache.

The bug is described in more detail here: 6631419 - gtk-update-icon-cache dies on first boot after install/upgrade

Fortunately, there is a very simple workaround:
  • Login as root
  • Run the following:
    for d in /usr/share/icons/\*; do
           [ -d $d ] &&
                   gtk-update-icon-cache --force $d;

This bug is intermittent and is known to exist in Nevada build 84 and build 85. It may exist in earlier builds on some hardware. The GNOME community's decision to enable application cores on ASSERT may have made some of these subtle underlying problems suddenly become much more frequent and obvious. A fix for bug 6631419 is committed for Nevada build 86.

CORRECTION: There was a discussion over enabling fail on assert within the GNOME community, but no change was recently made. The behavior is that on unstable (odd numbered) versions of gnome-session, fail on assert is enabled, but on even numbered builds it isn't.

Monday Jan 28, 2008

xlincity, another opensource S\*mC\*ty fork, now runs on OpenSolaris

Now there are two OpenSolaris options for those who spent many long hours playing SimCity in the 1980s and 90s, or those who would like to experiment with city simulations and contribute to the project. The first was micropolis, a port designed with the XO One Laptop Per Child project in mind and promising extensibility via Python modules.

And now I've checked in a spec file and patch which builds xlincity (a.k.a. lincity.) which looks to be a rather old, but ready to play X11 port of the classic game. Xlincity on OpenSolaris For some reason, parts of the lincity source code seemed to have been written with notepad, edlin or some other Windows/DOS text editor, as they had to be run through dos2unix in order to get them to build properly on OpenSolaris. But the SFExlincity spec file sorts that out for you and it applies a small xlincity-01-solaris.diff patch.

The easiest way to build it on OpenSolaris (and possibly Solaris 10?) is:

  1. Install the OpenSolaris Common Build Environment (CBE) from
  2. Checkout a copy of Spec-files-extra: svn co SFE
  3. Set your proxy (if needed): export http_proxy=http://your.proxy.server:your_proxy_port_number
  4. pkgtool build-install --download SFExlincity.spec
Have fun!

Thursday Jan 24, 2008

Micropolis (a.k.a. S\*mC\*ty) runs on OpenSolaris!

A long time ago in a galaxy not so far away, Don Hopkins ported Maxis's C64 game "SimCity" to Unix, using the NeWS desktop user interface. Unfortunately Sun canceled NeWS (even though it was decades ahead of other \*nix desktop GUIs.) Many years later Don Hopkins revisited this code, now open sourced and contributed to the OLPC project. With a few tweaks and kludges, I built this codebase on OpenSolaris. This is what it looks like:

Micropolis on OpenSolaris

I checked a new spec file, SFEmicropolis.spec and a patch into spec-files-extra. These along with the help of the CBE, build micropolis-activity.

Known bugs:

  • MicropolisCore Python modules aren't built so any part of the game dependent on this won't work.
  • Micropolis hostname and DISPLAY handling is a little borked so it might not work via ssh -X or other X forwarding.
  • Micropolis depends on shm extentions so it doesn't seem to work on Sun Ray.

To be done:
  • Build a spec file for the Python modules (micropolis, cellengine and tileengine) in MicropolisCore.tgz. This should be easy for anyone who has a little time and an OpenSolaris build with CBE.
  • Clean up the spec file. There are a few kludges in the way I build it for Solaris, but whenever a proper config build is incorporated into the Micropolis codebase, some of this work won't be necessary.
  • Find someone who has played S\*mC\*ty who test can test the Solaris build (I haven't)
  • Try to get OpenSolaris patches accepted into Micropolis codebase.
  • Work with the micropolis community. (We might be able to use dtrace and other OpenSolaris tools to solve problems.)

    Feel free to help if you can, the code is out there!

Monday Jan 14, 2008

OpenProj 1.0 project manager runs on OpenSolaris Nevada!

OpenProj on Solaris Nevada build 80

I must confess that I've used Microsoft Windows at work a few times in the past 6.5 years. Three MS Windows only applications have occasionally been part of my job. The first was a proprietary browser test suite which is now very outdated, the second is a CD/DVD burning server which would definitely be more usable on a stable OS such as Solaris or NetBSD, and the third was Microsoft Project, an application favored by project managers. Now thanks to the people at OpenProj, I hope the Sun can set on Microsoft Windows at least within my daily work life.

I first learned about OpenProj from my friend Ron at inventors garage. OpenProj is "A desktop replacement for Microsoft Project. It is capable of sharing files with Microsoft Project and has very similar functionality (Gantt, PERT diagram, histogram, charts, reports, detailed usage), as well as tree views which aren't in MS Project." OpenProj is a Java/Swing application but a minor incompatibility between the Solaris and Linux tail command prevented the OpenProj launcher from working on Solaris. When I mentioned this, the helpful people at Projity fixed it a few days before OpenProj 1.0 was released. Thank you!

[Read More]

Thursday Nov 01, 2007

Wrong Keyboard US<->UK Solaris install

Wrong Keyboard

I appreciate the new look and improved common sense defaults behind the new installer that appeared in Solaris Nevada build 70. But it isn't yet foolproof. Because I often switch between a laptop with a U.K. keyboard, a Sun Ray with a U.S. keyboard and my home P.C. with a U.K. keyboard, I often go too fast through the "Default layout?" choice in the installer and choose U.K. when I should choose U.S. or vice versa. I'm left with a system which almost works perfectly except that the " and @ are swapped, sometimes #, $ and / aren't where they belong and | is nowhere to be found.

[Read More]

Friday Jul 20, 2007

OpenSolaris support for USB webcams improving!

Many thanks to Colin Zao for letting us know that the USB video class driver usbvc(7D) was integrated into Solaris build 56 and Ekiga plugin was integrated into build 63. I'm sure there is much work to do, but those with an Ekiga account and connectivity can start testing and using webcams which comply with the USB video class specification and report any problems to:

driver-discuss-AT-opensolaris-DOT-org or driver-usb-AT-opensolaris-DOT-org

Thursday Apr 19, 2007

An economic analyst vs Microsoft Windows Vista

Lets face it, most operating system installers aren't fun. If Microsoft Windows weren't pre-installed on more than 90% of desktop PCs, I suspect more than a few PCs would be sitting at BIOS boot prompts. But if the experience of my favorite blogging economist is any indication, even when you have a monopoly and your OS is pre-installed on 90% of new PCs, you can still screw up the initial user experience. Michael Shedlock is a economic analyst with an interesting view on everything from global property bubbles to currency carry trades to the goofy and dangerous actions of the Federal Reserve board. But give him a new laptop with Windows Vista pre-installed and pretty soon there's blood on the floor (literally!)

[Read More]

Monday Mar 19, 2007

About those SunLive07 Tech Days London public access terminals

For anyone who attended the SunLive07 Tech Days conference in London and attended talks and demos of Looking Glass, Wonderland and other cool new technology, you might have been disappointed in the look and feel and performance of the public access terminals upstairs. I can only say that the kiosk mode CDE running on what appeared to be an old version of Solaris with an old version of Sun Ray server does not look or perform nearly as well as anything beyond Solaris 10 with SRSS 3.0+. Here in Ireland I get about 1 Megabit broadband only when there is a tailwind. Yet the GNOME based JDS desktop in Solaris 10 or any recent Nevada build looks and works fine on a Sun Ray at my home. Some long time Solaris advocates perfer CDE on Solaris 8, but I'd put it in the same category as orange T-shirts. It would be really cool to have trusted JDS running on the public terminals. It's one of those technologies (like dtrace, ZFS...) that doesn't make for a flashy passive demo but once you've used it, you get it!

Friday Mar 16, 2007

London Tech Days

London Eye and Plane Sun Tech days London was held in Methodist central hall, Westminster. The venue was only a stone's throw from Westminster Abbey and a short walk from Big Ben, Buckingham Palace and other famous London sights. Tech days was reasonably well attended, you might be able to see from the photo below that much of the lower section and a good part of the balcony of this enormous hall was full for James Gosling's keynote. James Gosling's somewhat understated traditional techie talk reminded me of a college friend Electron Ron. It was interesting to see some of the spinoff projects from Java3D and project looking glass.

It was encouraging to meet people from within and outside of Sun who were on the same wavelength regarding many issues. There was much agreement on the mistakes Sun made a few years back but it really feels like we're getting back on the right track. I already knew about dtrace, ZFS, containers and other major recent improvements in Opensolaris but I was also impressed with the enhancements in Java and NetBeans. The changes are very welcome for those of us who weren't happy with previous versions of NetBeans or J2EE. I was especially interested in the NetBeans profiler, C/C++ packs and mobility packs. It's no wonder there are so many cool applications becoming available for mobile phones. Embedded development is a far cry from what it was only 7 years ago when I was working with electron Ron, using assembly language to try to pack a CRC algorithm and associated ADC data acquisition software into a transistor sized 64k embedded microcontroller. Projects which I came to describe as "just this side of impossible" and which Ron admitted were occasionally just the other side of possible.

I finally met long time Solaris and opensolaris enthusiast, Peter Tribble but it was too late to talk him into running for OpenSolaris board during this election cycle.

The top photo of the London Eye was a lucky coincidence. When my (broken) Casio QV-4000 went through it's long boot sequence, the camera, clouds and London eye were in a good position in relation to one another. The missing front lens element in my camera gave it a surreal, almost fisheye effect.

Gosling Java keynote OpenSolaris tech days 2007

The distorted images from my broken camera make it difficult to align panoramic frames from my interior shots of James Gosling's keynote. I followed these instructions for installing the pandora GIMP plugin. The plug-in instructions are for linux but, as is the case for many well-written \*nix applications, it works just as well on Solaris Nevada and other opensolaris distributions. This GIMP panorama plugin made rough alignment of the image panels relatively easy. Now I should learn how to use Java3D to make an interactive stitched panoramas. But I can't yet simulate one of these conferences in your browser so if you'd like to find out more about Sun's technical solutions to real world business problems, I'd suggest attending one of Sun's Tech days conferences.

Friday Jul 07, 2006

First step towards a microsolaris

I doubt I'm the first to have complained about the size of default Solaris package clusters and weird package dependencies. Doug Scott has some good news for fans of Solaris minimalism. Doug posted this excellent mini-howto explaining the very first step in creating a stripped down Open Solaris distribution. It should be possible to make an opensolaris distribution that is at least as small as DSL When booted into the miniroot I see that zpool, pkgadd and mount are available. What more could you need? ;-)

Wednesday Jun 28, 2006

Slow food - Slow boot at GUADEC

GUADEC was my first visit to Vilanova i la Geltrú, Catalonia, Spain. I've enjoyed the weather, the people, the prices and the quality of food. (Black Rice Paella :-) The relaxed style of eating is polar opposite of the U.S. fast food that I grew up with.

Yesterday afternoon a few of us sat down at a cafe/tapas place on Las Ramblas and ordered salads and sandwiches. I was preparing for my dtrace talk and I'd hoped to give out live OpenSolaris CDs to the audience so that they could play with dtrace during the talk. Dtrace really is one of those tools you have to play with to appreciate.

Sun's Solaris Express opensolaris distribution would have been the obvious choice. It was put together by Sun, with good i18n, documents and accessibility support. GNOME 2.14 was integrated only a few builds ago. Solaris express contains zfs, dtrace and other cool solaris tools. It has passed some Sun/QA and review for architecture stability. Unfortunately it also contains a whole bunch of stuff which has nothing to do with my desktop and won't fit on a CD. Even some of the minimal package clusters contain weird things that are hardly appropriate for a laptop. Do I really need this fiber-channel stuff on a wifi laptop?

Belinix might have been another good opensolaris choice. It fits easily on a CD, but it contains an XFCE desktop instead of GNOME. Schillix also seems to lack a recent GNOME and the last time I used it, it didn't have dtrace.

Nexenta has an interesting opensolaris demo distribution which runs from a live CD. This distribution doesn't have documentation (no man pages?) or some of the Solaris tools. It packages som GNU utilities and user space apps with a solaris kernel and it contains both dtrace and GNOME 2.14. So I burned this distribution onto a few CDs. I intended to pass the CDs out before the dtrace talk so I booted it while eating a tomato and cheese salad at the Vilanova cafe. It was still looking for a dhcp server on my primary network interface when I finished my salad. It searched for a dhcp server on my wifi interface while I finished my sandwich. When I returned from ordering my coffee, it was at a login prompt. I logged in and sipped my coffee while we discussed why Nexena didn't change the default dhcp timeout to something more reasonable for a wifi connected laptop. About 10 minutes later (total time at least 30 minutes) I still wasn't at the gnome desktop. It seems that nexenta didn't take advantage of an alternate filesystem to buffer reads from the CD. (How about something like ZFS compressed filesystem living in RAM?) My battery was quickly running down while the CD was furiously trying to supply the livecd content. I gave up.

So that's why I didn't have a good distribution to share for the dtrace demo. With Sun's new boot architecture there is no reason why boot should take so long and there is no reason why a desktop user needs so much server related cruft packaged with their desktop. Someone told me about a keychain opensolaris distribution but I haven't had time to look at it. Sun has a great kernel, some awesome tools and a 21st century boot architecture. I know it's possible to create a distribution which is neither too big nor too small. Up until a year ago when Solaris was opensourced, it wasn't possible to make a task-custimized Solaris distribution. Now it is possible. I think it's only a matter of time before someone puts together a keychain opensolaris, DVD Opensolaris, CD demo opensolaris and (maybe I'll do this one) an RSYNC/ZFS opensolaris backup server distribution.

GUADEC dtrace

Shortly before Glynn and I gave our GUADEC talk on dtrace, we sat in on Federico's "How fast" talk where he raised a question about profiling evolution memory allocation. I put together a quick dtrace oneliner which gives a distribution for evolution startup:
bash-3.00$ dtrace -c /usr/lib/evolution-2.6 -n 'pid$target::malloc:entry {@howmuch=quantize(arg0);}'
dtrace: description 'pid$target::malloc:entry ' matched 2 probes
CalDAV Eplugin starting up ...

(evolution-2.6:104389): camel-WARNING \*\*: camel_exception_get_id called with NULL parameter.
\^Cdtrace: pid 104389 terminated by SIGINT

           value  ------------- Distribution ------------- count
               0 |                                         0
               1 |                                         3414
               2 |@                                        12502
               4 |@@@@@                                    45218
               8 |@@@@@@@                                  71301
              16 |@@@@@@@@@@@@                             117177
              32 |@@@@@@@@@                                84371
              64 |@                                        14154
             128 |@@                                       15120
             256 |@                                        8680
             512 |                                         4517
            1024 |                                         4119
            2048 |                                         1193
            4096 |                                         784
            8192 |                                         1263
           16384 |                                         25
           32768 |                                         112
           65536 |                                         10
          131072 |                                         4
          262144 |                                         1
          524288 |                                         0

Joerg also pointed out that my "where evolution mallocs" oneliner was printing (N X M) stack traces and that one of the pair of output numbers was meaningless because it represented the count of that unique combination:
dtrace -c /usr/lib/evolution-2.6 -n 'pid$target::malloc:entry {@howmuch=quantize(arg0); @where[arg0,ustack()]=quantize(arg0)}
Where N was the number unique allocations and M was the number of unique stack traces. Its not as easy to do this correctly in a oneliner, so as soon as possible I'll put together a little script for profiling where the allocations are taking place.

One thing that is easily obvious from the above histogram is that evolution allocations are weighted towards very small amounts (2 bytes?!!!) Solaris's slab allocator should help in these circumstances. The gtk 2.10 talk was interesting, apparently GSlice uses an allocation scheme similar to that used by Solaris's slab allocator. You can't keep a good idea secret forever, can you?

I'll try to blog more about this if I have time and reliable connectivity.

BTW, why are the GUADEC wifi dhcp-servers handing out the same IP address to several MAC addresses?
Jun 28 15:24:46 sligo ip: [ID 903730 kern.warning] WARNING: IP: Hardware address '00:0e:35:07:69:1b' trying to be our address!
I guess the opportunity for IP address collision at an event like guadec is the reason why dhclient uses a very time consuming and paranoid arp broadcasting scheme to make sure that I'm not using someone's IP.

Thanks to whoever attended our Dtrace talk, I wish we had more time so I didn't have to fly through the demos, but it really is a technology you have to play with to fully appreciate.

Friday Feb 03, 2006

Groundhog day: Debugging BrandZ linux, repeatability

I know that the groundhog's official winter forcasting day was yesterday, but I was just thinking about the movie, Groundhog Day, where Bill Murray plays a weatherman who must relive the same day over and over again until he gets it right. The reason I was thinking about this is that I was debugging some problems while trying to get Postgres and a Kylix based application to behave together within a BrandZ zone. Other than a few weird things about the way kylix creates ELF binaries, most of my problems have been configuration and RPM version issues. One cool thing about BrandZ is that it allowed me to tar up my working linux environment, send it off to a BrandZ expert who could recreate my problem, create a patch and send the patch back to me. Another cool thing is that if I somehow manage to misconfigure my linux environment or an install overwrites something, I can go back to the tarball and have the linux environment from yesterday (Groundhog day) up and running in just a few minutes.

What I haven't done, but should be possible, is use ZFS's filesystem snapshot and rollback feature to more efficiently capture snapshots of my work-in-progress so that I can always go back an hour or two to the latest working environment. BrandZ isn't yet complete, but is anyone else using it for debugging linux applications?

Sunday Jan 01, 2006

BrandZ just works

Shortly before Christmas I installed BrandZ on a Dell box with Open Solaris Nevada build 27. So on the same box I had debian nexenta, OpenSolaris and within that a zone running Linux. The instructions here were pretty straightforward. Even though I'd never messed with BFUs before, it didn't take much time to get up and running. I was even able to install packages from a Java Desktop System 2.0 (linux) into the BrandZ zone. In fact the only problem I had was that I couldn't think of many useful Linux application which didn't already have a native Solaris port. Think of typical linux applications. Does Apache exist for Solaris? Of course! Mysql? Mozilla, Gnome, Star/OpenOffice, gcc? Yes, they're built in! If you can't find it here, look here or here. I'll admit that there are occasional applications which don't yet have a native port, but not enough to really make an impressive demo. Porting between BSD, Linux, OSX and Solaris isn't nearly as difficult as porting between Microsoft Windows and practically anything else.

If you're interested, Tim Foster has some BrandZ screenshots. As Tim says, BrandZ performs just as advertised. It's one of those useful technologies which doesn't make for a spectacular demo. I wonder how many linux applications out there are running on separate machines simply because one needed V 2.99 of some library and the other one needed 3.00? I wonder how many redundant linux boxes are wasting energy just because someone couldn't configure web applications to share resources or apache/mysql configurations well?

If I were or a similar provider, I would at least consider the economics of running multiple linux zones on one of Sun's AMD 64 servers, versus running multiple machines.

BrandZ hasn't even been released as a beta product, but it seems to run the example centros distribution O.K.. My only question is, should I pronounce it "Brand-Zed" or "Brand-Zee?"

Tuesday Dec 13, 2005

Where are gnome page faults generated?

I dialed into some sessions of jmr's desktop summit which weren't too late for those of us on Irish time. It would have been nice to meet the engineers in person, it was really a worthwhile exchange of ideas.

Gman mentioned that someone in the Gnome community was interested in whether it would be possible to use dtrace to get a stack trace of all gnome code being executed when a page fault occurs. So I learned about the vminfo provider from the dtrace manual and from Richard McDougall's blog. Richard has some vminfo scripts which measure the time spent in pagefaults. Here is my attempt at a "Where is GNOME paging?" script:

#!/usr/sbin/dtrace -s
/uid ==$1/
        @[execname,arg0,ustack()] = count();
It's too simple isn't it? But does it work? I logout of gnome, do a lockfs -f / and run the script before logging in to gnome.
...{stuff deleted}
 gconfd-2                                                          1
  bonobo-activatio                                                  2
... {stuff deleted}

gnome-session                                                     1
...{stuff deleted}

There are no comments in the script and I didn't put headings on the printout so here is what it does: Whenever there is a pagefault AND uid = {command line argument 1}, the pgpgin probe puts the number of pages requested into arg0. I aggregate this along with the user stack and the name of the process which generated the page fault. So in the above example, gnome-session with this ustack page faults 146 times during login while requesting 1 page each time. Here is the entire pagefault output during login and here are the page faults which occur when gnome-calculator is launched.

It seems almost too easy, doesn't it? If this is wrong, I hope a dtrace expert will correct me. Padraig and the desktop summit attendees are already looking at linker options which could reduce paging but it would take someone who knows gnome libraries better than I to know whether this output highlights any problems.

Note:I suppose there is some irony in the fact that while investigating this, my cursed linux laptop kept going catatonic on me. Top, vmstat... showed that everything was just dandy. Does anyone have a simple suggestion for diagnosing severe linux performance issues not seen by top? I really don't have time to mess with this. Aaaargh! The biggest problem with dtrace is that it makes me wish I had it for linux, and OSX... and Microsoft Windows.

Update: As if those dtrace guys were reading my mind with the user::whatIwish4:now probe, it's now possible to dtrace a linux application with BrandZ. Spooky.

Friday Nov 11, 2005

More dtrace desktop fun

In case you missed it, Alan Hargreaves posted a followup to Gman's first dtrace blog entry. Wouldn't it be easier if the dtrace guys just published a concise document which explains the one or two things that dtrace can't do?

DTRACE HowNotTo --
I. How you should not use dtrace to butter your toast...

Tuesday Nov 08, 2005

First impressions Nexenta "Elatte" debian opensolaris live

A few miles from here, in Dublin's St. Patrick's Cathedral, there is a door with a hole in it. The story goes that in 1492, while Columbus was out on his spice junket, there was a feud between the Butlers of Ormonde and the Fitzgeralds of Kildare. The Earl of Kildare sought refuge behind this door. The Earl of Ormonde axed a hole in the door... to shake hands with the Earl of Kildare. I don't think the feud between GNU is Not Unix and Unix was ever as severe as the Fitzgerald-Butler fued, but seeing them 'shake hands' on my Nexenta desktop is pretty cool!

Installation notes: This is a live CD alpha so I didn't expect things would work perfectly but I hoped install would work on this Dell GX-170 which has a mysteriously finnickly OEM network card. I saw two errors before accessing the desktop. The first was that gdm was trying to start when it was already started. No problem, I'd rather have it try to start 2 gdms than none! The second was an Internal error popup indicating "failed to initialize HAL!" (I'm sorry Dave....) Installation was slow but the Nexenta website warned me of this. As Peter Tribble noted, the package system is one of the parts of Solaris which may still earn the (increasingly inaccurate) Slowaris title. Since a live CD may have to install many packages on every boot, it doesn't surprise me that this is slow. The install recognized my Intel NIC but not the weird internal Dell OEM NIC. This isn't surprising, this driver isn't yet in the my opensolaris distribution either.

The desktop is based on GNOME 2.12, which has some new features such as 'throbbing' windowlist indicators for applications requiring your attention. The themes and splash screens are pretty and the layout makes sense. I'm accustomed to the single bottom panel of JDS but the launch menu, calendar, show desktop and trash occupy the four easily accessed corners. The menu isn't cluttered, but it doesn't yet contain an office suite. Sun's java, package manager and Solaris Containers (a.k.a zones) aren't available but Evolution 2.3.7 and firefox 1.0.6 are. They start up O.K. and once I set up my proxy, they are able to access the network. Some administration functions don't yet work. The desktop feels somewhat slower than the JDS GNOME 2.12 on OpenSolaris build I was previously using on this box, but it isn't bad considering that this is running from CD.

Treats:Shell connosoirs will be happy to hear that /bin/bash is the default shell! Colour hightlighting is visible in gnome-terminal and /usr/local/bin is in your default PATH. Dtrace is ready to go for those who want to peer into a debian application. The package database is on the CDROM and is read-only, so I wasn't able to experiment with apt-get or the synaptic package manager. For that I'll need to set aside a partition and do a real install, but from what I've seen so far, it will be well worth it!




« July 2016