Thursday Jan 27, 2011

OpenSolaris CUPS LPD Printing Followup

In my earlier post I complained about CUPS in OpenSolaris (actually it was all OSes with CUPS 1.4) not auto-detecting the LPD queue name correctly. I filed a bug #3755 in the CUPS bug database and apparently this is now fixed in CUPS 1.5-current. Now CUPS will prompt you with a manual URI to fill out if it detects that the LPD queue name is empty. I feel good knowing I found a bug that caused me a bit of grief and that someone fixed it.

Wednesday Dec 15, 2010

Upgrading from OpenSolaris to Specific Builds of Solaris 11 Express Pkg Gotchas

There has been a major change in the packaging repositories between OpenSolaris and Solaris 11 Express. OpenSolaris packages were published with the publisher name of and a base URL of . Solaris 11 Express changes the publisher name to solaris and the base URL to . You will need to postfix either dev/ or release/ to the end of the base URL to get to the appropriate repository.

 In the course of testing, I wanted to upgrade to a specific build of Solaris. I had followed the instructions from a blog entry from The Observatory on how to upgrade to a specific build. Since the publisher changed between the bits that I was currently running (which referenced the publisher of to the repo that held the build I wanted (which referenced the publisher of solaris), I had to do some pkg magic to change the publishers.

One thing to keep in mind is that before I did anything, I created a new boot environment to install to:

beadm create snv_151a
beadm mount snv_151a /mnt-solaris11express

The two commands I did to switch publishers was:

pkg set-publisher -P -O  solaris


pkg set-publisher --non-sticky

The command I was using to try and upgrade was:

pkg -R /mnt-solaris11express install entire@0.5.11,5.11-

 One caveat that I found to this process is that if you are going to do a pkg -R <newpath> install of entire, you need to make sure that the two pkg set-publisher commands to change the publisher from to solaris are \*also\* prefixed with -R <newpath>. I was basically unable to install entire@somebuild to my -R <newroot> even if I gave it the full FMRI to the package. The weird thing was that pkg list -avf entire showed the entire package that I wanted, and the same FMRI. I could not figure it out so I went asking for help from the pkg folks down the hall from me. I was lucky enough to have Brock Pytlik come help me out. He quickly figured out the problem which was due to an incorrect understanding of how I thought the pkg process worked with the -R <path> flag.

As I understand it, pkg -R <path> anything basically chroots to <path> before executing. Disclaimer: I am not partial to how the -R flag is actually implemented, but chroot seems to describe it in terms that I can understand. One thing I did not realize is that the chroot occurs and THEN the publisher information is read in from <path>. I had created a new boot environment before changing my publishers and that led my snapshot boot environment to not have the updated publishers. I had assumed that the pkg install command would have used the current configured information and simply applied to install to -R <path> happily. I did not realize that it actually read in the configuration from <path> and THEN tried to install.

The correct sequence of commands would have been:

beadm create snv_151a


beadm mount snv_151a /mnt-solaris11express
pkg -R /mnt-solaris11express set-publisher -P -O  solaris


pkg -R /mnt-solaris11express set-publisher --non-sticky


pkg -R /mnt-solaris11express install entire@0.5.11,5.11-

Of course, this all could have been avoided if I changed the publishers and THEN ran the beadm create to create my new boot environment but alas, this is not what I did. If I had done it that way, the beadm create would have taken a snapshot that had the new publisher information configured and none of this would have happened. Sometimes command ordering matters, this is one of those times.

I hope this blog post helps out someone who was in the same predicament as me but did not have the convenience of having a Solaris engineer down the hall. The only alternatives I could see was a pkg image-update to the latest (which would have worked BTW since that does not use the -R <path> flag) and a fresh install to a build that had the publishers correct (which is not what I wanted to do).

Tuesday Dec 14, 2010

Using Crossbow and Solaris 11 Express Zones for a single machine proof of concept environment with Puppet

My last blog entry was about my debugging experience with Puppet and promise to share the setup that I used. I now follow up that previous entry with this one which describes my Crossbow + NAT + S11 Zones proof of concept.

One of the very nice features in Solaris 11 Express is the inclusion of the Crossbow virtual networking infrastructure into the mainline S11 Express code. I was familiar with Crossbow from some of the older presentations from CommunityOne when it was still just an OpenSolaris project. Now that it has been included in the mainline S11 Express codebase, I decided it was time to check things out and see how I could leverage it to do some proof of concept testing with my continuing evaluation of Puppet.

What is Puppet?

Puppet is a data center automation tool that we are evaluating to harmonize our configurations across multiple systems. As most people know, hand building and tweaking systems only works when you have a small number of systems, and even then it is error prone and not optimal. Puppet is a tool (one of many such tools) that seeks to automate these tasks with configuration servers and configuration agents.

Puppetmasterd is the configuration server that holds the configuration for an entire site. Inside the site defintion will be host mappings that map hosts to particular configurations that the agent is expected to conform to.

Puppetd is the agent on the client side that contacts a Puppet server, retrieves the expected host configuration definition, and then is the agent of change that implements the changes on the client. It will periodically poll the puppetmasterd service for any changes to the host configuration and will then ensure compliance to the new changes.

One thing that you have to realize is that by nature, Puppet is a multi-system architecture. Even in the smallest configuration you would need a puppetmaster server and an puppetd client for a minimum of two systems. In the olden days, that would have meant two independent computers/servers to simulate the test environment. That is not to mention the networking infrastructure just to connect these two machines together (yes, I know that a network switch is cheap, but it is not central to testing Puppet which is what my main goal is). All these pieces cost money, and is in my opinion, a complete waste of money and resources just to test and evaluate Puppet which is very lightweight by nature.

 So, I decided to test and evaluate Puppet on my workstation, a Sun Ultra 24 running Solaris 11 Express. I knew that I could create zones on it, but I did not want to give it a Virtual NIC with a real IP address which is how networking is normally plumbed in Solaris. Since this was my testing and evaluation, I did not want or need it to be on the public work network just in case it disrupted someone else and not to hog IPs just for testing inside my private network.

Enter Crossbow Virtualized Networking:

Since I read the S11 Express release notes, I remembered that Crossbow was integrated into the S11 Express build that I was using. I decided to try using that to accomplish my testing. As usual,  was the resource to consult, and I found Nicolas Droux's blog entry for setting up an etherstub network with NAT (see his original entry here).

The Crossbow Etherstub + NAT Howto:

You can think of an etherstub network conceptually as a virtual ethernet switch VLAN. To this etherstub, you connect virtual NICs (or vnics) that have the etherstub identifier. Those VNICs that have the same etherstub identifer will be able to communicate with amongst all the other VNICs that have the same etherstub identifer (analagous to having multiple system NICs on the same VLAN). There of course can be multiple etherstubs on a single host using crossbow to simulate multiple VLANs.

The original instructions had the following:

# dladm create-etherstub etherstub0
# dladm create-vnic -d etherstub0 vnic0
# dladm create-vnic -d etherstub0 vnic1
# dladm create-vnic -d etherstub0 vnic2

However, in the time that Nicolas wrote his blog and now, the command flags in dladm had changed to specify the etherstub identifier for the create-vnic subcommand from -d to -l (that is a lowercase L as in Llama).

After consulting the updated man pages, the new instructions are:

# dladm create-etherstub etherstub0
# dladm create-vnic -l etherstub0 vnic0
# dladm create-vnic -l etherstub0 vnic1
# dladm create-vnic -l etherstub0 vnic2

 Basically after the vnics are created, I assigned vnic0 to the global zone, vnic1 to the puppetmaster zone, and vnic2 to the puppetclient zone. Follow Nicolas' blog entry to figure out how to do that in the zonecfg zone definition.

After that, the steps are:

  1. Plumb vnic0 (ifconfig vnic0 plumb; ifconfig vnic0 up)
  2. Enable routing in the global zone (routeadm -u -e ipv4-forwarding)
  3. Create an NAT rule in /etc/ipf/ipnat.conf (replace e1000g0 with the interface with your default route)
    map e1000g0 -> 0/32 portmap tcp/udp auto
    map e1000g0 -> 0/32
  4. Enable ipfilter if not already enabled (svcadm enable network/ipfilter)
  5. Check the NAT mappings were taken in and accepted (ipnat -l)

At this point you can then boot your zones and configure them with IPs in the NAT'ed subnet (in my case They should be able to access any machines outside that the global zone can access through NAT.

The Results:

After completing the zone installation and Puppet installation, I now had a puppetmaster and puppetclient that could talk to each other that would not be exposed to the public. I did not have to purchase any new equipment and the configuration was about as simple as can be. The best part about all of this is that I can continue creating puppet clients as zones and have different puppet configurations that inherit different configuration manifests and test this all out without buying anything extra. Since zones are very lightweight compared to fully virtualized VMs, I can have many zones running on my workstation without a slowdown and they won't chew up all my processor time.

I can imagine many other great uses of Zones + Crossbow technology. It is immensely useful in situations where you are not testing the performance, but rather the features and proof of concepts to justify further investment. It can simulate small networks of deployment in a single host without further infrastructure investment (compute resources or network).

Tuesday Dec 07, 2010

Debugging Puppetmasterd from Puppet in Solaris 11 Express (and others)

I am evaluating Puppet for possible use in our labs. For those of you who do not know of Puppet, it is a Ruby based automation tool in the same vein of CFEngine and other system automation tools. This comes in very handy in medium to large deployments to make sure site-wide configurations are guaranteed to be identical. Hand building and tweaking systems becomes tedious and prone to error once the number of systems grows to maintain consistency in the configuration details.

As I was going through the Puppet installation on my Solaris 11 Express workstation I noticed that puppetmasterd was not starting an instance of puppet. I had followed the instructions in the Configuration Guide from PuppetLabs for Puppet 2.6.4. I had installed from the Ruby source instead of using one of the many prebuilt packages available on the interwebs and package repositories.

One major thing that I found non-intuitive with puppet was the options that you can run with puppetmasterd to debug problems. This is the daemon that starts the puppetmaster service on the host listening on port 8140. When I ran it, the puppetmasterd output was clean and it returned me to the shell. Unfortunately after testing with my puppet client, it appeared that the service was not running. I confirmed this with the puppetmaster zone seeing nothing binding to port 8140 with netstat and not seeing the output when I ran ps -ef | grep puppet.

One critical thing that I did was follow the instructions to use puppetmasterd --mkusers which automates a bunch of the tasks like creating the puppet users and groups, making certificates, and much more.

My first reaction was to run puppetmasterd --help and look for options that might give me debug output. Well there is a debug flag (--debug) and also a verbose flag (-v). I tried running them both with puppetmasterd and I still didn't get anything. Then I redirected the output to a file since I wasn't logged in to the zone console on the puppetmaster thinking that the output was being sent to the console. No such luck either. After searching around on the interwebs, I finally found the debug output I was looking for. I was missing the trace option which prints the stack trace.

puppetmasterd --no-daemonize --verbose --debug --trace

Once I did that, I saw the following:

err: /File[/var/lib/puppet/rrd]/ensure: change from absent to directory failed: Could not set 'directory on ensure: Permission denied - /var/lib/puppet/rrd

Sure enough, when I checked the permissions of /var/lib/puppet, it was all owned by root:root.

Then, and only then did it seem to show me what I needed to do to fix the problem. Once I did a chown -R puppet:puppet /var/lib/puppet everything was happy! My main gripe was that the debug and verbose options should have been enough to see that error. I don't see why I have to print the stack trace to get the debug information I was looking for. The error did not manifest itself when I used debug and verbose flags, it only came out when I added the trace flag. I expected the trace flag to show me the stack traces, not to show me additional debug error messages that actually told me what the problem was!

I will have a future followup blog about how I used new features in Solaris 11 from OpenSolaris such as Zones and Crossbow networking to virtualize my network connection and run it through NAT on my workstation for testing. I was able to have a puppet master and multiple puppet clients virtualized on a single machine connected through a virtual etherstub network courtesy of Crossbow.

Friday Oct 01, 2010

OpenSolaris CUPS Printing with auto-discovered LPD printers

There is  a major bug with the way the CUPS web administration portal(the interface on http://localhost:631) handles autodetected CUPS printers, if that printer only allows LPD printing. I remember spending a few hours to try to get my autodetected Xerox Document Centre printer working at the Sun Santa Clara office!

 Normally, LPD printers have a URI of lpd://<printer IP>:<port>/queuename. /usr/lib/cups/backend/lpd more specifically will NOT allow you to print without a queuename attached.

It fails with some obscure message of:
D [01/Oct/2010:15:21:43 +0800] [Job 3] lpd_command returning -1
D [01/Oct/2010:15:21:43 +0800] [Job 3] Backend returned status 1 (failed)
D [01/Oct/2010:15:21:43 +0800] [Job 3] Printer stopped due to backend errors; please consult the error_log file for details.
D [01/Oct/2010:15:21:43 +0800] [Job 3] End of messages
D [01/Oct/2010:15:21:43 +0800] [Job 3] printer-state=5(stopped)
D [01/Oct/2010:15:21:43 +0800] [Job 3] printer-state-message="/usr/lib/cups/backend/lpd failed"

 The problem that OpenSolaris CUPS (or maybe CUPS in general) has when it autodetects network printers is that it is impossible to edit the printer URI once it has been "detected" by the CUPS web interface. It is also impossible to modify it from the web interface from the printer administration page. At no point does it even bother to ask you what the queue names are after it autodetects the printer as an LPD printer, nor does it warn you that you need to fill in a queue name. Instead, it silently happily accepts the auto-detected configuration and tries to print, at which point it bombs out.

The solution I have found is to add the printer manually as an LPD/LPR printer, at which case you can fill in the LPD URI.

The only reason I solved this problem is that at this point, the CUPS interface shows you an example of an LPD URI which looks like:


At that point, I could print quite happily and all was fine. CUPS should warn you at even after it autodetects the LPD IP address, it still requires a queue name to complete the add process. I kind of assumed that CUPS had some sort of default queue name or something, but clearly this is not the case. And on a second note, the CUPS printer URI should be editable in the modify printer administration option.

All this to print out a network map! YIKES!

OpenSolaris is my new dayjob!

I have recently joined Bonnie's team for as her latest OpenSolaris commando for DevOps support. I am really happy to land a role where I hope I can make a difference.

I am no stranger to OpenSolaris... After I joined Sun in October 2007, I was working for the Sun Streaming System (an IPTV video server) and I was tasked with creating various infrastructure proof of concepts using OpenSolaris. One of these things was a Xen virtualized test dispatch consolidation to reduce the number of physical machines required to dispatch TCL jobs to various test harnesses. Using OpenSolaris 2008.05 was a real fun experience which I still enjoy to this day. The ipkg system really made a huge difference from the old system v packages and the 3rd party tools like pkg-get from the CSW guys.

 After that project, I made local OpenSolaris ipkg repo mirrors and kept going to see if we could repackage our software in ipkg format. I most recently tested ZFS deduplication and ZFS compression for a proof of concept and the results were pretty exciting!

It has been many years since 2008.05 came out, but I am really excited with what is to come. Lots of things have changed since then... most significant is probably having Sun being acquired by Oracle. It should be interesting times ahead, and I look forward to what comes my way!


I am part of the Development Operations team. I have been at Sun since October 2007 where I originally joined the Sun Streaming System, an IPTV video streaming server solution. My primary interests are networking and all related technology


« April 2014