Observations on packaging
By sch on Jul 19, 2007
Over the past few months, a bunch of us have been exploring various options for packaging. (Actually, I suppose I've been pursuing this on a part-time basis for a year now, but it's only recently that it's made it to the top of the stack.) I've looked at a bunch of packaging systems, ported a few to OpenSolaris, run a bunch on their native platforms, and read a slew of manual pages, FAQs, blogs, and the like. In parallel with those efforts, Bart, Sanjay, and others have been analyzing the complexity around patching in the Solaris 10 patch stream, and improving the toolset to address the forms of risk that the larger features in the update releases presented.
In the course of those investigations, we've come up with a number of different approaches to understanding requirements and possibilities; I'll probably write those out more fully in a proper design document, but I thought it would be helpful to outline some of those constraints here. For instance, one way to look at how we might improve packaging is to separate the list of "design inputs" for a packaging system into "old" and "new".
When I make a list of this kind, I know it's bound to offend. (I already know it's more scattershot than the argument we'll present in a design document.) Feel free to send me the important inputs I've omitted, as I have a few follow-up posts on requirements—lower level, more specific intentions—and architectural thoughts where I can cover any issues not mentioned here.
The "old" inputs are those that are derived from current parts of the feature set around software packaging and installation. Many of these inputs will really be satisfied by the new installer work ("Caiman") Dave's leading, but the capabilities of the packaging system will change some from difficult and fragile to straightforward and robust. With effort, some might even achieve some elegance.
- Hands off. For a long time, via the JumpStart facility, it's been easy to provision systems via scripted installed. This technology is still sound, particularly when we consider that the Flash archives allow a mix of image-based provisioning with the customizations JumpStart offered.
- Virtualized systems. As long or longer, we've supported the notion of diskless systems, where the installed image is shared out, in portions, between multiple running environments. Zones was a direct evolutionary successor of the diskless environments, and shows that this approach can lead to some very economical deployment models. Lower-level virtualization methods can also benefit from the administrative precision that comes out of sharing known portions of the installation image.
- Availability and liveness. With Live Upgrade, it's been possible for a while now to update a second image on a running system while the first image is active—a command to switch images and a reboot is all that's required to be running the new software. This approach requires too much feature-specific knowledge at present, but provides a very safe approach to installing an upgraded kernel or larger software stack, as reverting to the previous version is just a matter of switching back to the previous image. So, a package installation option that doesn't ruin a working system is a must.
- Change control. In principle, with the current set of package and patch operations, it is possible to create a very specific configuration—that may never have been tested at any other site, mind you—from the issued stream of updates. From a service perspective, the variety of possible configuration combinations is too broad, but the ability to precisely know your configuration and make it a fixed point remains important.
- Migration and compatibility. There's a very large library of software packaged in the System V format that will run on most OpenSolaris-based systems. Providing well-defined compatibility and migration paths is an obvious constraint, given the goals of the development process.
- Networked behaviour. Believe it or not, the current installer and packaging support a large set of network-based operations—you can install specific packages and systems over the network, if you know where everything is. That is, the current components need to be assembled into some locally relevant architecture by each site to be useful—any replacement needs to make this assembly unnecessary, potentially via a number of mechanisms, but definitely via a list of known (trusted?) public package repositories.
- Safety. One of the real complications from virtualized systems, at least in current packaging, is that a developer has to understand each of the image types his or her package might reach, and make sure that the set of pre- and post-install scripts, class actions scripts, and the like are correct for each of these cases. When that doesn't happen, the package is at a minimum incorrectly installed; in certain real cases, this class of failures compromises other data on the system. Restrictions in this space, particularly during package operations on an inert image, seem like a promising trade-off to achieve greater safety.
- Developer burden. Current packaging requires the developer provide a description of the package, its dependencies, and contents across a minimum of three files, in addition to knowing a collection of rules and guidelines around files, directories, and package boundaries. Most of these conventions should be encoded and enforced by the package creation and installation tools themselves, and it should be possible to construct a package with only a single file—or from an existing package or archive.
smf(5) aware. OpenSolaris-based systems have an understanding of services and service interdependencies, derived from the Service Management Facility's service descriptions.
smf(5) service management during packaging operations is awkward under current packaging and very much needs to be improved but, more importantly, the service graph provides a rich set of relationships among larger components that should lead to a better understanding of the rules around consistent system assembly.
Minimization. For a while, the current system has had
some very coarse package "blocks", from which one could construct a
system—the various metaclusters. These, with the exception of the
minimally required metacluster and the entire metacluster, split the
system along boundaries of interest only to workstation installs. Any
suitable packaging system must provide query and image management tools
to make image construction more flexible and less error-prone (and
eliminate the need for things like a "developer" metacluster, for that
It's also pretty clear that the package boundaries aren't optimized in any fashion, as evidenced by the differing rates of change of the binaries they currently enclose—in the form of the issued patches against Solaris 10.
Multiple streams of change, of a single kind. Although
we noted the continued need to control change above, it's also important
to be able to subscribe to a stream of change consistent with one's
expectations around system stability. The package system needs to allow
one to bind to one or more streams of change, and limit how the
interfaces the aggregate binaries from those streams evolve. That is,
it should be possible to subscribe to a stream of only the most
important updates, like security and data corruption fixes, or to the
development stream, so that one's system changes in a way consistent
with one's expectations.
Conversely, the tradeoff between complexity and space optimization in current patches—which introduce a separate and difficult-to-calculate dependency graph and distinct namespace entries for each platform, among other issues—has slid much too far, given the increase in system richness and the increases in disk space and bandwidth. There seems to be little long-term benefit in preserving the current patch mechanism, particularly since Sun never offered it in a form useful outside of Sun's own software.
ZFS aligned. ZFS offers the administrator access to so
many sophisticated options around deployment and data management that it
would be foolish for a packaging system to not explore how to take
advantage of them—
zfs cloneare the most obvious capabilities that allow the packaging system to limit the risk of its modifications to an image, without excessive consumption of storage space or system time.
Prevent bad stuff early. Another classic
OpenSolaris-style requirement is that the set of available packages
be known to be self-consistent and correct at all times, to spare the
propagation of incomplete configurations. In a packaging system, this
input expands on our intent to reduce the developer burden to assist the
developer in writing as correct a package as possible, and to enable the
repository operator to block incomplete packages from being made
available to clients. There's a rich set of data here, as we noted for
smf(5) service descriptions above.
Friendly user deployment. Direct from Indiana,
but sensible and apparent to all is that packaging systems have advanced
to a point where the usability of the client is expected, and not an
innovation. I haven't got the complete timeline of packaging
developments—the literature survey continues—but it's clear
aptsystem marks the current expectations about the client-side capability and ease-of-use.
In the course of the Indiana discussions, Bryan raised one point, which I'll paraphrase as "it's not an idea until there's code". That's a sound position, which I also happen to hold—Danek and I (and Bart and Daniel, I hope) and have been quietly prototyping for a little while now, to see if we had a handle on this collection of inputs. I'd like to give a bit more background, in the form of requirements and assertions, in an upcoming post or two. Then we're hoping to start a project on opensolaris.org to take the idea all the way from notional fancy to functioning code.