How is an Enterprise Data Center Like a Hardened, Minimized, Special-purpose Appliance?
By ford on Jun 20, 2006
Almost 2 years ago I was part of a team in Solaris Software that investigated the application of Solaris to horizontally scaled environments, looking for any technology gaps or opportunities that we should try to fill with new Solaris features. The typical customer environment that we generally envisioned when thinking about horizontal scaling was a large data center, running a variety of software including many "small" applications running on dedicated single boxes of various types, and a few "huge" applications, each running in some kind of distributed fashion across multiple boxes. In a sense, this is right in Sun's traditional customer base, the large data center environment. But it also requires adapting to the changing reality of the industry with some increasing trends: server consolidation, low-cost commodity computers, grid computing, and virtualization.
At the same time, I was advising another team in the same organization that was experimenting with Solaris-based appliances - small, special-purpose boxes with all of the "general purpose Unix" system administration eliminated, hidden, or automated, and only the one (or few) required functions having any direct interaction with the administrator, perhaps through a simplified browser-based interface. The emphasis here was not on creating new appliance products for Sun to sell, but rather on creating tools and a business model for Sun to sell a Solaris-based appliance construction set to hardware OEMs. The idea is that a hardware vendor who doesn't know much about Solaris itself can buy the construction set, point-and-click at a list of features (firewall, router, mail server, web server, database server, IDS, file server, etc.) and the tools would output, say, a hard disk image ready for mass duplication or a turn-key install CD that will turn a raw PC rolling off the assembly line into an web-managed appliance with all the selected features. This is a whole new area for Solaris.
These two projects were thinking about radically different deployment styles for Solaris, but I was intrigued by the number of issues and ideas that popped up in common between the two styles. I started collecting some interesting ideas that could improve Solaris' applicability for both styles of deployment.
For example, the two styles have different security issues, but may have some common solutions. If your data center or grid has hundreds or thousands of separate Unix systems, comprising literally terabytes of copies of the OS and application software (not counting the actual user data), it's an enormous amount of work for the administrator(s) to defend and audit all of them against various forms of malicious (or accidental) modification. With an appliance, you don't have the scale problem; instead you have the simple fact that there probably is no administrator who ever actually logs in to the system and checks how Solaris is running and whether the system has been tampered with. Both problems could be solved by allowing the system and application software to be stored in an immutable form, either on the network (in the data center) or on physically write-protected media (in the applicance). By making all of the software physically immutable or easily verifiable (by keeping it on the network separate from any variable configuration data) we eliminate many of the risks. There are several components of Solaris, though, that currently modify data on the disk even though they don't really intend to make persistent changes to the system; these components need to be fixed (that's what /tmp and /var/run are for) or tricked (perhaps by transparently redirecting writes to a temporary RAM copy).
That leaves the question of where you store configuration data -- the stuff that actually has to be stored persistently and per-host. This is what defines the "personality" for a system that is forced to run a generic immutable software image identical to many other boxes in the data center or to every other appliance of the same model sold by the hardware vendor. Solaris already has a good start at a configuration database stored separately from the OS implementation - the /etc directory. Ideally, on Solaris, all configuration data is stored somewhere under /etc and no file that forms part of the implementation of Solaris is ever modified by the administrator. There is enough variation and deviation from these rules, though, that it's not possible today to solve the problem simply by mounting / read-only and mounting /etc read-write. If we could either modify all software to conform to the rules, or create some technical mechanism for causing (only) the configuration files to be fetched from a special configuration database instead of from the immutable OS image, we could give each host a very small, writable file system that only contains the configuration data that has actually been modified for that host. In the data center this might be in the form of a bunch of tiny /etc file systems on an NFS server, or a small writable configuration partition on each system's disk next to the huge read-only boot partition. In the appliance, this might be just a few kilobytes or megabytes of NVRAM or flash memory.
There are several other aspects of system management that have interesting overlaps between the large data center and appliance deployment styles and might have some common solutions based around the idea of an immutable system image and separate configuration database. These include:
- hardening and minimization (installing small-footprint subsets of the system software),
- patching and upgrade (replacing the software while keeping the configuration),
- dynamic provisioning and configuration (bringing a system online with configuration data generated on the fly by a central server),
- application migration and load balancing (moving an application and its data to a different physical or virtual machine based on varying conditions),
- polyinstantiation (creating a re-usable cookie-cutter prototype of an application and booting it on as many machines as needed, without "installing" anything), and
- diskless and "live CD" applications.
There are many technical problems to be solved to make these concepts work well, and I started to collect some ideas under the umbrella of a project called (for no particular reason) "Fanta". This project is not currently funded as a whole, but many of the problems have incidentally already been taken on by other projects and solutions are being implemented. For example, the Zones team has already taken several steps toward the goal of "transportable zones", and projects such as Schillix have already achieved the "live CD" capability.
I am attaching here a set of slides that I used a little over a year ago to describe some of the ideas in the "Fanta" project. Some of these may be of particular interest to the OpenSolaris appliances community. I'm not proposing creating a new OpenSolaris project at this time because I don't know how much time I'll be able to commit to work on any of these features, but if others are interested in working on anything here, I could allocate some of my time as well.