Wednesday Sep 28, 2005

Arrogance and approachability

Although I have been writing a lot lately—slides, draft policies and processes, and slews of email—none of it has managed to reach my blog.

I've been a Wall Street Journal subscriber since graduate school—skimming the paper in the morning is a habit I've maintained despite the hazard of having milk-sodden cereal hurled in my face by an unconciously ignored child. Last week, Walter Mossberg, in his Personal Technology column entitled "Yahoo Email Delivers That Desktop Feel Most Users Expect" [$$], reviewed Yahoo's forthcoming web mail implementation, which is currently in beta test. The review is positive, but at the end Mr Mossberg shares what, for him, are some key constraints on software improvements. These constraints—which I think are overly severe—are very relevant to our work in Approachability, in which we're reexamining some of the steps required to configure and operate a Solaris system.

The major aspect of the Yahoo Mail Beta appears to be the use of an increased amount of "AJAX" technologies in the implementation, such that the application pushes more logic local to an advanced client (like Mozilla Firefox or any of the "big" browsers) and thus appears to be more responsive. However, the final paragraphs in the column instead concern how Google Mail differs from Mossberg's assessment of typical mail interactions:

.... On several key issues, Google's engineers have decreed that familiar email practices are no longer useful, and have substituted approaches they prefer, arrogantly denying users any choice.

Gmail doesn't allow folders, only color-coded labels, as an organizing technique. It forces you to view all of your email in groups of related messages called "conversations," instead of viewing them individually as they arrive. Other email programs also allow such grouped views, but they permit users to choose. Not Gmail, where "option" is a term too rarely employed, except in reference to employee compensation.

This language is strong (and the closing sentence of the passage veers off into an ad hominem cornfield, since there was surely an internal user interface discussion behind this particular choice). The underlying request I believe is being expressed here is "justify your differences from convention". Now, technically speaking, the labelling mechanism (or "tagging") is a legitimate categorization offering that some users find superior to the implicit single category provided by the folder mechanism. If you've ever wanted to file an message in two folders, you know what I mean. (It's also pretty reasonable to say we're living in the Golden Age of Tagging and perhaps should try to enjoy it...) And, if you use one label per mail, you've reproduced the folder model via convention. So, while different, there's no doubt that this interface offers a capability that some users will prefer.

The second objection, to the absence of an individual message view, I can't get excited about, but I suspect that the UI discussion about this aspect concerned the deliberate elimination of various views. That discussion might have involved consistency of presentation (frame and window counts constant) or increased responsiveness (document model-based inbox with dynamic population). Software engineering rationale, like the maintenance costs of additional features may also have entered the discussion.

Mr Mossberg concludes by emphasizing his concern that Google will fail to meet certain users' needs for cultural reasons:

I'm sure Gmail will get better and better, and will eventually adopt the new programming techniques that allow desktop-like ease of use. But I'm not sure Google's arrogance will ever make room for user preferences on things like folders or ads, or how emails are grouped.

Yahoo's new email program would blow Gmail away if it were widely released today. That's partly due to its features, but also to its respect for user choice.

Again, strong words.

In Solaris Approachability, we're wrestling with similar issues: should we offer all envisionable configuration choices for each feature? Where's the sweet spot between too few choices and too many? We've been operating with a philosophy that the system should adjust itself dynamically, eliminating former tunables (or refusing to introduce new ones) in the interest of reduced management costs. The tradeoff is potentially a tax on performance or a loss in some notion of absolute flexibility.

The concern I have when I read a review like Mr Mossberg's is that he presents a critique of a software artifact without even a cursory mention of his applicable criteria, and sweeps all failures to meet those criteria to Google's engineering approach. What are those criteria? Under what circumstances would those criteria be invalid?

We're planning to make changes to Solaris: typically, we will solicit feedback on the proposal itself, as well as through the development cycle. (Feedback is much easier to solicit with the growing OpenSolaris community.) If those changes risk violating your implicit criteria, let us know; if we make questioned changes even after that notice, it's due to our drawing different conclusions from the available data, and not the result of doctrinal arrogance.

[ T: ]

Wednesday Apr 06, 2005

Cameras on Solaris

Dan noted a while ago the libusb/libgphoto2 stack works well on Solaris 10 (and newer Express releases, of course). I've been using the gphoto2 CLI to manipulate my beaten-up Canon over USB, and then gthumb as my image browser/cataloguer. There are a few bugs here and there and some obvious enhancements needed, but the trajectory in this space is impessive. Since I've now got a "for-work-only-toy-camera", I'll keep experimenting with related applications on Solaris, and make any observations here.

You can get all of the mentioned software prebuilt for Solaris at Blastwave. A working libusb is available in /usr/sfw/lib.

Tie knot: Knot 7 (Half-Windsor) [reprise].

Wednesday Mar 16, 2005

String theory split

A few people mentioned an article in the San Francisco Chronicle describing the current support and skepticism around string theory in the physics community to me. It's worth a read. I would have liked to see some comments about aspects of theory—signs of the Higgs particle or supersymmetric structure—that should become testable at CERN's Large Hadron Collider in 2007. It's worth noting that, since some of its opponents are quoted in the article, we would already be exploring this energy regime at the Superconducting Super-Collider, if it had been ready by its revised completion date of 2003. (You didn't think this was a new argument, did you?)

Tie knot: Knot 12 (St Andrew).

Wednesday Jan 26, 2005

smf(5) design admissions

In a Usenet thread on comp.unix.solaris and an earlier comment series in the "Comparing Linux to System VR4" on Slashdot last week, some valid smf(5) criticisms have been raised. I thought I would reply to them here, and acknowledge the impact on some practices.

The strongest recent critic has been Tim Hogard, and I hope he doesn't mind if I collect his concerns to structure my response. Mr Hogard is a member of the class of exceptional administrators that have designed their systems' behaviours (by removing unneeded software and writing precise configurations for what remains, among other good practices). I've always known that this group of experts would be most directly impacted by the changes smf(5) brings, and have tried to listen carefully to their concerns and modify the implementation to accommodate them. (Such people exist within Sun and in the Beta and Express communities.)

"On my system the binary file contains the last time when the service started. That means a file in /etc gets written with every boot." There are actually two binary files that contain the set of information the facility acts on:

  • /etc/svc/repository.db, which is persistent and contains the service definitions, dependencies, and so on, and
  • /etc/svc/volatile/svc_nonpersist.db, which is non-persistent and contains service execution information, such as process IDs, contract IDs, and stae transition times.
(/etc/svc/volatile is a tmpfs filesystem mounted by the kernel prior to startup being turned over to the progeny of init(1M). Its contents do not persist across OS instantiations.)

I believe the concern here is that /etc/svc/repository.db change detection cannot be managed using fingerprints. This is in fact a bug:

6221934 \*svc.configd\* must respect repository filesystem metadata
as the current implementation does an idempotent test transaction to verify writeablity (in addition to other integrity checks). In the meanwhile, one can indirectly use a hash- or checksum-based fingerprint using the output of the archive subcommand to svccfg(1M), which dumps the repository as an XML document. An example using cksum(1):
13 $ svccfg archive > /tmp/a
14 $ svcadm disable manifest-import
15 $ svccfg archive > /tmp/b
16 $ svcadm enable manifest-import
17 $ svccfg archive > /tmp/c
18 $ cksum /tmp/a /tmp/b /tmp/c
1649351107      205437  /tmp/a
2659093841      205438  /tmp/b
1649351107      205437  /tmp/c
19 $ # use diff to examine repository differences globally...
But we'll get that bug fixed shortly.

"I've found some very interesting things to do to the new system that can mess up a box in a way that the current tool set won't even let you see what is wrong. It appears to be a script kiddies back door dream." I am less certain I understood this point. There are no hidden services in the repository, although you can certainly corrupt the repository deliberately. (Separate backup repositories are made automatically at stages of boot; a recovery utility is provided in /lib/svc/bin.) Because each process is a member of a process contract owned ultimately by svc.startd(1M), you can use svcs -p if you trust the repository and ptree -c if you don't. (If you don't trust the kernel any longer, then we're onto intrusion containment and system reinstallation from trusted media.)

"And yes I'm saying complexity is bad and we should stick with what works (and it isn't DOS)." Unfortunately, what worked once no longer works given today's requirements. The Predictive Self-Healing initiative at Sun and equivalents elsewhere are responses to a change in availability requirements: both hardware and software components have the potential for failure, and the operating system needs to acknowledge this outcome and provide abstractions and capabilities to manage its implications. It's my belief that smf(5) introduces a minimal amount of complexity in exchange for new and meaningful descriptive objects and, in fact, the simplification of some previously difficult operations.

This reply hasn't explained why there's a transactional database (which, as an implementation detail, is currently SQLite), or why hierarchical restart is a different problem than parallel startup, or why we converted the current system to use the facility rather than providing only the framework. I'm happy to expand on any of these, or the points above.

Tuesday Sep 21, 2004

Had no effect, given my coloured lenses

The Register is displaying with a Sun Blue theme today.

Tuesday Jun 29, 2004

Assembling a collective viewpoint

Bryan fielded some questions at USENIX today, and in doing so augments some of Andy's first comments on why we are pursuing an open source development model for Solaris.

Friday Jun 25, 2004

Why predictable?

I titled my blog "Predictable" more out of genuine interest about making systems more so, than any particular cynicism about how things (any things) turn out. We have a pretty established language about desirable system attributes: reliability, availability, serviceability, and performance. Predictability is a courtier of all of these, but really doesn't dominate any of them--it's almost a separate quality.

We've been working on resource management for a while now, with an eye to allowing the construction of more predictable servers. Most of this work has been adding various mechanisms to the OS to prevent resource denial of service opportunities, or mechanisms that reserve or fairly schedule resources among competing services, such that the system can respond fairly smoothly until we reach some level of overcommitment. Now that we've got the basic mechanisms (and more are coming), we can start to examine how we can layer automation on top of them, and push out the boundary where true overcommitment occurs, without administrator intervention.

The current edition of Solaris Express contains the first of these features, dynamic resource pools, which you can use in combination with zones to have your consolidated systems smoothly react and reassign processors based on system load and relative importance. (You can also use the fair share scheduler with zones, if you don't need to reserve or cap some absolute amount of processing capability for each zone).

We now have a system that can respond elastically across a wider range of load scenarios, without compromising some minimal expectations regarding quality of service with respect to one resource. Our solution, however, can handle multiple resources, and we'll see how that "predictability product" allows you to run your systems closer to maximal utilization in the future.

There are lots of problems in this space, connected with specific operating system resources (and the notion of a resource itself), how resources map from one layer in a stack to the next, how end-to-end scenarios play out within a participating host, ... But it's clear that predictability engineering is a piece of operating systems development--and, if you can get some time to think about it, it's a lot of fun.




« March 2017
External blogs