Tuesday Dec 15, 2009

Cloud Computing and Barriers to Entry

Part two of my early December trip took to me Shefayyim, Israel, for the IGT 2009 World Cloud Computing Summit. I'll admit to being biased when visiting Israel for technology reasons, because the so much in the country convolves the best of hard-charging technologists, the "pioneer spirit" (what Americans would call entrepreneurial efforts, but applied from agriculture to housing to immigration), and some seriously spicy food.

I'll start with the food, where I managed to eat what I could say in Hebrew: ice coffee, eggs, blintz, cheese, grapefruit, and chareef (spicy pepper sauce). This trip I upgraded to Breakfast 2.0 and distinguished chareef adom (red pepper harissa) from chareef yarok (green Scotch Bonnet pepper condiments, assured to damage your taste buds). I also know the Hebrew word for "horse" but fortunately it wasn't a breakfast option.

Best part of the show around the show was spending time with MBA students from Tel Aviv University who wanted to understand the implications of cloud computing for Israeli companies. They seeded me with questions: Do Israeli companies have any advantages in the market and would cloud computing make it harder for new companies to enter the infrastructure markets?

I based my answer on my first visit to Israel in the 1980s, when finding a working pay phone to call back to the States was an adventure in locating special-purpose tokens for the phones, finding a phone that felt like working, and then hoping that the time zones aligned when I had enough coins for the call. Almost two decades later, I visited Israel again: everyone had at least one cell phone and I had my choice of cellular carriers -- at the top of Masada, on the edge of the desert. I believe the lack of a built-out landline infrastructure stimulated the mobile uptake, and as a result the Israeli consumer is much more used to the cell phone as an data access endpoint. Creating software as a service or applications delivered over wireless networks is much easier when it's ingrained in the social fabric of the developers and the more seasoned managers.

While my interviewers were looking for me to present challenges for new companies entering the market, I described why I think cloud computing may reduce barriers to entry. Abstraction (through virtualization) hides implementation details, making it easier for cloud computing providers to change, upgrade or extend implementations without disrupting the services running on top of them. Have a better idea for a router, storage switch, VPC manager, or other device that would sit in the consumer-to-disk data path? Provided that its provisioning, operation and installation are hidden through the virtualization engines exposed to the user, you're only dealing with the provider's installed base, not the installed base of the installed base of users.

Finally, I got the required question about military experience and any benefits it might provide in cloud computing. My view of defense-related applications (whether using or building them) is that they have three strong requirements: security, reliability, and correctness (auditability, consistency, and known failure modes, for starters). Those tend to be the same issues raised as concerns around cloud computing, so I see something of a natural fit between in-country expertise and in-cloud demands.

Converging to "Good" Content

I've been trying to intersect a variety of conversations lately -- a Wikipedia board member on the editing and document model for his work; our SunSpace engineers in terms of the value of community equity; and editing work with two co-authors on a book about WordPress (an interesting task given the existence of the WordPress Codex.

In each case, the key question is "How do we know we're converging or improving the quality of what we have?" This problem shows up in various ways, depending upon the editing context. SunSpace is our internal wiki for collecting technical expertise about our products, services and industry applications of them. It has all of the benefits of a wiki (ease of editing, multi-author editing, revision history) but also the downsides as well (frequent editing, no clear indication of whether the new version is better than the old, and the occasional keyboard error that displaces valuable content).

One of the biggest problems with a large wiki, with an even larger volume of rapidly changing content, is that it's hard to ascribe value to what's in it. We've tied a notion of community equity to SunSpace, giving equity kickers for creation, re-use, and participation. The last point is the creative one, because it encourages ranking, voting and manipulation of what's in the wiki. There's very little value in being a write-only memory; there's tremendous value in knowing what stimulated conversation, contention, and competition. What I like about community equity is that it captures the value of expertise, and rewards interactions, not just outputs.

A team meeting two weeks ago with the SunSpace engineers got me thinking about a long-standing discussion I'd had on the editorial and content management model for Wikipedia. In short: how do you know that the quality of an entry is improving? It's possible to tie a ranking engine like community equity to a Wikipedia entry, and use references (people who land on the page and read it) as well as voting (like/dislike buttons) to measure the surface area quality of the entry.

But now add a time element to it, and look at the overall equity of the entry as it undergoes revision and extension. Is it trending upward, in which case the crowdsourcing of the content is a valuable effort? Or is it gyrating, perhaps with a large dynamic range, indicating that successive edits are trending toward opinion and interpretation, and less based on facts or measurable, objective evidence. If enough people like or appreciate the net changes to an entry, then it's "good enough" even if it rubs the original entry author or subsequent editors the wrong way. I had this exact experience adding my own thoughts to the Wikipedia entry on Princeton's Colonial Club, where I felt capturing a bit of the 70s and 80s would flesh out a much more recent history. Seems like the page authors didn't agree with me, and my edits soon vanished. I'd prefer if the decision was made by the readers and consumers of the page, rather than an arbitrary editorial board. Of course, for content that triggers the whipsawing of public opinion, it's time to bring in the professional encyclopedic editors.

Wikis don't obviate the need for good content publishing and production processes, as we've learned with our SunSpace work, but they do give us a platform in which to build and measure equity in a broad sense.

Sunday Jun 28, 2009

Enemy Of State

Last Wednesday's opening session at the SIFMA technology management show covered three aspects of data center evolution in increasing order of abstraction: AMQP as a primary data management tool, the future of the NYSE data center as a virtual trading floor, and cloud computing (given by yours truly) as an incentive for building more reliable and scalable applications.

Carl Trieloff from Red Hat started literally down in the wires talking about the AMQP and how this might change the way we think about state persistence. Rather than worrying about the end points for state management, Trieloff argued that we should think about the messaging vehicles themselves as methods for ensuring that we don't create interoperability and persistence problems. I was reminded of Reuven Cohen's blog proclaiming XMPP as the new glue of the internet, supplanting HTTP, citing the use of XMPP in Google's Wave protocol as evidence. While I never believe a protocol spec serves as physical proof of a phase change in the matter of any system (SOAP based web services, anyone? Buehler?), it is one more indicator that we way in which systems carry their state is becoming as critical as where the state is preserved, particularly if the state is short-lived (whether edits to a Google document or stock exchange order book information).

Stanley Young, CIO of NYSE Euronext, discussed the exchange's core messaging platform, built on the Wombat engine, since acquired by the NYSE. It's another example of messaging trumping structured data management, and it served as a foundation for Young's discussion of how future exchanges - emphasis on plural - will be built. He declared that the "data center is the new trading floor" and that nearly 80% of the NYSE Euronext future data center will be available for co-location and what is effectively a private hosted data center. He closed by stating that the NYSE's goal is to be able to spin up a new market in 24 hours: the listed instruments, settlement functions, and order management defined, deployed and connected to a critical mass of players that truly defines "capitalism". If you can't value it, trade it, and make it fungible, it's not capital. The NYSE has its eyes set on expanding, rather than contracting, the capital exchanges. It's an equally strong statement about the growing importance of application agility.

I got to speak after them but before the coffee break, which is a slightly better slot than the "after lunch nap hour". While going through an update on cloud computing use cases for test and development and space/time adjunct scaling, as well as thoughts on building private clouds, I emphasized how cloud computing is making us rethink reliability. You can't build a cluster out of what you can't physically configure - unless you do it in software.

Application reliability has historically been about recovering state after a failure. With a virtualization layer intermediating the application and underlying hardware, tried and true clustering methods no longer make sense. Rather than keeping in-memory state we should be encapsulating it (hence the emphasis on REST); similarly we should be putting applications in more explicit control of their replication of data and memory instances. This doesn't mean that persisted state goes away -- databases, table oriented stores (BigTable, SimpleDB), and replicated file object systems (Mogile, HDFS) are going to increase in use, not decrease. But each of those components has explicit control of replication and failure recovery, rather than relying on clustering at the hardware level to do it implicitly.

Thursday May 28, 2009

Unsolved Developer Mysteries

I love when customers play "stump the geek" and ask really insightful, serious questions. It's partly what makes being a systems engineer at Sun challenging and fun (and yes, I consider myself an SE within my own group, but I'll pass on the is-a has-a polymorphism jokes, thank you). Yesterday's question scored an 8 for style and 9 for terseness (usually a difficult combination to execute):

What are the top developer problems we haven't run into yet? I gave an answer in three parts.

1. Unstructured data management and non-POSIX semantics. Increasingly, data reliability is taking the shape of replication handled by a data management layer, using RESTful syntax to store, update, and delete items with explicit redundancy control. If you're thinking of moving an application into a storage cloud, you're going to run into this. Applications thriving on read()/write() syntax are wonderful when you have a highly reliable POSIX environment against which to run them. And no, don't quote me as saying POSIX filesystem clusters are dead - the Sun Storage 7310C is an existence proof to the contrary. Filesystems we loved as kids are going to be around as adults, and probably with the longevity of the mainframe and COBOL: they'll either engineer or survive the heat death of the universe. There is an increasing trend, however, toward WebDAV, Mogile, SimpleDB, HDFS and other data management systems that intermediate the block level from the application. New platforms, not at the expense of old ones.

2. Software reliability trumps hardware replacement. An application analog to the first point. Historically, we've used high availability clusters, RAID disk configurations and redundant networks to remove single points of failure, and relied on an active/active or active/passive application cluster to fail users from one node over to a better, more healthy one. But what if the applications are highly distributed, recognize failure, and simply restart a task or request as needed, routing around failure? IP networks work (quite well) in that sense. It requires writing applications that package up their state, so that the recovery phase doesn't involve recreating, copying or otherwise precipitating state information on the new application target system. There's a reason REST is popular - the ST stands for "state transfer". And yes, this worked really well for NFS for a long time. Can I get an "idempotent" from the crowd?

3. Parallelism. If not bound by single thread, what would you waste, pre-compute, or do in parallel? This isn't about parallelizing loops or using multi-threaded libraries; it's about analyzing large-scale compute tasks to determine what tasks could be partitioned and done in parallel. I call this "lemma computing" -- in more pure mathematics, a lemma is a partial result that you assume true; someone spent a lot of time figuring out the lemma so that you can leverage the intermediate proof point. When you have a surfeit of threads in a single processor, you need to consider what sidebar computation can be done with those threads that will speed up the eventual result bound by single-thread performance. This isn't the way we "think" computer science; we either think single threaded or multiple copies of the same single thread.

That was my somewhat top of mind list, based partly on the talk I gave at CloudSlam 09 which will be updated for SIFMA in New York later this month.

Thursday May 07, 2009

Cloud Computing Round Table Video

Sys-Con.TV has posted the video of our "Cloud Computing Round Table" held on March 29 in Times Square. It was a fun exchange with a lot of sharp dialogue and discussion about reliability, application fit and function, and whether or not amazon.com is going to eat your data center. Of course, having amazon.com CTO Werner Vogels on the panel made that discussion topical and lively.

Saturday Nov 01, 2008

Pecha Kucha: My Internet Life

At Sun's recent internal nerd fest, I participated in my first pecha kucha night. Having once failed miserably at play by play basketball broadcasting, I didn't think any other form of public speaking could be as challenging. As Yoda would say, wrong I was. The 20 seconds per slide cadence eschews anecdotes, extemporaneous thoughts, or anything less than a highly collimated focus. There are play stoppages in basketball but the slide timer has no wait states.

But that doesn't mean I can't fix it in post-production. Here is the tabular version of "My Internet Life" pecha kucha in less than a dozen frames.

In 1995 I started looking at how the Internet, then defined by AOL dial up and email, might change the types of relationships I built. I scribbled this Gartner-esque 2x2 grid for mapping relationship types, spanning those based on facts to those surrounding or stimulating emotion (think work versus religion although those lines are blurring); those that are purely personal versus those that are community driven.

I'll upset the magic quadrant afficionados by starting there. It's what Carol Cone called the "ribbonization" of America. If you're not about something, representing, excited, incensed, pitching, shilling or voting, you aren't engaged. According to NASA's Gen Y rocket riders, everything you know about causes is wrong. Not to quote bumper stickers, but if you're not outraged, you're not paying attention.

The Talking Heads' David Byrne sang the attributes of facts, including among them being lazy, late, simple, straight, and not doing what he wants them to (It's in the bridge of Crosseyed and Painless). Technical facts have a time value problem: they become worthless quickly. If your relationships are built on hoarding information, then they decay in value over time. Knowing something in isolation is useless.

Facts and emotions meet on Facebook, and dance in the form of short status updates. Create a group for just about any cause you can invent, and within three weeks your friends will forget what it's about. Probably because it's not about anything -- there's no output.

Output provides a segue to work: fact based and somewhat community oriented (unless you're the resident "doesn't play well with others" type). The classical view of work product defined by a company is rapidly being replaced by work product represented by an open source community. It has value, community, facts, economics, and usually a technical cause attached to it.

When you attach emotions to communities, they become long lasting institutions. Add in special clothes, chants, and traditions, and you have either higher education or religion, and some would argue there's little difference. Institutions forge remarkably strong, life-long connections; someone once told me that "Princeton" is like a second surname.

Challenges to these long-lived institutions arise from the same revisiting of intellectual property rights that fuel the growth of open source communities. We can use intellectual property to define the boundaries of today's institutions, or use it as a foundation for developing new ideas, broadening participation, and encouraging the ideas of others. That's true whether it's Disney, Princeton, Sun Microsystems or the religion of your choice.

Proof that you can remix media and and mediums: web comics are a growth industry. No matter how miserable the economy, everyone wants a good laugh on a regular basis. And if the comic remixes pop internet culture then it's rich on multiple levels.

Networked relationships have given rise to network memes. Our relationships that span the personal and community spectrum give us a sense of identity and worth: we know who we are and how we provide value to a group; those that run from facts to emotions provide a sense of belonging. And they intersect in an oblique pop culture reference.

Being in sales, I have to end with an ask: Do you grok the snowclone? If not, or worse, you stopped parsing at "grok," it's time to upgrade your science fiction canon from Heinlein to Doctorow. It may not make you or your causes more attractive on Facebook, but it's likely to help us understand how our companies and their work products matter in the networked market.

Thursday Sep 18, 2008

Intractability and Incomprehensibility

I attended the (sometimes semi) annual Princeton University Computer Science department affiliates seminar this week, and got to hear a variety of short talks on topics ranging from data management in computational biology to how students infer trust in search results. Professor Andrew Appel opened the day with some statistics about the department, including a graph showing that the enrollment in CS degree programs is on the rise again, after a huge wave that lagged the .com boom and bust cycles by about a semester. My caffeine-aided interpolation of his chart was that computer science is rebounding off of a decade-long lull in attractiveness. While the spike in 1999-2000 was an effect of the market, this could well be a leading indicator that computer science is once again interesting. Appel put a nice twist on the data and his overview of the research programs, adding that "having intractable problems is not a bug, it's a feature, and computer science actually needs them - otherwise things like cryptography don't work." I'd never really considered the benefit of finding things that you know can't be solved through normally scalable methods, although I'll admit to typing as many things bin packing problems as I can to put a point on their complexity.

The short talk I enjoyed the most, however, wasn't a research result but instead a summary of a search results from a freshman seminar led by one of my former (and favorite) professors, Andrea LaPaugh. Her summary of the incoming students' views and vectors of information consumption were startling: most students trusted in institutions (clearly none of them have been served with RIAA suits); they all believed Google "intervened" in search results (for which counter proof exists), demonstrating a conceptual commingling of sponsored links, ad words, search ranking and key word search; they seemed somewhat flippant about their privacy (some even believing that the "government sees everything they type") and overall, bring to bear little knowledge of how collections of information are presented.

I was a bit surprised by the results, but I also try to understand how each generation of users sees the social context of technology. Those of us in the late boomer era were shaped by television; we learned to be skeptical of the news, Madison Avenue, and the government; today's Gen Y users are perhaps not skeptical enough of the exogeneous forces shaping their information flow. LaPaugh's food for thought sent me off to the lunch break with two wildly different ideas: first, Marshall McLuhan was right and the medium is the message, especially when we convey copious trust to the medium. The second, significantly less politically correct thought, was that maybe cane spree isn't such a bad mechanism for pwning freshman before they experience the equivalent online.

Monday Aug 11, 2008

Innovation Framework

This essay/rant has evolved over nearly nine months, but was finally brought to a close after I re-used the material in an interview last month. Genesis was one of our sales executives asking for help in preparing for a customer meeting to discuss Sun's "innovation framework." I found the question fascinating, because I never think of having a process for invention or disruption; it just kind of happens. However, from a corporate strategy point of view, there are arms lengths of books published on creating new business models in the name of spurring innovation. If you can convert innovation into dead trees, there's a framework in there somewhere.

Our discussion turned into a cataloging of the mechanisms for identifying disruption and leveraging the corresponding change in market size (or market existence -- like that for pre-fab salad), operating costs, distribution costs, or costs of goods sold.

First and best-known entrant in the art of projecting disruption is scenario planning, popularized by Peter Schwartz and now institutionalized through the Global Business Network . Scenario planning combines market and behavior analysis; you start with a big pile of possible disruptions or exogenous events, and boil them down into two ideally orthogonal forces that will shape the business under evaluation. You then look at the four combinations of extreme situations of those forces, and build a narrative describing what the world will look like. Scenarios aren't about right or wrong, or optimizing for one ideal outcome; the future is typically a melange of more than one scenario as the identified contributing factors shape the market with differing degrees of force. What you're planning for isn't a singular future but the disruptions that shape that future.

Schwartz's claim to fame is that his scenarios prepared Royal Dutch Shell for disruptions in the oil market in the mid-70s. What's bad for some parts of the market (consumers) can be good for others (producers). I participated in one full-blown scenario planning exercise around the future of silver halide versus digital photographs. The market driving forces shaping our narrative were picture taking versus storytelling and standalone devices versus networked devices. Consider the mid-90s timeframe and these were fair questions, and our scenarios painted some interesting potential investment areas and disruptions to the market. The narratives we dictated back to management contained some "stay the course" advice; if digital photography never attained the quality of analog film, or if networking remained the domain of low reliability modems and AOL, existing franchises around film processing, rapid print making, and analog image science were safe.

What we missed, however, was that "storytelling" didn't mean using photographs for a storyboard or digital scrapbook. One of our scenarios was called "Personal Spielberg" describing a desktop digital editing and composition application. We had the ideas right, but the players wrong; it's not the digital photography players who built iMovie; it was Apple. And the demand for digital home studio output exploded with YouTube. We didn't know what a Long Tail was, and therefore we weren't looking for one. On a more practical and personal scale, though, storytelling circa mid-2008 is abundantly clear if you join FaceBook; your photo albums show another face to your life, with or without captions. Add some book lists, favorite music, share with your friends, and perhaps the Moody Blues were right about the Our scenarios were nicely constructed but completely missed the dominance of social networking as a continually evolving story, pictures as color commentary.

Scenario planning is much more of a strategy tool than an innovation tool, because it builds on known and project constraints and asks "What would you do if?" type questions. It doesn't push the boundaries of using technology to change the strength or force of those constraints.

More recently, Kim and Mauborgne at the Harvard Business School have promoted the Blue Ocean model for innovation. The basic premise is that "red ocean" markets are those already in existence, with each player growing as a function of overall market growth and through taking market share from each other. The blue ocean strategy focuses on finding new, non-consumptive markets based on the relative value of product or service features demanded by consumers in those markets. The disruptions come from combining the most valued features in non-intuitive ways to create a new market. Bagged lettuce combines the produce aisle and the convenience of the prepared foods aisle in the supermarket; nobody knew it would be a billion dollar business. One of my favorites (indicated by waistline) is Pret-A-Manger, the UK based sandwich shop that combines the speed of ordering lunch in a fast food outlet with the fresh ingredients and healthier eating choices of a local deli. The name itself is a play on pret-a-porter, the notion of high-end clothing (or food) ready to consume without the time intervention of a tailor or the guy slicing turkey one sandwich at a time.

Blue ocean strategies overlap in two ways with the digital world. First, open source software is a key entry point to non-consumptive markets. The best way to get someone who has never used your software to try it, evaluate it, or take an interest is to remove all barriers to entry. The analysts (and occasional) customers who ask me "How will Sun make money by giving things away?" miss the fact that "giving things away" is a blue ocean strategy that expands markets, while "making money" is a red ocean tactic to compete and take share in those newly entered fields of play. The second dip of the network endpoint in the blue ocean is the use of blue ocean strategic thinking to define new, small markets and identify the attributes that drive consumers to value them. It's Chris Anderson's Long Tail as seen by an MBA, not a web site developer. And I have to give Anderson credit for the most recent, and possibly most powerful, Long Tail model for describing innovation as the confluence of more products, better, lower-cost distribution, and a transition from mass-produced hits to niche-consumed special interests.

Disclaimer: the closest I've ever been to an MBA was going to a professional wrestling event at the old Boston Garden with two friends from Harvard Business School and a guy who used to call himself The Divine Bruce Yam (it involves Elliot Spitzer, so we'll stop there). That won't stop me from formulating a theory and giving examples, though.

If I had to pick one thing that's been at the heart of Sun's culture of innovation for 25 years, it's been the insistence that everything be networked, and assuming that the density of connectedness is monotonically increasing. If you take our vision of "everyone and everything connected to the network" (or, I could argue, "a network" where there may be multiple, sometimes disjoint meshes), then getting in front of the disruption wagon means looking at the set of constraints facing your business, and relaxing them to the point where you'd do things differently. That spurs innovation in strategy, products, services, and market mechanisms. Best example I can think of: When Jeff Bezos realized that ordering a book from an online catalog was independent from the source of the catalog, and therefore you could relax the constraint of equating "catalog" and "retail book store inventory." As soon as you could order any book in print, amazon.com had disrupted the scale of online retailing.

So what are the constraints you can relax, spurring the need to think about markets or products in innovative ways?

Time. Time can be bent in non-relativistic ways by focusing on real-time as a customer service or data access attribute. How long does it take to get to the piece of data that you need to make a decision, refute a claim, or answer a customer question? The answer isn't always about writing neat SQL scripts or having an in-house search engine, because they are bound by the meta data (or lack thereof) that enables those result sorting mechanics. One part of the standard time-space trade-off is to optimize for available space (for example, making a large data set memory resident to avoid paging); however, space constraints benefit from Moore's Law while time constraints do not. Adding tags to data, building indices based on context, and aggregating data based on user input and feedback drives the time constraint. "Real time" also refers to the latency limited world; if you aren't thinking about solving these problems within the attention span of the average click-driven user, someone else will.

Space. Not only the classic counterpart to time optimization, relaxing a space constraint also means "removing assumptions about the solution space." Example: Amazon's Mechanical Turk, a model for "crowd sourcing" work across a much larger pool of talent. Just as amazon.com flattened the book selling space by making the entire books in print catalog available, any innovation that broadens the input space (what can be worked on) or the transform space (who can do the work) is going to drive a space disruption in the market. Almost all of these space disruptions rely on networking technologies to match the flattened input and transform spaces with each other, be it crowd sourcing or the Hadoop/MapReduce model of moving computation to storage instead of the conventional reverse approach.

Developers or Contributors. Who are the developers for your applications? Your own IT department, Facebook developers who may engage with affiliated user communities, open source developers whose work products you consume, or commercial software companies' employees? Or some combination of all of them? The "new" definition of developer includes content as well as application developer; user generated content in training, virtual worlds, and support has become de rigeur. Taking advantage of a larger pool of developers and contributors is only possible if you relax some of the classic constraints enforced around rights to use. Recently I heard Philip Rosedale (founder of Linden Lab, creators of Second Life) talk about building customer premises Second Life worlds; Linden Labs gives away an edge of the network (and the rights they'd normally assert to be the ones building that world) knowing that users will want to populate those new boundary territories with walls, furniture, props, and other items purchased in the 2L economy. Relaxing the constraint about intellectual property distribution creates a new market player, and by extension, adds developers to the 2L network of economies.

Relaxing a constraint often leads to a surfeit of a resource formerly considered a rate-limiting factor. My introduction to this surplus economy thinking happened in 1985, in the days when a "network connection" meant you were on CSnet and could use a soldering iron, when I was on the staff of the Massive Memory Machine project. At that time, what was "massive" is less than you get in a single DIMM today, but challenging "conventional wisdom" about time-space trade-offs continues to drive innovation in computing.

Tuesday Jul 01, 2008

Happy New Year

Happy New Year to my fellow Sun employees. While most of the world aligns to the Gregorian calendar and maps major events into the January to December timeframe, Sun operates on a July 1 fiscal year, putting it in an equivalence class with two of my other favorite things: the NHL and Princeton University.

For all things financial -- sales attainment, spending, goal progress, or annual giving contributions -- those of us who roll on July 1 reset the counters to zero today. It's a bit auspicious, but it's also exciting because the new year also brings new strategies, new tactics, and new challenges. The NHL free agency season is always a time (for me) of thinking strategically: who do my beloved NJ Devils need, in what role? What missing ingredient will make them hungry, hard-hitting, and perhaps even more prolific goal scorers? It's a clean slate for general managers, coaches and marketing organizations. That sense of building a team and refocused energies on the next season's goals is precisely what permeates the next few weeks at Sun.

The Princeton University July 1 fiscal year never really mattered to me until this year: as of midnight last night, I'm a member of the 25th Reunion Class, the semi-official "parent class" of this year's annual giving campaign, punctuated with what is typically the largest post-graduation gathering of classmastes in June. It's another sign that I'm officially an adult, but it's also refreshing. I began thinking about a variety of 25th anniversaries: the first NJ Devils game that I attended was in 1983 (I sat with my cousins under the scoreboard and we heard the non-stop click-click of the relays turning the scoreboard bulbs on and off for three hours) and my son won his first NJ state ice hockey tournament medal on the 25th anniversary of the Miracle on Ice. Both involve hockey, but both also involve putting some element of perspective on events -- I now go to games with my son, and rather than cheering for Chico Resch in the Devils' net, we cheer for him in the Devils' broadcast booth to the left of our seats. And by seeing the parallels, we have another bit of history to share as another parent class another 25 years hence.

I'm looking forward to renewing old friendships, to meeting my classmates' kids, and to participating in my class' capital campaign. I'm not the guy who calls and asks for large sums of money (too close to the day job); I will be leading a "participation team" whose goal is to get classmates to give at any level, just to show support and connect back to the university. I explain my motivation for this work derviative by re-telling a story I've rarely dusted off. 27 years ago, while attempting to complete the freshman physical education requirement, I decided to sign up for "athletic conditioning" not realizing it was a euphemism for "spring football camp." The first eight weeks weren't too bad, but the first day of actual "conditioning" involved running, up-downs, more running, rolls, sprints, more running, more up-downs, and somewhere along the way I think my left lung decided to go on strike. There was no actual blocking involved, or my insides would have liquified. What I remember vividly was Billy M, a guy I vaguely knew from our dorm and a class, telling me "point your head up, breathe in hard through your nose, blow out through your mouth." I cannot vouch for the medical authority of this aerobic guidance, but it worked. I've used that breathing trick when I'm exercising (rare), stressed out (less rare) or need to focus (frequently). Each time I do, I think of Billy M putting an arm around me so that I wasn't trampled by 300 pound offensive tackles, and I'm thankful that even though I was never on his team, he considered me enough of a teammate in some context -- classmate, fellow wheezer, survivor of multiple papers on modern European authors -- to offer advice.

Good teams and good teammates can even overcome even the obstacles posed by an asthmatic nerd, without anybody getting hurt. Happy New Year, Billy M.

Sunday Mar 02, 2008

Flirtin' With Disaster

I spoke on a panel at a Marcus-Evans conference on business continuity and disaster recovery and found myself in the position of converging three themes: business continuity, security, and virtualization. Of course, I had my eyes (mostly) open while speaking, although it appears I was out of focus for much of the session so this shot of me escaping the bull by the horns has to suffice, or at least detract from my weak Molly Hatchet references (speakers on if you click on the link).

Historically, data center management saw disaster recovery and business continuity as reactions to physical events: force majeure of nature, building inaccessibility, threats or acts of terror, or major infrastructure failures like network outages or power blackouts. Increased stress on the power grid and near-continous construction in major cities increases the risks of the last two, but they are still somewhat contained, physical events that prompt physical reactions: spin up the redundant data center, fail over the services, and get up and running again, ideally without having missed shipments, deliveries or customer interactions.

Business continuity today has those physical events as table stakes only. The larger, more difficult problems are network service failure (due to denial of service attacks or failure of a dependent service), geographic restriction (due to pandemic fears, public transportation failures, or risk management), data disclosure and privacy, and the overall effect on brand and customer retention. What if you can't get people into an office, or have to react to an application failiure that results in customer, partner or supposedly anonymous visitor information being disclosed? Welcome to this decade's disasters.

Where does virtualization fit in? Quite well, actually. Virtualization is a form of abstraction; it "hides" what's under the layer addressed by the operating system (in terms of a hypervisor) or the language virtual machine (in the case of an interpreted language). But it's critical that virtualization be used as a tool to truly drive location and network transparency, not just spin up more operating system copies. I never worry about the actual data center containing my mail server, because I only see it through an IMAP abstraction. It could move, failover, or even switch servers, operating systems and IMAP implementations, and I'd never know. Virtualization gains in importance for business continuity because it drives the discussion of abstraction: what services are seen by what users, where and how on which networks?

Bottom line: Business continuity planning shares several common themes with systemic security design. There's self-preservation, the notion that a system should survive after one or more failures, even if they are coincident or nested. The least privilege design philosophy ensures that each process, administrator or user is given the minimum span of control to perform a task; in security this limits root privileges while in BC planning it ensures that you don't give conflicting directions regarding alternate data center locations. Compartmentalization drives isolation of systems that may fail in dependent ways, and helps prevent cascaded failures, and proportionality helps guide investment into areas where there is perceived risk. The short form of proportionality is to not spend money on rapid recovery from risks that would have other, far-reaching effects on your business anyway. My co-author Evan Marcus used to joke that it was silly to build a data center recovery plan for a potential Godzilla attack, because if that happens we have other, larger issues to deal with. On the other hand, if you saw Cloverfield, there's a lot of infrastructure that people depend upon even when monsters are eating Manhattan.

The best planning is to write out a narrative of what would happen should your business continuity plan go into effect: script out the disaster or event that causes your company to act, and write up press releases, decision making scenarios, and some plausible risk-adjusted actions, and follow the actions out to their conclusion. If you don't have a prescribed meeting place for a building evacuation, and there's no system for employees to check in and validate their safety, then your business continuity plan may suffer when you have to scramble to find a critical employee. When disasters happen, the entire electronic and physical infrastructures are unusually stressed, and normal chains of communication break down. Without a narrative to put issues into perspective, your disaster planning document becomes a write-only memory, holding little interest or failing to gain enough inspection from key stakeholders and contributors. Start naming names, and putting brand, product and individual risks into black and white, and you'll see how your carbon-based networks hold up when the fiber and copper ones are under duress.

Monday Jan 28, 2008

Princeterns Part II

I was looking forward to spending the day with a pair of one-day interns because I find that I always learn something from students. It's one of the reasons I love doing alumni interviews, and why I aim to attend Sun collaborative research events or meetings. Yesterday's goals: expose my interns to a full range of engineering roles and customer types, give them a sense of various career paths, and respect the fact that they'd finished exams in the last 48 hours. Here's a core sampling of the topics covered in and around the Big Apple:

Common thread in customer meetings: Importance of diagnostics, inspection and analysis. I didn't script the customer discussions but each one touched on Dtrace, truss, packet sniffers and the need to analyze networked systems end to end.

What's different in the real world?: My answer to this one has been consistent for more than 20 years, namely the time scale and operational complexity of real-world applications. College projects run for no more than 10-12 weeks, typically completed within 10-12 days. Large-scale software projects run from months to years, in multiple phases, and rather than engineering the whole enchilada you typically have visibility into a small window of code and functionality. One of our customer visits touched on the need to continue to "feel like a startup" and move quickly, even as the company matures and has code aged in years, not weeks. Operational complexity isn't just about putting projects into deployment; it's about making sure the code works for all inputs, not just the ones needed to demonstrate required output. Emphasis on security, reliability, performance and the measurement points to assert those features increases after graduation once code has a life beyond the final exam.

Participation in open source projects could change this. Many of the software lifecycle engineering issues listed above show up if you have even tangential involvement with an open source software project. We (at Sun) have tended to look at our open source efforts at attracting campus developers, and we should be putting equal emphasis on the collaborative development model that inculcates long-term software perspectives.

When and how do you move into management? One thread of this discussion focused on the non-engineering disciplines that develop leadership, including dealing with different styles, cultures and approaches, managing conflict and setting and sharing a vision. The other thread was much more Sun-culture-centric, and was counter to some of the student perceptions of career ladders converging in management layers. Sun's engineering path goes from individual contributor through a "staff" level, typically recognizing the key technical contributors to a product, up to "principal" and "distinguished" levels, for director-sized influence and leadership or public appelation of outstanding engineering output, and finally the "fellow" title given to engineers who have VP-sized roles without VP-sized organizations reporting to them. Bottom line: there are many facets to leadership other than budgets and people management, and it's important to ensure that all of those facets reflect light along different career paths. What I learned from this discussion was that Sun should thinking about recent college graduates for non-engineering roles as well in business development, strategy, and operations, because their perspectives are both valuable and different from our own.

What are your [student] impressions of Sun? I wasn't ready for this answer -- that we're seen as a software company, particularly in light of the mySQL deal. The bad news is that hardware visibility is linked to visible logos (Princeton has a population of Sun systems, but they're not on desktops); the good news is that increasing our software footprint and distribution mass means that there's greater awareness of what we do as a company. As as a sidebar, both students had incredible knowledge of copyright and DRM issues, due to concerns about file sharing and content downloads. Perhaps Cory Doctorow is right and copyright criminal awareness is pervasive.

Did I feel old? Twice. One of our customers mentioned Feynman in terms of making problems accessible to non-experts, but the name didn't register. Feynman died before either student was born, scaring me that our cultural references are being compressed into ever-shorter timescales as a by-product of rapid cultural dissemination. We have too many things in the context, and therefore context switch more quickly. The other oldness came from one student asking me if I was familiar with Guitar Hero. Even without teenaged kids and game consoles, the gaming culture and its influence on social networks can't be ignored in technical circles. Students don't see gaming as a "separate thing". I mentioned the historical view of computing was a place you went to, not a thing you did, and certainly not something seamless in your everyday use of devices, and what I extracted from this simple question and follow-on was that gaming is as seamless to students as cell phone use is to their parents.

What would I do differently? You're tempted to say "nothing" but that's a bogus answer when you've had twenty years to think about the question. I would have focused more on writing and trying to find my voice as a writer; much of the writing I did was in response to questions posed by instructors and professors, and not something I wanted to inscribe. While I probably wouldn't have been accepted into a writing seminar with John McPhee or E.L. Doctorow (no relation to Cory) at the time, I should have tried, which would have required that I had the interest. It was Tim O'Reilly who helped me discover that I liked writing, and he did it by marrying my love of explaining things to putting ideas into words.

By the literal end of the day, I think I learned something from every exchange and question, and I'm looking forward to this program becoming more formal in future years.

Sunday Jan 27, 2008

Princeterns Part I

Part I: Dunkin' Donuts on the Jersey side, 25 degrees and o'dark hundred outside. Today I'm hosting two Princeton University undergrads in a very short-term internship program designed to give students an overall feel for the dynamic range of real-world applications of whatever they're learning, whether Comparative Literature or Computer Science. I wanted to offer to support a Comparative Literature major, and talk about blogging, news reporting, and how the literary canon reflects the social norms of its times, but I think I'm going to get just as much info from the interns as they will meeting with fellow Sun employees, customers and getting the nickel tour of midtown NYC.

Time to beat some of the Lincoln Tunnel traffic.

Wednesday Jan 23, 2008

Information Leakage Protection

I've decided to start capturing various snippets of talks, interviews and other random thoughts as much as possible, both to provide greater insight into the kinds of problems our customers are asking about as well as to stimulate some discussion about those problems.

Yesterday morning Sun co-sponsored a seminar on Information Leakage Protection along with our partners BatBlue and Reconnex and speakers from a major investment bank and a media company. It was my job to set the tone for the morning, somewhere between projecting an imminent crisis and treating this as a theoretical exercise.

The "classical" view of information leakage protection (ILP), or data leakage protection (DLP), is that you want to keep your data safe in your databases, prevent emails from trickling out to the wrong sources or from being intercepted if they contain sensitive data, and avoid the theft of laptops, PDAs and desktops filled with confidential information. The "networked" view is that we have dozens of transmission vectors that provide partial information, and with enough compute power or time to join this data to other publicly accessible sources, we run the risk of second and third order information leakage. I started by paraphrasing a study done at the University of Washington that looked at the Nike+iPod RFID transmitter as a personal data leak. On the surface, it's not a big deal if your sneakers broadcast their serial number such that a sub-$100 sensor can track physical location. But marry that to secondary sources of data -- students in a class, security camera video, DHCP logs (that reveal MAC addresses which may be familiar to you) and you can construct a crude mapping of people to those IDs in the literal sneaker net.

My guidance for thinking about ILP was to think in four layers: (1) the persistence mechanisms used, including filesystem crypto, encrypted tapes, tape handling, and backup security; (2) applications, both purchased and developed, and their persistence of data, logging, transfer of data and identity management; (3) services consumed, where the application may reside on the other side of a network and users convey a variety of identification and data to the services and (4) the devices we use to access all of the above. Determining how to best seal the leaks requires a combination of detection and prevention tools (mechanism) with clearly communicated rules for data and information handling (policy). Several of the speakers highlighted personal webmail (Gmail, Yahoo, HotMail, cable providers) accounts as a major source of information leakage; while the companies in question had protected their in-house mail servers, users could still send attachments using mail tunneled through https.

Bottom line is that the adage I learned in college radio still holds true: be careful what you broadcast, because someone may be listening.

Tuesday Oct 16, 2007

The Eiffel Tower, Digital Divide and CEC 2007

We've often used buildings as examples of "good" and "bad" architecture. They always have to fit within a set of constraints -- street boundaries, zoning laws, public infrastructure bandwidth and until very recently in Philadelphia, the top of Ben Franklin's hat. At the same time, the buildings have to be functional, aesthetically pleasing, and part of the overall urban plan. There are lots of parallels between stacking floors and building software stacks.

At CEC 2007 last week, I couldn't help but pick up on this theme again. With some amazing camera work and editing by Seeley Roebuck, we've produced a CEC video about the Eiffel Tower and the Digital Divide. The Eiffel Tower was, and is, a great piece of engineering, not just in its design but in how it was constructed. It continues to sit not only in the center of Paris but in the center of controversy as well; most currently over the assertion of copyrights (it's in the video, trust me).

But is the Eiffel Tower in Vegas real? It's half-size, it's fairly accurate (if you ignore the slot machines around the footings), and there is an aire du francophone if you listen above the street noise. In the opening sequence of CEC, the narrator said that things can be "real, or virtually real" as actors gave the illusion of moving in and out of a Second Life animation on-screen. If you're using the virtual to build awareness of the real, and to drive common context, it doesn't matter. What does it have to do with the digital divide? Watch the video and the virtual me in front of the virtualized Eiffel Tower will attempt to close the loop.

Monday Oct 08, 2007

CEC Opening

Now I'm going to attempt to take a page from the hockey bloggers at 2 Man Advantage and do a live play-by-play commentary from CEC.

9:10 am. Just got off the stage from our geekly intro. Glad that our little "audition video" for the opening sequence got a few laughs. Disappointed that my Second Life avatar has a bigger chin and is overweight and under-tall. I'm going to be the first person to try to PhotoShop himself in Second Life. We're now officially 6 minutes late, with my rambling contributing 2 minutes of the lag.

9:30 am. Jane, our ace show flow coordinator (nerd herder) leaned over and told me that our second speaker for the morning session has just arrived. We're not supposed to have crap shoots getting to Vegas, but this one worked out.

9:45 am. Marc Tremblay really was disappointed he wasn't asked to be in the opening sequence (in addition to being a patenting monster, he was a world-class gymnast and can still do things in person that some of us can't even animate). He just gave a killer talk about CMT, and for the first time at a big Sun event, talked about transactional memory as a way of making parallelism significantly easier and faster.


Hal Stern's thoughts on software, services, cloud computing, security, privacy, and data management


« April 2014