Friday Feb 20, 2009

The Intercloud

There is a well-founded skeptical question as to whether "cloud computing" is just the 2008 re-labeling of "grid",
"utility" and "network computing". Google's Urs Hoelzle pokes at that a bit with a search trends chart he shows during talks (copied here).

Urs Hoelzle Google Search Trends

I have a whole string of answers as to what is different now, starting from the predictability and growth of broadband networks, to the arrival of usable and broadly available compute virtualization. But it seems to me that the most fundamental shift is that X-as-a-service is now approachable by most developers. This is especially true for Infrastructure-as-a-Service (IaaS), the ability to provision raw (virtual) machines. The credit here is squarely with Amazon Web Services; they more than anyone made getting a network service up and running (relatively) easy by going after the grokable abstraction of rentable VMs with BYOS (Bring Your Own Stack).

Certainly the lively debate over the most productive level of abstraction will continue. There are lots of compelling arm-chair reasons why BYOS/IaaS is too primitive an environment to expect most developers to write to. So there are a bunch of Platform-as-a-Service efforts (Azure, Joyent Accelerators, Google App Engine, Sun Project Caroline, to name but just a few!) that are all after that Nirvana of tools, scripting languages, core services, and composite service assembly that will win over the mass of developers. The expectation of all PaaS efforts is that new Software-as-a-Service applications will be written on top of them (and thus the abstraction layer cake of SaaS on PaaS on IaaS).

Not matter how the hearts and minds of developers are won over by various PaaS platform-layer efforts, there is little doubt in my mind that BYOS/IaaS will be a basic and enduring one. It's the "narrow waist" of agreement (the binary contract of stripped-down server, essentially) we see so successful in other domains (TCP/IP in networking, (i)SCSI in storage). Of course, higher level abstractions do get layered on top of these, but diversity blooms here, just like it does below the waist in physical implementations.

Productive and in-production are different concepts, however. And as much as AWS seems to have found the lowest common denominator on the former with IaaS, how at-scale production will actually unfold will be a watershed for the computing industry.

Getting deployed and in production raises an incredible array of concerns that the developer doesn't see. The best analogy here is to operating systems; basic sub-systems like scheduling, virtual memory and network and storage stacks are secondary concerns to most developers, but are primary to the operator/deployer who's job it is to keep the system running well at a predictable level of service.

Now layer on top of this information security, user/service identity, accounting and audit, and then do this for hundreds or thousands of applications simultaneously and you begin to see why it isn't so easy. You also begin to see why people get twitchy about the who, where, and how of their computing plant. You also see why we are so incredibly excited to have the Q-Layer team as part of the Sun family.

Make no mistake, I have no doubt that cloud (nee network, grid) computing will become the organizing principle for public and private infrastructure. The production question is what the balance will be.  Which cloud approach will ultimately win?  Will it be big public utility-like things, or more purpose-built private enterprise ones?

The answer: yes. There will be no more of a consolidation to a single cloud than there is to a single network.

[And, yes, I know I've said that the there will be only about five gigantic public clouds I still
think that is correct, but also as suggested by that post, it will look a lot like the energy business, with dozens, or hundreds, of national and regional companies.]

The Internet, after all, is a "network of networks": a commons of internetworking protocols that dominate precisely because they get the benefit of Metcalf-esque network-effects across a federation of both public and private (intranet) investments. The congruent concept is the "Intercloud", at term that Cisco has been popularizing recently (see this nice post by James Urquhart). The "Intercloud" is similarly a "cloud of clouds". Both public and private versions (intraclouds) not only co-exist, but interrelate. Intraclouds (private clouds) will exist for the same reasons that intranets do: for security and predictability (read: SLAs, QoS).

[Back to the energy industry analogy; there are regional and national oil & gas companies in addition to the majors for similar reasons].

How internet and intranet protocols relate is instructive. Foundationally, they are identical. But things like local namespaces and firewalls lets an owner make strong assertions (we hope!) about who traffic gets carried and how.

Of course, clouds will interelate through networking protocols themselves, and voila,
you get the Intercloud (I think this is the Cisco sense of the term). But we have the collective opportunity to create something much more interesting and vital. We should tear out a page from the internet playbook and work towards and open set of interoperable standards and all contribute to a software commons of open implementations.  Particularly important are standards for virtual machine representation (e.g. OVF), data in/exgest (e.g.  WebDAV), code ingest and provisioning (e.g. Eucalyptus), distributed/parallel data access (pNFS, MogileFS, HDFS), orchestration and messaging (OpenESB, ActiveMQ)  accounting, and identity/security ( SAML2, OpenID, OpenSSO).

The trick will be to get (collectively) to the right set that lets everyone go off and confidently employ or deploy cloud services without fear that they are ending up on some proprietary island.  Really important is that we keep it simple; there will be all kinds of temptation to over-engineer and try to end up with the uber-platform. Back to the internet analogy. It's not just the narrow-waist agreement on TCP/IP, but that with a reasonably small collection of others along-side and layered, such DNS, BGP, and http that has lead to the layered richness, freedom of choice, and competition in implementation that we all enjoy today. Joyent CEO David Young makes an interesting start at this at the platform layer with his Nine Features of an Ideal PaaS.

We should expect and works towards nothing less in cloud computing. You certainly have our dedication to that.

[Thanks for the kick in the pants, Jason]

Wednesday Jan 16, 2008

The Three Most Important Applications

Around 1995, I gave a series of academic talks that tried to capture what I had learned at Sun during my first couple of years away from teaching at M.I.T. My biggest lesson was that, in the world of enterprise computing, there were three applications that really mattered: Databases, Big Databases, and Really Big Databases. I actually went way out on a limb predicting that, by 2000, we'd likely see terabyte-sized production databases (imagine that!).

The punch line being how databases were shaping key aspects of server and storage systems design at Sun: large memory, lots of I/O and memory bandwidth, RAS, symmetric multiprocessing and, of course, an operating system (Solaris) that could grok it all. We ended up creating systems that were naturally very well-suited for running, well, really big databases from the likes of  DB2, Informix, Oracle, Sybase. We also worked very closely with all of these folks to continually tune performance and bolster availability.

Good for us at the time, a bunch of people found many of these system design values --- especially around memory bandwidth and I/O --- made great Web 1.0 machines, too.

A decade later, databases matter even more. They are to storage what application containers are to computing. That isn't to minimize the importance of file systems --- those are the foundational storage abstractions, just as threads and processes are to application containers like Apache and Glassfish. Databases have continued a primary influence over big swaths of  our systems design (and so has high performance computing). The overall system center now being scaled-out network assemblies of web/application and database tiers.

In the contemporary web era, not only have the enterprise databases grown in force (I'd rightfully add SQL Server to the list today) but open source databases (OSDBs) have come into their own: MySQL, PostgreSQL and Derby (to name but a few). These have wonderful affinity with the modern application containers, especially PHP and Java. And, indeed, MySQL has become foundational to the web, the M in LAMP.

And guess what? We've been targeting big swaths of our $2B R&D budget to engineer systems that run these workloads really well, too. The exciting part for thousands of engineers at Sun is that now we  get to rub shoulders with the great engineers at MySQL. We are champing at the bit to optimize and scale systems in a myriad of ways: from microelectronics to memory systems to storage to kernel engineering. In the magic transparency of open source, these optimizations will lift all boats.

And that is the truly exciting part. We now get to openly develop a new wave of very deep innovation in hardware and software systems. Ones that will continue the movement of  customer's capital to be invested in those who sustain in truly adding value, rather than adding to switching costs.

 A big open embrace to everyone at MySQL and welcome to the Sun family. This is going to be fun!


Thursday Sep 13, 2007

Why Microsoft Matters

You might imagine that being the technical executive sponsor for Microsoft at Sun would be one of those "challenging" roles, but it also has been a rewarding one (especially working with the likes of Bill Gates and Craig Mundie). The biggest challenges have been in the areas of bridging cultures and business models and, of course, in building trust between two companies that have been and continue to be  (at times, aggressively) competitive.

But at the core, we are both engineering-centric, products-offered companies where everything flows from a long-term, management-dedicated investment in R&D. Tens of thousands of really good engineers, most working on multi-year event horizons.

Microsoft matters because R&D matters.

And from my vantage point, it's been good to see the return in perception of the importance of R&D and resulting innovation in the marketplace. Just look at the rise of Apple, VMware and Google: at the core of all three are great engineers and designers building market-differentiated products. It's also good to see an ebb in the post-bubble conventional wisdom that the only thing that matters is driving cost into the dirt. As if all of of the problems in computing have been solved, and it's all about cost of production --- be it hardware or software. As if...

And that brings me back to our relationship with Microsoft. Our mantra has been "product interop", because at the end of the day, that's what our mutual customers care about. Pragmatically, we will both continue to innovate in our own ways, and continue to strive for differentiated products in the marketplace. And those products, pretty much up and down the stack, are and will be different.

Those differences are precisely the points of value and frustration for our customers. Value from choice, focus and the always heightened pace of innovation that comes from competition. Frustration from what I call "gratuitous incompatibilities": those places where our product stacks touch one another, but don't work well together. Places where we have left problems to be solved as an Exercise for the End-User.

These touch-points have been things such as identity, web services protocols, storage, and systems management. Adding to this list are touch points around the hardware platform itself, especially virtualization.

We've been making a lot of progress on these, and if both Microsoft and Sun matter to you, I'd encourage you to check out our resources and capabilities.





« April 2014