Tuesday Oct 20, 2009

Perfectly Closed

To my peers in the industry, I ask you to carefully think through your positions on the new set of “open internet” rules the FCC is proposing.  The rules would prevent Internet companies from selectively blocking or slowing certain Web content and would require providers to disclose how they manage their networks in an effort to promote transparency and true "openness." 

The FCC has said the rules will allow ISPs to manage their networks to ensure smooth performance during peak traffic times. Internet service providers are concerned that the rules would apply only to them and not other Web companies.  Manufacturers are worried that the rules would hamper their ability to find new ways to manage Internet traffic

Certainly, I have a  visceral reaction to most attempts to impose policies on markets, especially around the internet which has seemed to prosper well on its own set of organic rules, and with blessedly little regulation. The capital cycles and innovation rates are enormous; and they are, of course, related to the prospect of making money through market differentiation.

But I don’t see the proposed rules as a set of regulations, any more than a constitution is.  

They are, in my view, a codification of a set of fundamental principles around openness. A central question is, given we have done this well and gone this far, why do we need them now? I have a few answers to this, but mostly it’s because along with our technology advances in delivering networking and network services, we’ve also improved the ability to unilaterally, and thus dangerously, control it.

In many cases, nearly perfect unilateral control. Cryptography, in particular, lets us get really good at controlling content and devices. We can, if we choose, dictate the applications that can and can’t run on a device (a phone, say). We can also, instantly and en masse, revoke the content residing in the network (on an ebook, say). And we will certainly get better and better at extending these controls on a packet-by-packet basis, deciding which packet gets to go where, and perhaps re-written in the process. Perfect interlock is possible between devices, networks and services.

This certainly has not been the principles on how so much collective innovation has taken place on, and under, the internet. The vibrancy of where we are today is because we have been free from unilateral control points. With the web, in particular, anyone can challenge an incumbent with their idea of a new service by simply creating and publishing it, without getting any prior permission --- apps review, network service connections, or otherwise. Cloud computing lowers the barriers even further. But it is very easy to imagine that any of new control points by an single company or entity could be become a land mine to innovation.

Why new rules? We should actively and collectively set our principles around openness. It’s in our mutual self-interest to codify them.  I say “mutual self-interest” for a bunch of reasons. At an abstract level, the internet is one big value-add network effect. Wherever you are along the food chain, you benefit from the value of the internet to society always increasing. Certainly for new web services, keeping the barriers as low as possible maximizes their creation. Similarly, getting more bandwidth, reliably and cheaply, to customers will increase the ability to consume these new services. And new devices benefit from more ubiquitous broadband and tons of services. We all have to keep this open cycle open.

At a purely pragmatic level, the shoe can very quickly go onto another foot. You might see open principles as somehow restrictive to your particular business decisions today, but I assure you that tomorrow a new perfect control point somewhere else will have you wanting to have some rules by which we all play.

And finally, there is enormous mutual self-interest in getting out in front on these principles. The FCC has promised that the analysis will be data-driven and fact-based.  Let’s have the conversation, figure them out collectively, and work to get the right policies in place. Maybe what are  being proposed aren’t exactly right, but to me a blanket response of don’t need any rules will come back and haunt us all. Unilateral control is getting really good, and quickly.

So to my peers, let’s hang together on this.  Let's be progressive and write down the principles we should continue to shape the network around.

Or we shall most assuredly hang separately.

Friday Feb 20, 2009

The Intercloud


There is a well-founded skeptical question as to whether "cloud computing" is just the 2008 re-labeling of "grid",
"utility" and "network computing". Google's Urs Hoelzle pokes at that a bit with a search trends chart he shows during talks (copied here).

Urs Hoelzle Google Search Trends

I have a whole string of answers as to what is different now, starting from the predictability and growth of broadband networks, to the arrival of usable and broadly available compute virtualization. But it seems to me that the most fundamental shift is that X-as-a-service is now approachable by most developers. This is especially true for Infrastructure-as-a-Service (IaaS), the ability to provision raw (virtual) machines. The credit here is squarely with Amazon Web Services; they more than anyone made getting a network service up and running (relatively) easy by going after the grokable abstraction of rentable VMs with BYOS (Bring Your Own Stack).

Certainly the lively debate over the most productive level of abstraction will continue. There are lots of compelling arm-chair reasons why BYOS/IaaS is too primitive an environment to expect most developers to write to. So there are a bunch of Platform-as-a-Service efforts (Azure, Joyent Accelerators, Google App Engine, Sun Project Caroline, to name but just a few!) that are all after that Nirvana of tools, scripting languages, core services, and composite service assembly that will win over the mass of developers. The expectation of all PaaS efforts is that new Software-as-a-Service applications will be written on top of them (and thus the abstraction layer cake of SaaS on PaaS on IaaS).

Not matter how the hearts and minds of developers are won over by various PaaS platform-layer efforts, there is little doubt in my mind that BYOS/IaaS will be a basic and enduring one. It's the "narrow waist" of agreement (the binary contract of stripped-down server, essentially) we see so successful in other domains (TCP/IP in networking, (i)SCSI in storage). Of course, higher level abstractions do get layered on top of these, but diversity blooms here, just like it does below the waist in physical implementations.

Productive and in-production are different concepts, however. And as much as AWS seems to have found the lowest common denominator on the former with IaaS, how at-scale production will actually unfold will be a watershed for the computing industry.

Getting deployed and in production raises an incredible array of concerns that the developer doesn't see. The best analogy here is to operating systems; basic sub-systems like scheduling, virtual memory and network and storage stacks are secondary concerns to most developers, but are primary to the operator/deployer who's job it is to keep the system running well at a predictable level of service.

Now layer on top of this information security, user/service identity, accounting and audit, and then do this for hundreds or thousands of applications simultaneously and you begin to see why it isn't so easy. You also begin to see why people get twitchy about the who, where, and how of their computing plant. You also see why we are so incredibly excited to have the Q-Layer team as part of the Sun family.

Make no mistake, I have no doubt that cloud (nee network, grid) computing will become the organizing principle for public and private infrastructure. The production question is what the balance will be.  Which cloud approach will ultimately win?  Will it be big public utility-like things, or more purpose-built private enterprise ones?

The answer: yes. There will be no more of a consolidation to a single cloud than there is to a single network.

[And, yes, I know I've said that the there will be only about five gigantic public clouds I still
think that is correct, but also as suggested by that post, it will look a lot like the energy business, with dozens, or hundreds, of national and regional companies.]


The Internet, after all, is a "network of networks": a commons of internetworking protocols that dominate precisely because they get the benefit of Metcalf-esque network-effects across a federation of both public and private (intranet) investments. The congruent concept is the "Intercloud", at term that Cisco has been popularizing recently (see this nice post by James Urquhart). The "Intercloud" is similarly a "cloud of clouds". Both public and private versions (intraclouds) not only co-exist, but interrelate. Intraclouds (private clouds) will exist for the same reasons that intranets do: for security and predictability (read: SLAs, QoS).

[Back to the energy industry analogy; there are regional and national oil & gas companies in addition to the majors for similar reasons].

How internet and intranet protocols relate is instructive. Foundationally, they are identical. But things like local namespaces and firewalls lets an owner make strong assertions (we hope!) about who traffic gets carried and how.

Of course, clouds will interelate through networking protocols themselves, and voila,
you get the Intercloud (I think this is the Cisco sense of the term). But we have the collective opportunity to create something much more interesting and vital. We should tear out a page from the internet playbook and work towards and open set of interoperable standards and all contribute to a software commons of open implementations.  Particularly important are standards for virtual machine representation (e.g. OVF), data in/exgest (e.g.  WebDAV), code ingest and provisioning (e.g. Eucalyptus), distributed/parallel data access (pNFS, MogileFS, HDFS), orchestration and messaging (OpenESB, ActiveMQ)  accounting, and identity/security ( SAML2, OpenID, OpenSSO).


The trick will be to get (collectively) to the right set that lets everyone go off and confidently employ or deploy cloud services without fear that they are ending up on some proprietary island.  Really important is that we keep it simple; there will be all kinds of temptation to over-engineer and try to end up with the uber-platform. Back to the internet analogy. It's not just the narrow-waist agreement on TCP/IP, but that with a reasonably small collection of others along-side and layered, such DNS, BGP, and http that has lead to the layered richness, freedom of choice, and competition in implementation that we all enjoy today. Joyent CEO David Young makes an interesting start at this at the platform layer with his Nine Features of an Ideal PaaS.

We should expect and works towards nothing less in cloud computing. You certainly have our dedication to that.

[Thanks for the kick in the pants, Jason]

Wednesday Jan 16, 2008

The Three Most Important Applications

Around 1995, I gave a series of academic talks that tried to capture what I had learned at Sun during my first couple of years away from teaching at M.I.T. My biggest lesson was that, in the world of enterprise computing, there were three applications that really mattered: Databases, Big Databases, and Really Big Databases. I actually went way out on a limb predicting that, by 2000, we'd likely see terabyte-sized production databases (imagine that!).

The punch line being how databases were shaping key aspects of server and storage systems design at Sun: large memory, lots of I/O and memory bandwidth, RAS, symmetric multiprocessing and, of course, an operating system (Solaris) that could grok it all. We ended up creating systems that were naturally very well-suited for running, well, really big databases from the likes of  DB2, Informix, Oracle, Sybase. We also worked very closely with all of these folks to continually tune performance and bolster availability.

Good for us at the time, a bunch of people found many of these system design values --- especially around memory bandwidth and I/O --- made great Web 1.0 machines, too.

A decade later, databases matter even more. They are to storage what application containers are to computing. That isn't to minimize the importance of file systems --- those are the foundational storage abstractions, just as threads and processes are to application containers like Apache and Glassfish. Databases have continued a primary influence over big swaths of  our systems design (and so has high performance computing). The overall system center now being scaled-out network assemblies of web/application and database tiers.

In the contemporary web era, not only have the enterprise databases grown in force (I'd rightfully add SQL Server to the list today) but open source databases (OSDBs) have come into their own: MySQL, PostgreSQL and Derby (to name but a few). These have wonderful affinity with the modern application containers, especially PHP and Java. And, indeed, MySQL has become foundational to the web, the M in LAMP.

And guess what? We've been targeting big swaths of our $2B R&D budget to engineer systems that run these workloads really well, too. The exciting part for thousands of engineers at Sun is that now we  get to rub shoulders with the great engineers at MySQL. We are champing at the bit to optimize and scale systems in a myriad of ways: from microelectronics to memory systems to storage to kernel engineering. In the magic transparency of open source, these optimizations will lift all boats.

And that is the truly exciting part. We now get to openly develop a new wave of very deep innovation in hardware and software systems. Ones that will continue the movement of  customer's capital to be invested in those who sustain in truly adding value, rather than adding to switching costs.

 A big open embrace to everyone at MySQL and welcome to the Sun family. This is going to be fun!


 


Thursday Sep 13, 2007

Why Microsoft Matters

You might imagine that being the technical executive sponsor for Microsoft at Sun would be one of those "challenging" roles, but it also has been a rewarding one (especially working with the likes of Bill Gates and Craig Mundie). The biggest challenges have been in the areas of bridging cultures and business models and, of course, in building trust between two companies that have been and continue to be  (at times, aggressively) competitive.

But at the core, we are both engineering-centric, products-offered companies where everything flows from a long-term, management-dedicated investment in R&D. Tens of thousands of really good engineers, most working on multi-year event horizons.

Microsoft matters because R&D matters.

And from my vantage point, it's been good to see the return in perception of the importance of R&D and resulting innovation in the marketplace. Just look at the rise of Apple, VMware and Google: at the core of all three are great engineers and designers building market-differentiated products. It's also good to see an ebb in the post-bubble conventional wisdom that the only thing that matters is driving cost into the dirt. As if all of of the problems in computing have been solved, and it's all about cost of production --- be it hardware or software. As if...

And that brings me back to our relationship with Microsoft. Our mantra has been "product interop", because at the end of the day, that's what our mutual customers care about. Pragmatically, we will both continue to innovate in our own ways, and continue to strive for differentiated products in the marketplace. And those products, pretty much up and down the stack, are and will be different.

Those differences are precisely the points of value and frustration for our customers. Value from choice, focus and the always heightened pace of innovation that comes from competition. Frustration from what I call "gratuitous incompatibilities": those places where our product stacks touch one another, but don't work well together. Places where we have left problems to be solved as an Exercise for the End-User.

These touch-points have been things such as identity, web services protocols, storage, and systems management. Adding to this list are touch points around the hardware platform itself, especially virtualization.

We've been making a lot of progress on these, and if both Microsoft and Sun matter to you, I'd encourage you to check out our resources and capabilities.

 

Sunday Sep 09, 2007

A Word (or Two) on Redshift

I've received a very interesting array of comments from the Information Week redshift story  but nothing quite rivals a slashdot spanking. I keep seeing a set of misconceptions --- which I'll take as a failure to communicate :) --- so let me take a shot re-summarizing the basic points.

Redshift is an observation about the growth of computing demand of applications. If your application's computing needs are growing faster than Moore's Law, then color it red. If they are growing slower, or about the same, color it blue.  

Redshift applications are under-served by Moore's Law. The simple and obvious consequence is that the infrastructure required to support redshift apps needs to scale up. That is, the absolute number of processors, storage and networking units will grow over time. Conversely, infrastructure required by blue-shift apps will shrink as you get to consolidate them onto fewer and fewer systems.

Refining just a little, redshift apps appear to fall into three basic categories:

  1. Sum-of-BW. Basically, these are all of the consumer-facing apps (think YouTube, Yahoo!, Facebook, Twitter) that are serving the bits behind an Internet whose aggregate BW is growing faster than Moore's Law.

  2. \*-Prise. These are software-as-a-service style consolidation of apps that, at most enterprises, are blue. But there is a huge market over which to consolidate, so growth rates can become quite large (think eBay, SuccessFactors, Salesforce.com)

  3. HPC. Technical high performance computing was the pioneer of horizontal scale. For a good reason: halve the price of a unit of computing or storage, and most HPC users will buy twice as much. These apps are expanding gasses in the Nathan Myhrvold sense (think financial risk simulation, weather simulation, reservoir management, drug design).

Why it's a big deal now is my assertion (okay, SWAG) that we are nearing an inflection point where the majority, volume-wise, of computing infrastructure is in support of redshift applications. One the other side of this point is a kind of phase change where the great mass of computing is delivered through redshift-purposed infrastructure.

And if you believe this and you are in the business of building computing infrastructure, then you might want to think really hard about what all this means in terms of what is import and and where you invest your R&D dollars. Read: it's much more about how hardware and software conspire to become scalable systems, than it is about individual boxes.

Oh, and I guess I have to explain my abuse of a very well-understood physical phenomenon. The spectrum emitted from an object moving away from you looks like it has shifted in its entirety to a lower frequency (and thus "towards red" for the visible spectrum). When measuring the spectra of many galaxies, Hubble observed a correlation between the distance and spectrum: the further away a galaxy is from us, the greater the average redshift. A reasonable explanation for this is that space itself is expanding.

And thus (blame me, my lame marketing), the demand for scalable infrastructure is an expanding opportunity. Fact was, I didn't want to change my slides. Apologies to cosmologists everywhere.


Monday May 14, 2007

Are Software Patents Useful?

It is certainly encouraging to see reform of the U.S. patent system gain the attention of Congress (thanks, Rep. Lamar Smith!). Both the normalization with respect to international practices as well as starting to move damages more in line with actual harm. Clearly, the reform is trying to strike a compromise even among R&D-based companies: biotech and software being the poles. The debate for IT companies should be more fundamental: are software patents useful?

We should be judging "utility" objectively.

Patents are neither evil manifestations of corporate interests, nor an inalienable right of the subsisting lone inventor. They are contracts among us with the mutual self-interest of accelerating and sustaining innovation. Abstractly there could be lots of optimal points here, so we'll overlay a bias towards protecting individuals and especially individual freedom. In this sense, I'm not only highly aligned with Stallman, but have been heavily influenced by his thinking.

So, back to the basic question: are software patents useful, viz do they maximize innovation and the freedom to innovate?

My answer is "(mostly) No". And certainly not under our current view of how and for what they are awarded. And just to be clear, don't construe this as even a hint of criticism of patent offices around the world. These hugely overworked folks are doing what we collectively are asking them to do. It's an extraordinarily demanding job.

I say (mostly) no, because copyright appears to be (mostly) better for maximizing innovation while giving individual copyright holders the ability to modulate compensation and derived works. Larry Lessig (another primary influence in my thinking) and the folks at the Creative Commons have done a spectacular job of creating rather precise set points along this continuum.

What does copyright have to do with patents? With a mild fear of being pedantic, we have five basic tools at our disposal for controlling the distribution of our ideas and their expressions:

1. Public Domain. We can publish our ideas, and expressions of our ideas without restriction. A principal power here is that such publication can blunt or defeat any other person's attempt to obtain a patent, or to enforce copyright.

2. Patent. We can publish our (new and non-obvious) ideas in exchange for a time-limited exclusive right. That right being to \*exclude others\* from creating any expression of our idea. This is often misunderstood as a right of the inventor exercise the patent. It's not. It is the right to prevent others from doing so. One catch being that the invention may be dependent upon someone else's patented idea who could transitively exercise their right to exclude. Note that the "new and non-obvious" test is subjective and one that we ask our patent examiners to perform, backed up by (frequently technically lay) judges and juries.

3. Copyright. We can publish an expression of our ideas (code) with a vernier of exclusive rights, ranging from "you can't make any copies whatsoever" to "copy any and all freely, just propagate my copyright". There are two cool parts about copyright notices and source code: (1)they are both textual representations that can coexist in the same file, and (2) the copyright notice is a cudgel for a license agreement that can also go along for the ride: "follow the license terms or you can't make any copies whatsoever".

4. Trademark. We can freely publish our code but create a name and/or logo for the code (or, more typically, the binary build) that is sourced from us. In other words, others can use our code but they can't use our name for the code.

5. Trade Secret. Keep our ideas and our code a secret, and punish those who violate the terms of our secret agreement (e.g., an NDA). Of course, you always run the risk that someone else independently develops the idea and does 1,2,3 and/or 4 with it!

Copyright puts the licensing control clearly and explicitly in the hands of the developer. It can also capture a license for any related patents the developer might have obtained. Some of the most effective of these are "patent peace" grants, such as what we have done with CDDL: you get a grant to our patents as long as you follow the copyright license, including that you won't prosecute for any patents that you might have. If you do prosecute, then our patent grant is revoked.

So, in a way, the utility of a software patent here is that it can put even more teeth into the potential enforcement of a public license. That's fine when used this way. But any developer is always open to attack from a patent holder who has no interest in her code (e.g., a troll) other than to extract rents from having read upon the ideas in the patent.

I'm yet to see the case where these attacks are directly or indirectly useful. It's seldom that the patent holder is himself practicing it, so the patent peace provision is empty. These all seem to be taxes upon our industry. (At least we (Sun) have some resources to combat this. We paid almost $100M to Kodak to immunize the entire Java community from infringement claims on mechanisms that Kodak themselves don't use --- they acquired the patents from a third party. This was pure insanity, but we felt like we had to pay this in order to indemnify the whole Java community.)

Patents are a far more blunt instrument than copyright, and tend to teach far less than code. I just don't know of any developer who reads patents to understand some new software pattern or idea. Remember, the limited monopoly we grant a patent holder is in exchange for teaching others how to do it so that when the patent expires everyone is better off (the length of time of the grant is another issue. How long is two decades in software generations?)

Obviously, you make the case that these are the side-effects of an otherwise healthy dynamic balance around innovation. That individuals, start-ups and large companies do indeed need the protection in order to invest in basic software R&D that might yield, say, the reduce the solution to a previously exponential-time problem to a logarithmic one.

Certainly, we (at Sun) feel like we have put some serious coin into developing things like ZFS and dtrace, which we have published under a FOSS (Free and Open-Source Software) license (CDDL for now), and for which we have applied for patents. We will \*never\* (yes, I said \*never\*) sue anyone who uses our ZFS codebase and follows the terms of the license: they publish their improvements, propagate the license, and not sue anyone else who uses the ZFS codebase. And look at the innovation not only with ZFS in OpenSolaris, but its adoption by Mac OS X and BSD.

But under what conditions would we enforce our patents? How would we feel if someone did a cleanroom version of ZFS and kept the resulting code proprietary?

We wouldn't feel good, to be sure. But I'd put the burden back on us (certainly as a large company) that if such a thing were to happen it was because we were failing to \*continue to\* innovate around our original code. Being sanguine about patent protection as an exclusive right would result in less innovation, not more.

Our licensing of our Java implementations under GPLv2 are a case-in-point. The early returns are that we are seeing renewed interest and vitality in the platform and a real acceleration of innovation --- both from us as well as others.

There is a better state for fostering innovation in software. It's one built around the network itself, and one whose principles are grounded in the freedoms entitled by developers in the form of copyright. It's proving to be amazingly stimulating of innovation, and we ought to collectively drive our industry to state of complete FOSS. Either that, or take refuge in trade secrets and trademarks.

In case you haven't noticed, driving to a state of complete FOSS is exactly what we are doing at Sun. With some narrow exceptions, all of our software efforts are or will be under a FOSS license, and we will actively build and participate in communities around these code bases, and work as transparently as we possibly can. Why? Because it maximizes innovation. We have even taken the step of making some of our hardware design free and open. It's still early, but we think this style of open development will also yield a vibrant culture around OpenSparc.

Will we stop pursuing software patents on our software? Can't do that yet. That's simply because our competitors will still go for them, and unless our system changes, we'd have fewer "trading stamps" and end up paying even higher rates to indemnify the users of our software.

Is there anything short of eliminating software patents that could get us off that treadmill and drive us to a maximum innovation state? Well, maybe, given some additional restrictions on their granting. I have a few ideas having to do with teasing apart interfaces from implementations, but that's the subject of a future blog.

Wednesday Jan 31, 2007

101 Things

Thanks for the tag, boss. I thought if I simply ignored this, I could avoid answering. But no such luck.

I'm guessing most people don't know that...

1. I always wanted to be an oceanographer. I idolized Jacques Cousteau, got my YMCA cert (at 16) in era of horse-collar vests and J-valves, and wanted to design submarines. I started living my dream by working my way through UCSD with a part time job in the Underway Data Processing Group at Scripps. But every time we went out for sea trials, I puked over the fantail. So much for that dream. (Eventually outgrew the sea sickness, fortunately.)

2. My first career was as a controls engineer with HP (in the San Diego division designing servos for plotters) and then Honeywell (doing flight controls). My first job at Honeywell was verifying the stability of re-entry control system on the Shuttle before its first launch. As a junior engineer that was beyond cool.

3. I'm half Greek, half Hungarian. My dad emigrated right after WWII on a post-war reconstruction scholarship meeting my mom at Miami of Ohio. He was Greek Orthodox, she Jewish. Let's just say there was a lot of "energy" in my house growing up.

4. Jonathan pointed out that I love to cook. My latest passion is artisan sour dough bread (thank you Laurie, for still smiling at my bursting mason jars of starter in our new fridge...). It's way low tech (just yeast, flour, water and salt), but deeply rewarding to the psyche.

5. What he didn't point out is that I also survived a train derailment. It was September 13, 2001. I got one of the very last seats out for the California Zephyr out of Chicago on 9/11. In the middle of the night in the Utah desert we hit a coal train. Here's what's left of the engines.

Lemme see,.. Tag you're it DD!

Godspeed, Jim

Wherever you are, we are all wishing your keel deep and your sails full. We're missing you, Jim.

August, 2006

Friday Nov 10, 2006

THE WORLD NEEDS ONLY FIVE COMPUTERS

And, no, I'm not paraphrasing something that I bet Thomas J. Watson never uttered in 1943 anyway. But he should have because, ultimately, he might turn out to have been right.

Let's see, the Google grid is one. Microsoft's live.com is two. Yahoo!, Amazon.com, eBay, Salesforce.com are three, four, five and six. (Well, that's O(5) ;)) Of course there are many, many more service providers but they will almost all go the way of YouTube; they'll get eaten by one of the majors. And, I'm not placing any wagers that any of these six will be one of the Five Computers (nor that, per the above examples, they are all U.S. West Coast based --- I'll bet at least one, maybe the largest, will be the Great Computer of China).

I'm just saying that there will be, more or less, five hyperscale, pan-global broadband computing services giants. There will be lots of regional players, of course; mostly, they will exist to meet national needs. That is, the network computing services business will look a lot like the energy business: a half-dozen global giants, a few dozen national and/or regional concerns, followed by wildcatters and specialists.

Let me back up and explain what I mean by a Computer, and then why I think this is inevitable. I mean "Computer" as in the "The Network is the ...". These Computers will comprise millions of processing, storage and networking elements, globally distributed into critical-mass clusters (likely somewhere around 5,000 nodes each). My point in labeling them a Computer is that there will be some organization, a corporation or government, that will ultimately control the software run on and, important to my argument below, the capitalization and economics of the global system.

These Computers will be large for a number of reasons. It seems that the successful services are most definitely growing faster than Moore's Law. That is, in addition to upgrading to faster systems they are adding more of them and the compound growth is getting pretty spectacular in several cases. A company like Salesforce.com sees hypergrowth not in the form of some intrinsic demand on CRM (within an average company, definitely not growing close to Moore's Law --- Enterprise CRM is overserved by systems performance improvements), but rather the sum of consolidation of CRM systems across thousands and thousands of companies. Live.com is likely to fall into this camp, too. The growth seen by a Google or Yahoo!, on the other hand, is more directly a function of their pipe-filling roles: the greater the end-user bandwidth, the greater the demand on their infrastructure.

Moreover, there is most definitely an economy of scale in computing. To the extent that there is a scalable architectural pattern (cluster, pod, etc.), the per-unit engineering expense gets amortized over increasing capital volume. So, more and more engineering can be invested in driving higher and higher efficiencies at scale.

Our bet (meaning Sun's) is that, like the energy, transportation, telecommunications and power utility businesses, most of these companies will realize that they can become even more efficient if they rely upon a few, highly competitive and deeply technical infrastructure suppliers (think GE, Siemens, ABB for power systems, Boeing and Airbus for commercial aircraft, Ericsson, Nortel, Lucent/Alcatel, Nokia for telecom, etc.).

All this being said, a large enough enterprise (say, a big financial services firm) still have some pretty compelling reasons to build their own Computers. My only advice here is to approach the problem as one of latent scale. That is, think that you are building one of the world's five, but you just haven't quite grown into it yet! Same advice goes to start-ups: because either you will grow to become one of the big Computers, or you'll be acquired and be Borg-ed into one of them!

Naturally, we aim to be the premier infrastructure supplier to the world's Computers. Blackbox is just the beginning (More on Blackbox in a previous entry). Whatever its form (or color!) the emerging infrastructure will be far more efficient than what we think of for conventional enterprise computing. And, just as a reminder, that doesn't mean its piles and piles of cheap boxes, any more than you'd design a power plant with piles and piles of cheap portable generators. In the latter case, the little problems of noise, pollution, reliability and conversion efficiency are scaled into some really nasty ones.

Similarly, the cheapest computing is not necessarily obtained by lashing together racks and racks of the cheapest computers you can find. Engineering for scale matters. Really matters.

Monday Oct 16, 2006

THE INDUSTRIAL REVOLUTION, FINALLY

I've commented frequently upon a central paradox of IT: software and hardware components are the products of fierce, high-volume competition, yet their final assembly by IT organizations is one-of-a-kind artisanship. To quote Scott McNealy, I've never toured a datacenter with the reaction "Wow, this looks just like the one I visited yesterday!"

We ought to ask why this is so, because it is supremely inefficient. Practically all IT organizations speak of the commoditization of computers, but seldom of computing. Partly, this is because computers and storage are simple to understand and quantify compared to the enormous complexity of their assembly into systems that deliver some (with hope, predictable) level of business service. This complexity not only is expensive, it's viscous. Business innovation, the central goal of IT, suffers.

There is certainly a school of thought that this complexity is inherent and the proper (read: profitable) thing for a vendor to do is insulate the IT customer from it with "services and solutions". From our vantage point, this is a punt. It's far better to attack the composition of systems to provide useful service as an engineering problem, not as an Exercise Left to the Reader.

And it is precisely in this spirit that Project Blackbox was born. We went back to engineering first principles: how do you transport, physically assemble, power, cool and ultimately recycle computing infrastructure? Take the joules-in, BTUs-out problem as one of engineering co-design. Something that can be quantitative, efficient, and manufacturable in volume.

Many unquestioned assumptions were put on the table. "Why do we build datacenters?" (Because of latency and administrative scale issues.) "Why do we build machine rooms?" (To let people and machines cohabitate, you know, to mount tapes, clean out chad, and punch buttons...) "Why do we have hot-swap fine-grained FRUs?" (To give the cohabitants something to do?)

Where we ended up with Project Blackbox is admittedly not for everyone. It is designed for ferocious scale, complete lights-out, fail-in-place, virtualization, uber-fast provisioning, and brutal efficiency. And I'd like to emphasize that the we expect that the most efficient way to deliver computing and storage services is with containers. Full stop.

While we've tried to keep the project as stealth as possible, we have disclosed aspects of it during its development to selective sets of potential customers and analysts. Feedback has been categorically positive, from "I need ten of these tomorrow. No really, I'm not kidding." to a giddy "This is classic Sun! Why didn't someone do this before?".

Yeah, this seems obvious, so why don't we build datacenters this way? It's the same kind of reaction one had to luggage with wheels, in-line skates, or parabolic skis. Obvious in hindsight, so why did it take so long? Well, obvious at one level, but most definitely dependent upon basic technological progress (in these cases, advances in bearings, plastics and laminates).

For containerized computing, the underlying enabler is the confluence of power density, lights-out management, horizontal scale and virtualization.

Let's look at power density. Half-a-dozen years ago, we were indeed building mondo datacenters, but at quite approachable power densities: typically under 100 watts/ft2. But as we continued to compress physical dimensions (the 1RU server and, now, blades) while simultaneously running hotter chips with more DRAM and disk, watts-per-rack skyrocketed.

Today, 10 kilowatts/rack is standard fare, and many folks are facing 15, 20 and even 25 kw. A standard rack fits nicely over a 2ft x 2ft floor tile. Thus, a 20 kw rack "projects" 5 kw/ft2. If my datacenter is 100 w/ft2 then I can only put one such 20 kw rack every fifty floor tiles! Even a completely modern, leading edge datacenter at 500 w/ft2 spaces our 20 kw rack one every tenth tile.

(It's the square root, natch': for the 500w/ft2 facility it's a rack, two empty tiles, then a rack, in both x and y. For the 100 w/ft2 case, it's a rack, six empty tiles, ...)

No wonder that people are out of space, power or cooling (they are all inter-related). And no wonder I get people jumping out of their chairs wanting ten Blackboxes "tomorrow"!

[Aside: don't confuse power density --- watts/unit volume --- with power efficiency--- watts/unit performance. Even super power-efficient designs such as the eight core UltraSPARC T1 can lead to high power densities, for the simple reason that cramming processors closer together allows them to be more cheaply and effectively interconnected. Low power processors do not necessarily imply low power density systems. But because you use fewer of them overall, they most definitely can cut the power costs of delivering a certain throughput or level of service]

Actually, the higher the power density, the more desirable containerized computing becomes. A standard TEU-sized (8ft x 20ft) container readily can handle eight 25 kw racks. That's a power density of 1,600 watts/ft2. And we really aren't breaking a sweat at these levels, they could easily be doubled owing to the dedicated heat exchanger for each rack position in the cooling loop.

Lights out management (LOM) is another technology enabler. Simply put, we've had a lot of pressure from our customers to make sure that no one is required to interact with a functioning server or storage system. Again, this is a long way from the implicit assumption left over from the mainframe era that there are "operators" for computers.

[Another aside: we are constantly reminded that if you want to build very reliable systems, the best thing you can do is keep people's fingers away from them. There are significantly non-zero probabilities that an operator coming in physical contact with a system, despite all best intents and training, will break something; not infrequently, by disconnecting a wrong cable or wrong disk drive.]

When we mix in virtualization and/or horizontal scale, we finally get to the place where a bit of code doesn't have to run on a particular computer, it only has to run on some computer. Thus, we can use mature techniques such as load balancing, along with emerging ones such O/S paravirtualization and dynamic relocation, to abstract applications from computers. And that leads to service strategies such as fail-in-place, and a wholesale re-evaluation of things like hot swap and redundant power supplies.

Clearly, this level of physical engineering attacks only a focused part of the complexity-at-scale problem, which is manifold. Given this qualification, Project Blackbox is a real, tangible step towards the purposeful engineering and mass production of modular infrastructure. The cobbler's children no longer have to go barefoot, and the industrial revolution can finally arrive for scalable computing.

However the market develops, I know my wife, Laurie, is relieved that Project Blackbox is finally, well, out of the box. For the past two years, whenever seeing a container any where on the road, a train, stacked aboard a ship, or sitting motionlessly at some job site, I'd predictably mumble "that could be one of ours...". And that would lead to my pleading for a commercial driver's license so I could haul them around on an 18-wheeler to different events. "Have you seen the way that you drive?" is the inevitable reply. Of course, I know she's right (and she's a far better driver than I, for the record).

But, Laurie, please, I'll only drive it on the weekends, and just around the block!

Tuesday Sep 12, 2006

The Ecology of Computing

First things first. I've been really, really bad at keeping up this blog. Well, I've been really, really busy. But that excuse has gotten pretty thin, and even my mom has been sending me email with titles like "Time for a new blog!" Got it. So here's what's been on my mind recently.

We are at an inflection point of a massive build-out of computing: things like Google's mondo datacenters, Microsoft's response around live.com, Amazon.com forays into public grids, Salesforce.com AppExchange, are only a small sample. These are examples of folks who are seriously under-served by Moore's Law --- their appetites are only served through an absolute growth in the total amount of silicon dedicated to their causes. Indeed, industry ecosystems are dependent upon absolute growth of unit volume, and not just the growth in single system performance. More detail on this in a subsequent blog. Here, I'm going to focus upon a major consequence of this growth.

The turn of the millennium roughly coincided with modern computing's 50th anniversary and I frequently got the question of "what will computing look like in 2050?". That's either really easy or impossible to answer. Extrapolating Moore's Law, for example, is lots of fun: 25 doubling times, conservatively, is thirty-something-million times the "transistors" per "chip" ("spin-states per cc" will be more like it). Here's how to picture it: imagine Google's entire million-CPU grid of today in your mobile phone of 2050...

Simply extrapolating Moore's Law doesn't account for the absolute growth in computing stuff. So my answer then, and even more emphatic today, is that computing in 2050 will be dominated by two concerns, ecology and ethics: what are the environmental and sociological consequences of computing something. It's not what can we compute, but what should we responsibly compute? You have guessed right if you are imagining a very confused interviewer politely smiling at me. "Will there still be Microsoft and Dell?"

What is the ecology of computing? Will the 2050 Google Grid pocket work at the energy of a self-winding watch, or will it, on current trend lines, require a million amp supply at 0.1V (hey, that's only 100KW). And more to the here-and-now, will the main go-forward consideration in siting computing be the access to, and cost of, power? Just about every customer I speak with today has some sort of physical computing issue: they are either maxed out on space, cooling capacity, or power feeds --- and frequently all three. The people who take this very seriously do LP optimizations of building costs, power, labor, bandwidth, latency, taxes, permits and climate.

More than cost, people cite the time it takes to plan, get approval for, and build out a new datacenter. Any one of reasonable size involves some serious negotiation with local utilities, not to mention getting signed permits from your friendly municipalities. Running out of datacenter capacity can be a Bad Thing. And not surprisingly, the product marketing du jour for those of us in the business of supplying infrastructure has space and power as a first order requirement. Low power design is not new, of course. One only needs to look at the attention that goes into a modern mobile phone to get a great appreciation for the extraordinary lengths that engineers go to conserve microwatts.

What is new, or at least newly emphasized, is how to maximize the equation of a server: maximum throughput per watt. Some of the low power circuit tricks will apply, and we'll all get very clever with powering back under less-than-fully-utilized conditions. There is also the basic blocking and tackling of power supply conversion efficiencies and low-impedance air flow. I'm also expecting a lot more "co-design" of the heat-generating parts (servers, storage and switches) with the heat-removal parts (fans, heat exchangers, chillers). The first signs of this are rack-level enclosures with built-in heat exchange. There is also some simple, and effective, things to do with active damping of airflows within datacenters. This is just the beginning, and my advice is to Watch This Space.

Naturally, we feel really good about the fundamental advance of our throughput computing initiative of the past five years --- especially, the UltraSPARC T1 (nee Niagara 1), and its 65nm follow-on, Niagara 2. By focusing on throughput on parallel workloads, in contrast to performance on a single thread, one is led to a design space that is far more energy efficient. We frequently see an order-of-magnitude (base 10!) improvement in both raw performance and performance-per-watt with the T1. Indeed, it's so significant that last month PG&E are offering our customers rebates of up to $1000 for customers who replace less efficient existing systems. So, yeah, as engineers it feels really good to innovate like this --- something like building a bullet train that gets 100 miles per gallon. Kicking butt, responsibly. This is just the start. Watch This Space, too.

This is great stuff, but let's take a big step back and ask "how much power must computing take?". What are the fundamental limits of energy efficiency, and are we getting even close to them? Theoretically speaking, computation (and communication) can be accomplished, in the limit, with zero energy. No, this isn't some violation of the Laws of Thermodynamics. It turns out that the only real energy dissipation (literally, entropy increase) required is when a bit is destroyed (logical states decrease). Rolf Landauer , one of the great IBM Fellows, was the first to convincingly prove this, along with a bunch of other fundamental limits to computation and communication.

If you don't destroy bits --- meaning you build reversible or adiabatic logic systems --- then you get a hall pass to the Second Law. You hold the number of logical states constant, so it's not inevitable that you have to increase physical entropy. Richard Feynman was a huge proponent of these kinds of systems, and there has been some steady, but marked, progress in the area. See the University of Florida group , for example. If you are interested, check out some of the pioneering work of Fredkin, Toffoli, Knight, Younis and Frank (apologies to others, these are just some of the works that I've followed).

Of course, even the most aggressive researchers in the area will caution about a free lunch --- we have to draw a system boundary somewhere, and crossing that boundary will cost something. Even so, it's likely that we are off 100x or 1000x where we could be in energy efficiency. My guess is that we'll look back at today's "modern" systems to be about as efficient and ecologically responsible as we would now view the first coal-fired steam locomotives.

Back to reality. We are making huge strides in the ecology of computing, but by many measures we are still being hugely wasteful. My bet is that the next few years will be ones that fundamentally re-define how we view large-scale computing infrastructure. Did I forget to mention to Watch This Space?

Love you, mom.

Friday Jun 30, 2006

Charting a Course from Recent Grad to “Citizen Engineer”

In light of the recent graduation season - some thoughts on how engineering graduates might want to consider the changing role this career plays in society. [Read More]

Friday Mar 31, 2006

Everything is Happening at Once

So there must be some failure of physics, or entropy gone wild. But from my perspective at Sun, last week was a watershed moment. The public Sun Grid went live, we delivered the source code for the Ultra SPARC T1 processor (nee Niagara) to opensparc.org, and we unveiled Project Darkstar, a breakthrough in massive multiplayer on-line gaming. In the arrow of time, these are all pointing sharply to the future. The debate, it appears, is not whether, but when, computing is a service. If we are receiving any criticism today about our technology strategy, it's that we are being too aggressive about our embrace of the future. That's an enormous change from the barbs of just a few years ago: that we were way too backward looking, holding on to the past computing models. That's the watershed.

Now, anyone remotely connected to high tech R&D knows that you don't just wake up some day, change your view of the future, and then have a bunch of cool things roll out over the next 12 months. It takes years. As I've oft noted, the technology constants change rapidly (faster, smaller, cheaper), but the rate of change for organizing principles and architectures are glacial. The multiprocessing/multithreading concepts behind T1 have been researched for decades, including some excellent workload performance modeling lead by John Busch's team in Sun Labs a half-dozen years ago. It was actually on the basis of that work, circa 2000, that lead to a (very) big bet being taken within Sun to go build a microprocessor that pointed to the future. A design that optimized for the workload of network-scale services. Similarly big bets were taken around Solaris (Zones, ZFS, dtrace, Fire Engine, ...), JES, tools, and AMD's Opteron, just to name a few of them.

The momentum, buzz, excitement that I feel growing among our customers and the broader communities we touch with things like OpenSolaris, OpenSparc and Java, is palpable. It feels like these bets are beginning to convert. Take a look at some folks having some fantastic success with T1, seeing 8x the performance of Itanium on a socket-to-socket comparison

Hey, the last thing we need at Sun is hubris again. That's not the point of this post, at all. It's just that all of those vectors that the 9,000+ engineers at Sun having been hammering away on since the bubble burst are building up to an interesting coherence. More than anything, it's to acknowledge those awesome efforts --- a few of the fruits we saw last week. This is a very special place and a very special time.

Wednesday Dec 21, 2005

Our Identity Crisis

"To verify your identity, may I please have the last four digits of your 'Social'." I always know the question is coming, but it still causes me to cringe. Verify my identity by telling you four numbers that are in thousands of databases by now? You've got to be kidding, right? Oh, I know, how about something better: my mother's maiden name, the city I was born in, or the answer to my Secret Question?.

None of these, it turns out (the Secret Question actually being the best) does much of anything to authenticate my identity. Yet, we --- meaning all of us making a living in the network economy --- propagate this state of mutual delusion. And thus we are leaving ourselves wide open to some very bad things happening.

The worst thing, by far, is a loss of trust. If people fear that their privacy is compromised, or worse, that their economic future is going to be destroyed, then at least three unshiny things will happen: growth will slow or reverse, companies will get sued for negligence, and governments will invent some very unpleasant regulations.

We, collectively, are messing this up. And we will get more front page stories like this one from last week's USA Today about how crazed methamphetamine addicts are "stealing identities" to feed their habits.

The tragic part is that we have plenty of technology to combat identity theft. What we don't have is the sense of urgency to come together to do meaningful deployments.

Before getting to what we can and should do, it's useful to get some basic concepts across. First, the problem of "Identity Theft" isn't a problem of stealing something; it's about impersonation. That is, it's an authentication problem. This is an important distinction because it should be possible for someone to know eveything about me, but not impersonate me. Disgorging my personal data is a violation of my privacy, but it shouldn't enable someone else to pretend they are me.

To authenticate who I am is to verify one or more of the following factors: something only I should know (a secret like a passphrase, password or PIN),something only I should possess (a hard-to-forge ID such as a smartcard, or the SIM card buried in my GSM phone), and/or something only true of me physically (the pattern of my retina, my thumbprint, the rhythm of my signature). The more factors one uses, (typically) the higher confidence that you are dealing with the person who is the identity they say they are.

The last four digits of my Social Security Number, or the city in which I was born, or my mother's maiden name, or for that matter the digits on my Visa card, fall into none of the above. None of them can be assumed secret. And it is really stupid for us to pretend that they are.

As a first step, we've got to get passwords under control. There are lots of vulnerabilities, from cracking to phishing. Email addresses are reasonable usernames because the DNS system helps support uniqueness (due to the domain name of the email site; presumably the email provider will prevent name collision). Passwords are a mess because each site with which you have a business relationship maintains and records your password. If you are like most people, you reuse passwords at multiple sites with the obvious vulnerability.

There are some real things we can do here. The Liberty Alliance has over 150 companies, non-profits and governments who have been cooperating for years on open standards for federated identities. What that lets us do is have a core set of trusted providers with whom we authenticate, and they perform the electronic introductions with all of the other companies whose sites we use. The essential aspect being that secret information, such as a passphrase, need only be shared with the few trusted providers that we choose (and so much easier to maintain, of course). It is possible to do this now.

But this is only a first step, I think it is essential that we get to routine multifactor authentication, especially for high-value transactions. Smartcards are great, and so are mobile phones for the same reason --- they are reasonably difficult to clone. The key thing is that smartcard, or the one buried in the phone, can hold an electronic secret that it never has to directly reveal, only prove that it indeed has it.

Here's a simple proposal. Let's have a registration authority like the "Do Not Call" list called "Check That It's Me". I'll register my mobile phone number and a list of trip points such as opening a new account, extending credit, and changing my mailing address. If any of these trip, then the company providing the service (say, issuing a new credit card or mortgage) HAS to get approval from Check-That-Its-Me. That in turn, simply involves a call or text message to my mobile from Check-That-Its-Me, to which I respond. The net-net is that anyone trying to impersonate me to accomplish one of these transactions had better be in physical possession of my phone, too. That's a huge barrier.

Whatever the subsequent steps are, we have to get cracking. Let's all resolve for 2006 to ACT on identity management and federation. Tick Tock.

--

Friday Nov 11, 2005

Don't Become Moore Confused (Or, The Death of the Microprocessor, not Moore's Law)

It was great to see that Gordon Moore got to deliver his “40 years later” talk at the Computer History Museum. I hope, though I know in vain, that at last everyone now understands what Moore's Law actually predicts --- and more importantly, what it doesn't. It is a prediction about the doubling of the number of transistors on an integrated circuit, about every 24 months.

It isn't a prediction about the speed of computers.

It isn't a prediction about their architecture.

It isn't a prediction about their size.

It isn't a prediction about their cost.

It is a prediction about the number of transistors on a chip. Full stop. That's it.

Let's take this one at a time. But, first, a little math for the exponentially challenged. In 40 years there are 20 24-month periods. 2\^20 is about one million. A bit of revisionism calls the doubling time 18 months. In that scenario, there are about 26.6 doubling times, or a factor of about 100 million. Let's just split the difference (logarithmically) and say that we've got about 10 million transistors on a chip today for every one we had 40 years ago. In any case, the biggest chips we build today are about 500 million transistors.

Okay, what about speed? A fundamental misconception is that Moore's Law predicts that computer speed will double every 18 to 24 months. Worse, since a very large West coast semiconductor company decided to market the equation that clock speed = performance, I can't tell you how many times I have had the question “With all of the power and heat problems with microprocessors, it looks like clock rates have maxed out or (gasp) have actually slowed down. Are we seeing the end of Moore's Law?” I used to scream. Now I just sigh.

No. Gordon said nuttin 'bout clock rates. And a little data shows how ridiculous that would be. The IBM 360 System 91 (vintage 1967) had a clock rate of 10 Mhz. Ten million times that would mean that today's microprocessors should clock at 100 Thz, or about 10,000 times faster than the fastest clocked chips today.

Size? Well, this is fuzzy enough to say “yes and no.” At the computer level, the answer is firmly “no.” An average industrial refrigerator-sized computer of the late sixties was under 10 cubic meters. Today's 1-2RU rack-mounted server is in the neighborhood of 0.01 cubic meters. That's only a factor of a 1000. Now at the processor level, it depends upon what kind of packaging you consider. Looking at bare dice, you can get close to a factor of 10 million, but this kind of analysis is more about the number of transistors on a chip --- which is Moore's Law.

Cost? Certainly not. A usable server is about $1000 today. Even with generous inflation adjustment, this still translates to a $1B (1970), which is ridiculous. Before you fire off a flame to me about $1.50 microcontrollers and five-cent RFID tags, I'll point out that there were plenty of low cost computers in the late sixties and early seventies. Think of PDP-8's. And remember the first calculators in the mid-seventies? HP and TI had low-cost, programmable, battery-powered (coat)pocket-sized offerings for only a few hundred dollars.

Moore's Law is about transistors. I can print a chip today that has almost a billion transistors on it. Let's look at that more closely. Our first version of SPARC was constructed from about a 100,000 transistor ASIC. So today we could fit TEN THOUSAND of our original SPARC microprocessors on a single chip. That, gentle readers, is interesting.

It's interesting because, today at least, we don't put 10,000 processors on a single chip. What we did do with the gift of transistor doubling was to continuously reinvest them into to building bigger and badder (and more power-hungry) single processors. Not a lot of architectural innovation I might add. We basically were taking many of the ideas pioneered in the sixties (mostly invented at IBM; see the ACS, Stretch, and 360/91 and compare them with “modern” microprocessors) and re-implemented them on single pieces of silicon.

We took discrete processors and integrated them into microprocessors. The serious displacement of discrete systems (bipolar TTL and ECL) started in 1985 and by 1990 it was all over. The discrete processor-based systems, from minicomputers to mainframes to supercomputers, all died, being replaced by microprocessor-based designs.

Now here we are 20 years later. We have squeezed all of the blood from that stone. We're done. Actually, we over-did it. Continuing to throw transistors at making single processors run faster is a bad idea. It's kinda like building bigger and bigger SUVs in order to solve our transportation problems. As I said, a bad idea.

A direct consequence of pursuing this bad idea is that, like gigantic SUVs, the energy efficiency of our biggest and fastest microprocessors is horrible. Meaning, we get very poor computing return for every watt we invest. Outside of portable applications, this extreme energy wasting has really only become a concern when the industry realized that it was getting very difficult to remove the waste heat --- to cool the engine, as it were.

(Another consequence is that these complex microprocessors are, well, complex. That means more engineers to create the design, more engineers to test that the design is correct, and whole new layers of managers to try to coordinate the resulting hundreds and hundreds of folks on the project. Bugs increase, schedules are missed, and innovation actually decreases.)

The result: microprocessors are dead.

Just as the '80's discrete processors were killed by microprocessors, today's discrete systems --- motherboards full of supporting chip sets and PCI slots with sockets for microprocessors --- will be killed by microsystems: my word for the just-starting revolution of server-on-a-chip. What's that? Pretty much what it sounds like. Almost the entire server (sans DRAM) is reduced to a single chip (or a small number of co-designed ones, just as the first micros often had an outboard MMU and/or FPU). These microsystems directly connect to DRAM and to very high speed serialized I/O that are converted to either packet or coherent-memory style network connections.

Open up the lid of a microsystem and you'll find a full SMP : multiple processor cores, crossbar switches, multi-level caches, DRAM and I/O controllers. Our Niagara chip, for example, has eight cores (each four-way threaded), a caching crossbar switch, four memory controllers, and a high speed I/O channel. And its performance is very competitive with our original E10K, the 64 processor behemoth that stormed the world as the first high-volume, enterprise class, massive multiprocessor.

Moore's Law is VERY much alive. And as Marc Tremblay puts it, with Niagara, it's as if we have leapt forward one, or even two, generations of integration.

The secret was to turn the clock back --- figuratively and literally --- to earlier, more sane processor pipeline designs. Ones that were more conservative of transistors, area, power, and complexity. (A key innovation, however, was to finally fold multithreading into the individual pipes). With these smaller, leaner and far more power-efficient processor cores, we could then use the transistor count advance of Moore's Law to paste down many of them on the same die, and to integrate the rest of the SMP guts at the same time.

The result is a chip that is incredibly hot performance-wise, and way cool thermally speaking. Truly an awesome accomplishment.

Incidentally, Opteron is a microsystem, too. You can get a couple of cores, integrated memory controller, and a set of smartly architected serial network ports (hypertransport) that bridge to I/O systems, or directly connect to other Opterons. Our good friends at AMD are actively killing the microprocessor with Opteron. From our vantage, they are still leaving a lot of potential performance on the table (and power efficiency as well) by not reducing core complexity and adding aggressive multithreading. That being said, Opteron is seriously spanking Xeon with the lower memory latency benefit of on-chip DRAM controllers.

Where does end up? Well, we are now dying to get to 65nm (Niagara is 90nm) so we can get even more transistors on a chip in order to integrate more and bigger systems. Just as the microprocessor, harvested the pipeline inventions of 60's and 70's, microsystems are going to integrate the system innovations of the 80's and 90's.

By 2010 microprocessors will seem like really old ideas. Motherboards will end up in museum collections. And the whole ecology that we have around so-called industry standard systems will collapse as it becomes increasingly obvious that the only place that computer design actually happens is by those who are designing chips. Everything downstream is just sheet metal. The apparent diversity of computer manufactures is a shattered illusion. In 2010, if you can't craft silicon, you can't add value to computer systems. You'd be about as innovative as a company in the 90's who couldn't design a printed circuit board.

Thanks, Gordon.

About

Gregp

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today