Brief History of InfiniBand: Hype to Pragmatism

Remember the year 1999? That was when the Internet dot com boom was in full swing. Everyone was going to retire early with the money they made in the stock market, unless of course civilization ended due to Y2K. InfiniBand (IB) was born in the midst of this exciting time. Just like a lot of other things from that time, there was great hype associated with it.

Before we go further down memory lane, let's talk about what motivated IB. In the 1990s, folks from many server companies (including my own) were wondering what could be done to increase I/O speeds. PCI was the dominant standard solution. But everyone could forsee a day (especially because of the wild Internet expansion) when PCI would not be adequate, even at 64-bit/66Mhz. There were three responses to this problem: doing something proprietary, stretching PCI further through PCI-X or doing something "completely different". Proprietary has it's own well known issues, so I won't go into that.

"Completely different" turned out to be high-speed serial point-to-point links. So instead of trying to drive lots of wires in a shared bus in parallel, you would just drive a single wire at much higher speed. Of course, you could already do this with fiber optics, but the cost was very high, so that was not very interesting. The key thing was making it practical with copper wires using differential signalling (a pair of wires with the signal being sent out of phase on the second wire). Speeds obtainable with this approach were in the GHz range and the cost was around a tenth of the optical. (Lots of other areas of I/O technology also seized on this approach like SATA, PCI-Express, etc.)

This physical layer approach was then married to Remote DMA (RDMA) and the Queue Pair (QP) model of asynchronous operation borrowed from the Virtual Interface Architecture (VIA).

At this point, politics enters the picture. Two different competing efforts emerged: Next Generation I/O (NGIO) and Future I/O (FIO). There were lots of companies involved. NGIO included Intel, Sun and Dell. FIO included IBM, HP and Compaq. At this point in history, Intel and Sun were not too friendly with each other. However, it seemed Intel needed Sun, since FIO had many of its traditional OEMs. The FIO crowd, I think, feared Intel having too much power in this new realm and so went off on their own. That's not to say that NGIO and FIO did not have technical differences. NGIO was simpler, which perhaps meant to the FIO folks that critical features were being left out.

Everyone knew that two competing standards was potentially a disaster. So in August 1999, the inevitable merger took place after some difficult negotiations in 1999. The new organization was called the InfiniBand Trade Association (IBTA). In addition to the former NGIO and FIO folks, the IBTA included Microsoft. This addition came with a cost of having the charter prohibit direct work on APIs. So to this day, APIs associated with IB have to be worked on elsewhere.

Now back to the hype. What was IB going to do? Well it was going to connect almost all servers! The IBTA cast a wide far-reaching vision for servers, especially Internet servers. By that time, many folks had adopted a three tier web server architecture. The first tier is a bank of web servers connected to the Internet, or possibly fronted by load balancing/firewall stuff. The second tier is a cluster of application ("app") servers. The last tier consists of database servers and storage. Each tier in this architecture was connected through its own specialized interconnect (for clustering, storage and general networking). The IBTA claimed that you could connect all of this together, even the Fibre Channel disks with one high capacity IB fabric and a single administration scheme. So IB was positioned simultaneously as something that could replace PCI in I/O, Ethernet in the machine room, cluster interconnect and Fibre Channel. This, of course, stirred up some resistance, particularly from the general networking people/Ethernet enthusiasts and from Fibre Channel/SAN advocates. This was a big vision, but it also required a large investment in infrastructure and management tools.

Further, the IBTA proposed decomposing and virtualizing servers. It was argued that you could decompose your server into a processor/memory complex with an IB adapter attached to it. After you did that, all the rest of your server (e.g. storage, etc.) could be placed onto the IB fabric (there was a module form factor defined to help this idea). General network communication could be overlaid on top of IB without special purpose adapters, though such adapters might still be necessary at the edge of an IB fabric. Decomposing your server this way, put each part in a separate fault domain so a single problem didn't knock out your entire server. Further, it was possible to share I/O between systems and treat the components as mix and match commodities. This vision of a decomposed server connected by a backplane-like network was not new, though it seems this was the first time it was pushed in a mainstream commercial context by a consortium that included so many large server makers.

For a while, InfiniBand was hot. Interest grew. Lots of startups sprang up. Everyone touted their IB roadmap. Then the bubble burst. It wasn't just IB -- it was the implosion of the dot com boom, stock market adjustment, cautious corporate IT spending, the recession, 9/11, etc. The net effect was that no one was in the mood to adopt or invest in such a far-reaching technology jump. So, a lot of renewed skepticism emerged.

Then the other shoe dropped. Intel decided to discontinue IB chip development, though they continued to promote the technology. Why did they do this? I can only speculate. Their intial development was based on 1x links, which seemed to miss the mark (4x became the most popular size). Intel was not immune to R&D fiscal pressures also and so needed to shift resources to help with PCI-Express development. Further, they may have also have reacted to less favorable market conditions for IB acceptance. In any case, the loss of Intel's product made many feel that the IB market would not have sufficient volume. This led to a raft of vendors delaying or retrenching their IB roadmaps. Microsoft left the IBTA for the RDMAC (more on that later). Startups merged and failed. A stark reality set in. Folks at SIGCOMM 2003 asked me: "Isn't InfiniBand dead?"

Fast forward to today. After much gloom and doom, IB seems to be making a come back in select markets. In particular, IB is making gains as a clustering interconnect. This is particularly true in latency sensitive applications (e.g. databases and HPTC). In this market, IB has a latency edge over Ethernet. For a short while, it will also have a cost edge at 10Gbps, at least until 10Gbps CX-4 Ethernet ramps volume. Oddly enough CX-4 is the InfiniBand 4x physical layer. So the 10Gbps Ethernet cost breakthrough was enabled by InfiniBand! Anyway, Quadrics and Myrinet (and others) also play here too. They can do better in latency, but their bandwidth is not as good as IB. Further, they are proprietary.

IB isn't completly dead in other areas, but its success will be a much tougher fight. There is a niche market for IB network adapters (e.g. bridge to Ethernet) and storage adpaters (bridge to Fibre Channel). A few companies are using it to aggregate and encapsulate PCI traffic within the server. Some server vendors use it to interconnect blade servers (the modern version of decomposed, virtualized servers). Parts of the technical vision that the IBTA once touted are still possible, but it doesn't have the very favorable market conditions to fuel it.

Technorati Tags:


Post a Comment:
Comments are closed for this entry.



« July 2016