Tuesday Jun 14, 2005

A Brief History of Network Driver Development in Solaris

A Brief History of Network Driver Development in Solaris


Today is the day we've all been waiting for, OpenSolaris is finally here!.

Now that the source is out in the wild, I figured it would be a good time to discuss an area of Solaris technology that has kept me busy for the past year. Solaris device driver development, particularly network drivers, has had a long and complicated history. I'd like to go over the progression of technologies and API's and help point future OpenSolaris contributors in the right direction should they decide to write their own network drivers.

For what it's worth, writing device drivers in Solaris is a well documented procedure. There are even quite a few tutorials on the subject.

The network device driver is the first that layer packets go though when entering an operating system; where the rubber meets the road, so to speak. In the OSI seven layer model, this would be considered the Data Link Layer . The driver is responsible for grabbing the packets from the hardware and making them presentable to the Network Layer above. In most cases we'd be talking about IP at this point.

For this blog entry I plan on concentrating on the interface between the driver and the kernel, where most new development has occurred. The Device Driver Interface (DDI) has been relatively stable over the past few years.

The Past - Monolithic DLPI Drivers

In the past, network drivers tended to be written as huge monolithic drivers. All communication between the driver and the hardware was done through the DDI, and all communication from the driver to IP was done through the Data Link Provider Interface (DLPI) and STREAMs. DLPI "is the boundary between the network and data link layers in the OSI Reference Model".

DLPI was a stable API, so each driver tended to have the same code to handle each message. As a result, there ended up being a lot of duplication of "communication" code between various network drivers.

There are a couple of drivers in Solaris today that are still monolithic DLPI drivers. Feel free to check out:

  • hme - HME 10/100Mbps NIC
  • eri - ERI 10/100Mbps NIC
Note the amount of code devoted just for DLPI message processing.

The Present - Generic Lan Driver v2 (GLDv2)

Back in 1994, some enterprising engineers decided that writing the DLPI and STREAMs code each time they came up with a new driver wasn't such a good idea. They decided to come up with a much simpler interface that handled all of the mundane, common code, and thus the Generic Lan Driver (GLD) interface was born. Later in 1997, GLD was enhanced to version 2 which changed the mutex rules and behavior to handle the performance requirements of the "newer" 100MB network cards which were en vogue at the time. Other new features were introduced such as kernel statistics and high level interrupt support.

GLD allowed for the rapid development of network device drivers, which was especially important for Solaris x86. One of the most compelling benefits of using the GLD framework became apparent as the years went on. New features that were added to GLD became available to all GLD based drivers with little to no work. Over the years, features such as hardware checksum support and zero copy were integrated into the GLD framework and made available to all NICs that supported them.

Additional feautres such as VLANs, Multidata Transmit (MDT), and Jumbo Frame support were added to the framework, but unfortunately due to the limitations of the architecture, these features were not available to all drivers without retooling the driver code itself. When new features such as link aggregation (802.3ad) were requested, it became apparent that the GLD architecture was unable to scale to where we wanted it to go, and something new was needed.

Here are a few GLDv2 drivers in Solaris:

  • ixgb - Intel 10Gbps NIC
  • chxge - Chelsio 10Gbps NIC
The GLDv2 interface has been stable and public for some time now, and third party GLDv2 drivers can be found in a number of locations. An excellent repository of drivers can be found on Murayama-san's web site.

The Future - Project Nemo (a.k.a. GLDv3)

In early 2004, Project Nemo (GLDv3) was started to address some of the limitations of GLDv2 and provide a network driver API that could support all of the new technologies NIC manufactures were developing. Feature-wise, all GLDv3 drivers must support VLANs (802.1q) and Link Aggregation (802.3ad) out of the box. The framework was also retooled to be extensible -- new features were implemented through the use of loadable function pointers that are exposed to the driver. Lastly, we allowed more "trust" between the driver and the kernel by enabling direct function calls between the two layers. This helped improve performance by allowing fast path data traffic to bypass the STREAMS interface, which has been a bottleneck for some time now.

Since this was a network performance project, a number of optimizations and tweaks were done to get sizable performance benefits over GLDv2 drivers. Function pointers and tail call optimization were used to reduce the # of instructions per packet. We introduced packet chaining, which allows for a number of performance benefits by doing similar transactions over a chain of packets (which helped cache hits, among other things). And lastly, a novel method of adaptive interrupt coalescing was developed which reduced interrupt context switches during heavy load by turning off interrupts for short periods of time.

GLDv3 is the future. All new Solaris network drivers will be written to this API, and there is a lot of work going on right now to convert existing GLDv2 drivers to GLDv3. The similarities between the two API's are close enough that the actual porting process usually only takes a few days of coding. And once you've gone to GLDv3 you automatically get a hefty performance improvement in numerous benchmarks due to the performance enhancements listed above.

Currently only one Nemo based driver is available through OpenSolaris. We have another which doesn't appear to have been cleared for release just yet.

  • bge - Broadcom Gigabit Ethernet Driver

We don't recommend current driver writers to code to the API right now as it's currently evolving and we have a number of improvements in store with the new projects we're working on. Once we've stablized the API we will provide the necessary documentation to help everyone get on board. Of course now that Solaris is open source, even evolving interfaces can be played with :). So have fun with it!

Technorati Tag:
Technorati Tag:

Tuesday Apr 05, 2005

Start the blitz

Well, it's been a busy few months putting the polish on our new Generic Lan Driver (GLDv3) framework, codenamed: Nemo. The code will be available to the general public through our Solaris Express program soon! As we get closer to release the team will go into more detail of some of the new features, but the big ticket items to look forward to are:

  • Improved network performance
  • VLAN support (802.1q)
  • Trunking/Link Aggregation support (802.3ad)
  • Lower cpu utilization for network traffic

We're also working on a demo for the Sun Labs Open House at the end of the month showcasing the technology behind Nemo. I don't believe it's open to the public but my understanding is that customers should talk to their Sun sales associate if they're interested. In either case, we should hopefully be able to talk about some of the performance results we achieve. Stay tuned..

Wednesday Sep 08, 2004

The gathering storm

Woo, got a few more Solaris Networking folks registered with blogs! Carol,Tobin, Adi, and Sunay are all set up, they just need to put up their first entries. /tap. /tap. /tap.

I'm a bit skeptical of the "Today's Page Hits" that is auto generated on most Roller blogs. For one, it looks like it's cumulative. I've seen other Sun people's blogs with Today's Page Hits of 60k or so. For another, they seem awfully inflated. Looking at Adi's blog above, we see it has over 3000 hits, but his blog has 0 entries. Are people flocking to an empty blog or is it just noise in the ether?

Doh, I don't seem to be on planetsun yet. Dave the maintainer mentioned that anything that shows up on blogs.sun.com should automatically be subscribed. Hopefully this entry will nudge it a bit.

[EDIT] Never mind, looks like I popped up on planetsun after all. Thanks Dave for such a valuable resource!

Tuesday Sep 07, 2004

testing the waters...

So I hear there's some grumbling about the lack of participation in Sun's blogging endeavor on the part of SNST (Solaris Networking and Security Technologies). Hopefully this blog, along with the others that are likely to come online soon, will help shore up the numbers a bit.

Perhaps a short introduction is in order.

I work for Sun as an engineer on the Network Performace Team. I've been here a little over 2 years, which is still probably considered newbie territory for some of the old-timers here :). My current project is code-named Nemo (the fish, not the captain), and it aims to provide a new version of the generic lan driver technology that will provide a solid framework for driver writers to deliver next generation networking features. Among other lofty goals, the aim of our group in general, and this project in particular, is to make the Solaris networking stack move packets really, really fast. More on this later once I find out what parts of the cat are safe to let out of the bag, so to speak

Hmmmm, I wonder if there's a app out there that will let me blog from my Treo 600..




« April 2014