Fishworks Hardware Topology
By eschrock on Nov 09, 2008
It's hard to believe that this day has finally come. After more than two and a half years, our first Fishworks-based product has been released. You can keep up to date with the latest info at the Fishworks blog.
For my first technical post, I'd thought I'd give an introduction to the chassis subsystem at the heart of our hardware integration strategy. This subsystem is responsible for gathering, cataloging, and presenting a unified view of the hardware topology. It underwent two major rewrites (one by myself and one by Keith) but the fundamental design has remained the same. While it may not be the most glamorous feature (no one's going to purchase a box because they can get model information on their DIMMs), I found it an interesting cross-section of disparate technologies and awash in subtle complexity. You can find a video of myself talking about and demonstrating this feature here.
At the heart of the chassis subsystem is the FMA topology as exported by libtopo. This library is already capable of enumerating hardware in a physically meaningful manner, and FMRIs (fault managed resource identifiers) form the basis of FMA fault diagnosis. This alone provides us the following basic capabilities:
- Discover external storage enclosures
- Identify bays and disks
- Identify CPUs
- Identify power supplies and fans
- Manage LEDs
- Identify PCI functions beneath a particular slot
Much of this requires platform-specific XML files, or leverages IPMI behind the scenes, but this minimal integration work is common to Solaris. Any platform supported by Solaris is supported by the FishWorks software stack.
Unfortunately, this falls short of a complete picture:
- No way to identify absent CPUs, DIMMs, or empty PCI slots
- DIMM enumeration not supported on all platforms
- Human-readable labels often wrong or missing
- No way to identify complete PCI cards
- No integration with visual images of the chassis
To address these limitations (most of which lie outside the purview of libtopo), we leverage additional metadata for each supported chassis. This metadata identifies all physical slots (even those that may not be occupied), cleans up various labels, and includes visual information about the chassis and its components. And we can identify physical cards based on devinfo properties extracted from firmware and/or the pattern of PCI functions and their attributes (a process worthy of its own blog entry). Combined with libtopo, we have images that we can assemble into a complete view based on the current physical layout, highlight components within the image, and respond to user mouse clicks.
However, we are still missing many of the component details. Our goal is to be able to provide complete information for every FRU on the system. With just libtopo, we can get this for disks but not much else. We need to look to alternate sources of information.
For CPUs, there is a rather rich set of information available via traditional kstat interfaces. While we use libtopo to identify CPUs (it lets us correlate physical CPUs), the bulk of the information comes from kstats. This is used to get model, speed, and the number of cores.
The device tree snapshot provides additional information for PCI devices that can only be retrieved by private driver interfaces. Despite the existence of a VPD (Vital Product Data) standard, effectively no vendors implement it. Instead, it is read by some firmware-specific mechanism private to the driver. By exporting these as properties in the devinfo snapshot, we can transparently pull in dynamic FRU information for PCI cards. This is used to get model, part, and revision information for HBAs and 10G NICs.
IPMI (Intelligent Platform Management Interface) is used to communicate with the service processor on most enterprise class systems. It is used within libtopo for power supply and fan enumeration in libtopo as well as LED management. But IPMI also supports FRU data, which includes a lot of juicy tidbits that only the SP knows. We reference this FRU information directly to get model and part information for power supplies and DIMMs.
Even with IPMI, there are bits of information that exist only in SMBIOS, a standard is supposed to provide information about the physical resources on the system. Sadly, it does not provide enough information to correlate OS-visible abstractions with their underlying physical counterparts. With metadata, however, we can use SMBIOS to make this correlation. This is used to enumerate DIMMs on platforms not supported by libtopo, and to supplement DIMM information with data available only via SMBIOS.
Last but not least, there is chassis-specific metadata. Some components simply don't have FRUID information, either because they are too simple (fans) or there exists no mechanism to get the information (most PCI cards). In this situation, we use metadata to provide vendor, model, and part information as that is generally static for a particular component within the system. We cannot get information specific to the component (such as a serial number), but at least the user will be able to know what it is and know how to order another one.
Putting it all together
With all of this information tied together under one subsystem, we can finally present the user complete information about their hardware, including images showing the physical layout of the system. In addition, this also forms the basis for reporting problems and analytics (using labels from metadata), manipulating chassis state (toggling LEDs, setting chassis identifiers), and making programmatic distinctions about the hardware (such as whether external HBAs are present). Over the next few weeks I hope to expound on some of these details in further blog posts.