The x86gentopo Project

You've all seen this before...actually, maybe you haven't. In which case....good! You've got a healthy system. But, if your box did have issues, FMA would report something like:

# fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- May 13 15:00:02 04837324-f221-e7dc-f6fa-dc7d9420ea76 AMD-8000-AV Major Fault class : fault.cpu.amd.dcachedata Affects : cpu:///cpuid=0 degraded but still in service FRU : "CPU 0" (hc://:product-id=Sun-Ultra-20-Workstation:chassis-id= 0604FK401F:server-id=hexterra/motherboard=0/chip=0) ...

Most people are immediately interested in the FRU label - the "CPU 0" in the above output. Answers the question of what needs to be replaced to fix my system. But where does that label come from? Today, FRU labels are put in via platform specific code (an example). And not all Sun platforms have support for FRU labels coded. Running Solaris on HP, Dell, or IBM gear? Unless you've modified OpenSolaris, you don't have FRU labels.

You'll still have the FMRI (the hc://... stuff above). And while the example above seems simple enough, it's not always straightforward determining a FRU from an FMRI (try doing it for IO). Solaris FMA, specifically the topology, is supposed to do the translation for you.

A recently added project, the x86gentopo project, is working to revamp topology creation on x86 platforms. The concept is that a platform, Sun or otherwise, can describe its topology and FRU information to Solaris FMA. And the goal is to use existing industry standards to do so. My colleague Tom Pothier is leading the project, and the first go-round is looking to use SMBIOS information to build topology.

SMBIOS already provisions for representation of baseboard FRUs, processors, memory and the like. And with some extensions can provide the glue to connect the structures together and associate them with the devices as Solaris sees them. A very basic example is depicted at right. There are several fields in the structures that allow for the various structures to be arranged in a logical hierarchy. FMRI names can be derived either directly from the structure type ('chassis' for Type 3 seems obvious), or by a mapping (such as using "Board Type" in Type 2s). The picture at right could produce a topology like:

    hc:///chassis=0
    hc:///chassis=0/cpuboard=0
    hc:///chassis=0/cpuboard=0/chip=0
And as for labels, cpuboard=0 is labeled per the Type 2 "Location" field. The Processor, if it is a FRU, is labeled per the Type 4 "Socket Designation" field. Something else Sun is including in the FMRIs is details on product serial numbers, part numbers, serial numbers, and chassis serial numbers. If you're using Sun's ASR for Systems offering, including this level of detail allows automatic checking for service entitlement, case creation, and replacement part ordering.

This is merely a simple example, and there's many more complexities with having a generic mechanism for generating FMA topology on Solaris x86. If you're interested in the project, become an observer and/or join the x86gentopo-discuss alias. We'd welcome the help :)

:wq

Comments:

Post a Comment:
Comments are closed for this entry.
About

user9148476

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today