A Sensor Abstraction Layer for FMA

Solaris Nevada build 96 is an important milestone build for the Sensor Abstraction Layer project for FMA, as it introduces the software infrastructure (the plumbing) on which the functionality described in the original design doc1 will built.

This was a collaborative effort between the Solaris FMA team and the Fishworks team (specifically Eric Schrock) and involved over 7000 lines of change to over 60 source files.  Below are the two putback notifications that comprised the combined work:

First my putback done on July 31st:

Comment:
PSARC 2008/428 Extending libnvpair for type double
PSARC 2008/463 Extending HC FMRI scheme to represent sensors/indicators
6579615 fmtopo -e has lots of memory leaks
6635159 libtopo: extend hc scheme to allow for representing sensors and indicators in the topology
6692392 fmtopo -x doesn't handle property methods properly
6718703 Need to extend libnvpair to support type double
6718712 libtopo: Need to implement facility provider module for IPMI
6722594 libtopo: the topo_prop_set_\* interfaces need to learn to play well with propmethods
6727190 libtopo: add support for node properties of type double
6727459 libipmi: need interface to convert raw sensor readings to unit-based values
6727470 libipmi: need convenience routine to convert sensor unit defines to string
6729595 libtopo: add <set> case in fan and psu xml maps for SUN-FIRE-X4600-M2
6732318 fmd: small leak in sysevent modelling code

Files:
update: usr/src/cmd/fm/fmd/common/fmd_sysevent.c
update: usr/src/cmd/fm/fmtopo/common/fmtopo.c
update: usr/src/common/nvpair/nvpair.c
update: usr/src/lib/fm/topo/libtopo/Makefile.com
update: usr/src/lib/fm/topo/libtopo/common/hc.c
update: usr/src/lib/fm/topo/libtopo/common/libtopo.h
update: usr/src/lib/fm/topo/libtopo/common/mapfile-vers
update: usr/src/lib/fm/topo/libtopo/common/topo_2xml.c
update: usr/src/lib/fm/topo/libtopo/common/topo_error.h
update: usr/src/lib/fm/topo/libtopo/common/topo_fmri.c
update: usr/src/lib/fm/topo/libtopo/common/topo_method.c
update: usr/src/lib/fm/topo/libtopo/common/topo_method.h
update: usr/src/lib/fm/topo/libtopo/common/topo_mod.h
update: usr/src/lib/fm/topo/libtopo/common/topo_node.c
update: usr/src/lib/fm/topo/libtopo/common/topo_parse.h
update: usr/src/lib/fm/topo/libtopo/common/topo_prop.c
update: usr/src/lib/fm/topo/libtopo/common/topo_subr.c
update: usr/src/lib/fm/topo/libtopo/common/topo_subr.h
update: usr/src/lib/fm/topo/libtopo/common/topo_xml.c
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/common/topology.dtd.1
update: usr/src/lib/fm/topo/maps/i86pc/chip-hc-topology.xml
update: usr/src/lib/fm/topo/maps/i86pc/fan-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/i86pc/i86pc-hc-topology.xml
update: usr/src/lib/fm/topo/maps/i86pc/psu-hc-topology.xml
update: usr/src/lib/fm/topo/modules/common/Makefile
update: usr/src/lib/libipmi/Makefile.com
update: usr/src/lib/libipmi/common/ipmi_impl.h
update: usr/src/lib/libipmi/common/ipmi_sdr.c
update: usr/src/lib/libipmi/common/ipmi_util.c
update: usr/src/lib/libipmi/common/libipmi.h
update: usr/src/lib/libipmi/common/mapfile-vers
update: usr/src/lib/libipmi/common/mktables.sh
update: usr/src/lib/libnvpair/libnvpair.c
update: usr/src/lib/libnvpair/mapfile-vers
update: usr/src/pkgdefs/SUNWfmd/prototype_com
update: usr/src/uts/common/sys/fm/protocol.h
update: usr/src/uts/common/sys/nvpair.h
create: usr/src/lib/fm/topo/modules/common/fac_prov_ipmi/Makefile
create: usr/src/lib/fm/topo/modules/common/fac_prov_ipmi/fac_prov_ipmi.c

Examined files: 41

Contents Summary:
       2   create
      39   update

Names Summary:
       2   update parent's name history
       2   update children's name history

And now Eric Schrock's putback, done on August 1st:

Comment:
PSARC 2008/485 SES Sensors and Enumerator
6720433 SES enumerator should provide controller revision information
6720435 SES enumerator should prefer description over class-description
6720452 SES enumerator should support indicators and sensors
6722807 SES enumerator should work with internal enclosures
6722809 want a way to identify enclosures as internal
6722811 SES enumerator should prefer elements with known status
6723603 x86 xmlgen topo scripts should make use of propmap
6732875 typo in fan-hc-topology.xmlgen
6732879 broken logic in pad_process()

Files:
update: usr/src/lib/fm/topo/libtopo/common/topo_parse.h
update: usr/src/lib/fm/topo/libtopo/common/topo_xml.c
update: usr/src/lib/fm/topo/maps/Makefile.map
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Makefile
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Makefile
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Makefile
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Makefile
update: usr/src/lib/fm/topo/maps/common/topology.dtd.1
update: usr/src/lib/fm/topo/maps/i86pc/fan-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/i86pc/i86pc-hc-topology.xml
update: usr/src/lib/fm/topo/modules/common/ses/Makefile
update: usr/src/lib/fm/topo/modules/common/ses/ses.c
update: usr/src/lib/scsi/plugins/ses/Makefile
update: usr/src/lib/scsi/plugins/ses/libses/common/libses.h
update: usr/src/pkgdefs/SUNWfmd/prototype_i386
update: usr/src/pkgdefs/SUNWscsip/prototype_com
update: usr/src/pkgdefs/SUNWscsip/prototype_i386
update: usr/src/pkgdefs/SUNWscsip/prototype_sparc
update: usr/src/tools/scripts/bfu.sh
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Sun-Fire-X4200-M2-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Sun-Fire-X4200-M2-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Sun-Fire-X4200-M2-disk-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Sun-Fire-X4200-Server-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Sun-Fire-X4200-Server-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Sun-Fire-X4200-Server-disk-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-disk-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-disk-hc-topology.xmlgen
create: usr/src/lib/fm/topo/maps/common/xmlgen-header.xml
create: usr/src/lib/fm/topo/modules/common/ses/ses.h
create: usr/src/lib/fm/topo/modules/common/ses/ses_facility.c
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/Makefile.com
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/amd64/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/common/lsilogic.c
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/i386/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/sparc/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/sparcv9/Makefile

Examined files: 33

Contents Summary:
      10   create
      23   update

Names Summary:
       4   renamed
      10   update parent's name history
      14   update children's name history

The Sensor Abstraction Layer project page has been updated with links to some new documentation. Below is some more details on three of the key new FMA infrastructure changes: hc FMRI scheme extensions, facility nodes and facility providers.



First some background...

As touched in my previous blog entry, the Solaris Fault Manager maintains a snapshot of the hardware topology in a tree-like structure that includes a node for all hardware resources and FRU's that are managed/monitored by FMA.  The interfaces for generating a topology snapshot, walking the resulting tree and for manipulating the individual nodes in the tree are provided by libtopo and documented in Chapter 9 of the Fault Manager Programmer's Reference Guide.

The Sensor Abstraction Layer for FMA extends libtopo so that sensors and indicators can also be represented in our topology in a fashion that allows for the association of sensors and indicators to hardware resource to be programmatically determined.

Additionally it introduces the concept of a facility provider module which provides an abstraction layer between libtopo and the lower-level interfaces that are used to control a given sensor or indicator.

Together this provides a set of common infrastructure to enable future FMA projects to manipulate sensors and indicators as part of Fault Management activities.


hc FMRI scheme extensions

Th existing hc-scheme allows for a heirarchial representation of hardware resources, according to their physical connection properties. However, this is not a very useful way to represent sensors and indicators in the topology because it does not allow for consumers to programmatically determine the association of sensors/indicators to the hardware resource that they're monitoring.

In Solaris Nevada build 96, we've extended the hc FMRI scheme to allow for this association to be represented using a new type of node in the topology: a facility node.

A facility node is a special leaf node in the hc-scheme topology that represents either a sensor or an indicator. A fault managed resource may have one or more child facility nodes that represent sensors or indicators that are associated with it. The hc-scheme was be extended as shown below to allow for an additional facility node member:

Name Data Type Description
scheme uint32 scheme used for FMRI
version uint32 version of scheme specification
authority nvlist optional authority of FMRI
payload
resource path
facility nvlist facility component of FMRI

The facility nvlist will have two members:

Name Data Type Description
facility-type string type of facility node: "sensor" or "indicator"
facility-name string name of the facility

The string representation of an hc scheme FMRI will also be extended, as shown below:

<scheme>://[authority]/<resource-path>[?<fac-type>=<fac-name>]

where fac-type can be either "sensor" or "indicator" and fac-name is the name of the facility.

for example:
hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/chassis=0/fanmodule=0/fan=0?sensor=speed hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/chassis=0/bay=47?indicator=ok2rm


Anatomy of a Facility Node

Facility nodes are required to have the following properties specified in a "facility" property group:

 Property Name
Threshold Sensors
Discrete Sensors
Indicators
 type  Yes Yes
Yes
 sensor-class  Yes Yes
No
 reading  Yes No
No
 state Yes
Yes
No
 units Yes
No
No
 mode No
No
Yes

These properties allow for the classification of the facility node to be programmatically determined and are used by the new topo_fmri_facility() interface to check for the existence of sensors or indicators of a given type.

'sensor-class" property

All facility nodes of type "sensor" must specify a "sensor-class" property that is set to one of the following values.

Value Data Type Define
"threshold" string TOPO_SENSOR_CLASS_THRESHOLD
"discrete" string TOPO_SENSOR_CLASS_DISCRETE

'units' property

All 'sensor' facility nodes with a "sensor-class" property value of TOPO_SENSOR_CLASS_THRESHOLD are required to specify a "units" property of type uint32. The value should be set to one of the predefined unit types specified in libtopo.h (see the TOPO_SENSOR_UNIT_\* defines)

'type' property

All 'sensor' and 'indicator' facility nodes must provide a "type" property of type uint32. The value should be set to one of the predefined unit types specified in libtopo.h (see the TOPO_SENSOR_TYPE_\* and TOPO_LED_TYPE_\* defines)


Facility Providers

A facility provider is a logical collection of node and property methods that provide an abstraction layer between libtopo and the underlying lower level interfaces that are used to actually manipulate the sensors and indicators. This allows library consumers (namely fmd) to access sensor readings and manipulate indicators via standard libtopo interfaces (e.g. topo_prop_{get|set}_{type}). Nevada build 96 includes the implementation of facility provider modules for IPMI2 and SES3, which will provide broad coverage across Sun's x64 server platforms.  The diagram below shows how the provider modules fit into the overall software structure:

Facility Provider Block Diagram


Facility providers are implemented as simplified libtopo plugin modules (similar to enumerator modules). However, in the implementation of their tmo_enum entry point, a facility provider will simply register its methods on the node that is passed in.

At a minimum, a facility provider must implement the following property methods:

'reading' property method

Sensor nodes with a "sensor-class" property value of TOPO_SENSOR_CLASS_THRESHOLD must provide a property method for the "reading" property of type TOPO_TYPE_DOUBLE that should return the current analog reading from the sensor.

'state' property method

All sensors nodes (threshold and discrete) should provide a property method for a "state" property of type uint32. The property value should be set to one of the predefined sensor-type specific discrete states defined in libipmi.h (see the TOPO_SENSOR_STATE_\* defines)


'mode' property method

For 'indicator' facility nodes, the facility provider must implement a property method to get/set the LED mode.  The mode property can be set to one of the following two values: 0 (OFF) or 1 (ON)

Facility providers can also optionally implement a node method (fac_enum) that can be invoked on a given hardware resource node to automatically discover and enumerate facility nodes that should be bound (associated) with it.

Below are some example excerpts of fmtopo4 output for both a threshold and discrete sensor as well as an indicator node. These examples were taken from a Sun-Fire X4500.

hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/motherboard=0/chip=0?sensor=proc.p0.t_core
  group: facility                       version: 1   stability: Private/Private
    entity_ref        string    proc.p0.t_core
    sensor-class      string    threshold
    type              uint32    0x101 (THRESHOLD_STATE)
    state             uint32    0x0 (0x00)
    reading           double    49.000000
    units             uint32    0x1 (DEGREES_C)

hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/chassis=0/psu=0?sensor=ps0.prsnt
  group: facility                       version: 1   stability: Private/Private
    entity_ref        string    ps0.prsnt
    sensor-class      string    discrete
    type              uint32    0x108 (GENERIC_PRESENCE)
    state             uint32    0x2 (ASSERTED)

hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/chassis=0/bay=40?indicator=present
  group: facility                       version: 1   stability: Private/Private
    entity_ref        string    hdd40.state
    mode              uint32    0x1 (ON)
    type              uint32    0x3 (PRESENT)

Well - that's it for now.  In my next blog entry I'll give some example that demonstrate how easy it is to use the new interfaces in libtopo to get sensor readings or flip LED's on or off.


[1] The hc FMRI scheme extensions and the concept of facility nodes are based on Cynthia McGuire's original design document for the Sensor Abstraction Layer, so readers may find it beneficial to review section 2.3 of that document to gain additional background.

[2] IPMI is an acronym for Intelligent Platform Management Interface which is an Intel specification for doing out-of-band management of computers.  Over the last few years it has become and industry standard and thus most x86 server platforms (including those made by Sun) support IPMI.  Through IPMI we can get access to platforms sensors and indicators.

[3] SES is an acronym for SCSI Enclosure Services which is a protocol for accessing diagnostic services for SCSI storage enclosures including things like temperature and voltage sensors.

[4]  fmtopo is a command-line utility that developers can use to take a snapshot and dump the resulting topology.  It's usage is documented in chapter 12 of the Fault Manager Programmer's Reference Guide.

Comments:

Post a Comment:
Comments are closed for this entry.
About

user12611677

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today