Tuesday Nov 30, 2010

FMA and Email Notifications

In November, Oracle released a sneak peek at the next major release of Solaris in the form of Oracle Solaris 11 Express 2010.11.  There are tons of great features and innovations in this release.  One of the features I worked on was a new service smtp-notify, that can be configured to send email notifications in response to various Fault Management events, such as when a hardware component has been diagnosed as faulty.  Notifications can be configured for the following FMA event types (the descriptions below have been excerpted from the smf(1m) man page)


         A new problem has been diagnosed by the FMA   subsystem.
         The  diagnosis  includes a list of one or more suspects,
         which (where appropriate) may  have  been  automatically
         isolated  to prevent further errors occurring. The prob-
         lem is identified by a UUID  in the event  payload,  and
         further  events  describing  the resolution lifecycle of
         this problem quote a matching UUID.


         One or more of the suspect resources in a problem  diag-
         nosis  has  been repaired, replaced or acquitted (or has
         been faulted again), but  there  remains  at  least  one
         faulted  resource  in  the  list.  A repair could be the
         result of an fmadm command line  (fmadm repaired,  fmadm
         acquit,  fmadm  replaced)  or  might  have been detected
         automatically such as through detection of a part serial
         number change.


         All of the suspect resources in a problem diagnosis have
         been repaired, resolved or acquitted. Some or all of the
         resources might still be isolated at this stage.


         All of the suspect resources in a problem diagnosis have
         been  repaired  resolved or acquitted and are  no longer
         isolated (for example, a cpu that was a suspect and off-
         lined  is  now back online again; this un-isolate action
         is usually automatic).

The smtp-notify service is enabled out-of-the-box.

# svcs smtp-notify
STATE          STIME    FMRI
online         Oct_28   svc:/system/fm/smtp-notify:default

You can list the default notification preferences with svcs(1m):

# svcs -n
Notification parameters for FMA Events
    Event: problem-diagnosed
        Notification Type: smtp
            Active: true
            reply-to: root@localhost
            to: root@localhost

        Notification Type: snmp
            Active: true

        Notification Type: syslog
            Active: true

    Event: problem-repaired
        Notification Type: snmp
            Active: true

    Event: problem-resolved
        Notification Type: snmp
            Active: true

What does the above output tell us?  It tells us that problem-diagnosed events will result in an email notification being sent to root@localhost.  It will also result in a message being sent to syslog and an SNMP trap being generated.  Additionally, SNMP traps will be generated for problem-repaired and problem-resolved events.

What does an example email notification look like?  See below:

From noaccess@diffuser.sfbay.sun.com Wed Jul 21 19:58:29 2010
Date: Wed, 21 Jul 2010 19:58:29 -0700 (PDT)
From: No Access User <noaccess@diffuser.sfbay.sun.com>
To: root@localhost
X-FMEV-CLASS: list.suspect
X-FMEV-UUID: e82aa706-ce6a-cbbb-a529-ceef1c9b57b0
Subject: Fault Management Event: diffuser:AMD-8000-AV

SUNW-MSG-ID: AMD-8000-AV, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Wed Jul 21 19:58:29 PDT 2010
PLATFORM: Sun-Fire-X4200-Server, CSN: 0000000000, HOSTNAME: diffuser
SOURCE: eft, REV: 1.16
EVENT-ID: e82aa706-ce6a-cbbb-a529-ceef1c9b57b0
DESC: The number of errors associated with this CPU has exceeded acceptable levels.  Refer to http://sun.com/msg/AMD-8000-AV for more information.
AUTO-RESPONSE: An attempt will be made to remove this CPU from service.
IMPACT: Performance of this system may be affected.
REC-ACTION: Schedule a repair procedure to replace the affected CPU.  Use 'fmadm faulty' to identify the module.

Those who've seen the messages that are logged to the console when FMA diagnoses a fault will see that the format is similar.  One additional thing to note is that each FMA email notification message also includes the following X-headers, which are there to aid admins who write mail filters:

 Header Name  Description
 the name of the host on which the event occurred
 the event class
X-FMEV-CODE  the Knowledge Article message ID
X-FMEV-SEVERITY  the severity of the event
X-FMEV-UUID  the UUID associated with the event


Email notification for FMA are highly configurable via svccfg(1m).  For example, you can enable/disable them per event type.  For example:

# svccfg setnotify problem-diagnosed mailto:active


# svccfg setnotify problem-diagnosed mailto:inactive

You can configure separate lists of one or more email recipients per event type.  For example:

# svccfg setnotify problem-repaired mailto:joe@somehost.com,admin@central.com

You can even define your own message body template.

# svccfg setnotify problem-diagnosed \\

Of course defining your own message template is nice, but it's only really useful if you have a way of referencing information about the actual FMA event in your message.  To facilitate this, we support the following expansion macros that can be embedded in message templates:

Macro Description
 %%  expands to a literal % character
 %<HOSTNAME>  expands to the hostname on which the event occurred
 %<URL>  expands to the URL of the knowledge article associated with this event
 %<CLASS>  expands to the event class
 %<UUID>  expands to the UUID of the event
 %<CODE>  expands to the knowledge article message ID
 %<SEVERITY>  expands to the severity of the event


But wait…there's more!

The smtp-notify service can also be configured to generate notifications for SMF service state transitions.  I won't go into the details of that here, but it's all documented in the smf(1m), svccfg(1m) and smtp-notify(1m) man pages.

Monday Nov 23, 2009

Recent FMA Work

Well it's been over 15 months since my last blog entry so I thought I'd take the time to provide an update on some of the FMA work I've been doing in the interim - broken down by the Solaris Nevada build that the work was integrated into:

Build 98
6735704 deadlock in topo_node_facility()
6735691 deadlock in topo_node_facbind()
6743070 libtopo: need FRU labels for chip nodes on doradi

The first two changes were bug fixes to address a pair of potential deadlocks in libtopo introduced in the Sensor Abstraction Layer Phase 2 integration.

Doradi was the internal code name for the Sun-Fire 4150 and 4250 systems.  The third change was a small fix that allows the FMA command line tools to reference the FRU label when referring to a faulty CPU module.  (which is more friendly than using an FMRI to refer to the CPU module)

Build 100
6683671 txml_print_prop() leaks memory, segfaults on failure
6710915 typo in topo_method.h
6740819 typo in ses "contains" topo method

Three minor bug fixes to libtopo.

Build 106
PSARC 2008/753 Reflecting Fan/Power Supply Diagnosis in Solaris
6641745 diagnosis of power supply and fan failures via IPMI
6768720 disk-monitor: small leak in dm_process_sysevent() when handling
6769133 libtopo: hc_is_replaced() can leak memory
6765830 libtopo: need to enumerate sensors/indicators on fan/psu nodes
        on X4600
6773926 libipmi: ipmi_sdr_get sometimes bites off more than it can chew
6780080 libtopo: should optimize lookups for propmethod-backed properties
        if propvals are non-volatile
6781654 libtopo: completely bogus, but harmless logic in topo_snap_hold
        could be removed

This integration represents the first consumer for the Sensor Abstraction Layer that we introduced in build 96.  Why is this important to the user?

Modern servers are equipped with a wide variety of hardware sensors and our common service processor firmware (ILOM) monitors the fans and power supplies and can respond to failures and present them via IPMI or in its browser user interface.  However, any diagnosis made by ILOM was not visible to the Fault Manager running on Solaris.  This creates a sub-optimal usage model for admins and service personnel, as they're forced to consult multiple sources to get a complete and accurate picture of a system's health status, because some fault conditions are only visible from ILOM while others are only visible from the host OS.

This integration builds upon the previous Sensor Abstraction Layer work described here to allow fan and power supply faults that are diagnosed by the service processor to be reflected in the Solaris Fault Manager.  In the event of a fan or power supply failure, this integration allows us to provide a consistent Fault Management experience to service personnel, including:

    - producing a localized diagnosis message on the console
    - referring the admin to a relevant knowledge article
    - "fmadm faulty" output now includes fans/psus faulted by the SP

This work was based on code originally developed by Eric Schrock for the FishWorks product line.  I took this code and generalized it so that it could work across Sun's X64 server line and even theoretically on non-Sun systems which support IPMI.

This work also includes an important performance optimization for libtopo dynamic properties.  On AMD64 systems, this greatly speeds up lookups of DIMM serial numbers.  For more on how we use DIMM serial numbers in FMA see my earlier blog entry here.

Finally there were a handful of bug fixes in the integration.

Build 109
6802701 libtopo: need to enable the sensor-transport module on the Sun-Fire-X4140
6793549 libtopo: Need to enumerate sensors on duradi platforms
6793478 libtopo: Need to statically enumerate fans on duradi platforms
6793468 libtopo: Need to enumerate disk bays on the Sun-Fire-X4600 and
6805886 libipmi: handling of EIPMI_INVALID_RESERVATION errors is broken in

This integration enabled fan and power supply fault diagnosis on the Sun Fire X4140/X4240 and disk fault diagnosis on the Sun Fire X4600/X4600M2 platforms.

This integration also fixed a minor bug with the private library libipmi.

Build 115
PSARC 2009/265 fmdump -m
6810965 port fmdump -m to ON
6802474 Port libfmd_msg to ON
6805723 libtopo: port fmtopo -m to ON

The key piece in this integration is the introduction of a new private library, libfmd_msg, to Solaris.  This library is used by the FMA userland components to lookup and format localized message content before emitting it to the user (in the form of a console message or SNMP trap or email, etc...).  Encapsulating this functionality into a library allowed us to remove a ton of duplicated code from various userland FMA components.  Additionally, this library allows us to insert expansion macros into the message content contain in the portable object files delivered with Solaris.  These expansion macros can reference elements in the payload of FMA protocol events which, in turn, allows us to emit messages that are more customized to the system.  This library was originally developed by Mike Shapiro for this FishWorks product line.  This integration ports all this goodness to Solaris Nevada/OpenSolaris so that it can be leveraged by other platforms.

Build 116
6841968 syslog-msgs inadvertently uses wrong msg priority, causing msgs to
        be output to all windows

Very small integration to fix an annoying bug in the syslog-msgs fmd plugin.

Build 121
6839705 libtopo needs updates in order to cope with ILOM 3
6840169 libtopo: topo xml schema and parsing code needs to be extended to
        support defining array propvals
6840764 fmtopo can't print TOPO_TYPE_INT32_ARRAY and TOPO_TYPE_UINT64_ARRAY
6844530 dimm/cs serial propmethods in chip enumerator needlessly recompute
        IPMI entity name
6836314 add support for sensor-transport module on ILOM-based X4450 platforms
6844635 libtopo: pull chassis-specific xml out of i86pc-hc-topology.xml into
        seperate map
6844639 libtopo: add DIMM serial to chip-select nodes on X4140/4240/4440
6845699 libipmi: implementation of ipmi_sunoem_led_get/set interfaces needs
        to be updated for ILOM 3
6677012 libtopo: small leaks on snapshot creation
6535637 Add Severity level to payload of list.suspects event
6850083 libtopo: need to add JEDEC id for Hyundai Electronics to jedec_tbl in
        the chip enumerator
6844145 sys/bmc_intf.h should be delivered
6855750 fmadm faulty will fail to expand message tokens that reference event
6862378 libtopo: need to register TOPO_METH_SENSOR_FAILURE on ses nodes

This was a fairly large integration.  The bulk of these changes are to allow the FMA infrastructure to work correctly on Sun platforms which use the new ILOM 3 service processor firmware, while still maintaining compatibility with ILOM 2-based platforms.  Our FMA infrastructure leverages ILOM to do things like light/unlight chassis LED's and detect fan and power supply failures.

Build 124
6875268 missing power supplies may be reported as faulted
6874918 sensor-transport produces ereports too aggressively
6877019 topo_node_facility tries to release lock it doesn't own

This integration fixes some minor bugs in the sensor-transport module.

Wednesday Aug 06, 2008

A Sensor Abstraction Layer for FMA

Solaris Nevada build 96 is an important milestone build for the Sensor Abstraction Layer project for FMA, as it introduces the software infrastructure (the plumbing) on which the functionality described in the original design doc1 will built.

This was a collaborative effort between the Solaris FMA team and the Fishworks team (specifically Eric Schrock) and involved over 7000 lines of change to over 60 source files.  Below are the two putback notifications that comprised the combined work:

First my putback done on July 31st:

PSARC 2008/428 Extending libnvpair for type double
PSARC 2008/463 Extending HC FMRI scheme to represent sensors/indicators
6579615 fmtopo -e has lots of memory leaks
6635159 libtopo: extend hc scheme to allow for representing sensors and indicators in the topology
6692392 fmtopo -x doesn't handle property methods properly
6718703 Need to extend libnvpair to support type double
6718712 libtopo: Need to implement facility provider module for IPMI
6722594 libtopo: the topo_prop_set_\* interfaces need to learn to play well with propmethods
6727190 libtopo: add support for node properties of type double
6727459 libipmi: need interface to convert raw sensor readings to unit-based values
6727470 libipmi: need convenience routine to convert sensor unit defines to string
6729595 libtopo: add <set> case in fan and psu xml maps for SUN-FIRE-X4600-M2
6732318 fmd: small leak in sysevent modelling code

update: usr/src/cmd/fm/fmd/common/fmd_sysevent.c
update: usr/src/cmd/fm/fmtopo/common/fmtopo.c
update: usr/src/common/nvpair/nvpair.c
update: usr/src/lib/fm/topo/libtopo/Makefile.com
update: usr/src/lib/fm/topo/libtopo/common/hc.c
update: usr/src/lib/fm/topo/libtopo/common/libtopo.h
update: usr/src/lib/fm/topo/libtopo/common/mapfile-vers
update: usr/src/lib/fm/topo/libtopo/common/topo_2xml.c
update: usr/src/lib/fm/topo/libtopo/common/topo_error.h
update: usr/src/lib/fm/topo/libtopo/common/topo_fmri.c
update: usr/src/lib/fm/topo/libtopo/common/topo_method.c
update: usr/src/lib/fm/topo/libtopo/common/topo_method.h
update: usr/src/lib/fm/topo/libtopo/common/topo_mod.h
update: usr/src/lib/fm/topo/libtopo/common/topo_node.c
update: usr/src/lib/fm/topo/libtopo/common/topo_parse.h
update: usr/src/lib/fm/topo/libtopo/common/topo_prop.c
update: usr/src/lib/fm/topo/libtopo/common/topo_subr.c
update: usr/src/lib/fm/topo/libtopo/common/topo_subr.h
update: usr/src/lib/fm/topo/libtopo/common/topo_xml.c
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/common/topology.dtd.1
update: usr/src/lib/fm/topo/maps/i86pc/chip-hc-topology.xml
update: usr/src/lib/fm/topo/maps/i86pc/fan-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/i86pc/i86pc-hc-topology.xml
update: usr/src/lib/fm/topo/maps/i86pc/psu-hc-topology.xml
update: usr/src/lib/fm/topo/modules/common/Makefile
update: usr/src/lib/libipmi/Makefile.com
update: usr/src/lib/libipmi/common/ipmi_impl.h
update: usr/src/lib/libipmi/common/ipmi_sdr.c
update: usr/src/lib/libipmi/common/ipmi_util.c
update: usr/src/lib/libipmi/common/libipmi.h
update: usr/src/lib/libipmi/common/mapfile-vers
update: usr/src/lib/libipmi/common/mktables.sh
update: usr/src/lib/libnvpair/libnvpair.c
update: usr/src/lib/libnvpair/mapfile-vers
update: usr/src/pkgdefs/SUNWfmd/prototype_com
update: usr/src/uts/common/sys/fm/protocol.h
update: usr/src/uts/common/sys/nvpair.h
create: usr/src/lib/fm/topo/modules/common/fac_prov_ipmi/Makefile
create: usr/src/lib/fm/topo/modules/common/fac_prov_ipmi/fac_prov_ipmi.c

Examined files: 41

Contents Summary:
       2   create
      39   update

Names Summary:
       2   update parent's name history
       2   update children's name history

And now Eric Schrock's putback, done on August 1st:

PSARC 2008/485 SES Sensors and Enumerator
6720433 SES enumerator should provide controller revision information
6720435 SES enumerator should prefer description over class-description
6720452 SES enumerator should support indicators and sensors
6722807 SES enumerator should work with internal enclosures
6722809 want a way to identify enclosures as internal
6722811 SES enumerator should prefer elements with known status
6723603 x86 xmlgen topo scripts should make use of propmap
6732875 typo in fan-hc-topology.xmlgen
6732879 broken logic in pad_process()

update: usr/src/lib/fm/topo/libtopo/common/topo_parse.h
update: usr/src/lib/fm/topo/libtopo/common/topo_xml.c
update: usr/src/lib/fm/topo/maps/Makefile.map
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Makefile
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Makefile
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Makefile
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Makefile
update: usr/src/lib/fm/topo/maps/common/topology.dtd.1
update: usr/src/lib/fm/topo/maps/i86pc/fan-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/i86pc/i86pc-hc-topology.xml
update: usr/src/lib/fm/topo/modules/common/ses/Makefile
update: usr/src/lib/fm/topo/modules/common/ses/ses.c
update: usr/src/lib/scsi/plugins/ses/Makefile
update: usr/src/lib/scsi/plugins/ses/libses/common/libses.h
update: usr/src/pkgdefs/SUNWfmd/prototype_i386
update: usr/src/pkgdefs/SUNWscsip/prototype_com
update: usr/src/pkgdefs/SUNWscsip/prototype_i386
update: usr/src/pkgdefs/SUNWscsip/prototype_sparc
update: usr/src/tools/scripts/bfu.sh
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Sun-Fire-X4200-M2-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Sun-Fire-X4200-M2-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-M2/Sun-Fire-X4200-M2-disk-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Sun-Fire-X4200-Server-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Sun-Fire-X4200-Server-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4200-Server/Sun-Fire-X4200-Server-disk-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4500/Sun-Fire-X4500-disk-hc-topology.xmlgen
update: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-disk-hc-topology.xmlgen
rename from: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-hc-topology.xmlgen
         to: usr/src/lib/fm/topo/maps/SUNW,Sun-Fire-X4540/Sun-Fire-X4540-disk-hc-topology.xmlgen
create: usr/src/lib/fm/topo/maps/common/xmlgen-header.xml
create: usr/src/lib/fm/topo/modules/common/ses/ses.h
create: usr/src/lib/fm/topo/modules/common/ses/ses_facility.c
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/Makefile.com
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/amd64/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/common/lsilogic.c
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/i386/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/sparc/Makefile
create: usr/src/lib/scsi/plugins/ses/LSILOGIC-SASX28-A.0/sparcv9/Makefile

Examined files: 33

Contents Summary:
      10   create
      23   update

Names Summary:
       4   renamed
      10   update parent's name history
      14   update children's name history

The Sensor Abstraction Layer project page has been updated with links to some new documentation. Below is some more details on three of the key new FMA infrastructure changes: hc FMRI scheme extensions, facility nodes and facility providers.

First some background...

As touched in my previous blog entry, the Solaris Fault Manager maintains a snapshot of the hardware topology in a tree-like structure that includes a node for all hardware resources and FRU's that are managed/monitored by FMA.  The interfaces for generating a topology snapshot, walking the resulting tree and for manipulating the individual nodes in the tree are provided by libtopo and documented in Chapter 9 of the Fault Manager Programmer's Reference Guide.

The Sensor Abstraction Layer for FMA extends libtopo so that sensors and indicators can also be represented in our topology in a fashion that allows for the association of sensors and indicators to hardware resource to be programmatically determined.

Additionally it introduces the concept of a facility provider module which provides an abstraction layer between libtopo and the lower-level interfaces that are used to control a given sensor or indicator.

Together this provides a set of common infrastructure to enable future FMA projects to manipulate sensors and indicators as part of Fault Management activities.

hc FMRI scheme extensions

Th existing hc-scheme allows for a heirarchial representation of hardware resources, according to their physical connection properties. However, this is not a very useful way to represent sensors and indicators in the topology because it does not allow for consumers to programmatically determine the association of sensors/indicators to the hardware resource that they're monitoring.

In Solaris Nevada build 96, we've extended the hc FMRI scheme to allow for this association to be represented using a new type of node in the topology: a facility node.

A facility node is a special leaf node in the hc-scheme topology that represents either a sensor or an indicator. A fault managed resource may have one or more child facility nodes that represent sensors or indicators that are associated with it. The hc-scheme was be extended as shown below to allow for an additional facility node member:

Name Data Type Description
scheme uint32 scheme used for FMRI
version uint32 version of scheme specification
authority nvlist optional authority of FMRI
resource path
facility nvlist facility component of FMRI

The facility nvlist will have two members:

Name Data Type Description
facility-type string type of facility node: "sensor" or "indicator"
facility-name string name of the facility

The string representation of an hc scheme FMRI will also be extended, as shown below:


where fac-type can be either "sensor" or "indicator" and fac-name is the name of the facility.

for example:
hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/chassis=0/fanmodule=0/fan=0?sensor=speed hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop/chassis=0/bay=47?indicator=ok2rm

Anatomy of a Facility Node

Facility nodes are required to have the following properties specified in a "facility" property group:

 Property Name
Threshold Sensors
Discrete Sensors
 type  Yes Yes
 sensor-class  Yes Yes
 reading  Yes No
 state Yes
 units Yes
 mode No

These properties allow for the classification of the facility node to be programmatically determined and are used by the new topo_fmri_facility() interface to check for the existence of sensors or indicators of a given type.

'sensor-class" property

All facility nodes of type "sensor" must specify a "sensor-class" property that is set to one of the following values.

Value Data Type Define

'units' property

All 'sensor' facility nodes with a "sensor-class" property value of TOPO_SENSOR_CLASS_THRESHOLD are required to specify a "units" property of type uint32. The value should be set to one of the predefined unit types specified in libtopo.h (see the TOPO_SENSOR_UNIT_\* defines)

'type' property

All 'sensor' and 'indicator' facility nodes must provide a "type" property of type uint32. The value should be set to one of the predefined unit types specified in libtopo.h (see the TOPO_SENSOR_TYPE_\* and TOPO_LED_TYPE_\* defines)

Facility Providers

A facility provider is a logical collection of node and property methods that provide an abstraction layer between libtopo and the underlying lower level interfaces that are used to actually manipulate the sensors and indicators. This allows library consumers (namely fmd) to access sensor readings and manipulate indicators via standard libtopo interfaces (e.g. topo_prop_{get|set}_{type}). Nevada build 96 includes the implementation of facility provider modules for IPMI2 and SES3, which will provide broad coverage across Sun's x64 server platforms.  The diagram below shows how the provider modules fit into the overall software structure:

Facility Provider Block Diagram

Facility providers are implemented as simplified libtopo plugin modules (similar to enumerator modules). However, in the implementation of their tmo_enum entry point, a facility provider will simply register its methods on the node that is passed in.

At a minimum, a facility provider must implement the following property methods:

'reading' property method

Sensor nodes with a "sensor-class" property value of TOPO_SENSOR_CLASS_THRESHOLD must provide a property method for the "reading" property of type TOPO_TYPE_DOUBLE that should return the current analog reading from the sensor.

'state' property method

All sensors nodes (threshold and discrete) should provide a property method for a "state" property of type uint32. The property value should be set to one of the predefined sensor-type specific discrete states defined in libipmi.h (see the TOPO_SENSOR_STATE_\* defines)

'mode' property method

For 'indicator' facility nodes, the facility provider must implement a property method to get/set the LED mode.  The mode property can be set to one of the following two values: 0 (OFF) or 1 (ON)

Facility providers can also optionally implement a node method (fac_enum) that can be invoked on a given hardware resource node to automatically discover and enumerate facility nodes that should be bound (associated) with it.

Below are some example excerpts of fmtopo4 output for both a threshold and discrete sensor as well as an indicator node. These examples were taken from a Sun-Fire X4500.

  group: facility                       version: 1   stability: Private/Private
    entity_ref        string    proc.p0.t_core
    sensor-class      string    threshold
    type              uint32    0x101 (THRESHOLD_STATE)
    state             uint32    0x0 (0x00)
    reading           double    49.000000
    units             uint32    0x1 (DEGREES_C)

  group: facility                       version: 1   stability: Private/Private
    entity_ref        string    ps0.prsnt
    sensor-class      string    discrete
    type              uint32    0x108 (GENERIC_PRESENCE)
    state             uint32    0x2 (ASSERTED)

  group: facility                       version: 1   stability: Private/Private
    entity_ref        string    hdd40.state
    mode              uint32    0x1 (ON)
    type              uint32    0x3 (PRESENT)

Well - that's it for now.  In my next blog entry I'll give some example that demonstrate how easy it is to use the new interfaces in libtopo to get sensor readings or flip LED's on or off.

[1] The hc FMRI scheme extensions and the concept of facility nodes are based on Cynthia McGuire's original design document for the Sensor Abstraction Layer, so readers may find it beneficial to review section 2.3 of that document to gain additional background.

[2] IPMI is an acronym for Intelligent Platform Management Interface which is an Intel specification for doing out-of-band management of computers.  Over the last few years it has become and industry standard and thus most x86 server platforms (including those made by Sun) support IPMI.  Through IPMI we can get access to platforms sensors and indicators.

[3] SES is an acronym for SCSI Enclosure Services which is a protocol for accessing diagnostic services for SCSI storage enclosures including things like temperature and voltage sensors.

[4]  fmtopo is a command-line utility that developers can use to take a snapshot and dump the resulting topology.  It's usage is documented in chapter 12 of the Fault Manager Programmer's Reference Guide.

Thursday Apr 17, 2008

FMA and DIMM serial numbers

I've pretty much had my head down working on various FMA bug fixes and enhancements for the last few months.  Now that I've finally gotten them putback, I have some time to take a (short) breather and so I thought I'd blog about a few of the things I've been working on here.  Here's the first installment:

The Solaris Fault Manager maintains a snapshot of the hardware topology in a tree-like structure that includes a node for all hardware resources and FRU's that are managed/monitored by FMA.  The interfaces for generating a topology snapshot, walking the resulting tree and for manipulating the individual nodes in the tree are provided by libtopo and documented in Chapter 9 of the Fault Manager Programmer's Reference Guide.  Scott Davenport also has  some nice overview material here.   The nodes in the tree are represented by a unique identifier called an FMRI (fault managed resource identifier).  The format of the FMRI for hardware resources is the following:


where hardware-id would be:

Among other things, the optional "hardware-id" fields (in particular the serial) can be used by the fault manager to detect when a FRU has been replaced by service personnel.  In the absence of hardware identity information, administrators must manually inform the fault manager after they've replaced a faulty component via the "repair" subcommand to fmadm(1m).  Otherwise, the fault manager will continue to report the component as faulty and attempt to isolate it.  On our UltraSPARC systems much of this information is provided by the OpenBoot Platform firmware.   On x86 we don't have the benefit of sitting on top a common firmware layer that we control.  As a result, we historically haven't filled in the hardware-id fields because we haven't found a generalized, reliable mechanism for fetching this data.  However, on our newer AMD-based server platforms[1], some FRU information is maintained in non-volatile storage by the service processor and is accessible using a common protocol: IPMI

In Solaris Nevada, build 87 we've added the capability to leverage IPMI to find and attach serial numbers to the dimm nodes in our topology on our AMD-based server platforms and we've extended the fault manager to check for this serial property and use it, if found, to detect when a faulted DIMM has been replaced.  For people who like ugly details :), here's a brief rundown of the code changes:

It all starts with a new topo node property method that is registered to the dimm nodes in our topology on our AMD-based server platforms.  The XML for this looks like the example below and the complete XML changes are in usr/src/lib/fm/topo/maps/i86pc/chip-hc-topology.xml

 <propmethod name='get_dimm_serial' version='0' propname='serial' proptype='string' >
      <argval name='format' type='string' value='p%d.d%d.fru' />
      <argval name='offset' type='uint32' value='0' />

This property method uses interfaces from libipmi to communicate with the service processor to lookup the FRU locator record for the associated DIMM.  The FRU locator record provides the offset into the FRU inventory on the service processor where we can fetch information such as manufacturer name and the serial number.  Using the manufacturer name and serial number we synthesize a Sun serial ID[2] and attach it as a property to the dimm node.  This all happens in usr/src/lib/fm/topo/modules/i86pc/chip/chip_serial.c

Next we've modified the Fault Manager to look for the existence of the serial property method, and if found, invoke it and attach the serial to the FMRI's that are included in the payload of a fault event.  See fmd_nvl_create_fault() in usr/src/cmd/fm/fmd/common/fmd_api.c

The fault manager maintains a persistent cache of resources that have been the subject of a diagnosis (see Chapter 6 of the Fault Manager Programmer's Reference Guide).  The fault manager uses this to keep track of what's faulty and enables it to re-report and re-isolate a faulted component after a system restart.  However, before doing this, the fault manager first attempts to determine if the faulted component is still present in the system.  (No need to report or isolate something that's been removed).  The  code for determining if a faulted resource is a bit hard to follow and in some case varies based based on the type of component and whether we're on SPARC or x86, but the basic idea is to determine what scheme the FMRI of the faulted resource is in and then call the appropriate is_present method which should return TRUE, if the resource is still present and FALSE, otherwise.  For the DIMM case on our AMD-based platforms, the code flow looks like this:

|-> usr/src/cmd/fm/fmd/common/fmd_fmri.c::fmd_fmri_present()
    |-> usr/src/cmd/fm/schemes/mem/mem.c::fmd_fmri_present()
        |-> usr/src/lib/fm/topo/libtopo/common/topo_fmri.c::topo_fmri_present()
            |-> usr/src/lib/fm/topo/libtopo/common/hc.c::hc_is_present()
                |-> usr/src/lib/fm/topo/modules/i86pc/chip/chip_subr.c::rank_is_present()

The rank_is_present method in the chip enumerator module will compare the serial numbers and returns FALSE if the serial number of the faulted resource doesn't match the current serial number in the topology snapshot.  If any errors occur along the path above, thus preventing us from determining if the resource is still present, we err on the side of caution and return TRUE.

Ok - so that's some of the gory code details, but what will it look like from the user's perspective?

If a DIMM is diagnosed as faulty on an X64 system, the user will see something like this on the console (no change here):

SUNW-MSG-ID: AMD-8000-48, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Wed Mar 19 14:04:01 PDT 2008
PLATFORM: Sun Fire X4500, CSN: 00:14:4F:20:E4:B0     , HOSTNAME: lollipop
SOURCE: eft, REV: 1.16
EVENT-ID: 44384620-5c7d-4073-edbc-ff0664004de4
DESC: The number of errors associated with this memory module has exceeded acceptable levels.  Refer to http://sun.com/msg/AMD-8000-48 for more information.
AUTO-RESPONSE: Pages of memory associated with this memory module are being removed from service as errors are reported.
IMPACT: Total system memory capacity will be reduced as pages are retired.
REC-ACTION: Schedule a repair procedure to replace the affected memory module.  Use fmdump -v -u <EVENT_ID> to identify the module.

If the user runs "fmadm faulty", they'll see this (note the DIMM serial number is now included in the FRU FMRI)

lollipop# fmadm faulty -a
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Mar 19 14:04:01 44384620-5c7d-4073-edbc-ff0664004de4  AMD-8000-48    Major   

Fault class : fault.memory.dimm_ue
Affects     : mem:///motherboard=0/chip=0/memory-controller=0/dimm=0/rank=0
                  degraded but still in service
FRU         : "CPU 0 DIMM 0" (hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop:serial=002C000000DA062AF3/motherboard=0/chip=0/memory-controller=0/dimm=0)

Description : The number of errors associated with this memory module has
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/AMD-8000-48 for more information.

Response    : Pages of memory associated with this memory module are being
              removed from service as errors are reported.

Impact      : Total system memory capacity will be reduced as pages are

Action      : Schedule a repair procedure to replace the affected memory
              module.  Use fmdump -v -u <EVENT_ID> to identify the module.

Now if the user/service guy replaces "CPU 0 DIMM 0" and then reruns "fmadm faulty" after bringing the system back up they'll see this:  (note the state of the ASRU and FRU have changed to "faulted and taken out of service" and "not present", respectively)

lollipop# fmadm faulty -a
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Mar 19 14:04:01 44384620-5c7d-4073-edbc-ff0664004de4  AMD-8000-48    Major   

Fault class : fault.memory.dimm_ue
Affects     : mem:///motherboard=0/chip=0/memory-controller=0/dimm=0/rank=0
                  faulted and taken out of service
FRU         : "CPU 0 DIMM 0" (hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E4-B0:server-id=lollipop:serial=002C000000DA062AF3/motherboard=0/chip=0/memory-controller=0/dimm=0)
                  not present

Description : The number of errors associated with this memory module has
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/AMD-8000-48 for more information.

Response    : Pages of memory associated with this memory module are being
              removed from service as errors are reported.

Impact      : Total system memory capacity will be reduced as pages are

Action      : Schedule a repair procedure to replace the affected memory
              module.  Use fmdump -v -u <EVENT_ID> to identify the module.

I also putback a handful of other bug fixes into build 87 - here's the complete putback notification:

Event:            putback-to
Parent workspace: /ws/onnv-gate
Child workspace: /net/hyper/tank/ws/robj/fma-dimm-serial2
User: robj

6593380 topology for Sun x64 platforms should include serial numbers for dimms
6671247 missing DIMM FRU labels on 4600/4600M2 platforms with family 15 modules
6672188 chip FRU labels computed incorrectly on 2-socket AF4+ blades
6675806 libipmi: ipmi_fru_read() can leak memory on failure

update: usr/src/cmd/fm/eversholt/files/i386/i86pc/amd64.esc
update: usr/src/cmd/fm/eversholt/files/i386/i86pc/intel.esc
update: usr/src/cmd/fm/fmd/common/fmd_api.c
update: usr/src/cmd/fm/schemes/mem/mem.c
update: usr/src/lib/fm/topo/libtopo/common/libtopo.h
update: usr/src/lib/fm/topo/libtopo/common/mapfile-vers
update: usr/src/lib/fm/topo/libtopo/common/topo_fmri.c
update: usr/src/lib/fm/topo/maps/i86pc/chip-hc-topology.xml
update: usr/src/lib/fm/topo/modules/i86pc/chip/Makefile
update: usr/src/lib/fm/topo/modules/i86pc/chip/chip.c
update: usr/src/lib/fm/topo/modules/i86pc/chip/chip.h
update: usr/src/lib/fm/topo/modules/i86pc/chip/chip_amd.c
update: usr/src/lib/fm/topo/modules/i86pc/chip/chip_label.c
update: usr/src/lib/fm/topo/modules/i86pc/chip/chip_subr.c
update: usr/src/lib/libipmi/common/ipmi_fru.c
create: usr/src/lib/fm/topo/modules/i86pc/chip/chip_serial.c

Examined files: 16

Contents Summary:
1 create
15 update

[1] I'm qualifying the statement by saying "on our newer AMD-based server platforms" for a few reasons:

  1. Since we're sourcing the serial number from the service processor, we obviously won't be able to support this on our AMD-based desktop platforms which don't have baseboard management controllers.
  2. The third-party service processor firmware on some of older AMD-based server platforms do not export sufficient FRU information to allow us to get the serial numbers.  This mainly affects the lower-end X2100/2200 line.
  3. Our Intel-based platforms use a completely different mechanism to get the DIMM serial numbers.

[2] You might be wondering why we need to synthesize a Sun serial ID as opposed to simply using the manufacturer serial number.  There are a couple problems with using the manufacturer serial number, as is.  First, different DIMM manufacturers could use the same serial number.  Secondly, because the serial space is limited (8 characters) and DIMM manufacturers pump out DIMM's at a staggering rate, the same manufacturer could cycle through and then resuse serial numbers as frequently as every week.  Because FMA needs to use the serial number to determine whether a given DIMM has been replaced, we need know that the serial is as unique as possible.  Newer versions of our service processor firmware (ILOM) will concatenate the following three additional pieces of information to the manufacturer serial to form a globally unique 18 character Sun serial ID:

  1. The JEDEC ID of the manufacturer
  2. The manufacturing location
  3. The manufacturing date

For the cases where we encounter older ILOM software that doesn't synthesize a globally unique Sun serial ID, Solaris will synthesize an 18 character serial ID based on the manufacturer JEDEC ID and the manufacturer serial (filling in zeroes for the location and date).  While this isn't guaranteed to be unique, it is more likely to be unique than just using the manufacturer serial alone.




Top Tags
« July 2016