Wednesday Sep 08, 2010

Oracle Solaris 10 Update 9: FMA Fixes

It's been a long time since my last blog on FMA - I've since moved into another technology area. Maybe someday I'll garner enough knowledge there to start blogging. In the meantime, with the release of Oracle Solaris 10 Update 9 today, I thought I'd reprise my "favorite FMA fixes" blog entry for this release. So here's my (thankfully short :) list of my favorites:


6627041 Add a PSN nv-pair to the authority portion of the FMRI scheme

This may be less exciting to the end users, but my top favorite. This fix includes the product serial number (PSN) into the FMA fault telemetry. It's a key piece of information to improve the Oracle Auto Service Request (ASR) program. With the PSN, the "Auto" part of ASR is sped up. Hmm...maybe this is exciting for customers - you'll get a faster turn around time on parts servicing.


6502086 DBU errors should be diagnosed as HV defect/fault
6502089 ferg.invalid errors should be diagnosed as a fault

These two CRs provide a level of diagnosis for firmware issues on the sun4v systems. The first covers an error that should not happen unless there's a bug in the hypervisor. The second warrants a little explanation. On CMT systems, FMA telemetry is sourced in the SP - the hardware error is collected in the SP, packaged into an ereport, and provided to Solaris. If the error bits collected can't be mapped to a defined ereport, a "ferg.invalid" ereport is produced. Basically saying "There's been an error, but can't determine what kind". This shouldn't happen outside the lab. If it does, it indicates a logic flaw in the firmware on the SP.


6764337 Needs level 2 FMA compliance for chipset 5100 MCH
6889350 fma fails if DIMM sizes are mixed

A couple of fixes for various Intel configurations. The first enhancing FMA support to include another memory controller chipset. Oracle's CP3250 blades use the 5100 chipset. The second here fixes up a bug - the bug description is self explanatory.


6860401 FMA CPU Topology & Memory Topology needs to support Magny Cours(Multi chip Module)
6812502 Enable Generic-AMD FMA memory topology for Istanbul

These fixes enable CPU and memory diagnosis for the more recent AMD processor offerings. Beyond the typical changes of new model numbers, these processors have a more interesting topology with multiple processing nodes within a single package. Solaris 10 Update 9 understands these chips and creates the correct topology to support FMA diagnosis.

:wq

Friday Mar 26, 2010

Intel CPU/Memory HotPlug for OpenSolaris

Cool integration yesterday. Hotplug of CPUs and memory for Intel systems. Last summer, I worked with Intel to ensure that newly added resources are fault managed just as those present at start-of-day are fault managed. And gladly, that functionality is included in this integration.

When resources are added, the FMA topology is updated to reflect the new CPUs/memory, as are the #MC handlers. There's a gap in FRU identification with newly added resources (component labels and serial numbers are sourced from SMBIOS, which is static) but otherwise hotplugged components are handled in FMA.

:wq

Friday Mar 05, 2010

SATA Disk Diagnosis Hits x86 OpenSolaris

This set of changes hit OpenSolaris today:

Comments: PSARC/2010/045 x86gentopo enumeration of direct attached SATA 6891266 generic x86 enumeration for directly attached SATA disks 6903122 Export SATA PHY from framework 6906979 Generic x86 disk enum needs SMBIOS OEM extended structure Files: added: usr/src/lib/fm/topo/modules/i86pc/x86pi/x86pi_bay.c modified: usr/src/cmd/smbios/smbios.c usr/src/common/smbios/smb_info.c usr/src/lib/fm/topo/libtopo/common/topo_hc.h usr/src/lib/fm/topo/modules/common/disk/disk.h usr/src/lib/fm/topo/modules/i86pc/x86pi/Makefile usr/src/lib/fm/topo/modules/i86pc/x86pi/x86pi.c usr/src/lib/fm/topo/modules/i86pc/x86pi/x86pi_hostbridge.c usr/src/lib/fm/topo/modules/i86pc/x86pi/x86pi_impl.h usr/src/lib/fm/topo/modules/i86pc/x86pi/x86pi_subr.c usr/src/lib/libsmbios/common/mapfile-vers usr/src/uts/common/io/sata/impl/sata.c usr/src/uts/common/sys/smbios.h usr/src/uts/common/sys/smbios_impl.h

Extensions to the x86gentopo project that enables topology for direct attach SATA devices. With the topology, the direct attach SATA devices automatically get diagnosed, leveraging the existing driver hardending, SMART data collection, and disk diagnosis rules already in OpenSolaris.

:wq

About

user9148476

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today