FMA Features & Fixes in Solaris 10 Update 5
By user9148476 on Apr 15, 2008
Solaris 10 Update 5 released today and is available for download. This update contains some great features and fixes for FMA. Here's just a few of my favorite bugs (an oxymoron?) that are included in S10U5:
- 6545604 Enhance CPU/Mem DE to support T2plus
Of course this one is the top of my list, having spent a bunch of my life over the last 12 months on Fault Management for the recently announced T5140/T5240. Refer to my last posting for details on the fault management features for these systems.
- 6536482 diagnose FBR and FBU errors to branch
I've talked about this one before - being able to diagnose issues with a FBDIMM channel. This feature came out in an S10U4 patch, and now formally part of S10U5.
- 5076574 eft needs to go on a memory usage diet
One of my colleagues did some excellent work in not only reducing the amount of memory the Eversholt diagnosis engine uses, but also diagnosis speed improvements. I did some recent experiments with this fix for the sun4v platform independence project I'm working on in a resource constrained environment (an SP appliance). Things worked fantastically. Mocking up a topology of about 830 nodes, the time from the ereport being logged to the fault being logged was 5 seconds or less - and most of that 5 seconds was taken up with Eversholt "within" constraints (a langauge keyword). And I couldn't get memory consumption of the EFT module higher than 392K. Great stuff.
- 6532588 need to be able to override N/T values for SERD engines
With this change, the SERD values for diagnosis engines can be overridden via eft.conf. This is not intended for normal usage, but if SERD values are found to be incorrect, it provides a quick-fix method in the field - no more waiting for a Solaris patch or binary drop to fix threshold values. This is also handy in a burn-in environment where one wants to tailor the SERD values to match the test duration.
- 6524100 Extend niu.so to enumerate XAUIs and XFPs
- 6505251 NIU FMA needs to diagnose XAUI and XFP faults
These two fixes supply the topology and diagnosis rules necessary to diagnose down to the optical module in Sun's 10 GbE network cards. The changes also apply to the on-chip networking unit (NIU) in the UltraSPARC T2 processor, as well as the onboard 10 GbE ASIC in UltraSPARC T2 Plus systems.
Note: I fixed a bug with labeling XFP nodes in the topology. The fix is in OpenSolaris, but not yet in a Solaris 10 update/patch. Should be out in a month or two....upside is that the diagnosis is still accurate, just the label looks a little funny.