SPARC-64 VII Announced, FMA support in place

Sun and Fujitsu announced an update to the SPARC Enterprise line - the quad-core SPARC-64 VII processor. While I'm not an authority on the FMA implementation on these systems, I am aware that FMA fully supports the SPARC-64 VII processors. The events registry is up to date with VII information, as are the knowledge articles. A few examples for chip, core and strand fault messages on SPARC-64 VII.

:wq

Comments:

Hello Scott,

Now we have a project in Taiwan which need to demo the capability of M8000 and Solaris 10 new features, of course we hope to how our FMA ability to customer, but it seems there are no FMA demo kit for our M-series or SPARC-64VI base processor so far, I need some assistance and input about this, does any one have experiences for it? please do your comments for me!!

Thanks,
Jerry

Posted by Jerry Huang on July 14, 2008 at 12:30 PM PDT #

Jerry,

I talked with some of the folks in Sun that worked with Fujitsu on the OPL machines. OPL's FM architecture is very much SP centric. Highly summarized, SCF's RAS-DB takes in the raw info from the HW and produces an ereport. That ereport is consumed and diagnosed in the SP.

Any faults that require Solaris coordination (CPU offlines and/or page retires) are transported to the OS for action. There is a re-issue agent in Solaris to ensure the FMD resource cache is updated so replay of offline/retire events happen when Solaris restarts. But my point here is that diagnosis is SP-driven.

Since the FMA Demo Kit is Solaris-based, we could simulate the feed of faults from SCF to Solaris on OPL, but it would be only a piece of the puzzle...and arguably not the best for a customer demo. I'd expect more customers would want to see and end-to-end demo and show both SCF and Solaris reactions to an error condition. During OPL development, low level error simulation/injection has been driven by Fujitsu.

Sadly, I don't have names or contacts within Fujitsu to ask - any readers that can help Jerry?

Posted by Scott Davenport on July 16, 2008 at 09:33 AM PDT #

Hi Scott,
Thanks for you keep find out the answers for me.
As we know, the fault detect predictably and self healing are the major functions of Solaris 10 FMA which is the one of key feature that customer need and we hope to demo to customer, and I knew the XSCF can do some fault monitoring job but can it detect the failures predictably and then fix it self like FMA does?

Or how we show similar functions by XSCF like FMA? Do you have any fault simulator or demo kit for XSCF use to do the demonstration?

Regrads,
Jerry

Posted by Jerry on July 16, 2008 at 12:41 PM PDT #

Jerry,

I did a little more asking around within Sun. There are apparently some internal test suites that the Sun test folks have used. I can put you in touch with them. Alas, I don't know if these tools are publicly available.

Posted by Scott Davenport on July 18, 2008 at 08:16 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

user9148476

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today