New Features in XCP1050

As I mentioned in my 01-Nov-2007 posting, the Sun SPARC Enterprise M-class server service processor firmware version XCP1050 and beyond have several new features that I wanted to blog about.

Servicetags

Servicetags is part of the Sun Connection infrastructure. The basic idea is to enable customers to better track their Sun assets, and by communicating with Sun, determine what updates are available, what needs patching, etc.

Servicetags were introduced in Solaris 10. It's essentially a piece of software that runs on a server and can communicate the list of software products installed on the server, the product versions, patch levels, and so forth. Customers can then run a Java application on their workstation to discover Sun products throughout their datacenter, and at their discretion, send that list to Sun to register the products and/or check for updates.

In XCP1050, the servicetags software now also runs on the service processor. This allows customers to discover the hardware assets in their datacenter, including the machine type, part number, and serial number.

On new machines, servicetags are enabled by default; if you upgrade from XCP1041 or earlier, you'll need to enable servicetags manually. The commands to manage servicetags on the service processor are setservicetags and showservicetags. The usage is very straightforward, for example:

    XSCF> setservicetag -c disable
    XSCF> showservicetag
    Disabled
You can download the discovery application here.

Browser User Interface

Anyone who used the Browser User Interface (called BUI, or Web User Interface) in XCP1041 or earlier probably found there were many tasks that could not be accomplished through the BUI, but required you to use the command line interface. In XCP1050, that all changed. Now, just about everything you could do through the command line can now be done through your web browser. A lot of hard work went into these BUI updates, and I think it really shows.

Fault LEDs and clearfault

It might be surprising, but the most difficult aspects of collaborating with another company on a new product were things that seem the most trivial: bezel color, whether buttons in the Browser UI should have square or rounded corners, and when and how should LEDs blink. Of all these, I think LEDs were the most contentious.

Sun adheres to the ANSI/VITA 40-2003 Service Indicator Standard (SIS) for most of its products. Fujitsu, however, adheres to a different standard. The differences between the two standards are minor, but when a customer is managing a large number of systems, any variation in indicator standards can be a source of confusion. In XCP1040, we shipped with a compromise, meaning both Sun and Fujitsu were unhappy with the solution.

For XCP1050, we reworked the fault indicator policies for both companies. In fact, we added the ability of the firmware to tell if the server was Sun-branded or Fujitsu-branded, and based on the branding, it adhered to the respective company's fault LED standards. Now, on a Sun branded system for example, the fault LEDs adhere to a simple policy:

  • If a FRU's (field replaceable unit) fault LED is on, then there is a fault in the chassis and it has been isolated to that specific FRU with very high confidence. In other words, if the fault LED is on, then we know for a fact the FRU is broken.
  • If the chassis fault LED (on the front panel) is on, then there is a fault in the chassis somewhere.
Note that it's possible that the chassis fault LED is on, but no FRU LEDs are on; that can happen if the server cannot isolate the fault to a single FRU. Commands such as showstatus and fmadm faulty will identify the list of suspected FRUs.

Furthermore, on Sun branded systems, cycling chassis power can no longer be used to clear the fault LEDs for FRUs or the chassis. FRUs don't magically become "better" just because you cycled power; if the FRU was faulty, it's still faulty, so the fault LED shouldn't magically turn off. For Sun branded systems, the fault LEDs will remain on until the customer or service engineer actively clears the fault condition, by removing/replacing the faulty FRU, or by running the clearfault command.

In XCP1040, clearfault could be used to mark a FRU as not faulty; however, almost all FRUs still required a chassis power cycle. This was so that the chassis could perform a power-on self test of the FRU before reconfiguring it into a running server. The last thing you want is someone to manually type clearfault /CMU#0 and then discover that CMU#0 really was faulty and bring down the server.

XCP1050 was enhanced so that clearfault could selectively initiate self test on many FRUs, without requiring a chassis power cycle. When you run clearfault now, it will check to see if it is possible to run self test without disturbing the running system. Some FRUs cannot be tested during operation; other FRUs may be in use in such a way that self test cannot be performed. If the FRU is safe to be tested, clearfault will initiate the self test, and if successful, the fault condition will be cleared. If clearfault cannot test the FRU, it will be marked to be cleared at the next chassis power cycle.

A great deal of effort went into improving the fault detection, isolation, reporting, and service interface, to make it more accurate and more consistent with other Sun products.

Comments:

Hi Bob,

Our XSCF is currently running at XCP1060. Recently, we have invoked the 'clearfault' command to clear a suspected IOU fault. However, it is noticed that the CHECK LED still on even though the 'clearfault' is run successfully. So, is there any way to turn off the CHECK LED as well?

Thanks & Regards
Paul

Posted by Paul Liong on April 02, 2008 at 08:44 PM EDT #

Paul,

Sorry, it's been a while since I worked on that code, so my memory is a little weak. I believe there are certain cases where a FRU can be marked faulty that isn't cleared with clearfault, even if it passes power-on testing. You would really need to raise this issue with a Service Engineer.

-- Bob.

Posted by Bob Hueston on April 03, 2008 at 02:18 AM EDT #

Post a Comment:
Comments are closed for this entry.
About

Bob Hueston

Search

Top Tags
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today