Flashupdating the stand-by XSCFU

I got a really comment on my entry setupplatform and other new XSCF features from Paul Liong asking:
    Our 'xscf" is currently running at XCP1050. We hope to get it upgraded to XCP1060 according to the Chapter 8 of XSCF User’s Guide. However, it is noticed that there is no permission to run the 'flashupdate' command on the Redundant XSCF Unit. So, how can we perform the firmware upgrade on the XSCF Unit on the standby side first and then on the active side?
An excellent question indeed! I checked the documentation (User Guide, Administrator Guide, man pages) and don't see it described in any detail there. So let me take a stab at it.

First, some background. The Sun SPARC-Enterprise M8000 and M9000 support two service processors, XSCFU#0 and XSCFU#1. The two work in a dual-redundant fashion. One unit is always the "active" unit, and can fully monitor and control the platform. The second unit, if present, is the "stand-by" unit, and has very limited functionality; mostly, the stand-by XSCFU is a slave to the active unit, receiving database updates so that it's current and ready to take over if the active XSCFU fails, is physically removed, or the user runs switchscf.

Back to Paul's question. You cannot run flashupdate on the stand-by XSCFU, that is true. Instead, you run flashupdate on the active XSCFU. This causes the XSCFU to check the flash image, install it and reboot. Upon reboot, if all goes well, the active XSCFU then communicates with the stand-by, tells it's partner that a new version of firmware has been installed, and copies the firmware image to the stand-by XSCFU. At this point, the stand-by XSCFU installs the firmware image, and reboots. If the upgrade is successful, the stand-by XSCFU will request to become the active XSCFU in order to finish the ugprade process. When you're done, both XSCFUs will be running the same version of firmware.

One side effect of this process is that the active XSCFU will switch. In other words, if XSCFU#0 was active and XSCFU#1 was stand-by when you started, then XSCFU#1 will be active and XSCFU#0 will be stand-by when you're done. We had a heated debate about this during development. Someone filed a high-priority bug that the transition was unexpected and should be considered a bug. On the other hand, switching the active XSCFU back to the original active unit would require a second transition; that second transition would add another couple of minutes to the upgrade process (minimizing firmware upgrade times was an important requirement for the SPARC-Enterprise service processor, so an extra two minutes is a lot of time). In the end we decided that it doesn't matter which unit is active, since they are dual redundant, so we should adopt the approach that allowed the firmware upgrade to finish as quickly as possible. If there are customers who strongly feel that, for example, XSCFU#0 should always be the active unit, then they can use switchscf when the firmware upgrade is complete.

So I'm sure some people are out there now wondering what happens if the stand-by XSCFU is absent when you upgrade the active XSCFU. Well, the active XSCFU will hold on to the firmware image. When the active XSCFU sees the stand-by inserted, the user can run flashupdate -c sync to update the stand-by XSCFU from the active unit. The same command can be used when you replace the stand-by XSCFU with a new unit.

Comments:

Post a Comment:
Comments are closed for this entry.
About

Bob Hueston

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today