T2000 firmware often ignored

Many customers(and engineers!) ignore firmware as part of their patching strategy and this can result in hard to diagnose issues. Over the last couple on weeks I have come across a couple of customer performance issues (some 1st hand and some 2nd hand) on T2000 which were resolved by applying the current firmware patch.

We have very limited observability in th firmware layer, so diagnosis can be a challange to say the least.

So in the spirit of avoiding future problems, have a quick look at the output of prtconf -V and if it does not show 4.28.1 or later, consider applying the patch 136927-01 or later if you are reading this in 6 months time. This is a patch where a long cool read of the README and install instructions is savy.

Same principle applies to T1000 and T5220's, but the patches are different.

Comments:

Even though 127568-03 has not been obsoleted yet, a newer T2000 firmware is avaliable with patch 136927-01. http://sunsolve.sun.com/search/advsearch.do?collection=PATCH&type=collections&queryKey5=136927&toDocument=yes

Posted by steve on May 28, 2008 at 11:23 AM BST #

Thanks, I have updated the main body of the text to reflect this. Note to self, don't write a entry when you are waiting for a ferry to sale. Writing the entry always takes longer than you think and you loose connectvity before you have checked it through fully.

Posted by guest on May 29, 2008 at 03:18 AM BST #

People are nervous about upgrading something such lowlevel as firmware (or BIOS). If something goes wrong, you have a expensive, bricked server. Upgrading softer is orders of magnitude safer and can easily be reversed when something goes wrong.

Posted by zdz on May 29, 2008 at 04:48 AM BST #

With 136927-01 we bricked two T2000's, even though we followed the instructions. I believe this is why most people fear (rightfully) firmware updates.

Posted by Robert on May 29, 2008 at 12:39 PM BST #

Robert, I would be interested in the caseid of the service call (if any) via email. If there are concerns in this area, they need to be addressed. Given the importance of the Hypervisor on machines like the T2000, firmware is more central to correct functioning and accurate problem diagnosis than on many of our earlier machines. Be good to understand the root cause of your "brick state".

Posted by Clive King on May 29, 2008 at 12:50 PM BST #

Hi sorry to post a query regarding "bufhwm on large systems" in this issue....You have closed the comments for 4/01/2008 blog "bufhwm on large systems"....I have seen the same kind of system and values were reset...gave some benefit but in sar -b I can still see the value of %wcache is low...infact it varies from 0 to 100...and %readcache is above 85%....we set the value as 256 MB.....do u suggest to further increase it....

Posted by peter on June 03, 2008 at 12:31 AM BST #

UPDATE:
The issue with the firmware patch and our "bricked" servers has been resolved. The issue was complicated by the fact that these servers are at a remote location, so I do not have physical access to the system.
The solution was very simple ... Turns out that unplugging the servers for > 10 seconds is necessary to reset the system after the firmware update completes. "Poweroff" from alom does not work.

Posted by Robert on June 03, 2008 at 09:15 AM BST #

Is the removing of AC becoming a feature of firmware updates with Sun? benr seems to having a complain about it too:

http://cuddletech.com/blog/pivot/entry.php?id=937

Steve

Posted by Steve Foster on June 04, 2008 at 06:03 AM BST #

Post a Comment:
Comments are closed for this entry.
About

clive

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today