Thursday Apr 02, 2009

Why I removed Ubuntu ..Really!!

 

A couple of  months ago, I got  a new workstation.  A quad core Xeon box from Dell. It came with preinstalled Ubuntu.  I decide to play with that for while.  After a while I noticed that the system would randomly reset. Ah the darn unstable linux is what I thought!!

I reinstalled the system with OpenSolaris 2008.11 and it ran for a day or so. The system reset again.  Annoyed I started looking at /var/adm/messages and I found that there was a hardware fault detected by FMA and taken appropriate action.  Now it was nice to call Dell Support  and tell them definatively the cause, diagnosis and that I needed a new motherboard.  

The rep asked me what test was running.  I said nothing. This functionality is built into OpenSolaris...and it's free!!  One cannot get this running linux.

Here's what fmadm faulty printed

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 20 01:29:08 152a7687-c256-40dd-80b1-83c1f4ed74c7  INTEL-8001-43  Critical 

Fault class : fault.cpu.intel.nb.ie
FRU         : "MB" (hc://:product-id=Precision-WorkStation-T3400:chassis-id=65QLTH1:server-id=opensolaris/motherboard=0)
                  faulty

Description : Northbridge has detected an internal error  Refer to
              http://sun.com/msg/INTEL-8001-43 for more information.

Response    : System panic or reset by BIOS

Impact      : System may be unexpectedly reset

Action      : Replace motherboard

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 20 01:29:08 50ba84aa-3f12-c2c5-9c0b-8fdec9454104  INTEL-8000-LE  Major    

Fault class : fault.cpu.intel.l1dcache
Affects     : hc://:product-id=Precision-WorkStation-T3400:chassis-id=65QLTH1:server-id=opensolaris/motherboard=0/chip=0/core=3/strand=0
                  faulted and taken out of service
FRU         : hc://:product-id=Precision-WorkStation-T3400:chassis-id=65QLTH1:server-id=opensolaris/motherboard=0/chip=0
                  faulty

Description : A level 1 Data Cache on this cpu is faulty.  Refer to
              http://sun.com/msg/INTEL-8000-LE for more information.

Response    : The system will attempt to offline this cpu to remove it from
              service.

Impact      : Performance of this system may be affected.

Action      : Schedule a repair procedure to replace the affected CPU.  Use
              'fmadm faulty' to identify the module.


About

user12610379

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today