Thursday Apr 02, 2009

Why I removed Ubuntu ..Really!!

 

A couple of  months ago, I got  a new workstation.  A quad core Xeon box from Dell. It came with preinstalled Ubuntu.  I decide to play with that for while.  After a while I noticed that the system would randomly reset. Ah the darn unstable linux is what I thought!!

I reinstalled the system with OpenSolaris 2008.11 and it ran for a day or so. The system reset again.  Annoyed I started looking at /var/adm/messages and I found that there was a hardware fault detected by FMA and taken appropriate action.  Now it was nice to call Dell Support  and tell them definatively the cause, diagnosis and that I needed a new motherboard.  

The rep asked me what test was running.  I said nothing. This functionality is built into OpenSolaris...and it's free!!  One cannot get this running linux.

Here's what fmadm faulty printed

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 20 01:29:08 152a7687-c256-40dd-80b1-83c1f4ed74c7  INTEL-8001-43  Critical 

Fault class : fault.cpu.intel.nb.ie
FRU         : "MB" (hc://:product-id=Precision-WorkStation-T3400:chassis-id=65QLTH1:server-id=opensolaris/motherboard=0)
                  faulty

Description : Northbridge has detected an internal error  Refer to
              http://sun.com/msg/INTEL-8001-43 for more information.

Response    : System panic or reset by BIOS

Impact      : System may be unexpectedly reset

Action      : Replace motherboard

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 20 01:29:08 50ba84aa-3f12-c2c5-9c0b-8fdec9454104  INTEL-8000-LE  Major    

Fault class : fault.cpu.intel.l1dcache
Affects     : hc://:product-id=Precision-WorkStation-T3400:chassis-id=65QLTH1:server-id=opensolaris/motherboard=0/chip=0/core=3/strand=0
                  faulted and taken out of service
FRU         : hc://:product-id=Precision-WorkStation-T3400:chassis-id=65QLTH1:server-id=opensolaris/motherboard=0/chip=0
                  faulty

Description : A level 1 Data Cache on this cpu is faulty.  Refer to
              http://sun.com/msg/INTEL-8000-LE for more information.

Response    : The system will attempt to offline this cpu to remove it from
              service.

Impact      : Performance of this system may be affected.

Action      : Schedule a repair procedure to replace the affected CPU.  Use
              'fmadm faulty' to identify the module.


Friday Feb 15, 2008

Recovery on OSDP or Indiana

Six easy steps to  recover your scrozzed system in  OpenSolaris Developer Preview 2.[Read More]
About

user12610379

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today