Why is this only a warning?

Welcome to Monday! I'm jumpstarting an ultra-60 in our lab so I can test a bugfix when I see this message:
WARNING: /pci@1f,4000/scsi@3/sd@0,0 (sd2):
        Error for Command: load/start/stop         Error Level: Informational
        Requested Block: 0                         Error Block: 0
        Vendor: SEAGATE                            Serial Number: 9808500387
        Sense Key: Soft Error
        ASC: 0x5d (drive operation marginal, service immediately (failure prediction threshold exceeded)), ASCQ: 0x0, FRU: 0x45
That's a pretty serious-looking message, so why is it only a "WARNING" rather than an "ERROR" ? The answer comes from the routine gda_errmsg(..) which is in usr/src/uts/common/io/dktp/dcdev/gda.c starting at line 247. This routine calls gda_log(..) which is a wrapper around cmn_err(..). One of the parameters we pass to cmn_err(..) is the error level: CE_CONT, CE_NOTE, CE_WARN, CE_PANIC and CE_IGNORE (defined in usr/src/uts/common/sys/cmn_err.h. The gda_errmsg(..) routine passes CE_WARN (that's the first part of the message above) and CE_CONT (the rest of the message). So what should I do about this message? Replace the disk immediately. There is no other option you can take. The message is that the drive's failure prediction threshold has been exceeded, so the drive's internal electronics is telling you that it's about to die. In my case this is a rather old 4gb Seagate disk, so I'm more than happy to get a new one in instead. We don't pass CE_PANIC as an argument to gda_log(..) because we do not want to take out the system due to a (generally) online-resolvable issue. Of course if this is your boot disk you'd better take action right away, but Solaris isn't going to panic on you from this incident. Moral of the story: don't ignore "WARNING" messages because they're only "WARNING"s and always read the full text of the message. It could really be an error.
Comments:

Post a Comment:
Comments are closed for this entry.
About

I work at Oracle in the Solaris group. The opinions expressed here are entirely my own, and neither Oracle nor any other party necessarily agrees with them.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today