Wednesday Nov 08, 2006

Unbreakable Oracle 10g Release 2 : What if you have ORA-600 kcratr1_lastbwr ?

ORA-600_kcratr1 <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-917120-1"; urchinTracker(); </script>
This an interesting story that happened yesterday on one of our customer site. An engineer powered off the wrong rack of equipment containing a Sun Fire X4600 running Oracle 10g Release 2.  Almost no transactions were performed at time so when the system came up the customer expected the database to be up and running very quickly.

In reality this is what happened :

Completed: ALTER DATABASE   MOUNT
Tue Nov  7 11:19:42 2006
ALTER DATABASE OPEN
Tue Nov  7 11:19:42 2006
Beginning crash recovery of 1 threads
 parallel recovery started with 16 processes
Tue Nov  7 11:19:44 2006
Started redo scan
Tue Nov  7 11:19:44 2006
Errors in file /xxx/oracle/oracle/product/10.2.0/db_1/admin/xxx/udump/xxx_ora_947.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
Tue Nov  7 11:19:44 2006
Aborting crash recovery due to error 600
Tue Nov  7 11:19:44 2006
Errors in file /xxx/oracle/oracle/product/10.2.0/db_1/admin/xxxtest/udump/xxxtest_ora_947.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

Not too pretty ! Checking the ASM configuration and the IO subsystem showed nothing wrong. So what to do if you do not have a backup handy ?

Well, here is the idea .... what would we do if we had a backup that was inconsistent ?
The recover database command will start an Oracle process which will roll forward all transactions stored in the restored archived logs necessary to make the database consistent again. The recovery process must run up to a point that corresponds with the time just before the error occurred after which the log sequence must be reset to prevent any further system changes from being applied to the database.

So we tried :

startup mount

ALTER DATABASE   MOUNT
Tue Nov  7 11:54:03 2006
Starting background process ASMB
ASMB started with pid=61, OS id=1070
Starting background process RBAL
RBAL started with pid=67, OS id=1074
Tue Nov  7 11:54:13 2006
SUCCESS: diskgroup xxxTESTDATA was mounted
Tue Nov  7 11:54:17 2006
Setting recovery target incarnation to 2
Tue Nov  7 11:54:17 2006
Successful mount of redo thread 1, with mount id 2364224219
Tue Nov  7 11:54:17 2006
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Tue Nov  7 11:54:32 2006

recover database


ALTER DATABASE RECOVER  database 
Tue Nov  7 11:54:32 2006
Media Recovery Start
 parallel recovery started with 16 processes
Tue Nov  7 11:54:33 2006
Recovery of Online Redo Log: Thread 1 Group 3 Seq 4 Reading mem 0
  Mem# 0 errs 0: +xxxTESTDATA/xxxtest/onlinelog/group_3.263.605819131
Tue Nov  7 11:59:25 2006
Media Recovery Complete (xxxtest)
Tue Nov  7 11:59:27 2006
Completed: ALTER DATABASE RECOVER  database 


alter database open

Tue Nov  7 12:03:01 2006
alter database open
Tue Nov  7 12:03:01 2006
Beginning crash recovery of 1 threads
 parallel recovery started with 16 processes
Tue Nov  7 12:03:01 2006
Started redo scan
Tue Nov  7 12:03:01 2006
Completed redo scan
 273 redo blocks read, 0 data blocks need recovery
Tue Nov  7 12:03:01 2006
Started redo application at
 Thread 1: logseq 4, block 12858574
Tue Nov  7 12:03:01 2006
Recovery of Online Redo Log: Thread 1 Group 3 Seq 4 Reading mem 0
  Mem# 0 errs 0: +xxxTESTDATA/xxxtest/onlinelog/group_3.263.605819131
Tue Nov  7 12:03:01 2006
Completed redo application
Tue Nov  7 12:03:01 2006
Completed crash recovery at
 Thread 1: logseq 4, block 12858847, scn 824040
 0 data blocks read, 0 data blocks written, 273 redo blocks read
Tue Nov  7 12:03:02 2006
Thread 1 advanced to log sequence 5
Thread 1 opened at log sequence 5
  Current log# 1 seq# 5 mem# 0: +xxxTESTDATA/xxxtest/onlinelog/group_1.261.605819081
Successful open of redo thread 1
Tue Nov  7 12:03:02 2006
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Nov  7 12:03:02 2006
SMON: enabling cache recovery
Tue Nov  7 12:03:03 2006
Successfully onlined Undo Tablespace 1.
Tue Nov  7 12:03:03 2006
SMON: enabling tx recovery
Tue Nov  7 12:03:03 2006
Database Characterset is UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=56, OS id=1128
Tue Nov  7 12:03:05 2006
Completed: alter database open



And we are up and running ! The real thing that Oracle should work on is the quality and clarity of their error messages.
At this point this is quite poor ...

 Unbreakable database, maybe. Automatic (and simple) , not yet.

About

mrbenchmark

Search

Categories
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today
News
Blogroll
deepdive

No bookmarks in folder