Time saving tactic: How we saved 6 hrs of downtime in production 10g upgrade

Preface

So you want to upgrade your database to 10g from 9i. Well, welcome to the club.

If you have RAC, then you will definitely need to install the CRS and database binaries, along with some RDBMS patches.

When we were challenged with keeping the downtime to within 24 hrs by a customer, that set us thinking as to what time saving tactics could be employed to achieve this end.

Divide and conquer the downtime..

The old strategy of divide and conquer is really time tested and works well in most paradigms. In this article, we demonstrate how its possible to split the 10g upgrade downtime into two logical parts:

Part 1)

Install the 10g technology stack ahead of time, including the database, any patchsets (usually 10.2.0.3) and rdbms patches. Before doing this, we shutdown the 9i RAC/DB binaries. After the 10g CRS/DB technology stack installation, you shutdown the 10g CRS , bring up the 9i RAC processes and carry on as it nothing happened.


   At this stage, 9i and 10g technology stack will co-exist peacefully with each other. The production system can run on 9i RAC for a week or more till you decide to do the actual upgrade.

Part 2)

In the subsequent main upgrade outage, you can shutdown the 9i RAC/DB, bring up DB using 10g CRS/DB oracle home  and do the actual upgrade. This outage could last anywhere from 10-24 hrs or even upto 32 hrs depending on your pre-production timing practice. It is assumed that one would do at least 2-3 rounds of upgrade before gaining the confidence of doing the production round.


Adopting this split strategy saved us about 6 hrs in the main upgrade downtime window and we were able to do the upgrade in a window of 16-20 hrs. The size of the database was ~1 TB on HP-UX PA RISC 64 bit OS.

How was it done..

When you do an 9i->10g upgrade for RAC, the following happens:

1) OCR device location is automatically picked up from /var/opt/oracle/srvConfig.loc (9i  setup file)

2) The OCR device's contents is upgraded to 10g format

3) A new file called /var/opt/oracle/ocr.loc is created with the OCR device name

Since we had to preserve the 9i OCR device for running 9i RAC after the 10g CRS/DB techstack installation, we did the following:

1) Got a new set of OCR and Voting device for 10g. This was a separate set of devices in addition to the 9i OCR and voting disks. Then we brought down the 9i CRS processes.

A caveat here was that the HP service guard (vendor cluster solution) was required to be up to perform the 10g CRS installation.

2) copied the contents of 9i OCR into the 10g OCR device using the dd command:

$ whoami
oracle

$ dd if=/dev/rawdev/raw_9i_ocr.dbf  of=/dev/rawdev/raw_10g_ocr.dbf   bs=1024


3) Edited the srvConfig.loc file to point to the /dev/rawdev/raw_10g_ocr.dbf file

4) Did the 10g CRS installation and ran root.sh that upgraded the OCR device to 10g format

5) We then installed the 10.2.0.1 DB binaries and applied the 10.2.0.3 patchset, along with RDBMS patches. This was the major activity and took about 4-5 hrs.

Another option to save time here was that we could have cloned the 10.2.0.3 ORACLE_HOME from the UAT servers, but since this was production, we wanted to do everything with a clean start and not carry on any mistakes from UAT.

6) Brought down 10g CRS services with $ORA_CRS_HOME/bin/crsctl stop crs and also disabled the automatic startup of CRS with /etc/init.d/init.crs crs disable

7) Re-pointed the 9i OCR device back in /var/opt/oracle/srvConfig.loc as  /dev/rawdev/raw_9i_ocr.dbf

Then we brought up the 9i CRS services again and then brought up the Database with 9i binaries.

At this point, the 9i and 10g binaries co-existed with each other peacefully as if 10g techstack was never there.

Conclusion

There might be better ways of reducing downtime during the upgrade, but this was one of the selling points of projecting a successful 10g upgrade to the customer by reducing the downtime, especially when we were under extreme pressure to keep the main downtime window to less than 24 hrs.




Comments:

you could do it with near zero downtime with Streams

Posted by Marc Musette on June 02, 2008 at 06:46 PM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today