Case Study: OCR Manual Cleanup and Reconfiguration
By AVargas on Mar 28, 2007
This morning I had the opportunity to work together with a colleague in a case of OCR corruption, it was an interesting and learning experience.
Late in the evening I got a call describing this scenario:
1st check was to look at the alert log, no information was written at all on it... It seemed that CRS was trying to start another database.
In the 2nd check we looked for CRS configuration of the database:
dbtst1 sbtst11 /dbtst/app01/oracle/product/10gASM
dbtst2 sbtst12 /dbtst/app01/oracle/product/10gASM
[dbtst1] > srvctl modify database -d sbtst1 -n dbtst2 -o /dbtst/app01/oracle/product/10gDB -p +SBDATADG/parameter/sbtst1_spfile.ora
... But only for instance on node 2; Instance on node 1 seemed to be in the same status as before, not even a single byte was written to the alert.log
In addition to this nodeapps status on node 1 was not correct; so we checked again with srvctl:
dbtst1 sbtst12 /dbtst/app01/oracle/product/10gDB
[dbtst1] > srvctl config nodeapps -n dbtst1
Next step was trying to reconfigure the virtual IP with vipca, but this trial failed with error: virtual IP in use. The status at this moment was:
- Remove nodeapps does not succeeded, even with -f, to be removed
- Cannot configure vip because the previous configuration still was on place on OCR
First we did unregister to both listeners, because they depend on vip, and then to vip; that cleaned completely nodeapps on node 1:
Adding nodeapps this time worked, and we got ons,gds and vip up and running using:
We used Network Configuration Assistant, netca, to remove and recreate the default cluster listener, using both TCP and IPC protocols. IPC provides the local connections feature that was missing before.
Netca registered the listener with CRS and we were able to start all RAC components.