crsctl start crs does not work in 10gR2

Preface

Nowadays, we are doing a 10g upgrade for  one of our clients and hit upon the idea of pre-staging the 10gR2 CRS + DB technology stack on their RAC servers that are running 9iR2 RAC on HP service gaurd already. This is nothing but a downtime reduction technique, that saved about 5-6 hours. Thankfully, the idea worked, but not before some excitement.

Suprise, surprise..

A week after doing the 10gR2 CRS + DB installation on the pre-production servers, when we were starting the real Database upgrade, we had to bring up the 10gR2 CRS.

I was surprised to see that the 10gR2 CRS services would not come up. We had tried the following three things:

1) Uncommenting the crs, css, and evm daemons in /etc/inittab
2) Issued /etc/init.d/init.crs enable
3) Issued /etc/init.d/init.crs start
4) Isused $ORA_CRS_HOME/bin/crsctl start crs

I was pretty aghast. Apart from thinking of logging a tar, on searching metalink, we came across a new command that I had not tried in 10gR1.


raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./srvctl status nodeapps -n raclinux1
PRKH-1010 : Unable to communicate with CRS services.
[Communications Error(Native: prsr_initCLSS:[3])]

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./srvctl start nodeapps -n raclinux1
PRKH-1010 : Unable to communicate with CRS services.
[Communications Error(Native: prsr_initCLSS:[3])]

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./srvctl status nodeapps -n raclinux1
PRKH-1010 : Unable to communicate with CRS services.
[Communications Error(Native: prsr_initCLSS:[3])]

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

Redemption

This is then when we tried the crsctl start resources and the CRS actually came up:

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./crsctl start resources
Starting resources.
Successfully started CRS resources

raclinux1:/opt/oracle/product/10.2.0/CRS/bin # ./crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

Conclusion

At this point, I am not sure why the behaviour changed in 10gR2, whether it was intentional or un-intentional or whether this is a bug. But I am glad that we have a workaround. Everyday is a new learning.


Comments:

Hi,gaurav.verma: Yesterday I happened the issue that was same to yours.I have two nodes in my rac environment.one of them can not startup the CRS.following your operate,I finally execute the command--crsctl start resources .but the crs was still down. Why is the reasion that one node's crs work well but another is bad? I probably confirm my OCR and voting disk work well. Can you give me some advice?

Posted by Kevin.yuan on July 17, 2008 at 02:36 AM EDT #

Recently i faced problem starting oracle application on my galaxy cluster on one node.In the log i found that the CRS demon was not started after the booting of the node , so i manually tried to start it but faced some error. So here are the work around that i had done and the CRS services got started . The error i was getting while starting oracle is ====================== PRKC-1056 : Failed to get the hostname for node galclus157 PRKH-1010 : Unable to communicate with CRS services. [Communications Error(Native: prsr_initCLSS:[3])] ====================== When i tried to start the crsd manually the service did not started . Then after debugging this error i found that the crs service depends on the ucmmd service to start . So please check if this is already running or not (If not start it) =================================== root@galclus157# ps -aef | grep ucmmd root 2030 1 0 May 12 ? 13:12 ucmmd -r /usr/cluster/lib/ucmm/ucmm_reconf ===================================

Posted by Amit Ranjan Sahu on May 18, 2009 at 06:27 PM EDT #

Recently i faced problem starting oracle application on my galaxy cluster on one node.In the log i found that the CRS demon was not started after the booting of the node , so i manually tried to start it but faced some error. So here are the work around that i had done and the CRS services got started . The error i was getting while starting oracle is ====================== PRKC-1056 : Failed to get the hostname for node galclus157 PRKH-1010 : Unable to communicate with CRS services. [Communications Error(Native: prsr_initCLSS:[3])] ====================== root@galclus157#rsctl check crs Failure 1 contacting CSS daemon Cannot communicate with CRS Cannot communicate with EVM ========================= When i tried to start the crsd manually the service did not started . Then after debugging this error i found that the crs service depends on the ucmmd service to start . So please check if this is already running or not (If not start it) =================================== root@galclus157# ps -aef | grep ucmmd root 2030 1 0 May 12 ? 13:12 ucmmd -r /usr/cluster/lib/ucmm/ucmm_reconf ===================================

Posted by Amit Ranjan Sahu on May 18, 2009 at 06:29 PM EDT #

this actually aids, today i happen the troubles and i donot know how to work out, i research yahoo and discovered your blog, thanks once more

just one thing, can i post this entry on my site? i will add the source.

regards!

Posted by cool amber on December 18, 2009 at 12:13 PM EST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today