如何 Relink 11.2 Grid Infrastructure(GI) 和 11.2 RAC?

     对于Oracle Clusterware(CRS) 10g 和11.1, CRS 的binary 不能被 relink。CRS 中的client shared libraries可以被relink, 但是这个relink只有当OS升级或者给OS打补丁出现问题后才需要,大部分时候不需要relink CRS。

     如果想对CRS中的client shared libraries做relink, 请参考MOS文档中的步骤:
     Will an Operating System Upgrade Affect Oracle Clusterware? [ID 743649.1]

     对于Oracle Grid Infrastructure(GI) 11.2 及之后的版本,在GRID HOME中有一些binary需要在OS升级或者打补丁后被relink。
     对于数据库软件(RDBMS binary),在OS升级或者OS打补丁后推荐做relink, RAC 的binary也是一样的,需要relink。

     下面是在11.2 集群环境中执行relink的过程,包括了对GI和RAC做relink的步骤:

1. 首先停止这个节点上的所有数据库实例,这是因为之后停止CRS时虽然会停止数据库实例,但是是以shutdown abort的方式,我们需要以shutdown immediate或者normal来停止数据库实例:
$su - oracle
$srvctl stop instance -d <db_name> -i <instance_name> -o immediate

2. 如果业务需要高可用性,确保这个实例上的service已经切换到了其它节点的实例上。
$ srvctl status service -d <db_name>

3. 用root用户执行<GRID_HOME>/crs/install/rootcrs.pl -unlock来修改相应目录权限并停止GI:
[root@rac1 ~]# cd /u01/app/11.2.0/grid/crs/install
[root@rac1 install]# perl rootcrs.pl -unlock
Using configuration parameter file: ./crsconfig_params


CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.rac2.vip' on 'rac1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.cvu' on 'rac1'
CRS-2677: Stop of 'ora.rac2.vip' on 'rac1' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rac1'
CRS-2677: Stop of 'ora.scan1.vip' on 'rac1' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rac1' succeeded
CRS-2677: Stop of 'ora.cvu' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.CRS.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.racdb.db' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac1'
CRS-2677: Stop of 'ora.rac1.vip' on 'rac1' succeeded
CRS-2677: Stop of 'ora.CRS.dg' on 'rac1' succeeded
CRS-2677: Stop of 'ora.racdb.db' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.RECO.dg' on 'rac1'
CRS-2677: Stop of 'ora.DATA.dg' on 'rac1' succeeded
CRS-2677: Stop of 'ora.RECO.dg' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rac1'
CRS-2677: Stop of 'ora.ons' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rac1'
CRS-2677: Stop of 'ora.net1.network' on 'rac1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rac1' has completed
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully unlock /u01/app/11.2.0/grid

注意,如果在$GRID_HOME/rdbms/audit下面的audit文件很多,会导致rootcrs.pl执行很长时间,这样的话可以将$GRID_HOME/rdbms/audit/*.aud 文件备份到GRID_HOME之外,然后删除。

4. 禁止GI在OS重启后自动启动,这是因为升级OS或者打OS补丁后,可能需要重启主机,这样的话,需要在relink之前禁止GI启动。
用root用户:
[root@rac1 install]# crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.

5. 备份GI和RDBMS的ORACLE_HOME。

6. 升级OS或者给OS打补丁,包括重启主机等(如果需要)。

7. 用GI的属主用户来对GI binary进行relink:

[root@rac1 audit]# su - grid
[grid@rac1 ~]$ export ORACLE_HOME=/u01/app/11.2.0/grid

确保GI是停止的,然后再执行relink:
[grid@rac1 ~]$ ps -ef|grep d.bin
grid      3408  3360  0 17:09 pts/0    00:00:00 grep d.bin

[grid@rac1 ~]$ crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

[grid@rac1 ~]$ $ORACLE_HOME/bin/relink
writing relink log to: /u01/app/11.2.0/grid/install/relink.log

[grid@rac1 ~]$ <===relink结束后,并不会有任何信息提示,只是显示命令提示符。

需要检查/u01/app/11.2.0/grid/install/relink.log, 查看是否有错误。

下面截取了末尾的一些行,如下:
...
 - Linking Oracle
rm -f /u01/app/11.2.0/grid/rdbms/lib/oracle
gcc  -o /u01/app/11.2.0/grid/rdbms/lib/oracle -m64 -L/u01/app/11.2.0/grid/rdbms/lib/ -L/u01/app/11.2.0/grid/lib/ -
...
lsnls11 -lnls11 -lcore11 -lnls11 -lasmclnt11 -lcommon11 -lcore11 -laio    `cat /u01/app/11.2.0/grid/lib/sysliblist` -Wl,-
rpath,/u01/app/11.2.0/grid/lib -lm    `cat /u01/app/11.2.0/grid/lib/sysliblist` -ldl -lm   -L/u01/app/11.2.0/grid/lib
test ! -f /u01/app/11.2.0/grid/bin/oracle ||\
           mv -f /u01/app/11.2.0/grid/bin/oracle /u01/app/11.2.0/grid/bin/oracleO
mv /u01/app/11.2.0/grid/rdbms/lib/oracle /u01/app/11.2.0/grid/bin/oracle
chmod 6751 /u01/app/11.2.0/grid/bin/oracle

8. 用RDBMS的属主对数据库binary做relink:
su - oracle
确保$ORACLE_HOME设置为了数据库的ORACLE_HOME,然后执行:

[oracle@rac1 ~]$ $ORACLE_HOME/bin/relink all
writing relink log to: /u01/app/oracle/product/11.2.0/dbhome_1/install/relink.log
<===relink结束后,并不会有任何信息提示,只是显示命令提示符。

需要检查/u01/app/oracle/product/11.2.0/dbhome_1/install/relink.log, 查看是否有错误。

截取relink.log中部分内容:

Starting Oracle Universal Installer... <<<<<<开头
...
le/product/11.2.0/dbhome_1/lib/sysliblist` -ldl -lm   -L/u01/app/oracle/product/11.2.0/dbhome_1/lib
test ! -f /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle ||\
           mv -f /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle /u01/app/oracle/product/11.2.0/dbhome_1/bin/
oracleO
mv /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/oracle /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle
chmod 6751 /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle <<<<<<结尾

9. 用root用户执行<GRID_HOME>/crs/install/rootcrs.pl -patch来修改相应目录权限并启动GI:

[root@rac1 ~]# cd /u01/app/11.2.0/grid/crs/install
[root@rac1 install]# perl rootcrs.pl -patch
Using configuration parameter file: ./crsconfig_params
CRS-4123: Oracle High Availability Services has been started.

10. Enable CRS来保证主机重启后可以自动启动GI:
[root@rac1 install]# crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.

11. 确认所有的应启动的资源都已启动:
[root@rac1 install]#  crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       rac1                                        
               ONLINE  ONLINE       rac2                                        
ora.DATA.dg
               ONLINE  ONLINE       rac1                                        
               ONLINE  ONLINE       rac2                                        
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                                        
               ONLINE  ONLINE       rac2                                        
ora.RECO.dg
               ONLINE  ONLINE       rac1                                        
               ONLINE  ONLINE       rac2                                        
ora.asm
               ONLINE  ONLINE       rac1                     Started            
               ONLINE  ONLINE       rac2                     Started            
ora.gsd
               OFFLINE OFFLINE      rac1                                        
               OFFLINE OFFLINE      rac2                                        
ora.net1.network
               ONLINE  ONLINE       rac1                                        
               ONLINE  ONLINE       rac2                                        
ora.ons
               ONLINE  ONLINE       rac1                                        
               ONLINE  ONLINE       rac2                                        
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac2                                        
ora.cvu
      1        ONLINE  ONLINE       rac2                                        
ora.oc4j
      1        ONLINE  ONLINE       rac2                                        
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                                        
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                                        
ora.racdb.db
      1        ONLINE  ONLINE       rac2                     Open               
      2        OFFLINE OFFLINE                               Instance Shutdown  
ora.scan1.vip
      1        ONLINE  ONLINE       rac2                                        

如果发现实例没有启动,可以手工启动:
$srvctl start instance -d <db_name> -i <instance_name>

12. 可以用下面的MOS文档中的方法来确认oracle 的binary是RAC的:
How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC [ID 284785.1]

方法1:如果下面的命令能查出kcsm.o ,说明binary是RAC的:
su - oracle
$ar -t $ORACLE_HOME/rdbms/lib/libknlopt.a|grep kcsm.o
kcsm.o

在AIX上命令是不同的:
 ar -X32_64 -t $ORACLE_HOME/rdbms/lib/libknlopt.a|grep kcsm.o

方法2:查看RAC特有的后台进程是否存在,比如:
[grid@rac1 ~]$ ps -ef|grep lmon
grid      7732     1  0 17:59 ?        00:00:17 asm_lmon_+ASM1
oracle   18605     1  0 20:49 ?        00:00:00 ora_lmon_RACDB1 <===========
grid     20992 10160  0 21:10 pts/2    00:00:00 grep lmon

上面的所有步骤需要在集群的各个节点上依次执行。

上述relink GI的过程来源于下面MOS文档中章节 “Do I need to relink the Oracle Clusterware / Grid Infrastructure home after an OS upgrade?”
 RAC: Frequently Asked Questions [ID 220970.1]


参与此主题的后续讨论,可以访问我们的中文社区,跟帖“分享: 如何 Relink 11.2 Grid Infrastructure(GI) 和 11.2 RAC?"


注意上面的过程适用于集群环境,对于单机版的GI (即Restart),请参考MOS下面的文档来进行relink:
How To Relink The Oracle Grid Infrastructure Standalone (Restart) Installation (Non-RAC). [ID 1536057.1]

评论:

发表一条评论:
  • HTML语法: 禁用
About

本博客由Oracle全球技术支持中国区的工程师维护。为中文用户提供数据库相关的技术支持信息,包括常用的诊断工具、诊断方法、产品新特性、案例分析等。此外,MOS也陆续推出各类中文内容:技术通讯统一发布在Note 1529795.1 中,中文文档列表更新在Note 1533057.1 中,网上讲座请查看MOS文档 1456176.1,在"Archived"中可以下载历史的录音和文档。

Search

Archives
« 四月 2014
星期日星期一星期二星期三星期四星期五星期六
  
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
今天