X

一线数据库工程师的精彩案例分享、新特性介绍、诊断工具和诊断方法、以及常用的测试案例 -- 欢迎光临Oracle数据库中文技术支持官方微博

记一个12c的chm bug导致的ORA-01017

一、现象:
两节点12c RAC,在两节点上export ORACLE_SID再sqlplus / as sysdba都正常登录,然而Commvault通过service_name方式(sqlplus sys/password@service_name as sysdba)登录则是在节点1正常,而节点2报:
ORA-01017: invalid username/password; logon denied
与客户进一步测试,在sqlplus下重现了这个问题:
(节点1 vip 192.168.56.85, 节点2 vip 192.168.56.86) 
1、
两节点分别使用各自的vip新建tnsnames条目:
ONE_TEST =
 (DESCRIPTION =
 (ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.86)(PORT=1569)))
   (CONNECT_DATA =
     (SERVER = DEDICATED)
     (SERVICE_NAME = ONE)
   )
 )

2、测试连接,节点1 ok,节点2报错:
$tnsping ONE_TEST

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 23-MAR-2016 17:08:33

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.86)(PORT = 1569)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ONE)))
OK (0 msec)
$strace -o /tmp/strace_sysdba.output -cfT sqlplus sys/password@ONE_test as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Mar 23 17:09:28 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied

$tnsping ONE_TEST

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 23-MAR-2016 17:21:01

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.85)(PORT = 1569)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ONE)))
OK (10 msec)

$strace -o /tmp/strace_sysdba.output -cfT sqlplus sys/password@ONE_test as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Mar 23 17:21:13 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
二、分析
1、首先让客户做了下sqlnet client trace,与故意输错密码情况的sqlnet client trace对比并无差别。于是考虑到12c的共享口令文件的新特性,让客户在两个节点上分别建立了本地的口令文件,然而问题依旧。
2、让客户测试用普通用户连接,反馈是一样的情况,故排除了口令文件的问题。
3、因客户的口令中含有大小写��建议客户把sec_case_sensitive_logon改成false,但问题还是依旧。
4、对节点2的service连接做了strace:
 strace -fo /tmp/strace_ONE.output sqlplus ONE/password@ONE_TEST
并在设置了ORA-1017的errorstack( errorstack对于这种口令错误问题没什么用,但通过它可以判断连接在哪个实例上遇到的ORA-01017错误):
alter system set events '1017 trace name ERRORSTACK level 3';

然而重现问题之后,两实例上均无errorstack trace生成,alert log中也没有ORA-01017的记录。
故此怀疑报错的实例并非是这两个。
5、通过strace输出发现走了nameserver。但客户并没用到GNS解析,且只使用vip连接。建议客户将/etc/resolv.conf中的nameserver行注释掉看看。注释掉之后问题依旧,但从strace上已可以完全确定ORA-01017报错还是发生在本地库上。
6、回想起一开始检查listener status输出时曾发现如下可疑情况:
$lsnrctl status
... 
Service "ONE" has 2 instance(s).
 Instance "-MGMTDB", status READY, has 1 handler(s) for this service...<===
 Instance "ONE2", status READY, has 1 handler(s) for this service...
当时询问客户怎么这个service下多了个mgmtdb实例,客户说这个应该是12c自带的吧,所以并没在意。
但是它们在同一个service下,如果连接被传递给mgmtdb实例的话,那么肯定会发生ora-01017。查了下mgmtdb的用途,发现其与12c的新特性有关:原本在11g中由Berkeley DB管理的CHM repository改成了Oracle db管理:

MGMTDB is new database instance which is used for storing Cluster Health Monitor (CHM) data. In 11g this was being stored in berkley database but starting Oracle database 12c it is configured as  Oracle Database Instance.

7、于是建议客户将mgmtdb停掉试试。客户反馈停掉之后果然正常连接了。
8、进一步查看相关文档,发现如下bug:
MGMTDB registers Database Service (Doc ID 2063662.1)
GIMR (Management Database) Registers Into Same Service that the Database Instance also registers On RAC (Doc ID 2024572.1)

该问题在数据库与cluster name同名时发生,会导致mgmtdb把自己注册到这个与cluster name同名数据库的default service下。

经客户确认,其数据库名的确与Cluster name相同。 

文档中给出的Workaround有二:一是办法将GIMR重建到不同的共享存储上;另一个办法是按Doc ID 2024572.1给mgmtdb配置local_listener。当然如果可以使用不同的dbname重建数据库则是从根本上避免此bug。

应该还有个办法是通过srvctl stop mgmtdb和srvctl disable mgmtdb来禁用它,但Oracle不推荐将其禁用,因为使用opatchauto打psu时会去尝试启动mgmtdb,如果发现它被disable,则会报错:
PRCR-1005 : Resource ora.mgmtdb is already stopped

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.