How to change private interface in 10g cluster

如果要修改Private Interface的实际IP地址,必须要在操作系统级别修改,比如通过ifconfig命令以及修改/etc/hosts文件等,而为了避免发生node evictions,需要保证CRS stack down。在修改完IP地址以后,再重新让CRS stack up,才可以使用oifcfg命令修改保存在OCR中的信息。

在10gR2以前起停CRS Stack使用init.crs。

#init.crs stop
#init.crs start

在10gR2以后可以使用crsctl命令。

#crsctl stop crs
#crsctl start crs

1. 检查当前的网络资源配置

% $ORA_CRS_HOME/bin/oifcfg getif 

2. 删除原先的Private Interface设置

% $ORA_CRS_HOME/bin/oifcfg delif –global eth1

3. 添加新的Private Interface设置(此处例子仅仅修改subnet,如果修改了eth1,就是修改了interface name)

% $ORA_CRS_HOME/bin/oifcfg setif –global eth1/192.168.1.0:cluster_interconnect 

4. 再次检查网络资源配置确认已经修改成功

% $ORA_CRS_HOME/bin/oifcfg getif 

5. 重新启动数据库实例,以确认RAC使用了正确的Private Interface作Cache Fusion。在告警日志中检查如下输出:

Cluster communication is configured to use the following interface(s) for this instance
192.168.1.1

How to resolve ORA-01034 when RAC failover

今天在客户处测试Oracle 9.2.0.8 on HP-UX IA64的RAC Failover功能,遇到ORA-01034错误。

表现为:
当关闭RAC环境的某一个实例之后(无论是shutdown abort还是shutdown immediate),再用远程客户端通过tns连接RAC Service都会间歇性报ORA-01034错误。

$ sqlplus system/oracle@prod 

SQL*Plus: Release 9.2.0.8.0 - Production on Tue Nov 17 20:52:09 2009

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

ERROR:
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
HPUX-ia64 Error: 2: No such file or directory

客户端的TNS配置是很常规的客户端failover。

PROD  =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = VIP1)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = VIP2)(PORT = 1521))
      (LOAD_BALANCE = yes)
    )
    (CONNECT_DATA =
      (SERVICE_NAME = prod)
      (FAILOVER_MODE=
       (TYPE=SELECT)
       (METHOD=BASIC))
    )
  )

纳闷许久,仔细检查服务器端的listener.ora配置,才发现设置了GLOBAL_DBNAME,这是对于还没有往本地监听动态注册服务名功能的Oracle8和Oracle7才需要设置,在Oracle9i之后,如果设置了该参数,将会导致Failover失败。

将listener.ora中的配置从:

SID_LIST_LISTENER_PROD2 =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME=prod)
      (ORACLE_HOME = /oracle/product/9.2)
      (SID_NAME = prod2)
    )
  )

修改为:

SID_LIST_LISTENER_PROD2 =
  (SID_LIST =
    (SID_DESC =
      (ORACLE_HOME = /oracle/product/9.2)
      (SID_NAME = prod2)
    )
  )

再次测试Failover,一切正常。

结论:
1. 对于监听依然存在,然后数据库实例关闭的情况,必须是在监听中动态注册的服务,才可以实现Failover。
2. GLOBAL_DBNAME会影响Failover。

How to identify the cluster name

在为RAC环境配置database control的时候,会被问及cluster name,当然我们知道默认安装的Oracle Cluster Name就是crs,但是如何确认到底CRS的名字是什么呢?

[oracle@dbserver1 oracle10g]$ emca -config dbcontrol db -cluster -EM_NODE dbserver1 -EM_SID_LIST intertol2,intertol3,intertol4

STARTED EMCA at Jul 27, 2009 4:20:25 PM
EM Configuration Assistant, Version 10.2.0.1.0 Production
Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Enter the following information:
Database unique name: intertol
Database Control is already configured for the database intertol
You have chosen to configure Database Control for managing the database intertol
This will remove the existing configuration and the default settings and perform a fresh configuration
Do you wish to continue? [yes(Y)/no(N)]: Y
Listener port number: 1521
Cluster name:

实际上Oracle提供了一个实用程序cemutlo来获取cluster name。

[oracle@dbserver1 ~]ORA_CRS_HOME/bin/cemutlo -h
Usage: /u01/app/oracle/product/crs/bin/cemutlo.bin [-n] [-w]
        where:
        -n prints the cluster name
        -w prints the clusterware version in the following format:
                 ::

[oracle@dbserver1 ~]$ cemutlo -n
crs

另外还有使用ocrdump命令或者ocrconfig命令来间接获得cluster name的方法,详细参看《怎么查安装CRS时设置的cluster name》