Oracle 10.2.0.3 RAC Reboot due to system time change

on

在Oracle10.2.0.3 RAC的测试中,发现如果修改某个节点的系统时间超过1.5秒,那么这个节点会被自动重新启动。

好狠的处理方式 ……

详细机制参见Internal Only的Metalink Note 308051.1

The OPROCD executable sets a signal handler for the SIGALRM handler and sets the interval timer based on the to-millisec parameter provided. The alarm handler gets the current time and checks it against the time that the alarm handler was last entered. If the difference exceeds (to-millisec + margin-millisec), it will fail; the production version will cause a node reboot.

尝试修改/etc/init.cssd中关于OPROCD的配置,将DISABLE_OPROCD设置为TRUE,然后重新启动系统,在系统进程中已经不存在oprocd进程,但是居然修改完系统时间以后,机器仍然被重新启动了。

文档中另外的描述提到,如果OPROCD是在non fatal mode状态下启动的,那么将只会写一段log而不去重新启动机器,并且在Note:265769.1中也描述了如何修改为non fatal mode,但是我没有去尝试。

In fatal mode, OPROCD will reboot the node if it detects excessive wait. In Non Fatal mode, it will write an error message out to the file .oprocd.log in one of the following directories.

最后尝试的结果是将整个cssd进程disable掉,这样可以避免因为修改系统时间而引起机器重启。

Oracle10g的CRS确实有些霸道,上次的测试中拔掉Private IP网卡上的网线,操作系统会重新启动,这次居然修改系统时间也会导致系统重启,真当这些机器是Windows了?UNIX Server中重启一次机器多大的事儿啊,CRS搞的跟吃饭一样随意,时不常reboot。

下面的这段资料描述了Oracle CRS的三个进程会在哪些状态下重新启动机器。

Oracle clusterware has the following three daemons which may be responsible for panicing the node. It is possible that some other external entity may have rebooted the node. In the context of this discussion, we will assume that the reboot/panic was done by an Oracle clusterware daemon.

* Oprocd – Cluster fencing module
* Cssd – Cluster sychronization module which manages node membership
* Oclsomon – Cssd monitor which will monitor for cssd hangs

OPROCD This is a daemon that only gets activated when there is no vendor clusterware present on the OS. This daemon is also not activated to run on Windows/Linux. This daemon runs a tight loop and if it is not scheduled for 1.5 seconds, will reboot the node.
CSSD This daemon pings the other members of the cluster over the private network and Voting disk. If this does not get a response for Misscount seconds and Disktimeout seconds respectively, it will reboot the node.
Oclsomon This daemon monitors the CSSD to ensure that CSSD is scheduled by the OS, if it detects any problems it will reboot the node.

更多讨论参见itpub note 747833

关于为何要reboot,Wing Hong在上面的帖子里对fencing有一段深入浅出的解释,摘录如下。

fencing is a very important concept in cluster. not sure I can explain clearly in a few words. any how , forget about RAC first, just look at generic share-disk cluster issue.

let’s say the cluster only has two nodes. both share the disk. ie. they both have the right to write to the same disk.

so this share access must be coordinated.

however , if for some reason , this coordination lost, it can be any reason, communication process hanging, you unplug the network cable between them, etc etc

at this moment both nodes are functioning normally except the lost coordination between them.

then the cluster has two problems to solve:
1. who will be the new member of the new incarnation of cluster , in this case, this is split brain issue.

2. once we decide who will remain in the cluster and who will go , how can we prevent the going node NOT to do something harmful to the cluster ? this is fencing issue. Bear in mind , the going node is working perfectly normal except the coordination part, so it still can write to the shared-disk.

There are a couple of approaches in fencing:

1.server fencing , in cluster terms, Shoot The Other Node In The Head (STONITH) , i.e the good node kill the going node.

( the way Oracle do is : reboot itself, once go through the reboot process, it just do the rejoin cluster again, then the cluster can decide whether accept it or not )

2. I/O fencing. rather than trying to kill the node, it is working on the disks side to block the going node’s access to disk, Sun , Veritas has solution on this way.

HTH. also try to do a google on terms like “fencing” , “split brain”, “amnesia”.

7 Comments Add yours

  1. 木匠 says:

    很有价值 看来升级还需谨慎
    现在我们跑的是: 4 nodes 10.1.0.4 RAC on Linux RH AS3.0

    我打算等11*出来了,明年初 直接升级到 Oracle 11*.

  2. gototop says:

    这个比较狠哦,谁还敢这么升级啊

  3. kamus says:

    木匠 on April 2, 2007 at 12:54 pm said:

    很有价值 看来升级还需谨慎
    现在我们跑的是: 4 nodes 10.1.0.4 RAC on Linux RH AS3.0

    我打算等11*出来了,明年初 直接升级到 Oracle 11*.

    我没有测试过10.1版本,OPROCD是10.2推出的,但是如果拔掉private ip网卡的网线,CSSD会重启node这在10.1中应该也是一样吧。
    呵呵,总之没什么大问题,就别升级数据库了。

  4. Orion says:

    en en en

    好久8见哈,我又上班了

Leave a Reply

Your email address will not be published. Required fields are marked *