Irrelevant thoughts of an oracle DBA

11 August 2011

RAC investigations part I

Filed under: rac — Freek D'Hooge @ 0:06
Tags: , ,

Environment description

2 node rac with Oracle 11.2.0.2.2
Oracle Linux 5.6 with the Unbreakable Enterprise Kernel (2.6.32-100.36.1.el5uek)

Conducted tests

test_srv is a service which has both the instance running on node1 and node2 as preferred instances.
On node1 the service was manually stopped.

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
 NAME=ora.mydb.test_srv.svc
 TYPE=ora.service.type
 CARDINALITY_ID=1
 DEGREE_ID=1
 TARGET=OFFLINE
 STATE=OFFLINE
 CARDINALITY_ID=2
 DEGREE_ID=1
 TARGET=ONLINE
 STATE=ONLINE on node2

Issue a “shutdown abort” on the instance running on node2

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

start the instance again

[grid@node1 ~]$ srvctl start instance -d mydb -i mydb2

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

The service is now running on both instances, although before the crash the service was set offline on node1.

Same test, but this time the service is stopped on all instances

[grid@node1 ~]$ srvctl stop service -d mydb -s test_srv

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

[grid@node1 ~]$ srvctl stop instance -d mydb -i mydb2 -o abort

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

This time both services stay offline.
But what happens if we start the instance again:

[grid@node1 ~]$ srvctl start instance -d mydb -i mydb2

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

Now the service has started again on the restarted instance.
Explanation for this is that the service was configured to come up automatically with the instance, which explains why the service is started on the restarted node.
For the failover this seems to me as expected behaviour as it is the same as what would happen with a preferred / available configuration.

For the third test, we will reconfigure the service to have a preferred and an available node

[grid@node1 ~]$ srvctl stop service -d mydb -s test_srv
[grid@node1 ~]$ srvctl modify service -d mydb -s test_srv -n -i mydb2 -a mydb1

[grid@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Preferred instances: mydb2
Available instances: mydb1

[grid@node1 ~]$ srvctl start service -d mydb -s test_srv -i mydb2
[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

The service is running on its preferred instance, which we will now crash

[grid@node1 ~]$ srvctl stop instance -d mydb -i mydb2 -o abort

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=OFFLINE

eumm, I actually expected a relocation here…
As I have other services which have a preferred / available configuration, I know this service should failover.

[grid@node1 ~]$ srvctl status service -d mydb -s test_srv
Service test_srv is not running.

[grid@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Preferred instances: mydb2
Available instances: mydb1

[grid@node1 ~]$ srvctl status database -d mydb
Instance mydb1 is running on node node1
Instance mydb2 is not running on node node2

I could find no clues in the different cluster log files as of why the relocation did not occur.
More testing will be necessary.
Also note that the output of the crsctl status resource does not contain information about on which node or instance the service is expected to be online.
But by using the -v flag we can see the last_server attribute:

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -v
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
LAST_SERVER=node2
STATE=OFFLINE
TARGET=ONLINE
CARDINALITY_ID=1
CREATION_SEED=137
RESTART_COUNT=0
FAILURE_COUNT=0
FAILURE_HISTORY=
ID=ora.mydb.test_srv.svc 1 1
INCARNATION=5
LAST_RESTART=08/10/2011 16:32:53
LAST_STATE_CHANGE=08/10/2011 16:34:03
STATE_DETAILS=
INTERNAL_STATE=STABLE

After starting the instance again, the service was back available

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

A second run of this test gave the same result.
Manually relocating the service did work though:

[grid@node1 ~]$ srvctl relocate service -d mydb -s test_srv -i mydb1 -t mydb2
[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

What if I removed the service and recreated it directly as preferred / available:

[grid@node1 ~]$ srvctl stop service -d mydb -s test_srv

[grid@node1 ~]$ srvctl remove service -d mydb -s test_srv

[grid@node1 ~]$ srvctl add service -d mydb -s test_srv -r mydb2 -a mydb1 -y AUTOMATIC -P BASIC -e SELECT
PRCD-1026 : Failed to create service test_srv for database mydb
PRKH-1014 : Current user grid is not the same as oracle owner orauser of oracle home /opt/oracle/orauser/product/11.2.0.2/dbhome_1.

would it?
Let us test it:

[grid@node1 ~]$ su - orauser
Password:

[orauser@node1 ~]$ srvctl add service -d mydb -s test_srv -r mydb1,mydb2 -y AUTOMATIC -P BASIC -e SELECT

[orauser@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: mydb1,mydb2
Available instances:

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

now modify it:

[orauser@node1 ~]$ srvctl modify service -d mydb -s test_srv -n -i mydb2 -a mydb1

[orauser@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: mydb2
Available instances: mydb1

[orauser@node1 ~]$ srvctl start service -d mydb -s test_srv -i mydb2

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

[orauser@node1 ~]$ srvctl stop instance -d mydb -i mydb2 -o abort

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=OFFLINE

Nope, the user modifying the service has nothing to do with it.
I also tested the scenario where I directly created a preferred / available service, but in this case the failover also did not work.
But after some more testing I found the reason.
During the first test I had shutdown the instance via sqlplus, not via srvctl. And the other services I talked about had failed over during this test (I never did a failback).
After doing the shutdown abort again via sqlplus, the failover worked again.

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

[orauser@node2 ~]$ export ORACLE_SID=mydb2
[orauser@node2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Wed Aug 10 18:28:29 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> shutdown abort
ORACLE instance shut down.

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

SQL> startup
ORACLE instance started.

Total System Global Area 3140026368 bytes
Fixed Size                  2230600 bytes
Variable Size            1526728376 bytes
Database Buffers         1593835520 bytes
Redo Buffers               17231872 bytes
Database mounted.
Database opened.

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

as expected, starting the instance again did not trigger a failback of the service.

Question now is, if the failover not happening when issuing the shutdown via srvctl is expected behaviour or not.
For this, one probably would have to open a service case, answer a couple of question not important for this issue, escalate and still have to wait for several months.
Do I sound bitter now?

Conclusion:

  • When restarting an instance, an offline service that has this instance listed as a preferred node will be started (management policy = automatic).
  • When an instance on which a service was running fails, the service is started on at least one other preferred instance.
  • The service will remain running on this instance, even when the original instance is started again (in which case the service will run on both instances).
  • When a service has a preferred / available configuration, the service will failover to the available instance, but not failback afterwards.
  • Failover in a preferred / available configuration does not happen when the instance was stopped via “srvctl shutdown <db_unique_name> – o abort”

Questions remaining:

  • What if there where more then 2 nodes, with a service that has all three or more nodes listed as preferred, but currently only running on one node.
    If the instance on which that service is running fails, would the service then be started on all preferred nodes or on only 1 of them?
  • What if, in the above case, the service was running on 2 nodes.
    Would it still be started on other nodes?
  • And what if one of the nodes was configured as available and not as preferred? Would the service on the preferred node still be started or the one on the available instance or both?
  • And last but not least, is the srcvtl shutdown behaviour a bug or not?

It would be neat if someone has access to a 3 or more node rac on which they can run the above tests and send me the results  :-)

Update 13/08/2011:
Amar Lettat, one of my colleagues at Uptime, has pointed me to MOS note 1324574.1 – “11gR2 RAC Service Not Failing Over To Other Node When Instance Is Shut Down”.
This note clearly points out that the service not failing over when shutting down with srvctl is expected behaviour in 11.2.
It also points to the Oracle documentation, where this behaviour is also documented.
So not a bug, only a well documented change in behaviour.
(more…)

Advertisements

3 November 2009

Two Oracle RAC bugs on the wall, two Oracle bugs. Take one down …

Filed under: infrastructure,linux,rac — Freek D'Hooge @ 2:50

Ok, not as good as beer and they can give you a nasty headache, so you have been warned   ;)

Reason for this post are 2 bugs I discovered with Oracle RAC, both resulting in a single point of failure.
The platform on which I’m working is Oracle 10gR2 (10.2.0.4) on OEL 4.7.

The first one is when you are using NFS volumes to host the ocr and ocrmirror volumes.
Normally, when the ocr volume gets corrupted or unavailable , oracle should failover to the ocrmirror volume. The exact response is documented in the RAC FAQ on metalink (note 220970.1) and is currently discussed in a series of blog posts by Geert de Paep.
With NFS, however, you must use both the nointr and hard mount options (OS is OEL 4.7) and as a result the process that is trying to read or write an unavailable ocr volume will wait undefinitly on a response. This is not only happening when using commands such as crs_stat or srvctl, but also when an instance or service failover is initiated.
Oracle support however, does not exactly see it this way and has first blamed the os, then the storage and finally stated that there is no failover foreseen between the ocr and ocrmirror volumes…
It took some escalating and a change in support engineer to get some progress in that SR (mind you that after more then 4 months, they still have not acknowledged it as a bug).

The second problem is that, when you made the public interface redundant with os bonding, the racgvip script does not detect when all interfaces in the bond are disconnected.
This is caused because the script, unlike older version, is using mii-tool to check the availability of the public interface. Only when mii-tool states that the link is down, a ping test is done to the public gateway. If that test fails as well, then the vip fails over and the rac instances on that node are placed in a blocked state.
The problem however with mii-tool is that it plays not very well with bonds, and always reports the bond status as being up (in fact, regardless of the link state, mii-tool is always reporting a network bond as “bond0: 10 Mbit, half duplex, link ok”). So, the racgvip script always thinks that the public interface is up.
As mii-tool is an os utility, I first opened a case on the Oracle Enterprise Linux support, to check with them if its behavior was normal (I already confirmed that by googeling, but Oracle support does not seem to accept results from google :)   ). And after running multiple tests with different bond options, they finally stated that mii-tool was indeed obsolete and should not be used to verify a bond status (yes, I know. Its own man page already states that mii-tool is obsolete).
So next, I opened a SR on part of the clusterware and oracle development promptly stated that it was not a clusterware bug but an os issue, pointing the finger to mii-tool and asking where it was written that mii-tool is obsolete… . After making them aware of the statement made by their OEL colleagues and the mii-tool man page, they have seemed to have accepted it as a bug.
I have checked the 11gR2 version of the racgvip script, and it seems to suffer the same problem.

ps) Note 365605.1 – “Oracle Bug Status Codes, Descriptions and Usage” is, although it seems incomplete, very usefull to understand the different status codes

8 March 2008

bugs introduced via patches

Filed under: infrastructure,rac — Freek D'Hooge @ 13:58

“Sometimes” you have to apply patches to your oracle system to fix bugs. This is however not without risk, as patches are just code and thus can contain bugs themself. The same is true for the scripts used to apply the patches, as I found out first hand.
Recently I needed to perform a drop node / add node procedure on an oracle 10gR2 rac (on solaris 10).
During this procedure I ran into the following problems:

When using ocrconfig to check the backups of the cluster registry, I got the following error:

ld.so.1: ocrconfig: fatal: libocr10.so: open failed: No such file or directory
Killed

After some searching on metalink I found the following bug: 6342492 – ld.so.1: ocrcheck: fatal: libocr10.so: open failed after patch 6000740.
Patch 6000740 is bundle patch MLR7, and the problem with this patch is that the path to make has been hardcoded to a value which is not correct on solaris 10. The solution is to create a symbolic link from /usr/ccs/bin/make to /usr/bin/make, and perform a relink all.

Yep, a hardcoded path. The same problem which requires you to create symbolic links for scp and ssh into the /usr/local/bin directory on solaris 10 when installing a rac system. You would think they learn from their mistakes… .

The second problem I ran into, appeared during the addnode procedure. During the copy of the cluster home to the new node, oracle complained that not all files could be copied.
When I checked the logfile I found the following messages:

WARNING: Error while copying directory /opt/oracle/crs with exclude file list '/tmp/OraInstall2008-02-09_04-20-28PM/installExcludeFile.lst' to no
des 'eocpc-rc01'. [PRKC-1073 : Failed to transfer directory "/opt/oracle/crs" to any of the given nodes "eocpc-rc01 ".
Error on node eocpc-rc01:tar: ./lib/prod/lib: Permission denied
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory
tar: ./lib/prod/lib: Permission denied
tar: cannot open ./lib/prod/lib/v9 No such file or directory]

All problem files where located in the same directory, and this directory was lacking write privileges for the oracle user.

[oracle@eocpc-rc02:~]$ find /opt/oracle/crs/lib/prod -exec ls -ald {} \;
dr-xrwx---   3 oracle   oinstall     512 May  5  2007 /opt/oracle/crs/lib/prod
drwxrwx---   3 oracle   oinstall     512 May  5  2007 /opt/oracle/crs/lib/prod/lib
drwxrwx---   2 oracle   oinstall     512 May  5  2007 /opt/oracle/crs/lib/prod/lib/v9
-rw-rw----   1 oracle   oinstall    2824 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/CCrti.o
-rw-rw----   1 oracle   oinstall    1232 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/CCrtn.o
-rw-rw----   1 oracle   oinstall    3064 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/crt1.o
-rw-rw----   1 oracle   oinstall     776 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/crti.o
-rw-rw----   1 oracle   oinstall     712 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/crtn.o
-rw-rw----   1 oracle   oinstall   14736 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/libCCexcept.so.1
-rw-rw----   1 oracle   oinstall   19056 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/libldstab_ws.so

A quick check on a new, net yet patched installation confirmed that this was not a normal situation:

-[oracle@eocsp-rc51:~]$ find /opt/oracle/crs/lib/prod -exec ls -ald {} \;
drwxrwx---   3 oracle   oinstall     512 Feb  7 23:47 /opt/oracle/crs/lib/prod
drwxrwx---   3 oracle   oinstall     512 Feb  7 23:47 /opt/oracle/crs/lib/prod/lib
drwxrwx---   2 oracle   oinstall     512 Feb  7 23:48 /opt/oracle/crs/lib/prod/lib/v9
-rw-rw----   1 oracle   oinstall    2824 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/CCrti.o
-rw-rw----   1 oracle   oinstall    1232 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/CCrtn.o
-rw-rw----   1 oracle   oinstall    3064 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/crt1.o
-rw-rw----   1 oracle   oinstall     776 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/crti.o
-rw-rw----   1 oracle   oinstall     712 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/crtn.o
-rw-rw----   1 oracle   oinstall   14736 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/libCCexcept.so.1
-rw-rw----   1 oracle   oinstall   19056 Mar 13  2003 /opt/oracle/crs/lib/prod/lib/v9/libldstab_ws.so

Making the lib/prod directory in the crs home writable again for the oinstall group did indeed solve the problem.
Sofar I have found 2 different patches which introduces this problem:

  • 6000740 MLR7
  • 5749953 ons sigbus error after install patchset 10.2.0.3 for crs

A quick search on metalink did not found this problem listed as a known bug. I will file one next week. (If I find the time)

Interesting to note is that these 2 problems are also not likely to be discovered by applying the patch on a test system.
I don’t think there are many dba’s who are performing an addnode procedure when testing a new patchset…

Blog at WordPress.com.