Irrelevant thoughts of an oracle DBA

3 November 2009

Two Oracle RAC bugs on the wall, two Oracle bugs. Take one down …

Filed under: infrastructure,linux,rac — dhoogfr @ 2:50

Ok, not as good as beer and they can give you a nasty headache, so you have been warned   ;)

Reason for this post are 2 bugs I discovered with Oracle RAC, both resulting in a single point of failure.
The platform on which I’m working is Oracle 10gR2 (10.2.0.4) on OEL 4.7.

The first one is when you are using NFS volumes to host the ocr and ocrmirror volumes.
Normally, when the ocr volume gets corrupted or unavailable , oracle should failover to the ocrmirror volume. The exact response is documented in the RAC FAQ on metalink (note 220970.1) and is currently discussed in a series of blog posts by Geert de Paep.
With NFS, however, you must use both the nointr and hard mount options (OS is OEL 4.7) and as a result the process that is trying to read or write an unavailable ocr volume will wait undefinitly on a response. This is not only happening when using commands such as crs_stat or srvctl, but also when an instance or service failover is initiated.
Oracle support however, does not exactly see it this way and has first blamed the os, then the storage and finally stated that there is no failover foreseen between the ocr and ocrmirror volumes…
It took some escalating and a change in support engineer to get some progress in that SR (mind you that after more then 4 months, they still have not acknowledged it as a bug).

The second problem is that, when you made the public interface redundant with os bonding, the racgvip script does not detect when all interfaces in the bond are disconnected.
This is caused because the script, unlike older version, is using mii-tool to check the availability of the public interface. Only when mii-tool states that the link is down, a ping test is done to the public gateway. If that test fails as well, then the vip fails over and the rac instances on that node are placed in a blocked state.
The problem however with mii-tool is that it plays not very well with bonds, and always reports the bond status as being up (in fact, regardless of the link state, mii-tool is always reporting a network bond as “bond0: 10 Mbit, half duplex, link ok”). So, the racgvip script always thinks that the public interface is up.
As mii-tool is an os utility, I first opened a case on the Oracle Enterprise Linux support, to check with them if its behavior was normal (I already confirmed that by googeling, but Oracle support does not seem to accept results from google :)   ). And after running multiple tests with different bond options, they finally stated that mii-tool was indeed obsolete and should not be used to verify a bond status (yes, I know. Its own man page already states that mii-tool is obsolete).
So next, I opened a SR on part of the clusterware and oracle development promptly stated that it was not a clusterware bug but an os issue, pointing the finger to mii-tool and asking where it was written that mii-tool is obsolete… . After making them aware of the statement made by their OEL colleagues and the mii-tool man page, they have seemed to have accepted it as a bug.
I have checked the 11gR2 version of the racgvip script, and it seems to suffer the same problem.

ps) Note 365605.1 – “Oracle Bug Status Codes, Descriptions and Usage” is, although it seems incomplete, very usefull to understand the different status codes

About these ads

4 Comments »

  1. Hi,

    The second one really interested me because I’m also using bonding on several installations although I have tested it, I never disconnected both cables so now I realize that this could be a problem.

    Thanks!

    Comment by Svetoslav Gyurov — 3 November 2009 @ 13:07 | Reply

    • Svetoslav,

      You can easily work arround the public bond problem by editing the racgvip script and set the MIITOOL variable to a non existing location.
      This way, oracle will fallback to pinging the default gateway to check the link status.
      However, I think the crs will have to be restarted before the change will be enabled.

      regards,

      Freek

      Comment by dhoogfr — 3 November 2009 @ 13:13 | Reply

      • Hey Freek,

        Thanks for the reply and advice. I’ll fix it and keep in mind this problem. Actually I found this at this particular moment when I’m preparing for presentation about how Oracle is working on HP-UX and Linux and show some differences.

        Regards,
        Sve

        Comment by Svetoslav Gyurov — 3 November 2009 @ 14:03

  2. I’ve hit that NFS{hard,nointr} issue about 1 year ago, when I was trying to build Extended RAC in home lab. I did not have have Metalink access at time, but what appears that should work is DirectNFS – it is implemented by simple send()/recv() operations from Oracle userspace – so in theory, provided that there is timeout code implemented in DNFS implementation it should work. However I was not able to get DNFS working with Linux/Solaris/OpenFiler(Linux) NFS server at that time – from what I remember, sniffing & analyzing revealed that there was some kind of incompatibility between DNFS client and non-NetApp NFS server (some strange(?) NFS-level operation on directory not being handled correctly by Linux NFS server/same with Solaris).

    …What worked for me finally was iSCSI (I was pretty time constrainted) :)

    Comment by Jakub Wartak — 11 November 2009 @ 0:51 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: