Irrelevant thoughts of an oracle DBA

2 November 2011

datapump gives wrong privileges

Filed under: bugs,upgrades / migrations — Freek D'Hooge @ 20:58
Tags: datapump, exp, expdp, imp, impdp

When performing an export / import using datapump or with the legacy exp / imp utilities, a bug can cause wrong privileges to be granted to users.
The circumstances under which the bug occurs seems to be that a privilege is given on one user with grant option and on a second user (or role) without grant option and that the object must be a schema procedural object (job, program, schedule, …).

I could reproduce this issue on 10.2.0.4, 11.2.0.2 and 11.2.0.3, so changes are that all versions since 10.2 (or even 10.1) are affected.

SQL> @reproduce.sql
SQL> set feedback on
SQL> set linesize 120
SQL> set pages 9999
SQL> set trimspool on
SQL>
SQL> select
  2    banner
  3  from
  4    v$version
  5  ;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

5 rows selected.

SQL>
SQL> drop user fdh cascade;

User dropped.

SQL> drop user grant_test_usr;
drop user grant_test_usr
          *
ERROR at line 1:
ORA-01918: user 'GRANT_TEST_USR' does not exist

SQL> drop user grant_test_usr2;
drop user grant_test_usr2
          *
ERROR at line 1:
ORA-01918: user 'GRANT_TEST_USR2' does not exist

SQL> drop role grant_test_role;
drop role grant_test_role
          *
ERROR at line 1:
ORA-01919: role 'GRANT_TEST_ROLE' does not exist

SQL>
SQL> select
  2    directory_path
  3  from
  4    dba_directories
  5  where
  6    directory_name = 'DATA_PUMP_DIR';

DIRECTORY_PATH
------------------------------------------------------------------------------------------------------------------------
/u01/oracle/oracle1/admin/gunnar/dpdump/

1 row selected.

SQL>
SQL> create user fdh
  2  identified by blabla
  3  default tablespace users
  4  quota unlimited on users
  5  /

User created.

SQL>
SQL> create user grant_test_usr
  2  identified by blabla
  3  default tablespace users
  4  quota unlimited on users
  5  /

User created.

SQL>
SQL> create role grant_test_role
  2  /

Role created.

SQL>
SQL> create user grant_test_usr2
  2  identified by blabla
  3  default tablespace users
  4  quota unlimited on users
  5  /

User created.

SQL>
SQL> BEGIN
  2    dbms_scheduler.create_job
  3      ( job_name          =>  'FDH.TEST',
  4        job_type          =>  'PLSQL_BLOCK',
  5        job_action        =>  'begin null; end;',
  6        repeat_interval   =>  'FREQ=DAILY; BYHOUR=5',
  7        end_date          =>  NULL,
  8        enabled           =>  FALSE,
  9        auto_drop         =>  FALSE
 10      );
 11
 12    commit;
 13  END;
 14  /

PL/SQL procedure successfully completed.

SQL>
SQL> create procedure
  2    fdh.procedure_test
  3  is
  4  begin
  5    null;
  6  end;
  7  /

Procedure created.

SQL>
SQL> grant alter on fdh.test to grant_test_usr with grant option;

Grant succeeded.

SQL> grant alter on fdh.test to grant_test_role;

Grant succeeded.

SQL> grant alter on fdh.test to grant_test_usr2;

Grant succeeded.

SQL>
SQL> grant debug on fdh.procedure_test to grant_test_usr with grant option;

Grant succeeded.

SQL> grant debug on fdh.procedure_test to grant_test_role;

Grant succeeded.

SQL> grant debug on fdh.procedure_test to grant_test_usr2;

Grant succeeded.

SQL>
SQL> select
  2    table_name, grantee, privilege, grantable
  3  from
  4    dba_tab_privs
  5  where
  6    owner = 'FDH'
  7    and table_name in
  8    ( 'TEST', 'PROCEDURE_TEST'
  9    )
 10  order by
 11    table_name, grantee, privilege, grantable;

TABLE_NAME                     GRANTEE                        PRIVILEGE                                GRA
------------------------------ ------------------------------ ---------------------------------------- ---
PROCEDURE_TEST                 GRANT_TEST_ROLE                DEBUG                                    NO
PROCEDURE_TEST                 GRANT_TEST_USR                 DEBUG                                    YES
PROCEDURE_TEST                 GRANT_TEST_USR2                DEBUG                                    NO
TEST                           GRANT_TEST_ROLE                ALTER                                    NO
TEST                           GRANT_TEST_USR                 ALTER                                    YES
TEST                           GRANT_TEST_USR2                ALTER                                    NO

6 rows selected.

SQL>
SQL> /* execute the following part on the os
SQL>    in the data_pump_dir directory
SQL>
SQL> expdp system schemas='FDH' dumpfile=grant_test.dmp
SQL> impdp system schemas='FDH' dumpfile=grant_test.dmp sqlfile=grant_test_dump.txt
SQL>
SQL> grep GRANT grant_test_dump.txt
SQL> */
SQL>
SQL>
SQL> !
[oracle1@elin ~]$ cd /u01/oracle/oracle1/admin/gunnar/dpdump/
[oracle1@elin dpdump]$ expdp system schemas='FDH' dumpfile=grant_test.dmp

Export: Release 11.2.0.2.0 - Production on Wed Nov 2 19:49:15 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Password:

UDE-28002: operation generated ORACLE error 28002
ORA-28002: the password will expire within 6 days

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYSTEM"."SYS_EXPORT_SCHEMA_01":  system/******** schemas=FDH dumpfile=grant_test.dmp
Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 0 KB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/POST_SCHEMA/PROCOBJ
Processing object type SCHEMA_EXPORT/POST_SCHEMA/GRANT/PROCOBJ_GRANT
Master table "SYSTEM"."SYS_EXPORT_SCHEMA_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_01 is:
  /u01/oracle/oracle1/admin/gunnar/dpdump/grant_test.dmp
Job "SYSTEM"."SYS_EXPORT_SCHEMA_01" successfully completed at 19:49:49

[oracle1@elin dpdump]$ impdp system schemas='FDH' dumpfile=grant_test.dmp sqlfile=grant_test_dump.txt

Import: Release 11.2.0.2.0 - Production on Wed Nov 2 19:49:57 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Password:

UDI-28002: operation generated ORACLE error 28002
ORA-28002: the password will expire within 6 days

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_SQL_FILE_SCHEMA_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_SQL_FILE_SCHEMA_01":  system/******** schemas=FDH dumpfile=grant_test.dmp sqlfile=grant_test_dump.txt
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/POST_SCHEMA/PROCOBJ
Processing object type SCHEMA_EXPORT/POST_SCHEMA/GRANT/PROCOBJ_GRANT
Job "SYSTEM"."SYS_SQL_FILE_SCHEMA_01" successfully completed at 19:50:02

[oracle1@elin dpdump]$ grep GRANT grant_test_dump.txt
-- new object type path: SCHEMA_EXPORT/PROCEDURE/GRANT/OWNER_GRANT/OBJECT_GRANT
GRANT DEBUG ON "FDH"."PROCEDURE_TEST" TO "GRANT_TEST_USR" WITH GRANT OPTION;
GRANT DEBUG ON "FDH"."PROCEDURE_TEST" TO "GRANT_TEST_ROLE";
GRANT DEBUG ON "FDH"."PROCEDURE_TEST" TO "GRANT_TEST_USR2";
-- new object type path: SCHEMA_EXPORT/POST_SCHEMA/GRANT/PROCOBJ_GRANT
SYS.DBMS_UTILITY.EXEC_DDL_STATEMENT('GRANT ALTER ON "FDH"."TEST" TO "GRANT_TEST_USR" WITH GRANT OPTION');
SYS.DBMS_UTILITY.EXEC_DDL_STATEMENT('GRANT ALTER ON "FDH"."TEST" TO "GRANT_TEST_ROLE" WITH GRANT OPTION');
SYS.DBMS_UTILITY.EXEC_DDL_STATEMENT('GRANT ALTER ON "FDH"."TEST" TO "GRANT_TEST_USR2" WITH GRANT OPTION');

As you can see, all three grants on fdh.test are now including the “with grant option” clause, while only the grant to grant_test_usr should have been using the “with grant option”.
The privileges on the procedure are correct.
I have logged a case about this with Oracle, but no bug number is assigned at this moment

22 September 2011

Rise of the appliances?

Filed under: infrastructure,opinion,unbreakable db appliance — Freek D'Hooge @ 9:52

Some quick thoughts.

Yesterday Oracle announced it’s first database appliance for th SMB market.
Before this, it had already its Exadata and Exalogic appliances for the big environments.
During the presentation Oracle has also indicated that it want’s to continue delivering new appliance products and apparently is no longer interested in selling “commodity” x86 servers.

Symantec has also been busy with appliances for Netbackup.

For some time now, we have seen that the big players in the IT market are leaving their historical background and are trying to offer the complete stack from software over switches to storage. Is this offering of appliances the next step?
Will we see more and more applications offered as appliances?

If so, what will this mean for the independent system integrators?

Also, as these appliances seems to use their own dedicated storage, what does this mean for the SAN?
(I know of some people who will not mourn there decline).

Comments (1)

21 September 2011

Oracle anounces the Unbreakable DB Appliance

Filed under: infrastructure,opinion,unbreakable db appliance — Freek D'Hooge @ 19:33
Tags: oracel, unbreakable db appliance

More then 10 years after Oracle’s first appliance attempt with Raw Iron and 3 years after the release of Exadata, Oracle has now announced the Unbreakable DB Applicance.

This “cluster in a box” consists out of a 4 RU chassis, in which 2 server nodes, 96 GB memory per node, 12 TB raw shared disk storage (24 disks) and 292 GB flash disks has been placed.
The two server nodes have a total of 24 cpu cores, but cores can be disabled.
This allows for sub-capacity licensing of the software (with a minimum of 4 cores).

On the software side, the appliance is running Oracle linux and 11gR2 grid infrastructure and 11gR2 db software. Databases on this appliance can run as single node, RAC or RAC One Node.
Oracle enterprise manager is also part of the software stack.

Claims are made towards one button installation of software and patching.
The appliance has also a “phone home” functionality which automatically creates a service request when a problem is detected.

List price for the hardware is $ 50,000 (regardless of how many cores you activate) and for the software the standard DB licensing applies.
Which means that existing CPU licences can be transferred to this appliance.

Oracle positions this system below the Exadata quarter rack, and it is also worth mentioning that this appliance is not expandable.

So far the product launch information.

Some questions / remarks I have:

According to the presentation the hardware price remains the same, regardless of how many cores you activate (namely $ 50,000).
In my opinion, this means that no one will buy this appliance to just activate 4 cores.
There are much cheaper solutions when you only need a low number of cores (certainly when you consider that most companies already have a san which can be used for the Oracle databases)

There are 24 disks in the appliance, which seems low (certainly compaired to the 24 cpu cores).
However, keep in mind that this storage is dedicated and probably (I don’t have confirmation on this) capable of asm intelligent data placement and command queuing.
Normally SAN vendors are using an estimate of 180 IOPS per san disk. Oracle however is using an estimation of 300 IOPS per cell disk for Exadata, and tests done by Glenn Fawcett show that they can actually perform even better (around 400 IOPS).
http://glennfawcett.wordpress.com/2011/05/10/exadata-drives-exceed-the-laws-of-physics-asm-with-intelligent-placement-improves-iops/
Using the number of 300 IOPS, this would mean that the 24 disks translate to 40 SAN disks (that may not used by any other application, so in reality to even more san disks), which already looks very different.Now, I’m still unsure how it will perform with write intensive databases (oltp or dwh), certainly when several databases are consolidated on this appliance.As this appliance is not expandable, the number of disks may be a weak point, compaired to the number of cpu cores.
I’m hoping that someone like Kevin Closson (poke poke) will be able to shed some light on this, as my knowledge in this area is rather limited :-)

In the presentation it was mentioned that the flash storage is used for the redo logs, but it is unclear if it could also be used to store datafiles or as cache (as with the Exadata smart flash cache)

As with many things the proof of the pudding is in the eating, so I’m looking forward to some benchmarks and presentations by real world customers.
And if anyone from Oracle is reading this, you may always send me a demo machine so I can do some testing on my own ;-))

update 20:12, fixed wrong memory specification

Comments (3)

11 August 2011

RAC investigations part I

Filed under: rac — Freek D'Hooge @ 0:06
Tags: failover, rac, service names

Environment description

2 node rac with Oracle 11.2.0.2.2
Oracle Linux 5.6 with the Unbreakable Enterprise Kernel (2.6.32-100.36.1.el5uek)

Conducted tests

test_srv is a service which has both the instance running on node1 and node2 as preferred instances.
On node1 the service was manually stopped.

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
 NAME=ora.mydb.test_srv.svc
 TYPE=ora.service.type
 CARDINALITY_ID=1
 DEGREE_ID=1
 TARGET=OFFLINE
 STATE=OFFLINE
 CARDINALITY_ID=2
 DEGREE_ID=1
 TARGET=ONLINE
 STATE=ONLINE on node2

Issue a “shutdown abort” on the instance running on node2

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

start the instance again

[grid@node1 ~]$ srvctl start instance -d mydb -i mydb2

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

The service is now running on both instances, although before the crash the service was set offline on node1.

Same test, but this time the service is stopped on all instances

[grid@node1 ~]$ srvctl stop service -d mydb -s test_srv

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

[grid@node1 ~]$ srvctl stop instance -d mydb -i mydb2 -o abort

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

This time both services stay offline.
But what happens if we start the instance again:

[grid@node1 ~]$ srvctl start instance -d mydb -i mydb2

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

Now the service has started again on the restarted instance.
Explanation for this is that the service was configured to come up automatically with the instance, which explains why the service is started on the restarted node.
For the failover this seems to me as expected behaviour as it is the same as what would happen with a preferred / available configuration.

For the third test, we will reconfigure the service to have a preferred and an available node

[grid@node1 ~]$ srvctl stop service -d mydb -s test_srv
[grid@node1 ~]$ srvctl modify service -d mydb -s test_srv -n -i mydb2 -a mydb1

[grid@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Preferred instances: mydb2
Available instances: mydb1

[grid@node1 ~]$ srvctl start service -d mydb -s test_srv -i mydb2
[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

The service is running on its preferred instance, which we will now crash

[grid@node1 ~]$ srvctl stop instance -d mydb -i mydb2 -o abort

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=OFFLINE

eumm, I actually expected a relocation here…
As I have other services which have a preferred / available configuration, I know this service should failover.

[grid@node1 ~]$ srvctl status service -d mydb -s test_srv
Service test_srv is not running.

[grid@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Preferred instances: mydb2
Available instances: mydb1

[grid@node1 ~]$ srvctl status database -d mydb
Instance mydb1 is running on node node1
Instance mydb2 is not running on node node2

I could find no clues in the different cluster log files as of why the relocation did not occur.
More testing will be necessary.
Also note that the output of the crsctl status resource does not contain information about on which node or instance the service is expected to be online.
But by using the -v flag we can see the last_server attribute:

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -v
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
LAST_SERVER=node2
STATE=OFFLINE
TARGET=ONLINE
CARDINALITY_ID=1
CREATION_SEED=137
RESTART_COUNT=0
FAILURE_COUNT=0
FAILURE_HISTORY=
ID=ora.mydb.test_srv.svc 1 1
INCARNATION=5
LAST_RESTART=08/10/2011 16:32:53
LAST_STATE_CHANGE=08/10/2011 16:34:03
STATE_DETAILS=
INTERNAL_STATE=STABLE

After starting the instance again, the service was back available

[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

A second run of this test gave the same result.
Manually relocating the service did work though:

[grid@node1 ~]$ srvctl relocate service -d mydb -s test_srv -i mydb1 -t mydb2
[grid@node1 ~]$ crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

What if I removed the service and recreated it directly as preferred / available:

[grid@node1 ~]$ srvctl stop service -d mydb -s test_srv

[grid@node1 ~]$ srvctl remove service -d mydb -s test_srv

[grid@node1 ~]$ srvctl add service -d mydb -s test_srv -r mydb2 -a mydb1 -y AUTOMATIC -P BASIC -e SELECT
PRCD-1026 : Failed to create service test_srv for database mydb
PRKH-1014 : Current user grid is not the same as oracle owner orauser of oracle home /opt/oracle/orauser/product/11.2.0.2/dbhome_1.

would it?
Let us test it:

[grid@node1 ~]$ su - orauser
Password:

[orauser@node1 ~]$ srvctl add service -d mydb -s test_srv -r mydb1,mydb2 -y AUTOMATIC -P BASIC -e SELECT

[orauser@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: mydb1,mydb2
Available instances:

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

CARDINALITY_ID=2
DEGREE_ID=1
TARGET=OFFLINE
STATE=OFFLINE

now modify it:

[orauser@node1 ~]$ srvctl modify service -d mydb -s test_srv -n -i mydb2 -a mydb1

[orauser@node1 ~]$ srvctl config service -d mydb -s test_srv
Service name: test_srv
Service is enabled
Server pool: mydb_test_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: mydb2
Available instances: mydb1

[orauser@node1 ~]$ srvctl start service -d mydb -s test_srv -i mydb2

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

[orauser@node1 ~]$ srvctl stop instance -d mydb -i mydb2 -o abort

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=OFFLINE

Nope, the user modifying the service has nothing to do with it.
I also tested the scenario where I directly created a preferred / available service, but in this case the failover also did not work.
But after some more testing I found the reason.
During the first test I had shutdown the instance via sqlplus, not via srvctl. And the other services I talked about had failed over during this test (I never did a failback).
After doing the shutdown abort again via sqlplus, the failover worked again.

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node2

[orauser@node2 ~]$ export ORACLE_SID=mydb2
[orauser@node2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Wed Aug 10 18:28:29 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL&gt; shutdown abort
ORACLE instance shut down.

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

SQL&gt; startup
ORACLE instance started.

Total System Global Area 3140026368 bytes
Fixed Size                  2230600 bytes
Variable Size            1526728376 bytes
Database Buffers         1593835520 bytes
Redo Buffers               17231872 bytes
Database mounted.
Database opened.

[orauser@node1 ~]$ /opt/grid/11.2.0.2/bin/crsctl status resource ora.mydb.test_srv.svc -l
NAME=ora.mydb.test_srv.svc
TYPE=ora.service.type
CARDINALITY_ID=1
DEGREE_ID=1
TARGET=ONLINE
STATE=ONLINE on node1

as expected, starting the instance again did not trigger a failback of the service.

Question now is, if the failover not happening when issuing the shutdown via srvctl is expected behaviour or not.
For this, one probably would have to open a service case, answer a couple of question not important for this issue, escalate and still have to wait for several months.
Do I sound bitter now?

Conclusion:

When restarting an instance, an offline service that has this instance listed as a preferred node will be started (management policy = automatic).

When an instance on which a service was running fails, the service is started on at least one other preferred instance.

The service will remain running on this instance, even when the original instance is started again (in which case the service will run on both instances).

When a service has a preferred / available configuration, the service will failover to the available instance, but not failback afterwards.

Failover in a preferred / available configuration does not happen when the instance was stopped via “srvctl shutdown <db_unique_name> – o abort”

Questions remaining:

What if there where more then 2 nodes, with a service that has all three or more nodes listed as preferred, but currently only running on one node.
If the instance on which that service is running fails, would the service then be started on all preferred nodes or on only 1 of them?

What if, in the above case, the service was running on 2 nodes.
Would it still be started on other nodes?

And what if one of the nodes was configured as available and not as preferred? Would the service on the preferred node still be started or the one on the available instance or both?

And last but not least, is the srcvtl shutdown behaviour a bug or not?

It would be neat if someone has access to a 3 or more node rac on which they can run the above tests and send me the results :-)

Update 13/08/2011:
Amar Lettat, one of my colleagues at Uptime, has pointed me to MOS note 1324574.1 – “11gR2 RAC Service Not Failing Over To Other Node When Instance Is Shut Down”.
This note clearly points out that the service not failing over when shutting down with srvctl is expected behaviour in 11.2.
It also points to the Oracle documentation, where this behaviour is also documented.
So not a bug, only a well documented change in behaviour.
(more…)

Comments (1)

30 November 2009

0x1A

Filed under: rant — Freek D'Hooge @ 0:23

0x1A, better know as the end of file character.
Now also known as the cause of me waisting several hours on analyzing a 500MB raw trace file trying to figure out why the tkprof report did not seemed to be correct.

Still wondering why the application I was tracing had an EOF character in the value of a varchar2 type bind variable.

Comments (3)

3 November 2009

Two Oracle RAC bugs on the wall, two Oracle bugs. Take one down …

Filed under: infrastructure,linux,rac — Freek D'Hooge @ 2:50

Ok, not as good as beer and they can give you a nasty headache, so you have been warned ;)

Reason for this post are 2 bugs I discovered with Oracle RAC, both resulting in a single point of failure.
The platform on which I’m working is Oracle 10gR2 (10.2.0.4) on OEL 4.7.

The first one is when you are using NFS volumes to host the ocr and ocrmirror volumes.
Normally, when the ocr volume gets corrupted or unavailable , oracle should failover to the ocrmirror volume. The exact response is documented in the RAC FAQ on metalink (note 220970.1) and is currently discussed in a series of blog posts by Geert de Paep.
With NFS, however, you must use both the nointr and hard mount options (OS is OEL 4.7) and as a result the process that is trying to read or write an unavailable ocr volume will wait undefinitly on a response. This is not only happening when using commands such as crs_stat or srvctl, but also when an instance or service failover is initiated.
Oracle support however, does not exactly see it this way and has first blamed the os, then the storage and finally stated that there is no failover foreseen between the ocr and ocrmirror volumes…
It took some escalating and a change in support engineer to get some progress in that SR (mind you that after more then 4 months, they still have not acknowledged it as a bug).

The second problem is that, when you made the public interface redundant with os bonding, the racgvip script does not detect when all interfaces in the bond are disconnected.
This is caused because the script, unlike older version, is using mii-tool to check the availability of the public interface. Only when mii-tool states that the link is down, a ping test is done to the public gateway. If that test fails as well, then the vip fails over and the rac instances on that node are placed in a blocked state.
The problem however with mii-tool is that it plays not very well with bonds, and always reports the bond status as being up (in fact, regardless of the link state, mii-tool is always reporting a network bond as “bond0: 10 Mbit, half duplex, link ok”). So, the racgvip script always thinks that the public interface is up.
As mii-tool is an os utility, I first opened a case on the Oracle Enterprise Linux support, to check with them if its behavior was normal (I already confirmed that by googeling, but Oracle support does not seem to accept results from google :) ). And after running multiple tests with different bond options, they finally stated that mii-tool was indeed obsolete and should not be used to verify a bond status (yes, I know. Its own man page already states that mii-tool is obsolete).
So next, I opened a SR on part of the clusterware and oracle development promptly stated that it was not a clusterware bug but an os issue, pointing the finger to mii-tool and asking where it was written that mii-tool is obsolete… . After making them aware of the statement made by their OEL colleagues and the mii-tool man page, they have seemed to have accepted it as a bug.
I have checked the 11gR2 version of the racgvip script, and it seems to suffer the same problem.

ps) Note 365605.1 – “Oracle Bug Status Codes, Descriptions and Usage” is, although it seems incomplete, very usefull to understand the different status codes

Comments (4)

25 October 2009

Wintertime (again)

Filed under: infrastructure,Uncategorized — Freek D'Hooge @ 15:28

During my prior post on the effect of daylight saving settings on the Oracle scheduler, I already pointed out that it is best to set your session timezone information to a named timezone and not to an absolute offset. In this post I would like to investigate how the session timezone settings affect the sysdate, current_date, systimestamp and current_timestamp variables during the switchover to or from daylight saving time. Current_date and current_timestamp, are using the date/time information of the server on which the database runs and modify that time using the timezone settings of the session.
As with the last post, the tests where done in response to the switching from wintertime to summertime, and I’m to lazy to redo them.

In the first test, I do not explicitly set timezone information in my session.
Both the server time and the client time has been set to a couple of minutes before the swithover from wintertime to summertime:

sys@GUNNAR> alter session set nls_date_format = 'DD/MM/YYYY HH24:MI:SS';

Session altered.

sys@GUNNAR> alter session set nls_timestamp_tz_format='DD/MM/YYYY HH24:MI:SS "TZ:" TZR "DS:" TZD ';

Session altered.

sys@GUNNAR> column systimestamp format a35
sys@GUNNAR> column current_timestamp format a35
sys@GUNNAR> select sysdate, current_date, systimestamp, current_timestamp from dual;

SYSDATE             CURRENT_DATE        SYSTIMESTAMP                        CURRENT_TIMESTAMP
------------------- ------------------- ----------------------------------- -----------------------------------
29/03/2009 01:58:32 29/03/2009 01:58:32 29/03/2009 01:58:32 TZ: +01:00 DS:  29/03/2009 01:58:32 TZ: +01:00 DS:

As you can see the timezone information uses the absolute offset notation and is set to GMT +1 (which corresponds with wintertime in Belgium).
After some minutes (when the summertime came in effect), I execute the same query again:

sys@GUNNAR> select sysdate, current_date, systimestamp, current_timestamp from dual;

SYSDATE             CURRENT_DATE        SYSTIMESTAMP                        CURRENT_TIMESTAMP
------------------- ------------------- ----------------------------------- -----------------------------------
29/03/2009 03:00:18 29/03/2009 02:00:18 29/03/2009 03:00:18 TZ: +02:00 DS:  29/03/2009 02:00:18 TZ: +01:00 DS:

Both sysdate and systimestamp has jumped 1 hour in the feature and systimestamp now shows the timezone as “GMT + 2” (summertime in Belgium).
Current_date and current_timestamp both show the time without summertime corrections, but with current_timestamp the timezone information places the time in the right context.

Next, I disconnect and reconnect the session:

sys@GUNNAR> select sysdate, current_date, systimestamp, current_timestamp from dual;

SYSDATE             CURRENT_DATE        SYSTIMESTAMP                        CURRENT_TIMESTAMP
------------------- ------------------- ----------------------------------- -----------------------------------
29/03/2009 03:01:15 29/03/2009 03:01:15 29/03/2009 03:01:15 TZ: +02:00 DS:  29/03/2009 03:01:15 TZ: +02:00 DS:

This time, all 4 show the same time and timezone information (all using summertime).
The explanation for this is that the timezone information for a session is determined when the session is created, and Oracle only applies daylight saving settings when using a named timezone. So as long as the session is connected, it uses the “old” timezone of GMT +1. With sysdate and systimestamp the timezone information comes from the server, not from the client.

In the second test, I have set the ORA_SDTZ variable in the client environment to “Europe/Brussels”

sys@GUNNAR> select sysdate, current_date, systimestamp, current_timestamp from dual;

SYSDATE             CURRENT_DATE        SYSTIMESTAMP                        CURRENT_TIMESTAMP
------------------- ------------------- ----------------------------------- -----------------------------------------------
29/03/2009 01:57:37 29/03/2009 01:57:38 29/03/2009 01:57:37 TZ: +01:00 DS:  29/03/2009 01:57:37 TZ: EUROPE/BRUSSELS DS: CET

### a couple of minutes later

sys@GUNNAR> select sysdate, current_date, systimestamp, current_timestamp from dual;

SYSDATE             CURRENT_DATE        SYSTIMESTAMP                        CURRENT_TIMESTAMP
------------------- ------------------- ----------------------------------- ------------------------------------------------
29/03/2009 03:00:04 29/03/2009 03:00:04 29/03/2009 03:00:04 TZ: +02:00 DS:  29/03/2009 03:00:04 TZ: EUROPE/BRUSSELS DS: CEST

Both current_date and current_timestamp have now also jumped 1 hour in the “future” and the daylight saving settings in current_timestamp has changed from CET (Central European Time) to CEST (Central European Summer Time).
To me this shows that it is important to set the timezone of you clients correctly, even if the database is not used from different timezones.
A long running session is sufficient to pollute your data, certainly if you are using current_date as it has no timezone information.

Comments (1)

22 October 2009

switching to wintertime

Filed under: infrastructure — Freek D'Hooge @ 18:17

In Belgium we are switching to wintertime this Sunday, which is good opportunity for me to write this post.
I normally intended to write it when we switched to summer time, so everything will be from the point of view of changing from winter time to summer time (confused yet? ).

The reason that I wanted to write about it, where some alerts we got back then from our monitoring considering scheduler jobs which where no longer running on time.
Quickly it became clear that these jobs did not follow the change to summer time, but instead ran an hour later.
The key is to look at the dba_scheduler_jobs table in the correct format. You see, the *_run_date columns are of the datatype “timestamp(6) with timezone”, so to get all the information you need to use the right format model. Using the TZR and TZD models you can respectively see the timezone and the daylight saving information:

sys@WPS50> select job_name, to_char(last_start_date, 'DD/MM/YYYY HH24:MI:SS "TZ:" TZR "DS:" TZD ') last_start_date, to_char(next_run_date, 'DD/MM/YYYY HH24:MI:SS "TS:" TZR "DS:" TZD ') next_run_date from dba_scheduler_jobs;

JOB_NAME                       LAST_START_DATE                                    NEXT_RUN_DATE
------------------------------ -------------------------------------------------- --------------------------------------------------
AUTO_SPACE_ADVISOR_JOB         28/03/2009 06:00:04 TZ: +01:00 DS:
GATHER_STATS_JOB               02/02/2009 22:00:00 TZ: +01:00 DS:
FGR$AUTOPURGE_JOB
PURGE_LOG                      29/03/2009 03:00:00 TZ: MET DS: MEST               30/03/2009 03:00:00 TS: MET DS: MEST
ANALYZETHIS_PURGEHISTORY       29/03/2009 17:00:00 TZ: +01:00 DS:                 30/03/2009 17:00:00 TS: +01:00 DS:
GATHER_WK_TEST_STATS           29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
GATHER_SESSIONUSR_STATS        29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
GATHER_RELEASEUSR_STATS        29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
GATHER_LMDBUSR_STATS           29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
GATHER_ICMADMIN_STATS          29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
GATHER_COMMUNITYUSR_STATS      29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
GATHER_CUSTOMIZATIONUSR_STATS  29/03/2009 18:00:00 TZ: +01:00 DS:                 30/03/2009 18:00:00 TS: +01:00 DS:
MGMT_STATS_CONFIG_JOB          01/03/2009 01:01:01 TZ: +01:00 DS:                 01/04/2009 01:01:01 TS: +01:00 DS:
MGMT_CONFIG_JOB                28/03/2009 06:00:04 TZ: +01:00 DS:

14 rows selected.

(note the additional space after the TZD format, I needed to add this to actually show the information if I used the "DS:" litteral in front (probably this is a bug) )

As you can see, each job has its own timezone offset and some have also daylight saving information.
So, what happened with our jobs? Well, when a job gets created, Oracle stores the timezone information of the start_date parameters. If this timezone is specified in an absolute offset then no daylight saving changes are applied.
When the server switches to summer time (GMT +2 in Belgium), the scheduler job stays in its own little world and remains in the timezone GMT +1.
So, when for the rest of the database the time is 07:00, the job thinks it is still 06:00 and does not start. As the monitoring check did not take the timezone of the job in account, it reported the job as being late.

To avoid this situation, you need to use a named timezone, in which case oracle will apply automatically the correct daylight saving settings.
How do you do this? Well either you use the to_timestamp_tz to convert a text string to a timestamp with timezone information or Oracle retrieves the timezone from your session.
The timezone information in your session can be set with alter session, or by using the ORA_SDTZ variable in your client environment.
But there is a catch. In the following example I have set my timezone to Europe/Brussels, and then verified the timezone information in systimestamp:

sys@WPS50> select sessiontimezone from dual;

 SESSIONTIMEZONE
---------------------------------------------------------------------------
Europe/Brussels

sys@WPS50> select to_char(systimestamp, 'DD/MM/YYYY HH24:MI:SS "TZ:" TZR "DS:" TZD ') from dual;

TO_CHAR(SYSTIMESTAMP,'DD/MM/YYYYHH24:MI:SS"TZ:"TZR"DS:"TZD')
--------------------------------------------------------------------
30/03/2009 01:57:15 TZ: +02:00 DS:

As you can see, the timezone part an absolute notation, not a named timezone.
Systimestamp will never use the named timezone notation, so whenever you use systimestamp as value for the next_date parameter in dbms_scheduler, you will use an absolute offset and thus not follow daylight saving switches.
The current_timestamp variable, will however use the correct notation:

sys@WPS50> select to_char(current_timestamp, 'DD/MM/YYYY HH24:MI:SS "TZ:" TZR "DS:" TZD ') from dual;

TO_CHAR(CURRENT_TIMESTAMP,'DD/MM/YYYYHH24:MI:SS"TZ:"TZR"DS:"TZD')
--------------------------------------------------------------------
30/03/2009 01:57:18 TZ: EUROPE/BRUSSELS DS: CEST

So, when you want to specify the current date as value for the next_date parameter, use current_timestamp and not systimestamp.

This timezone stuff is only applicable when you have an interval that is at least 1 day. With smaller intervals, Oracle will make sure that the period between 2 runs remain the same.
If a job runs every 3 hours and last ran on midnight and the clock is then moved forward from 02:00 to 03:00, then the next run date of the job becomes 04:00, so that the 3 hour period between two job runs is retained.

More information on this, including how Oracle behaves when no start_date parameter is given can be found here:

http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_sched.htm#sthref5475
Metalink id: 467722.1 – DBMS_SCHEDULER And Time Zones ( DST ) Explained
Metalink id: 340512.1 – Timestamps & time zones – Frequently Asked Questions

The database version on which the tests where done is 10.2.0.4

Comments (4)

20 October 2009

Multiple standby databases and supplemental logging

Filed under: dataguard,infrastructure — Freek D'Hooge @ 18:10

A quick warning:

When you setup a logical standby database, you need to activate supplemental logging on the primary database.
This is done automatically when you build the data dictionary (by running the dbms_logstdby.build procedure).
Activating supplemental logging is however (I know now) a control file change and is thus not replicated to the other physical standby databases.
As a result, the logical standby will become (logical) corrupt when you perform a role switch between your primary and another physical standby database.

I learned this the hard way :(
Luckily it was during a proof of concept and not in a real production environment … .

Of course, AFTERWARDS, I found the following maa document which points out that you have to enable supplemental logging yourself on the other physical standby databases.
It still makes a good read though

Comments (5)

19 January 2009

Silent upgrade troubles

Filed under: bugs,infrastructure,upgrades / migrations — Freek D'Hooge @ 1:08

Last week I was asked to write a little script to automate an upgrade of oracle client 9.2.0.1 to 9.2.0.8.
Reason for this was that we needed to update arround 1.300 clients to enable them to connect to a 10g database (we couldn’t install 10g clients because of other applications restricted the client version to 9i).

Ok, easy enough. Oracle allows you to automate installations and upgrades via response files and the response file for a client upgrade from 9.2.0.1 to 9.2.0.8 is very simple.
When I started testing the upgrade, I immediately spotted a first problem. The setup.exe (it was on windows xp) started a new console and then returned directly to the prompt in the original console. This would make it impossible to check the return codes to know if a upgrade was successful or not.

The upgrade itself finished without a problem, but at the end the following message appeared in the newly started console: “Press enter to exit”.
Huh!? This was supposed to be a “silent” install, meaning no interraction needed. But here it was, asking to press enter to exit.
And the documentation was not telling anything about it.
After some searching, I found that you can specify the “-noconsole” flag when starting the setup, which would surpress the new console and avoid the question to press enter.
You still would see the question in the logfiles, but the installation presumed you responded to it and finishes the upgrade.

This left me with the first problem: the prompt would still directly return while the upgrade was running in the background.
After some searching in the documentation I found a note stating that you need to modify the oraparam.ini file and change the BOOTSTRAP parameter from TRUE to FALSE.
Unfortunately this did not help. Yelling at it did either.

Then I found that in 10g, you had a “-waitforcompletion” flag you could set, that would do exactly what I needed. So I tried if it would work for the oui shipped in the 9.2.0.8 patchset.
At first, it didn’t, but then I found metalink note 293044.1 that said that the setup.exe in Disk1 and Disk1/install where not the same and that the one in Disk1/install should be used for the “-waitforcompletion” flag.
At last it worked.

For those interested, here is the full command I used to start the silent upgrade:

start /wait C:\oracle\patches\9.2.0.8\Disk1\install\setup.exe -silent -noconsole -waitforcompletion -responsefile c:\oracle\patches\9.2.0.8\patchset.rsp -paramfile c:\oracle\patches\9.2.0.8\oraparam.ini

—————————-

Thanks to Geert for the yelling link :)

Comments (4)

Irrelevant thoughts of an oracle DBA

2 November 2011

datapump gives wrong privileges

22 September 2011

Rise of the appliances?

21 September 2011

Oracle anounces the Unbreakable DB Appliance

11 August 2011

RAC investigations part I

30 November 2009

0x1A

3 November 2009

Two Oracle RAC bugs on the wall, two Oracle bugs. Take one down …

25 October 2009

Wintertime (again)

22 October 2009

switching to wintertime

20 October 2009

Multiple standby databases and supplemental logging

19 January 2009

Silent upgrade troubles

Pages

Postings by RSS

Comments by RSS

Archives

Blogroll

Meta