Irrelevant thoughts of an oracle DBA

10 January 2009

How to use the plan_table table to sabotage your oracle upgrade

Filed under: bugs,infrastructure,upgrades / migrations — Freek D'Hooge @ 12:55

Lets say you need to upgrade your 9i database to 10g ( to be exact), but you actually want to sabotage the upgrade (don’t know why, just assume you do).
Granted, there are many ways to do this, but you want to do it subtle. What are your options then?
Well, one option is to create the plan_table table in your sys schema (or a synonym plan_table to a plan table in another schema if you want to make it really subtle) before the upgrade.
If you do this, you will see the following message in your upgrade log:

Warning: Package Body created with compilation errors.

SQL> show errors;

-------- -----------------------------------------------------------------
113/5    PL/SQL: SQL Statement ignored
118/44   PL/SQL: ORA-00904: "OTHER_XML": invalid identifier

And the “oracle database server” component in the dba_registry will be marked as invalid.
Mission accomplished I would say.

What is that?
You regret your actions and you want to fix the problem?


Ok then, to fix it you can use the following steps:

  • drop the sys.plan_table table
  • drop the sys.plan_table$ table
  • drop all sys synonyms and public synonyms to the plan_table or the plan_table$
  • @?/rdbms/admin/catplan.sql — recreate the plan table
  • @?/rdbms/admin/dbmsxpln.sql — reload dbms_xplan spec
  • @?/rdbms/admin/prvtxpln.plb — reload dbms_xplan implementation
  • @?/rdbms/admin/prvtspao.plb — reload dbms_sqlpa

For those seeking more information:

Metalink note 565600.1 – ERROR IN CATUPGRD: ORA-00904 IN DBMS_SQLPA
Metalink note 605317.1 – DBMS_SQLPA ORA-00904 OTHER_XML invalid identifier

According to the notes, the problem only exists with upgrades to or to

ps. Don’t ask me why I had a synonym called plan_table in my sys schema. I didn’t do it.
pps. This is why you should test your migration (I’m glad I did)

7 December 2008

About datapump, SMO, autocompletion, P2V and opatch

Filed under: Uncategorized — Freek D'Hooge @ 15:30

Long time since I blogged, but I have been busy with tinkering with Ubuntu, learning about Oracle streams, client projects and playing with my now 10 month old son (not necessarily in this order).

Still, to let you know that I am still alive: a short overview of things I learned that you might find useful:

Tanel Poder’s advanced troubleshooting course

If you still have the possibility to follow it, then do so. I really learned a lot during this course.
The word “advanced” in the course title is not soley there for marketing purposes
More information about the course contents can be found here

Datapump and lobs

Until there is a bug in datampump with lobs, causing them to be slow on importing. This is due to bug 5555463, which is fixed in (there is also an one-off patch available). Another work-arround is to enforce the access_method used by datapump, but this is unsupported.
Some articles about this that I found usefull:

Note:5555463.8 – Bug 5555463 – Slow performance of datapump IMPORT with small LOBs
Note:552424.1 – Export/Import DataPump Parameter ACCESS_METHOD – How to Enforce a Method of Loading and Unloading Data ?
Note:286496.1 – Export/Import DataPump Parameter TRACE – How to Diagnose Oracle Data Pump

Netapp Snapmanager for Oracle 3.0 has been released.

This release has some important new features, such as time based retention policies.
Don’t have the time to play with it myself, but one of my colleagues has started to test it, and is blogging about it:

Autocompletion for hostnames

Prevent windows from requiring to reactivate after a P2V

I created a vmware image from my laptop to run it under Ubuntu, unfortunately my laptop was installed with an OEM edition of XP, so it kept comlaining that I needed to reactivate the license again.
Luckally there is a vmware parameter you can use to avoid this:
(Thanks to Filiep Declerck for the link), opatch, solaris and rac

With, opatch does not seems to update the $ORACLE_HOME/inventory/oneoffs/ and the $ORACLE_HOME/.patch_storage/ directories on the remote nodes when applying one-off patches. We have noticed this issue on several servers with Solaris 10 and oracle (and opatch and Other combination might have the same problem.
The patch itself seems to be applied on the remote nodes, but because the directories are not copied a query (such as lsinventory) on the remote inventory will fail on the remote nodes (preventing installing other patches).
This issue is now tracked in bug 7611827. The workarround is to use the “-local” flag and install the patch node per node.

That’s all for now.

26 October 2008

Merge statements and sequences

Filed under: performance,Uncategorized — Freek D'Hooge @ 3:00

Here’s something I stumbled upon when I was researching a performance problem involving the oracle merge command.

fdh@GUNNAR> create table t_dest
  2  ( n1   number(10,0),
  3    n2   number(10,0),
  4    v1   varchar2(1)
  5  )
  6  /

Table created.

fdh@GUNNAR> create table t_source
  2  ( n1 number(10,0)
  3  )
  4  /

Table created.

fdh@GUNNAR> create sequence s_seq
  2  start with 1
  3  increment by 1
  4  nomaxvalue
  5  nocache
  6  nocycle
  7  noorder
  8  /

Sequence created.

fdh@GUNNAR> insert into t_dest 
2 select rownum, s_seq.nextval, 'O'
3 from all_objects 
4 where rownum <= 1000; 

1000 rows created. 

fdh@GUNNAR> insert into t_source 
2 select rownum 
3 from all_objects 
4 where rownum <= 500; 

500 rows created. 

fdh@GUNNAR> commit; 

Commit complete.

fdh@GUNNAR> select last_number from user_sequences where sequence_name = 'S_SEQ';


fdh@GUNNAR> set timing on
fdh@GUNNAR> exec dbms_monitor.session_trace_enable(waits => TRUE, binds => TRUE);

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.07
fdh@GUNNAR> merge into t_dest
  2  using t_source
  3  on ( t_dest.n1 = t_source.n1)
  4  when matched then update
  5  set v1 = 'U'
  6  when not matched then insert (n1, n2, v1)
  7  values (0, s_seq.nextval, 'N');

500 rows merged.

Elapsed: 00:00:03.68
fdh@GUNNAR> commit;

Commit complete.

Elapsed: 00:00:00.06
fdh@GUNNAR> exec dbms_monitor.session_trace_disable;

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
fdh@GUNNAR> select v1, count(*)
  2  from t_dest
  3  group by v1
  4  order by v1;

V   COUNT(*)
- ----------
O        500
U        500

Elapsed: 00:00:00.09

As you can see, the merge statement has updated 500 records and inserted no new records.
But check out the sequence:

fdh@GUNNAR> select last_number from user_sequences where sequence_name = 'S_SEQ';


Elapsed: 00:00:00.07

Apparently the sequence has increased with 500 (number of records from t_source), even with the “when not matched” clause never evaluating to true.
The trace output confirms this as well.

merge into t_dest
using t_source
on ( t_dest.n1 = t_source.n1)
when matched then update
set v1 = 'U'
when not matched then insert (n1, n2, v1)
values (0, s_seq.nextval, 'N')

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          6          0           0
Execute      1      0.59       0.56          0          9       1014         500
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.60       0.58          0         15       1014         500

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 43

Rows     Row Source Operation
-------  ---------------------------------------------------
      2  MERGE  T_DEST (cr=509 pr=0 pw=0 time=3478125 us)
    500   VIEW  (cr=509 pr=0 pw=0 time=3510161 us)
    500    SEQUENCE  S_SEQ (cr=509 pr=0 pw=0 time=3508659 us)
    500     HASH JOIN OUTER (cr=9 pr=0 pw=0 time=6366 us)
    500      TABLE ACCESS FULL T_SOURCE (cr=3 pr=0 pw=0 time=1042 us)
   1000      TABLE ACCESS FULL T_DEST (cr=6 pr=0 pw=0 time=1049 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  log file sync                                   2        0.01          0.01
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.01          0.01

update seq$ set increment$=:2,minvalue=:3,maxvalue=:4,cycle#=:5,order$=:6,

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      500      0.00       0.00          0          0          0           0
Execute    500      2.96       2.90          0        500       1013         500
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1000      2.96       2.91          0        500       1013         500

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  UPDATE  SEQ$ (cr=1 pr=0 pw=0 time=167 us)
      1   INDEX UNIQUE SCAN I_SEQ1 (cr=1 pr=0 pw=0 time=16 us)(object id 102)


Clearly something to remember when you are using a merge command with a sequence in the “when not matched” clause and a source table containing a couple of million rows. Certainly when you expect most records to be matched.
In such a case it could be better to move the sequence to a row trigger on the target table.

23 September 2008

Overshoot day

Filed under: personal — Freek D'Hooge @ 21:59

Today, 23/09 is Earth Overshoot Day. That is the day we have used up more natural resources then the Earth can regenerate in one year.
Currently we are using 1.4 planets to support our lifestyle, resulting in a decrease of our supply of natural resources.

Maybe it’s not a bad idea to make a simular calculation for the resource usage on our databases and present the results to developers   :)

21 September 2008

Tanel Poder: Advanced Oracle Troubleshooting seminars

Filed under: Uncategorized — Freek D'Hooge @ 1:12

Just a quick note to say I will be going to Tanel Poder’s Advanced Oracle Troubleshooting seminar in Düsseldorf on 6 and 7 November. :)

Anyone else going?

ORA-08103 revisisted

Filed under: Uncategorized — Freek D'Hooge @ 0:39

In an earlier post I had already written about sessions receiving an ORA-08103 error after selecting from a table that had been truncated. At that time, I wrote the following conclusion:

“At this point I have no real explanation as for why the error occurs at this number of records fetched.
I suspect oracle of storing the locations of the blocks it has to scan during a full table in a kind of array with a limited length. After this array has been processed, Oracle would use the dataobj# value (and other keys) to get the next batch of block locations. At that moment the “object no longer exists” error would be thrown as the referenced dataobj# value no longer exists (it has been increased by 1).
But at this moment this is just pure speculation.”

Some weeks ago, I read an answer on the Oracle-L list from Tanel Poder to an ORA-08103 question, in which he stated that oracle would read 10 pointers to extents from the segment header at a time.

A quick test does seem to confirm this:

The TEST tablespace is a LMT with an uniform extent size of 16K

drop table test;

Table dropped.

fdh@GUNNAR&amp;amp;gt; create table test (veld1 char(2000)) tablespace test pctfree 0;

Table created.

fdh@GUNNAR&amp;amp;gt; insert into test select 'x' from all_objects where rownum &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;= 200;

200 rows created.

fdh@GUNNAR&amp;amp;gt; commit;

Commit complete.

fdh@GUNNAR&amp;amp;gt; select bytes, blocks, extents from user_segments where segment_name = 'TEST';

---------- ---------- ----------
425984         52         26

fdh@GUNNAR&amp;amp;gt; set serveroutput on
fdh@GUNNAR&amp;amp;gt; -- set the arraysize to 1 to prevent oracle from "prefetching" records
fdh@GUNNAR&amp;amp;gt; set arraysize 1
fdh@GUNNAR&amp;amp;gt; var x refcursor
fdh@GUNNAR&amp;amp;gt; BEGIN
3      open :x for select veld1 from test;
5  END;
6  /

PL/SQL procedure successfully completed.

fdh@GUNNAR&amp;amp;gt; DECLARE
3      l_veld1   char(2000);
7      fetch :x into l_veld1;
8      dbms_output.put_line('veld1: ' || trim(l_veld1));
10  END;
11  /
veld1: x

PL/SQL procedure successfully completed.


fdh@GUNNAR&amp;amp;gt; DECLARE
3      l_veld1   char(2000);
4      l_cnt     number(3,0)   := 0;
8      BEGIN
10          loop
12              fetch :x into l_veld1;
13              exit when :x%NOTFOUND;
14              l_cnt := l_cnt + 1;
16          end loop;
19          when others then
20              dbms_output.put_line(SQLERRM);
22      END;
24      dbms_output.put_line('fetched rows: ' || l_cnt);
26  END;
27  /
ORA-08103: object no longer exists
fetched rows: 75

PL/SQL procedure successfully completed.


My block size is 8K, so I could store 4 records per db block. Before the truncate I read 1 record and after the truncate I could read 75 records before getting the error. 76 records / 4 records per block / 2 blocks per extent = 9.5 extents. Remember that the first block is used for the segment header, so pointer 1 would only point to an extent with 1 data block ( meaning that my 10 pointers point in total to 19 blocks * 4 = 76 records).

This answers the question on when you would receive the ORA-08103 error after a truncate, but also means that your dbfmrc is limited by your extent size.
In my example the 10 pointers would reference 20 blocks (not taking in account the segment header), so a dbfmrc of more then 20 would not be possible for this table.

New question: Why is a multiblock read not crossing extent bounderies?

20 August 2008

Just because its printed, doesn’t mean its true

Filed under: infrastructure,rant — Freek D'Hooge @ 0:55

That statement is often written by Jonathan Lewis and today I was reminded on how true it is.
I had a discussion today with two of my colleagues who wanted to increase the number of arch processes on a dataguard system. As reason they pointed to metalink note 468817.1 – “RFS: possible network disconnect while taking rman backup on primary site”, which makes the following statement:

“In a Data Guard Configuration, during Scheduled RMAN Backup no Redo is transported to the Standby Server as the ARCH Process is blocked (as expected ie. RMAN would utilize 1 ARCn Process and the other ARCn for local Archiving ) which means the Standby stays out of sync (assuming max_arch_processes=2) until the ArchiveLog is manually copied across and registered to the Standby Database after which the Standby Database resumes applying the ArchiveLogs.”

The first thing that drew my attention was the part about rman utilizing an ARCn process.
While the ARCn processes are indeed responsible for archiving the online redo log files, it is the sessions own server process that does this when issuing an “alter system archive log current” command. When rman forces a log file to be archived, the same thing happens. This can easily be verified by looking to the “creator” column in the v$archived_log view. A strace of the rman server process would also prove that it is this process which reads the archived redo logs and streams them to the rman client to be written in a backup piece.

1 – 0 for me

The second thing I noticed was that, according to the note, the standby would remain out of sync until the archivelog was manually copied across and registered to the standby db.
Even if the rman process was using an ARCn process, leaving no processes to copy the archivelog over (with max_arch_processes=2), the standby db would normally be able to pickup the synchronization again after rman would have released the ARCn process again.

2 – 0 for me
and end of discussion.

So, “just because its printed, doesn’t mean its true”, would also apply on metalink notes.
Luckily there is a feedback link at the end of the note, so I hope it will soon be removed or modified.

And yes, I can get a little bit competitive in discussions.
How did you guess?

31 July 2008


Filed under: personal — Freek D'Hooge @ 0:39

Karen Morton has written a post about a site which takes for example an rss feed and turns it into a word-cloud.
I just had to try it out, so here is the word cloud generated from the rss feed from my blog:

I like the fact that “backup” has such a prominent place…

21 July 2008


Filed under: Uncategorized — Freek D'Hooge @ 12:22

A while ago I had written a post with the question if I would release a package as open source or just provide a wrapped form of it. Since then, it became clear to me that I simply don’t have the time to provide proper support for the package.
Meaning I have two choises, either I never release the package to the public and only use it for my own purposes, or I publish the unwrapped code.

For me, option 1 is not realy an option, so I’m proud to present to you: AnalyzeThis !

uhm, ok, fantastic, euhm… so what does it do?

Well, I have created a page with a little faq (which will have to act as documentation), but in a nuttshell: It helps you to gather and manage the cbo statistics for your database by making it possible to store the dbms_stats option values you want to use into a table and to specify default options on several levels. I also automates the backup of statistics before overwriting them and logs information about the generation of new statistics (including timing information).

Currently you have two options with dbms_stats: either you let oracle determine which stats options to use for which object or you specify seperate options for each object yourself. There is nothing in between.
With AnalyzeThis you now have a third option: specify the objects (for which you can also use wildcards) for which you want to use a certain parameter set and let the other objects use the default parameters.

I don’t know if other people think there is a need for such a package, but I did. So here it is.
Don’t expect to much support for it though. Remember the main reason why I’m publising the code is because I don’t have the time to provide proper supprt.
I will still continue to further development this package, so irregular updates and bugfixes will be made available for download in the future.

10 July 2008

Why is rman not removing my obsolete backups?

Filed under: backup — Freek D'Hooge @ 13:06

Last night I received a filesystem free space warning from one of our clients servers.
The offending filesystem was the filesystem used to store the rman backup pieces before they are copied to tape via a filesystem backup, and a quick ls showed that there where more rman backups still on disk then expected.

Normally the old backups are removed from disk through the rman “delete obsolete recovery window of 2 days;” command at the end of the backup script, but there was still a backup of 3 days back on disk.
In the logfile the output of the delete obsolete command showed that there where no obsolete backups found and the list backup proved that the backup on disk was still known to rman.

After a close inspection of the logfiles, I think I have found the reason.
The table below shows the completion time of the last backupset containing datafiles and the time on which the “delete obsolete” command ran:


Completion time














As you can see the script checked today at 04:21 for obsolete backups (with the recovery window of 2 days), but the backup of the datafiles at 08/07 was finished at 04:28, meaning rman could not use this backup to satisfy the earliest necessary restore point of 08/07 04:21 and therefor the backup of 07/07 (and all archivelogs between) where still needed for the recovery window of 2 days.

Normally this should not be a real problem, but you must keep this in mind when calculating how much disk space you need to store your rman backups to disk.
A possible solution is to not use the “recovery window” clause but rather the “redundancy” or the “completed before” clause (and the use a fix starting point in the date calculation). Another option would be to remove the deletion from the backup script and put it in a seperate maintenance script that is run on a fixed time (well after the backup time), so that the deletion is not affected by the backup duration.

For those interested, I have posted below an sql script that can be used to follow up on your backup sizes and durations:

set linesize 200

column bjob format a15
column recid format 99999
column start_time format a20
column handle format a55
column mb format 99G999D99

break on bjob skip 1 on recid on backup_type on start_time

compute sum of mb on bjob

with backup_set_details
( select set_count, set_stamp, session_recid, session_stamp
from v$backup_set_details
group by set_count, set_stamp, set_count, session_recid, session_stamp
select rbjd.session_key ||',' || rbjd.session_stamp bjob, bs.recid recid, bs.backup_type,
to_char(bs.start_time, 'DD/MM/YYYY HH24:MI:SS') start_time,
bp.piece#, bp.bytes/1024/1024 mb, bp.compressed, bp.handle
from v$backup_set bs, v$backup_piece bp, backup_set_details bsd, v$rman_backup_job_details rbjd
where bs.set_stamp = bp.set_stamp
and bs.set_count = bp.set_count
and bs.set_stamp = bsd.set_stamp
and bs.set_count = bsd.set_count
and bsd.session_recid = rbjd.session_recid
and bsd.session_stamp = rbjd.session_stamp
and bp.status != 'D'
order by bs.start_time, bs.recid, bp.piece#;
« Previous PageNext Page »

Blog at