Irrelevant thoughts of an oracle DBA

19 January 2009

Silent upgrade troubles

Filed under: bugs, infrastructure, upgrades / migrations — dhoogfr @ 1:08

Last week I was asked to write a little script to automate an upgrade of oracle client 9.2.0.1 to 9.2.0.8.
Reason for this was that we needed to update arround 1.300 clients to enable them to connect to a 10g database (we couldn’t install 10g clients because of other applications restricted the client version to 9i).

Ok, easy enough. Oracle allows you to automate installations and upgrades via response files and the response file for a client upgrade from 9.2.0.1 to 9.2.0.8 is very simple.
When I started testing the upgrade, I immediately spotted a first problem. The setup.exe (it was on windows xp) started a new console and then returned directly to the prompt in the original console. This would make it impossible to check the return codes to know if a upgrade was successful or not.

The upgrade itself finished without a problem, but at the end the following message appeared in the newly started console: “Press enter to exit”.
Huh!? This was supposed to be a “silent” install, meaning no interraction needed. But here it was, asking to press enter to exit.
And the documentation was not telling anything about it.
After some searching, I found that you can specify the “-noconsole” flag when starting the setup, which would surpress the new console and avoid the question to press enter.
You still would see the question in the logfiles, but the installation presumed you responded to it and finishes the upgrade.

This left me with the first problem: the prompt would still directly return while the upgrade was running in the background.
After some searching in the documentation I found a note stating that you need to modify the oraparam.ini file and change the BOOTSTRAP parameter from TRUE to FALSE.
Unfortunately this did not help. Yelling at it did either.

Then I found that in 10g, you had a “-waitforcompletion” flag you could set, that would do exactly what I needed. So I tried if it would work for the oui shipped in the 9.2.0.8 patchset.
At first, it didn’t, but then I found metalink note 293044.1 that said that the setup.exe in Disk1 and Disk1/install where not the same and that the one in Disk1/install should be used for the “-waitforcompletion” flag.
At last it worked.

For those interested, here is the full command I used to start the silent upgrade:

start /wait C:\oracle\patches\9.2.0.8\Disk1\install\setup.exe -silent -noconsole -waitforcompletion -responsefile c:\oracle\patches\9.2.0.8\patchset.rsp -paramfile c:\oracle\patches\9.2.0.8\oraparam.ini

—————————-

Thanks to Geert for the yelling link :)

10 January 2009

How to use the plan_table table to sabotage your oracle upgrade

Filed under: bugs, infrastructure, upgrades / migrations — dhoogfr @ 12:55

Lets say you need to upgrade your 9i database to 10g (10.2.0.4 to be exact), but you actually want to sabotage the upgrade (don’t know why, just assume you do).
Granted, there are many ways to do this, but you want to do it subtle. What are your options then?
Well, one option is to create the plan_table table in your sys schema (or a synonym plan_table to a plan table in another schema if you want to make it really subtle) before the upgrade.
If you do this, you will see the following message in your upgrade log:

Warning: Package Body created with compilation errors.

SQL> show errors;
Errors for PACKAGE BODY DBMS_SQLPA:

LINE/COL ERROR
-------- -----------------------------------------------------------------
113/5    PL/SQL: SQL Statement ignored
118/44   PL/SQL: ORA-00904: "OTHER_XML": invalid identifier

And the “oracle database server” component in the dba_registry will be marked as invalid.
Mission accomplished I would say.

What is that?
You regret your actions and you want to fix the problem?

*sigh*

Ok then, to fix it you can use the following steps:

  • drop the sys.plan_table table
  • drop the sys.plan_table$ table
  • drop all sys synonyms and public synonyms to the plan_table or the plan_table$
  • @?/rdbms/admin/catplan.sql — recreate the plan table
  • @?/rdbms/admin/dbmsxpln.sql — reload dbms_xplan spec
  • @?/rdbms/admin/prvtxpln.plb — reload dbms_xplan implementation
  • @?/rdbms/admin/prvtspao.plb — reload dbms_sqlpa

For those seeking more information:

Metalink note 565600.1 – ERROR IN CATUPGRD: ORA-00904 IN DBMS_SQLPA
Metalink note 605317.1 – DBMS_SQLPA ORA-00904 OTHER_XML invalid identifier

According to the notes, the problem only exists with upgrades to 10.2.0.4 or to 11.1.0.7.

ps. Don’t ask me why I had a synonym called plan_table in my sys schema. I didn’t do it.
pps. This is why you should test your migration (I’m glad I did)

7 December 2008

About datapump, SMO, autocompletion, P2V and opatch

Filed under: Uncategorized — dhoogfr @ 15:30

Long time since I blogged, but I have been busy with tinkering with Ubuntu, learning about Oracle streams, client projects and playing with my now 10 month old son (not necessarily in this order).

Still, to let you know that I am still alive: a short overview of things I learned that you might find useful:

Tanel Poder’s advanced troubleshooting course

If you still have the possibility to follow it, then do so. I really learned a lot during this course.
The word “advanced” in the course title is not soley there for marketing purposes
More information about the course contents can be found here

Datapump and lobs

Until 10.2.0.4 there is a bug in datampump with lobs, causing them to be slow on importing. This is due to bug 5555463, which is fixed in 10.2.0.4 (there is also an one-off patch available). Another work-arround is to enforce the access_method used by datapump, but this is unsupported.
Some articles about this that I found usefull:

Note:5555463.8 – Bug 5555463 – Slow performance of datapump IMPORT with small LOBs
Note:552424.1 – Export/Import DataPump Parameter ACCESS_METHOD – How to Enforce a Method of Loading and Unloading Data ?
Note:286496.1 – Export/Import DataPump Parameter TRACE – How to Diagnose Oracle Data Pump

Netapp Snapmanager for Oracle 3.0 has been released.

This release has some important new features, such as time based retention policies.
Don’t have the time to play with it myself, but one of my colleagues has started to test it, and is blogging about it: http://bjornnaessens.wordpress.com

Autocompletion for hostnames

http://geekozoid.blogspot.com/2008/06/auto-complete-hostnames-with-using-bash.html

Prevent windows from requiring to reactivate after a P2V

I created a vmware image from my laptop to run it under Ubuntu, unfortunately my laptop was installed with an OEM edition of XP, so it kept comlaining that I needed to reactivate the license again.
Luckally there is a vmware parameter you can use to avoid this: http://communities.vmware.com/thread/99057
(Thanks to Filiep Declerck for the link)

10.2.0.4, opatch, solaris and rac

With 10.2.0.4, opatch does not seems to update the $ORACLE_HOME/inventory/oneoffs/ and the $ORACLE_HOME/.patch_storage/ directories on the remote nodes when applying one-off patches. We have noticed this issue on several servers with Solaris 10 and oracle 10.2.0.4 (and opatch 10.2.0.4.3 and 10.2.0.4.5). Other combination might have the same problem.
The patch itself seems to be applied on the remote nodes, but because the directories are not copied a query (such as lsinventory) on the remote inventory will fail on the remote nodes (preventing installing other patches).
This issue is now tracked in bug 7611827. The workarround is to use the “-local” flag and install the patch node per node.

That’s all for now.

26 October 2008

Merge statements and sequences

Filed under: performance — dhoogfr @ 3:00

Here’s something I stumbled upon when I was researching a performance problem involving the oracle merge command.


fdh@GUNNAR> create table t_dest
  2  ( n1   number(10,0),
  3    n2   number(10,0),
  4    v1   varchar2(1)
  5  )
  6  /

Table created.

fdh@GUNNAR> create table t_source
  2  ( n1 number(10,0)
  3  )
  4  /

Table created.

fdh@GUNNAR> create sequence s_seq
  2  start with 1
  3  increment by 1
  4  nomaxvalue
  5  nocache
  6  nocycle
  7  noorder
  8  /

Sequence created.

fdh@GUNNAR> insert into t_dest
2 select rownum, s_seq.nextval, 'O'
3 from all_objects
4 where rownum <= 1000; 

1000 rows created. 

fdh@GUNNAR> insert into t_source
2 select rownum
3 from all_objects
4 where rownum <= 500; 

500 rows created. 

fdh@GUNNAR> commit; 

Commit complete.

fdh@GUNNAR> select last_number from user_sequences where sequence_name = 'S_SEQ';

LAST_NUMBER
-----------
       1001

fdh@GUNNAR> set timing on
fdh@GUNNAR> exec dbms_monitor.session_trace_enable(waits => TRUE, binds => TRUE);

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.07
fdh@GUNNAR> merge into t_dest
  2  using t_source
  3  on ( t_dest.n1 = t_source.n1)
  4  when matched then update
  5  set v1 = 'U'
  6  when not matched then insert (n1, n2, v1)
  7  values (0, s_seq.nextval, 'N');

500 rows merged.

Elapsed: 00:00:03.68
fdh@GUNNAR> commit;

Commit complete.

Elapsed: 00:00:00.06
fdh@GUNNAR> exec dbms_monitor.session_trace_disable;

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
fdh@GUNNAR> select v1, count(*)
  2  from t_dest
  3  group by v1
  4  order by v1;

V   COUNT(*)
- ----------
O        500
U        500

Elapsed: 00:00:00.09

As you can see, the merge statement has updated 500 records and inserted no new records.
But check out the sequence:


fdh@GUNNAR> select last_number from user_sequences where sequence_name = 'S_SEQ';

LAST_NUMBER
-----------
       1501

Elapsed: 00:00:00.07

Apparently the sequence has increased with 500 (number of records from t_source), even with the “when not matched” clause never evaluating to true.
The trace output confirms this as well.


merge into t_dest
using t_source
on ( t_dest.n1 = t_source.n1)
when matched then update
set v1 = 'U'
when not matched then insert (n1, n2, v1)
values (0, s_seq.nextval, 'N')

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          6          0           0
Execute      1      0.59       0.56          0          9       1014         500
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.60       0.58          0         15       1014         500

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 43

Rows     Row Source Operation
-------  ---------------------------------------------------
      2  MERGE  T_DEST (cr=509 pr=0 pw=0 time=3478125 us)
    500   VIEW  (cr=509 pr=0 pw=0 time=3510161 us)
    500    SEQUENCE  S_SEQ (cr=509 pr=0 pw=0 time=3508659 us)
    500     HASH JOIN OUTER (cr=9 pr=0 pw=0 time=6366 us)
    500      TABLE ACCESS FULL T_SOURCE (cr=3 pr=0 pw=0 time=1042 us)
   1000      TABLE ACCESS FULL T_DEST (cr=6 pr=0 pw=0 time=1049 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  log file sync                                   2        0.01          0.01
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.01          0.01
********************************************************************************

update seq$ set increment$=:2,minvalue=:3,maxvalue=:4,cycle#=:5,order$=:6,
  cache=:7,highwater=:8,audit$=:9,flags=:10
where
 obj#=:1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      500      0.00       0.00          0          0          0           0
Execute    500      2.96       2.90          0        500       1013         500
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1000      2.96       2.91          0        500       1013         500

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  UPDATE  SEQ$ (cr=1 pr=0 pw=0 time=167 us)
      1   INDEX UNIQUE SCAN I_SEQ1 (cr=1 pr=0 pw=0 time=16 us)(object id 102)

********************************************************************************

Clearly something to remember when you are using a merge command with a sequence in the “when not matched” clause and a source table containing a couple of million rows. Certainly when you expect most records to be matched.
In such a case it could be better to move the sequence to a row trigger on the target table.

23 September 2008

Overshoot day

Filed under: personal — dhoogfr @ 21:59

Today, 23/09 is Earth Overshoot Day. That is the day we have used up more natural resources then the Earth can regenerate in one year.
Currently we are using 1.4 planets to support our lifestyle, resulting in a decrease of our supply of natural resources.

Maybe it’s not a bad idea to make a simular calculation for the resource usage on our databases and present the results to developers   :)

21 September 2008

Tanel Poder: Advanced Oracle Troubleshooting seminars

Filed under: Uncategorized — dhoogfr @ 1:12

Just a quick note to say I will be going to Tanel Poder’s Advanced Oracle Troubleshooting seminar in Düsseldorf on 6 and 7 November. :)

Anyone else going?

ORA-08103 revisisted

Filed under: Uncategorized — dhoogfr @ 0:39

In an earlier post I had already written about sessions receiving an ORA-08103 error after selecting from a table that had been truncated. At that time, I wrote the following conclusion:

“At this point I have no real explanation as for why the error occurs at this number of records fetched.
I suspect oracle of storing the locations of the blocks it has to scan during a full table in a kind of array with a limited length. After this array has been processed, Oracle would use the dataobj# value (and other keys) to get the next batch of block locations. At that moment the “object no longer exists” error would be thrown as the referenced dataobj# value no longer exists (it has been increased by 1).
But at this moment this is just pure speculation.”

Some weeks ago, I read an answer on the Oracle-L list from Tanel Poder to an ORA-08103 question, in which he stated that oracle would read 10 pointers to extents from the segment header at a time.

A quick test does seem to confirm this:

The TEST tablespace is a LMT with an uniform extent size of 16K

drop table test;

Table dropped.

fdh@GUNNAR&amp;amp;gt; create table test (veld1 char(2000)) tablespace test pctfree 0;

Table created.

fdh@GUNNAR&amp;amp;gt; insert into test select 'x' from all_objects where rownum &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;= 200;

200 rows created.

fdh@GUNNAR&amp;amp;gt; commit;

Commit complete.

fdh@GUNNAR&amp;amp;gt; select bytes, blocks, extents from user_segments where segment_name = 'TEST';

BYTES     BLOCKS    EXTENTS
---------- ---------- ----------
425984         52         26

fdh@GUNNAR&amp;amp;gt; set serveroutput on
fdh@GUNNAR&amp;amp;gt; -- set the arraysize to 1 to prevent oracle from "prefetching" records
fdh@GUNNAR&amp;amp;gt; set arraysize 1
fdh@GUNNAR&amp;amp;gt; var x refcursor
fdh@GUNNAR&amp;amp;gt;
fdh@GUNNAR&amp;amp;gt; BEGIN
2
3      open :x for select veld1 from test;
4
5  END;
6  /

PL/SQL procedure successfully completed.

fdh@GUNNAR&amp;amp;gt;
fdh@GUNNAR&amp;amp;gt; DECLARE
2
3      l_veld1   char(2000);
4
5  BEGIN
6
7      fetch :x into l_veld1;
8      dbms_output.put_line('veld1: ' || trim(l_veld1));
9
10  END;
11  /
veld1: x

PL/SQL procedure successfully completed.

-- AT THIS MOMENT ANOTHER SESSION TRUNCATES THE TABLE

fdh@GUNNAR&amp;amp;gt;
fdh@GUNNAR&amp;amp;gt; DECLARE
2
3      l_veld1   char(2000);
4      l_cnt     number(3,0)   := 0;
5
6  BEGIN
7
8      BEGIN
9
10          loop
11
12              fetch :x into l_veld1;
13              exit when :x%NOTFOUND;
14              l_cnt := l_cnt + 1;
15
16          end loop;
17
18      EXCEPTION
19          when others then
20              dbms_output.put_line(SQLERRM);
21
22      END;
23
24      dbms_output.put_line('fetched rows: ' || l_cnt);
25
26  END;
27  /
ORA-08103: object no longer exists
fetched rows: 75

PL/SQL procedure successfully completed.

fdh@GUNNAR&amp;amp;gt;

My block size is 8K, so I could store 4 records per db block. Before the truncate I read 1 record and after the truncate I could read 75 records before getting the error. 76 records / 4 records per block / 2 blocks per extent = 9.5 extents. Remember that the first block is used for the segment header, so pointer 1 would only point to an extent with 1 data block ( meaning that my 10 pointers point in total to 19 blocks * 4 = 76 records).

This answers the question on when you would receive the ORA-08103 error after a truncate, but also means that your dbfmrc is limited by your extent size.
In my example the 10 pointers would reference 20 blocks (not taking in account the segment header), so a dbfmrc of more then 20 would not be possible for this table.

New question: Why is a multiblock read not crossing extent bounderies?

20 August 2008

Just because its printed, doesn’t mean its true

Filed under: infrastructure, rant — dhoogfr @ 0:55

That statement is often written by Jonathan Lewis and today I was reminded on how true it is.
I had a discussion today with two of my colleagues who wanted to increase the number of arch processes on a dataguard system. As reason they pointed to metalink note 468817.1 – “RFS: possible network disconnect while taking rman backup on primary site”, which makes the following statement:

“In a Data Guard Configuration, during Scheduled RMAN Backup no Redo is transported to the Standby Server as the ARCH Process is blocked (as expected ie. RMAN would utilize 1 ARCn Process and the other ARCn for local Archiving ) which means the Standby stays out of sync (assuming max_arch_processes=2) until the ArchiveLog is manually copied across and registered to the Standby Database after which the Standby Database resumes applying the ArchiveLogs.”

The first thing that drew my attention was the part about rman utilizing an ARCn process.
While the ARCn processes are indeed responsible for archiving the online redo log files, it is the sessions own server process that does this when issuing an “alter system archive log current” command. When rman forces a log file to be archived, the same thing happens. This can easily be verified by looking to the “creator” column in the v$archived_log view. A strace of the rman server process would also prove that it is this process which reads the archived redo logs and streams them to the rman client to be written in a backup piece.

1 – 0 for me

The second thing I noticed was that, according to the note, the standby would remain out of sync until the archivelog was manually copied across and registered to the standby db.
Even if the rman process was using an ARCn process, leaving no processes to copy the archivelog over (with max_arch_processes=2), the standby db would normally be able to pickup the synchronization again after rman would have released the ARCn process again.

2 – 0 for me
and end of discussion.

So, “just because its printed, doesn’t mean its true”, would also apply on metalink notes.
Luckily there is a feedback link at the end of the note, so I hope it will soon be removed or modified.

And yes, I can get a little bit competitive in discussions.
How did you guess?

31 July 2008

Wordle

Filed under: personal — dhoogfr @ 0:39

Karen Morton has written a post about a site which takes for example an rss feed and turns it into a word-cloud.
I just had to try it out, so here is the word cloud generated from the rss feed from my blog:

I like the fact that “backup” has such a prominent place…

21 July 2008

AnalyzeThis

Filed under: Uncategorized — dhoogfr @ 12:22

A while ago I had written a post with the question if I would release a package as open source or just provide a wrapped form of it. Since then, it became clear to me that I simply don’t have the time to provide proper support for the package.
Meaning I have two choises, either I never release the package to the public and only use it for my own purposes, or I publish the unwrapped code.

For me, option 1 is not realy an option, so I’m proud to present to you: AnalyzeThis !

uhm, ok, fantastic, euhm… so what does it do?

Well, I have created a page with a little faq (which will have to act as documentation), but in a nuttshell: It helps you to gather and manage the cbo statistics for your database by making it possible to store the dbms_stats option values you want to use into a table and to specify default options on several levels. I also automates the backup of statistics before overwriting them and logs information about the generation of new statistics (including timing information).

Currently you have two options with dbms_stats: either you let oracle determine which stats options to use for which object or you specify seperate options for each object yourself. There is nothing in between.
With AnalyzeThis you now have a third option: specify the objects (for which you can also use wildcards) for which you want to use a certain parameter set and let the other objects use the default parameters.

I don’t know if other people think there is a need for such a package, but I did. So here it is.
Don’t expect to much support for it though. Remember the main reason why I’m publising the code is because I don’t have the time to provide proper supprt.
I will still continue to further development this package, so irregular updates and bugfixes will be made available for download in the future.

Next Page »

Blog at WordPress.com.