Recently I was reading in Jonathan Lewis’ book “cost-based oracle fundamentals” ( I’m reading this small peaces at a time, otherwise my brain gets overloaded and my head feels like it’s going to explode) about histograms, when I got to page 168 where Jonathan shows the output of tests with frequenty based histograms. In these tests Jonathan modifys some CBO statistics by using the set_table_stats and the set_column_stats procedures of the dbms_stats package.

In the first test Jonathan decreases the num_rows of the table, resulting in a decrease of the cardinality in the execution plan. I was surprised by this result, as with a frequenty based histogram available, Oracle knows the exact number of records for the given predicate. Why then would changing the number of rows of the table change the cardinality? So, I did what I always do when I don’t understand something: I try to work out an example myself.

First I created a table with 10 different values, 1 of the values appearing in 50% of the records.

fdh@LOKI> drop table t; Table dropped. fdh@LOKI> create table t (veld1 number(10,0)); Table created. fdh@LOKI> insert into t 2 select mod(rownum, 9) + 1 3 from all_objects, all_objects 4 where rownum insert into t 2 select 10 3 from all_objects, all_objects 4 where rownum commit; Commit complete. fdh@LOKI> select veld1, count(*) counted 2 from t 3 group by veld1 4 order by veld1; VELD1 COUNTED ---------- ---------- 1 5555 2 5556 3 5556 4 5556 5 5556 6 5556 7 5555 8 5555 9 5555 10 50000 10 rows selected.

After creating the data I gathered the statistics, asking for a histogram on the field “VELD1” with 10 buckets ( which would generate a frequenty based histogram ).

fdh@LOKI> BEGIN 2 3 dbms_stats.gather_table_stats 4 ( ownname => NULL, 5 tabname => 'T', 6 method_opt => 'FOR COLUMNS VELD1 SIZE 10' 7 ); 8 9 END; 10 / PL/SQL procedure successfully completed. fdh@LOKI> select num_distinct, num_nulls, num_buckets, histogram, density from user_tab_col_statistics where table_name = 'T' and column_name = 'VELD1'; NUM_DISTINCT NUM_NULLS NUM_BUCKETS HISTOGRAM DENSITY ------------ ---------- ----------- --------------- ------------ 10 0 10 FREQUENCY ,000004918 fdh@LOKI> select endpoint_number, endpoint_value from user_tab_histograms where table_name = 'T' and column_name = 'VELD1'; ENDPOINT_NUMBER ENDPOINT_VALUE --------------- -------------- 314 1 635 2 943 3 1269 4 1573 5 1911 6 2200 7 2512 8 2802 9 5592 10 10 rows selected.

Apparently the CBO is thinking that the value 10 appeared 2.790 times (5592 – 2802), which is far less then the 50.000 I would have expected. Also the other values where much lower then the actual counts. The reason for this became clear after querying the user_tables view:

fdh@LOKI> select num_rows, sample_size from user_tables where table_name = 'T'; NUM_ROWS SAMPLE_SIZE ---------- ----------- 101673 5592

Oracle generated the statistics based upon a sample, not the entire table.

A quick check on the dbms_stats.gather_table_stats procedure indeed revealed that I had not included the estimate_percent parameter and that this parameter in 10g did not have the default value of “NULL” (compute) anymore. The values showed where the values occuring in the sample. It took me a while to realise that this was also the reason why lowering the num_rows would change the cardinality, but eventually it seeped into my numbed brain: oracle always extrapolates the values it finds in the histograms. It is just that when you don’t use a sample, the num_rows equals the sample_size, giving a fraction of 1. When you later on then lower the value of num_rows the fraction changes and thus does the cardinality. Therefor the cardinality for a predicate of the form column = constant (with a frequenty based histogram in place) would be:

row_count * (user_tables.num_rows / user_tables.sample_size)

where row_count is the number derived from the histogram.

A second example given by Jonathan however showed that the num_nulls also played a role in the calculation, changing the formula to:

row_count * ((user_tables.num_rows - user_tab_columns.num_nulls) / user_tables.sample_size)

But this striked me as odd. It would only make sense if the sample size did not include the record for which the value was NULL. Something which is not possible for the sample_size column in user_tables as the number of nulls would differ per column. A quick look on the columns of the user_tab_columns view revealed that there was indeed a sample_size column in this view as well ( the cbo can use a different sample size for the histograms then for the table). To test this further, I updated the table – setting veld1=NULL where veld1 = 1 – and gathered the statistics again (using compute this time).

fdh@LOKI> select sample_size, num_distinct, num_nulls, num_buckets, histogram, density from user_tab_columns where table_name = 'T' and column_name ='VELD1'; SAMPLE_SIZE NUM_DISTINCT NUM_NULLS NUM_BUCKETS HISTOGRAM DENSITY ----------- ------------ ---------- ----------- --------------- ------------- 94445 9 5555 9 FREQUENCY ,0000052941 fdh@LOKI> set autotrace traceonly explain fdh@LOKI> select veld1 from t where veld1 = 10; Execution Plan ---------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 50000 | 97K| 36 (3)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T | 50000 | 97K| 36 (3)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("VELD1"=10)

Which shows that the sample_size was indeed excluding the num_nulls and that the cardinality of the query would still be correct. So the formulla actually becomes:

row_count * ((user_tables.num_rows - user_tab_columns.num_nulls) / user_tab_columns.sample_size)

Although I now understand why changing the number of rows or the number of nulls will change the cardinality I’m left with the question of why oracle is not just storing the extrapolated numbers, instead of making the calculation each time it needs to now the cardinality? Maybe I’m overestimating the cost of this calculation, but it just seems to me as a more efficient thing to do.

Oh well, 1 question answered, 99 to go… .

Hmm, if you add more records into the table after stats were gathered, there is a somewhat imprecise but nevertheless sensible assumption that they would be distributed similarly to the existing records. So if you don’t store the extrapolated numbers but the actual sample results, and extrapolate on the fly, then the CBO can make sensible decisions even if the table has grown 100%.

Of course, regathering stats is always recommended when there’s dramatic change in data, but the method used by Oracle still strikes me as more robust since it can deal with such data change for a while with (hopefully) only slightly degraded accuracy. If it stored the extrapolated results for cardinality, even relatively small changes to the data would make those stored results significantly wrong.

Comment by Volker Kleinschmidt — 17 April 2008 @ 19:08 |

Volker,

But then how would oracle know that the table has grown?

If the dba_tables.num_rows is not changed, the extrapolated result will not change if the table grows (or shrinks)

The only reason i can think of why oracle stores the sample result and not the extrapolated numbers, is that it is now easier to go behinds the cbo’s back to change the expected cardinality (by changing the num_rows).

In theory you could thus gather histograms once and then just change the num_rows to cope with the growth of the table.

But I’m not sure if this is something I would advice.

Comment by dhoogfr — 17 April 2008 @ 22:51 |