Oracle Database 12.2 PowerUp the DBMS_STAT package

Oracle Database 12 Release 2 makes it a lot easier to be able to manage this vast collection of scripts by includes a new DBMS_STATS preference called PREFERENCE_OVERRIDES_PARAMETER. When this preference is set to TRUE, it allows preference settings to override the parameter values specifically set in a DBMS_STATS command.

For example, if the global preference ESTIMATE_PERCENT is set to the default DBMS_STATS.AUTO_SAMPLE_SIZE, but a 10% sample size was specified in the in a DBMS_STATS.GATHER_TABLE_STATS command, setting PREFERENCE_OVERRIDES_PARAMETER to TRUE would automatically replace the 10% sample size with the default preference  DBMS_STATS.AUTO_SAMPLE_SIZE.

Let’s take a look at this in action. we have a SALES table with nearly 1 million rows in it.

15:17:49   2  FROM dual;

15-JUL-2017 15:17:49

Elapsed: 00:00:00.00
15:17:49 SQL>

15:02:18 SQL> SELECT COUNT(*)
  2  FROM sales;


15:02:20 SQL> BEGIN
  2  DBMS_STATS.GATHER_TABLE_STATS(ownname=>’SH’, tabname=>’SALES’, estimate_percent=>10);
  3  END;
  4  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:09.08

If we check the sample size used for gathering statistics, we see that only 10% of the rows were used as requested and that a FREQUENCY histogram has been created on the CUST_ID column of the SALES table.
15:03:38 SQL> DBMS_STATS.GET_PREFS(pname=>’ESTIMATE_PRECENT’ , ownname=>’SH’
SP2-0734: unknown command beginning “DBMS_STATS…” – rest of line ignored.
15:04:02 SQL>
15:04:02 SQL> SELECT TABLE_NAME, sample_size
15:04:19   2  FROM   user_tables
15:04:19   3  WHERE  TABLE_NAME= ‘SALES’;

TABLE_NAME                     SAMPLE_SIZE
—————————— ———–
SALES                                91619

Elapsed: 00:00:00.09
15:04:20 SQL> SELECT column_name, num_distinct, num_nulls, histogram
15:04:51   2  FROM   user_tab_col_statistics
15:04:51   3  WHERE  TABLE_NAME=’SALES’;

————– ————— ———— ———-
PROD_ID                  651       0         NONE
CUST_ID                  630       0         FREQUENCY
TIME_ID                  620       0         NONE
CHANNEL_ID               5         0         NONE
PROMO_ID                 116       0         NONE
QUANTITY_SOLD            44        0         NONE
AMOUNT_SOLD              583       0         NONE

7 rows selected.

Elapsed: 00:00:00.12

Unfortunately, these statistics are not that accurate as there are over 600 distinct PROD_IDs and CUST_IDs in the SALES table but the statistics gather missed this information because it looked at such a small sample.

15:05:31   2  FROM   sales;

———————-  ———————-
651                      630
In order to improve the accuracy of the statistics, we should use the recommended best practice value for the ESTIMATE_PERECENT parameter, AUTO_SAMPLE_SIZE. Let’s check the preference value for the ESTIMATE_PERECENT parameter on the SALES table.

15:05:34 SQL> SELECT
DBMS_STATS.GET_PREFS(pname=>’ESTIMATE_PRECENT’ , ownname=>’SH’, tabname=>’SALES’)
FROM Dual;


We see it’s actually set to DBMS_STATS.AUTO_SAMPLE_SIZE, which is good news for us.

Rather than having to find and edit all of the existing statistics gathering script that refer to the SALES table, we can set the DBMS_STATS preference PREFERENCE_OVERRIDES_PARAMETER to TRUE on the SALES table, which will automatically override all of the none default parameter values used in the DBMS_STATS command with the corresponding preferences values.

15:05:39 SQL> BEGIN
DBMS_STATS.SET_TABLE_PREFS(ownname=>’SH’, tabname=>’SALES’,
                           pname=>’PREFERENCE_OVERRIDES_PARAMETER’, pvalue=>’TRUE’);

PL/SQL PROCEDURE successfully completed.
So, in our case the ESTIMATE_PERCENT of 10% will actually be overridden and DBMS_STATS.AUTO_SAMPLE_SIZE will be used. Let’s see what impact that has on both the statistics gathering performance and the quality of the statistics gathered.

15:05:40 SQL> BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname=>’SH’, tabname=>’SALES’, estimate_percent=>10);

PL/SQL PROCEDURE successfully completed.
Elapsed: 00:00:06.59
FROM   user_tables

TABLE_NAME                      SAMPLE_SIZE
—————————— ———–
SALES                               918843

So, you can see that the time it took to gather statistics was actually less than the time for the 10% but the sample size was the total number of rows in the table. But what about the quality of the statistics?

15:06:04 SQL> SELECT column_name, num_distinct, num_nulls, histogram
FROM  user_tab_col_statistics

————– ————— ———— ———-
PROD_ID                  651       0         NONE
CUST_ID                  630       0         TOP-FREQUENCY
TIME_ID                  620       0         NONE
CHANNEL_ID               5         0         NONE
PROMO_ID                 116       0         NONE
QUANTITY_SOLD            44        0         NONE
AMOUNT_SOLD              583       0         NONE

As you can see, we have a much more accurate set of statistics this time with each of the number of distinct values being correct. You will also notice we got a new type of histogram a TOP-FREQUENCY histogram this time. That’s because this new type of histogram is only gathered if the ESTIMATE_PERCENT is set to the default.

Remember setting PREFERENCE_OVERRIDES_PARAMETER to TRUE will override all none default parameters set in the DBMS_STATS.GATHER_*_STATS command, so you may need to adjust the default preferences for some of the parameters, for example METHOD_OPT if you don’t want to revert to the default.

You can set a table preference using the DBMS_STATS.SET_TABLE_PREFS procedure.

15:06:10 SQL>BEGIN
DBMS_STATS.SET_TABLE_PREFS(ownname=>’SH’, tabname=>’SALES’, pname=>METHOD_OPT,
                           pvalue=>’FOR ALL COLUMNS SIZE 1 FOR COLUMNS CUST_ID SIZE 254′);

  • July 15, 2017 | 20 views
  • Comments