Morten Jensen's Blog: Thoughts on Optimizing the Database Gather Statistics Job Run Duration

I came across a couple of 11.2 database, which never had the GATHER_DATABASE_STATS job (AUTOTASK or DBMS_SCHEDULER) complete within the allotted maintenance windows - even with incremental global statistics enabled on partitioned tables. Yes, one of the databases did not have its database statistics job upgraded from DBMS_SCHEDULER to AUTOTASK as part of the 10.2->11.1->11.2 upgrade the had previously occurred.

After spending some time on finding the root cause and trying out various remedial options I came to the following conclusions:

1) Per default no parallelism is used for the SQL Analyze queries running as part of the statistics gathering job running within the maintenance window. In fact, NO_PARALLEL and NO_PARALLEL_INDEX hints are widely issued by the job - you can check SQL Monitor while the job is running to observe this. However, if you were to run DBMS_STATS.GATHER_DATABASE_STATS(OPTIONS=>'GATHER AUTO') manually in a session, the system parameters for parallelism will be "inherited" by this job, i.e. parallelism may be used. So be very careful with parameters like parallel_threads_per_cpu, parallel_servers_target and parallel_max_servers - as a manual run of the job could end up "killing" database performance for other sessions by starving other sessions for very costly I/O. I strongly recommend that you do a few trial runs starting with low values for these parameters and monitor the database - you can always cancel the job and try with new parameters at any time.

2) The parameter db_file_multiblock_read_count can have a big impact on the efficiency of GATHER_DATABASE_STATS on serialized (non-parallel) maintenance window job execution. The databases in this case had db_file_multiblock_read_count=8 specified (db_block_size=8). This resulted in 2 things: Average I/O size never breached 64KB on table/partition scans/samples (potentially inefficient with RAID stripe sizes of 1MB); and the sample INSERTs into temporary tables from table/partitions samples (INSERT /*+ append */ INTO sys.ora_temp_1_ds_ SELECT ... FROM ... SAMPLE () for some reason never had I/O sizes different from 8KB (block size). When setting db_file_multiblock_read_count=0 (or removing from (s)pfile) the parameter will be auto-tuned but generally defaults to 128 blocks depending on various factors documented in other blogs. As a result the partition/table scans may see I/O sizes up to 128 blocks (1MB at 8KB block sizes) at a time - and the INSERT ... SELECT operations also move towards avg. 1MB I/O sizes. I have noticed a factor of 2-4 improvement in overall performance (time for job to finish) as a result of less I/O requests (but not less overall I/O size in terms of bytes) in the case of these 2 databases.

This improvement is a function of more efficient I/O - likely due to much fewer round trips (and therefore seeks/reads) as a result of better alignment towards RAID array stripe size.

So the conclusion is that there are ways to improve GATHER_DATABASE_STATS job execution, depending on the environment, regardless of whether it's run manually or through the maintenance window.

Morten Jensen's Blog

26 October 2011

Thoughts on Optimizing the Database Gather Statistics Job Run Duration

No comments: