With Oracle Database 18c, we have a new parameter which can change the execution plan of parallel queries which consist of partitioned tables. There is not too much information about this parameter yet but based on my experiments I have observed its behaviour. Actually this parameter flags if our table is "SKEWED" or "NOT SKEWED"
Let's create a monthly partition table and insert some data;
CREATE TABLE part_skew_table
( prod_id NUMBER(6)
, cust_id NUMBER
, time_id DATE
, channel_id CHAR(1)
, promo_id NUMBER(6)
, quantity_sold NUMBER(3)
, amount_sold NUMBER(10,2)
)
PARTITION BY RANGE (time_id)
INTERVAL (NUMTOYMINTERVAL(1,'MONTH'))
(PARTITION sales_q1_2017 VALUES LESS THAN (TO_DATE('01-FEB-2017','dd-MON-yyyy')));
for rec in 1 .. 25 loop
insert into part_skew_table
values (10,110,TO_DATE(TRUNC(DBMS_RANDOM.VALUE(2457755+00, 2457755 + 270)), 'J'),'S',910,140, 1000);
insert into part_skew_table
values (10,110,TO_DATE(TRUNC(DBMS_RANDOM.VALUE(2457755+270, 2457755 + 360)), 'J'),'S',910,140, 1000);
insert into part_skew_table
values (10,110,TO_DATE(TRUNC(DBMS_RANDOM.VALUE(2457755+180, 2457755 + 270)), 'J'),'S',910,140, 1000);
insert into part_skew_table
values (10,110,TO_DATE(TRUNC(DBMS_RANDOM.VALUE(2457755+00, 2457755 + 180)), 'J'),'S',910,140, 1000);
end loop;
end;
analyze table part_skew_table compute statistics;
select table_name, partition_name, num_rows
from user_tab_partitions
where table_name = 'PART_SKEW_TABLE'
order by 3 desc;
TABLE_NAME PARTITION_NAME NUM_ROWS
---------- -------------- --------PART_SKEW_TABLE SYS_P479 16
PART_SKEW_TABLE SYS_P482 14
PART_SKEW_TABLE SYS_P481 12
PART_SKEW_TABLE SYS_P477 11
PART_SKEW_TABLE SYS_P478 9
PART_SKEW_TABLE SALES_Q1_2017 8
PART_SKEW_TABLE SYS_P484 8
PART_SKEW_TABLE SYS_P480 5
PART_SKEW_TABLE SYS_P487 5
PART_SKEW_TABLE SYS_P485 5
PART_SKEW_TABLE SYS_P486 4
PART_SKEW_TABLE SYS_P483 3
How does it work?
The mechanism is only relevant when the number of partitions is at least twice the DOP you choose.
If the DOP (parallelism) is N, then examine the N-1 largest partitions, sum their sizes, and express this as a percentage of the size of the whole table, call it p%
- number of partitions > 2xDOP
- if p% is greater than "_px_partition_skew_threshold" a plan that could be executed as a hash distribution plan
- if p% is less than "_px_partition_skew_threshold" then the optimizer will choose the partition-wise plan.
Let's check the following query:
select /*+ parallel(3)*/time_id,count(*) from part_skew_ group by time_id;
DOP (N) => 3
N-1 largest partitions => SYS_P479, SYS_P482
p% => ((16+14)/100)*100 = %30
1st Case:
p% =30 > _px_partition_skew_threshold =29
alter session set "_px_partition_skew_threshold" = 29;
explain plan for
select /*+ parallel(3)*/time_id,count(*) from part_skew_table group by time_id;
select * from table(dbms_xplan.display('plan_table',null,'basic,predicate'));
Plan hash value: 1775209382
----------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 |
| 3 | HASH GROUP BY | |
| 4 | PX RECEIVE | |
| 5 | PX SEND HASH | :TQ10000 --------------------> PX HASH
| 6 | HASH GROUP BY | |
| 7 | PX BLOCK ITERATOR | |
| 8 | TABLE ACCESS FULL| PART_SKEW_TABLE |
----------------------------------------------------
- Degree of Parallelism is 3 because of hint
2nd Case:
p% =30 < _px_partition_skew_threshold =31
alter session set "_px_partition_skew_threshold" = 31;
explain plan for
select /*+ parallel(3)*/time_id,count(*) from part_skew_table group by time_id;
select * from table(dbms_xplan.display('plan_table',null,'basic,predicate'));
Plan hash value: 1410516856
----------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 |
| 3 | PX PARTITION RANGE ALL|--------------------------------> PX PARTITION
| 4 | HASH GROUP BY | |
| 5 | TABLE ACCESS FULL | PART_SKEW_TABLE |
----------------------------------------------------
- Degree of Parallelism is 3 because of hint