Slow import of data using impdp for index organized tables in oracle..is there a way around?-CodePudding

I have a table of around 50G which i want to move into another environment. the method we chose is to

export the data from a source IOT table using expdp create a temp IOT table without secondary index use impdp with the below command. The issue I face is the slowness of uploading the data into this IOT table. I hear that parallelism does not work when importing into an IOT. is there a way around this ?

impdp user/pwd@server tables=DATE_ATTRIBUTE_VALUES0 directory=dp_dir dumpfile= /ora_dump/DOUK5DAS/datapump/date_attrib_values_01.dmp,/ora_dump/DOUK5DAS/datapump/date_attrib_values_02.dmp, /ora_dump/DOUK5DAS/datapump/date_attrib_values_03.dmp,/ora_dump/DOUK5DAS/datapump/date_attrib_values_04.dmp,/ora_dump/DOUK5DAS/datapump/date_attrib_values_05.dmp logfile=date_attrib_values_imp.log REMAP_TABLE=server.DATE_ATTRIBUTE_VALUES0:T_DATE_ATTRIBUTE_VALUES0 TABLE_EXISTS_ACTION=APPEND DATA_OPTIONS=TRUST_EXISTING_TABLE_PARTITIONS ACCESS_METHOD=AUTOMATIC METRICS=y LOGTIME=all CLUSTER=N PARALLEL=4 &

CodePudding user response：

You must use partitioning to enable parallel inserts into an index-organized table. Per the manual: "Direct-path INSERT into a single partition of an index-organized table (IOT), into a partitioned IOT with only one partition, or into an IOT that is not partitioned, will be done serially..."

Unfortunately, there are several other reasons why parallelism may not be fully enabled. For example, you might need to alter the table to NOLOGGING, remove triggers, remove relational constraints, etc. I recommend trying to measure the parallelism with straight SQL first, and then once that works, try the export and import again. Also, impdp parallelism works best when the number of files matches the degree of parallelism, so I recommend trying with PARALLEL=5 instead of 4 if you have 5 files.

The following is an example of regular index-organized table using some parallelism for the read, but no parallelism for the write. Notice how the "PX" operations only occur after the LOAD AS SELECT, and not before. If you're using a more recent version of Oracle, the "Note" section will perfectly describe why parallel DML didn't happen.

drop table date_attribute_values0;

create table date_attribute_values0
(
    id       number,
    the_date date,
    constraint pk_locations primary key (id)
)
organization index;

alter session enable parallel dml;

explain plan for
insert /*  append parallel(5) */ into date_attribute_values0
select * from date_attribute_values0;

select * from table(dbms_xplan.display(format => 'basic  note'));


Plan hash value: 132732
 
-----------------------------------------------------------
| Id  | Operation                | Name                   |
-----------------------------------------------------------
|   0 | INSERT STATEMENT         |                        |
|   1 |  LOAD AS SELECT          | DATE_ATTRIBUTE_VALUES0 |
|   2 |   PX COORDINATOR         |                        |
|   3 |    PX SEND QC (RANDOM)   | :TQ10000               |
|   4 |     PX BLOCK ITERATOR    |                        |
|   5 |      INDEX FAST FULL SCAN| PK_LOCATIONS           |
-----------------------------------------------------------
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - Degree of Parallelism is 5 because of hint
   - PDML disabled because non partitioned or single fragment IOT used

The following example shows a partitioned index-organized table using full read and write parallelism. Notice the "PX" operations both before and after the LOAD AS SELECT.

drop table date_attribute_values2;

create table date_attribute_values2
(
    id       number,
    the_date date,
    constraint pk_locations2 primary key (id)
)
organization index 
partition by hash (id) partitions 5;


alter session enable parallel dml;

explain plan for
insert /*  append parallel(5) */ into date_attribute_values2
select * from date_attribute_values2;

select * from table(dbms_xplan.display(format => 'basic  note'));

Plan hash value: 438294917
 
------------------------------------------------------------------------
| Id  | Operation                             | Name                   |
------------------------------------------------------------------------
|   0 | INSERT STATEMENT                      |                        |
|   1 |  PX COORDINATOR                       |                        |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10002               |
|   3 |    INDEX MAINTENANCE                  | DATE_ATTRIBUTE_VALUES2 |
|   4 |     PX RECEIVE                        |                        |
|   5 |      PX SEND RANGE                    | :TQ10001               |
|   6 |       LOAD AS SELECT (HIGH WATER MARK)| DATE_ATTRIBUTE_VALUES2 |
|   7 |        PX RECEIVE                     |                        |
|   8 |         PX SEND PARTITION (KEY)       | :TQ10000               |
|   9 |          PX PARTITION HASH ALL        |                        |
|  10 |           INDEX FULL SCAN             | PK_LOCATIONS2          |
------------------------------------------------------------------------
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - Degree of Parallelism is 5 because of hint

After you get the parallelism working in regular SQL, you'll still need to carefully monitor the database during the import. It might help to use a visual tool like Oracle Enterprise Manager to help you see that multiple threads are both allocated and used.

CodePudding user response：

Performance may increase when you can mark all indexes unusable before the import and rebuild them afterwards:

alter index <<tbl_idx_1>> unusable;
alter index <<tbl_idx_2>> unusable;
...
alter index <<tbl_idx_n>> unusable;

... do the import ...

alter index <<tbl_idx_1>> rebuild;
alter index <<tbl_idx_2>> rebuild;
...
alter index <<tbl_idx_n>> rebuild;