Home > Blockchain >  Delete Duplicate in I$ table in ODI
Delete Duplicate in I$ table in ODI

Time:11-20

We have a load plan in ODI. We get a repeating error in some on our scenarios due to duplicate records in the I% table. What we do is manually run the script every time the load plan fails.

DELETE FROM adw12_dw. I$_1558911580_4
      WHERE     (EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID) IN
                    (  SELECT EFFECTIVE_FROM_DT,
                              DATASOURCE_NUM_ID,
                              INTEGRATION_ID
                         FROM adw12_dw . I$_1558911580_4
                     GROUP BY EFFECTIVE_FROM_DT,
                              DATASOURCE_NUM_ID,
                              INTEGRATION_ID
                       HAVING COUNT (1) > 1)
            AND ROWID NOT IN
                    (  SELECT MIN (ROWID)
                         FROM adw12_dw . I$_1558911580_4
                     GROUP BY EFFECTIVE_FROM_DT,
                              DATASOURCE_NUM_ID,
                              INTEGRATION_ID
                       HAVING COUNT (1) > 1)
                
                commit;

Is there a way to automate the deletion of duplicate records in the Integration table?

CodePudding user response:

If you have duplicates in the source, best would be to handle that in the logic of the mapping. What could work is to add an expression component to add a row_rank column using an analytical function to rank the duplicates : row_number() over (partition by EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID order by ROWID). You can then add a filter with the condition row_rank = 1.

If you prefer to do a delete after inserting, you can edit the IKM and add the delete step before loading the target table.

You could also divide the integration in 3 different steps :

  1. a mapping that would load a staging table instead of your final target table, with the duplicate
  2. an ODI procedure that would perform the delete to remove the duplicates in the staging table
  3. a mapping that would load the data from the staging area to the target table

CodePudding user response:

Maybe your duplicate counts over 2 because we must execute the delete query recursively. For example:

CREATE OR REPLACE PROCEDURE delete_dublicates
IS
BEGIN
    DELETE FROM TABLE1 WHERE ID IN 
        (
            SELECT max(ID) FROM TABLE1
            GROUP BY USER_ID, TYPE_ID 
            HAVING count(*) > 1
        );

    IF (SQL%ROWCOUNT > 0) THEN 
        delete_dublicates;
    END IF; 

END delete_dublicates;
  • Related