Versioned Table has duplicate address records and needs to be deleted as below 1: Find duplicate records from a table as below Address
adr_id | ver_id | address
1 1 newYork
1 2 newYork
1 3 newYork
4 1 Washington
4 2 Washington
2: Insert new records as below
adr_id | ver_id | address
11 0 newYork
12 0 Washington
3: Delete existing duplicate records so final table would look like #2.
Note: Table has billions of records and required to be done in minimum time with best database technique.
CodePudding user response:
Use a MERGE
statement correlated on the ROWID
pseudo-column:
MERGE INTO table_name dst
USING (
SELECT ROWID AS rid,
COUNT(*) OVER (PARTITION BY adr_id) AS cnt,
ROW_NUMBER() OVER (PARTITION BY adr_id ORDER BY ver_id) AS rn
FROM table_name
) src
ON (src.cnt > 1 AND dst.ROWID = src.rid)
WHEN MATCHED THEN
UPDATE
SET adr_id = YOUR_ADR_ID_SEQUENCE.NEXTVAL,
ver_id = 0
DELETE WHERE rn > 1;
Which, for the sample data:
CREATE SEQUENCE your_adr_id_sequence START WITH 11;
CREATE TABLE table_name (adr_id, ver_id, address) AS
SELECT 1, 1, 'newYork' FROM DUAL UNION ALL
SELECT 1, 2, 'newYork' FROM DUAL UNION ALL
SELECT 1, 3, 'newYork' FROM DUAL UNION ALL
SELECT 4, 1, 'Washington' FROM DUAL UNION ALL
SELECT 4, 2, 'Washington' FROM DUAL;
Then, after the MERGE
the table contains:
ADR_ID | VER_ID | ADDRESS |
---|---|---|
11 | 0 | newYork |
14 | 0 | Washington |