Home > Software design >  I want to delete duplication with condition from a table in PLSQL
I want to delete duplication with condition from a table in PLSQL

Time:09-28

I want to delete the dup lines using PLSQL. The sample of the table is below

Policy # Price Dealno for Loan # Price of Loan PersonID
123 10 Loan123 1,000 abc
123 10 Loan123 3,000 abc
456 10 Loan456 500 xyz
456 10 Loan456 500 null

As you can see, in the case of Policy #123, I try to get the line with the highest amount of Price of Loan. Which mean the Price of Loan for 3,000. For Policy #456, I want to get the one without null value.

Is there a way for me to achieve that in PLSQL.

Thank you

CodePudding user response:

This query identifies if a row is OK (rn = 1) or if is is a duplicated copy (rn > 1) based on your definition

select POLICY#, PRICE, LOAN#, PRICE_LOAN, PERSON_ID,
row_number() over (partition by POLICY# order by PRICE_LOAN desc, PERSON_ID nulls last) as rn
from tab
;


   POLICY#      PRICE LOAN#    PRICE_LOAN PER         RN
---------- ---------- -------- ---------- --- ----------
       123         10 loan123        3000 abc          1
       123         10 loan123        1000 abc          2
       456         10 loan4563        500 xyz          1
       456         10 loan4563        500              2

Note that you use row_number where you partition by on the unique key and order by so that you get first the row that should be taken.

So to get the duplicates only you use this query

with rn as (
select POLICY#, PRICE, LOAN#, PRICE_LOAN, PERSON_ID,
row_number() over (partition by POLICY# order by PRICE_LOAN desc, PERSON_ID nulls last) as rn
from tab
)
select * from rn where rn > 1;

   POLICY#      PRICE LOAN#    PRICE_LOAN PER         RN
---------- ---------- -------- ---------- --- ----------
       123         10 loan123        1000 abc          2
       456         10 loan4563        500              2

Based on this you write the DELETE statement (enclose in BEGIN ... END if you insist in PL/SQL)

delete from tab where rowid in
(
with rn as (
select POLICY#, PRICE, LOAN#, PRICE_LOAN, PERSON_ID,
row_number() over (partition by POLICY# order by PRICE_LOAN desc, PERSON_ID nulls last) as rn
from tab
)
select rowid from rn where rn > 1
);

You may check if the delete worked fine ....

select * from tab;

   POLICY#      PRICE LOAN#    PRICE_LOAN PER
---------- ---------- -------- ---------- ---
       123         10 loan123        3000 abc
       456         10 loan4563        500 xyz

... and commit

  • Related