Home > OS >  distinct multiple columns and then update
distinct multiple columns and then update

Time:09-30

I have this table CUSTOMER_TEST with id_1, id_2 can be duplicated. But this table must only has one pair id_1 and id_2. After appending new data, this table looks like this:

update_date id_1    id_2    score
13/09/2021  ABC123  ABC456  0.6
13/09/2022  DEF123  DEF456  0.4
13/09/2021  ABC123  ABC789  1.2
14/09/2021  ABC123  ABC456  1
14/09/2022  TEST123 TEST456 0.6 

I want to update CUSTOMER_TEST by keeping the latest update_date with the updated data. The desire result should be like this:

update_date id_1    id_2    score
13/09/2022  DEF123  DEF456  0.4
13/09/2021  ABC123  ABC789  1.2
14/09/2021  ABC123  ABC456  1
14/09/2022  TEST123 TEST456 0.6

I have tried these codes but don't know how to update the results on the current table. Because as what I read on Oracle Documentation, the UPDATE function need to specify SET columns

select max(update_date), id_1, id_2, score
from CUSTOMER_TEST
group by id_1, id_2

CodePudding user response:

Well, it's not that you'd update that table, but remove (i.e. delete) all rows that are "older" than the newest per (id_1, id_2) combination.

Before:

SQL> select * from customer_test order by update_date, id_1, id_2;

UPDATE_D ID_1    ID_2         SCORE
-------- ------- ------- ----------
13.09.21 ABC123  ABC456          ,6
13.09.21 ABC123  ABC789         1,2
14.09.21 ABC123  ABC456           1
13.09.22 DEF123  DEF456          ,4
14.09.22 TEST123 TEST456         ,6

Delete what you don't need:

SQL> delete from customer_test a
  2  where (id_1, id_2, update_date) in (select id_1, id_2, update_date
  3                                      from (select id_1, id_2, update_date,
  4                                              row_number() over (partition by id_1, id_2
  5                                                                 order by update_date desc) rn
  6                                            from customer_test
  7                                           )
  8                                      where rn > 1
  9                                     );

1 row deleted.

After:

SQL> select * from customer_test order by update_date, id_1, id_2;

UPDATE_D ID_1    ID_2         SCORE
-------- ------- ------- ----------
13.09.21 ABC123  ABC789         1,2
14.09.21 ABC123  ABC456           1
13.09.22 DEF123  DEF456          ,4
14.09.22 TEST123 TEST456         ,6

SQL>

CodePudding user response:

It seems you need delete rather than update. Use window functions to choose rows to delete, then delete them. In oracle it's easy using built-in metacolumn rowid.

delete from customer_test x
where x.rowid in ( 
  select i.rid from (
    select t.rowid as rid
         , row_number() over (partition by t.id_1, t.id_2 order by t.update_date desc) as rn
    from customer_test t
  ) i
  where i.rn > 1
)
  • Related