Home > database >  Sql query to remove duplicate entries after sync from data source
Sql query to remove duplicate entries after sync from data source

Time:06-13

I am currently creating a service that fetches data from a public data source every 24 hours. The data is essentially structured as thus:

a b c ImportDate
1 2 3 12.06.22
2 3 3 12.06.22
1 2 3 11.06.22

Where I want to have only unique values (ignoring the importdate), i.e something like this.

a b c ImportDate
1 2 3 12.06.22
2 3 3 12.06.22

Where we remove the old duplicate value.

What would be the best way to approach this to ensure no data is actually lost, only the duplicate values.

Thanks in advance!

CodePudding user response:

I think it's easiest to first import the new data in a separate table with the same Structure, let's say Import, and then merge it into YourTable.

The merge statement lets you query a piece of data, and match it against an existing table. In a single statement you can choose to update (or skip) existing rows, and insert new rows.

merge into YourTable t
using
  (select * from Import) i
on (i.a = t.a and i.b = t.b and i.c = t.c) -- Or just the columns you want to match
when matched then
  update set t.ImportDate = i.ImportDate -- add any other columns you want to update
when not matched then
  insert (a, b, c, ImportDate)
  values (i.a, i.b, i.c, i.ImportDate);

CodePudding user response:

After fetching all the data, in a second step I would do something like select a, b, c, max(ImportDate) as lastDate from source group by a, b, c, that should keep all the values with the last imported date.

CodePudding user response:

You can use the row_number() window function, the function will create partitions in the values that you don't want to repeat, after all you can use the where clause to filter only the first ocurrencies.

select a, b, c, importDate
from (
  select a, b, c, importDate,
  row_number() over(partition by a,b,c order by a desc) rn
  from example
  ) a
where rn =1;

here is the example: https://www.db-fiddle.com/f/3iryppZrysgCPkRVjpCKyM/0

  • Related