Home > Mobile >  Postgresql remove rows by duplicated column value
Postgresql remove rows by duplicated column value

Time:12-18

I'm using python pandas with Postgresql,and I have a table named stock:

        open    high    low close      volume   datetime
        383.97  384.22  383.66  384.08  1298649 2022-12-16 14:25:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.64  384.2099 383.54 383.61  1439271 2022-12-16 14:15:00

How can I remove the rows that have duplicated datetime ,and only keep 1 row of it,only keep the latest row of it.

The output should be:

        open    high    low close      volume   datetime
        383.97  384.22  383.66  384.08  1298649 2022-12-16 14:25:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.64  384.2099 383.54 383.61  1439271 2022-12-16 14:15:00

Something like:

delete from stock where datetime duplicated > 1

CodePudding user response:

Use drop_duplicates() from pandas:

# Remove duplicates and keep only the last row
df = df.drop_duplicates(subset='datetime', keep='last')
  • Related