event_id | alert_id | alert_timestamp | value |
---|---|---|---|
1 | x | 2022-10-10 | -2 |
1 | x | 2022-10-10 | 4 |
1 | x | 2022-10-10 | 5 |
2 | z | 2022-09-02 | 3 |
I have a table that has rows that match on event_id
, alert_id
and alert_timestamp
. I want to only keep the single row that has the smallest value in the value
column where all other rows are matching on event_id
, alert_id
and alert_timestamp
.
NOTE: this is just one example of matching rows, the table has a mix of different rows than the example given, that match on event_id
, alert_id
and alert_timestamp
Also if there are single rows i.e. a row that has no other row that matches on event_id
, alert_id
and alert_timestamp
, this should be kept as is.
CodePudding user response:
That's a typical top-1-per-group problem. In Redshift you can use row_number()
to solve it:
select event_id, alert_id, alert_timestamp, value
from (
select t.*,
row_number() over(partition by event_id, alert_id, alert_timestamp order by value) rn
from mytable t
) t
where rn = 1
Note that if you really had only 4 columns, then aggregation would be sufficient:
select event_id, alert_id, alert_timestamp, min(value) as value
from mytable
group by event_id, alert_id, alert_timestamp, value