Home > Software engineering >  Sql query to remove duplicates depending on two particular columns
Sql query to remove duplicates depending on two particular columns

Time:09-03

I am having a table called BUS_DATA that returns below rows when running below query

query select * from BUS_DATA

enter image description here

I then run below query to get info only for the bus with id 4-3323309834

select * from BUS_DATA where busId = '4-3323309834'

Below are the rows that i get when i run the above query

enter image description here

What i would like to achieve is to only return the row with the maximum currentSpeed for that particular dataDateTime like below so that i don't get two have two records for a particular day.

enter image description here

I have tried below query but its still returning two records for each date

select busId, dataDateTime, max(currentSpeed) as CURRENTSPEED from `BUS_DATA` WHERE busId = '4-3323309834' group by busId, dataDateTime

I would also like the query to return other columns like passengersNo and speedLimit which the above query is not returning

CodePudding user response:

You could probably use a partition to achieve this. Group by the BusID and the Date, but your date is date time, so you would want to cast or convert to just a date so the 27th all group together, etc... Then you could order by the Current speed by desc putting the fastest at the top.

SELECT
X.*
FROM
(
SELECT
BusID
, BusName
, CurrentSpeed
, PassengersNo
, SpeedLimit
, dataDateTime
, ROW_NUMBER() OVER (PARTITION BY CAST(dataDateTime AS DATE), BusId ORDER BY CurrentSpeed DESC) AS RowNumOrder
FROM BUS_DATA 
WHERE busId = '4-3323309834'
) AS X 
WHERE X.RowNumOrder = 1
  • Related