I've data like below:
ID Task Time
1 X started T1
2 X ended T2
3 X started T3 [wrong entry in data]
4 X started T4
5 X ended T5
6 Y started T6 [wrong entry in data]
7 Y started T7
8 Y ended T8
And, I need to get the data from above in started/ended fashion, but in case of wrong entry I need to pickup the latest one [as T4>T3 and T7>T6]. How can I write SQL on above dataset to get below result ?
ID Task Time
1 X started T1
2 X ended T2
4 X started T4
5 X ended T5
7 Y started T7
8 Y ended T8
CodePudding user response:
use a subquery in the where to match the ID, the subquery returns the ID of the row that matches the task (in the primary query), if there is more than one row with the same task value, it will return the row with the greatest id value.
SELECT * FROM `mytable` p1
where id = (select id from `mytable` p2
where p2.task= p1.task
order by id DESC limit 1);
CodePudding user response:
You may use the difference of two row_number()
s to define unique groups for the consecutive duplicate values of Task column, then get the max of Time and ID columns, try the following:
select max(ID), Task, max(Time) Time
from
(
select *,
row_number() over (order by Time) -
row_number() over (partition by Task order by Time) grp
from table_name
) T
group by Task, grp
order by max(Time)
See a demo.
For MySQL 5.7 try the following:
set @gr=0;
set @ts=null;
select max(ID), Task, max(Time)
Time
from
(
select *,
if(@ts<>Task, @gr:=@gr 1,
@gr) grp,
@ts:=Task
from table_name
order by Time
) T
group by Task, grp
order by max(Time)
Demo.