I want to return the columns listed below but would like it grouped by JobNumber and only return the maximum StageNum so long as it has not skipped any stages.
For example, my table currently looks like this:
ID | Stage | StageNum | Date |
---|---|---|---|
104 | Released | 10 | 2022-02-07 |
104 | Slab | 20 | 2022-02-18 |
104 | Frame | 30 | 2022-03-07 |
104 | Mechanicals | 42 | 2022-03-10 |
105 | Released | 10 | 2022-02-07 |
105 | Slab | 20 | 2022-02-18 |
106 | Released | 10 | 2022-02-07 |
106 | Slab | 20 | 2022-02-18 |
106 | Frame | 30 | 2022-03-04 |
106 | Cornice | 40 | 2022-03-08 |
And the result should look like this with ID 104 returning StageNum 30 instead of 42 since StageNum 40 was skipped:
ID | Stage | StageNum | Date |
---|---|---|---|
104 | Frame | 30 | 2022-03-07 |
105 | Slab | 20 | 2022-02-18 |
106 | Cornice | 40 | 2022-03-08 |
I can return the highest value by grouping by ID and selecting MAX(StageNum) but do not know how to account for the skipped stages.
CodePudding user response:
Provided there exists the stages
table
select * into stages
from(
values
('Released', 10),
('Slab', 20),
('Frame', 30),
('Mechanicals', 42),
('Cornice', 40)
) t(Stage, StageNum)
You can compare positions requierd with the operations
select top(1) with ties ID, Stage, StageNum, Date
from (
select t.ID ,t.StageNum, t.Date, stn.Stage, stn.spos, row_number() over(partition by ID order by t.StageNum) tpos
from tbl t
join (
select StageNum, Stage, row_number() over(order by StageNum) spos
from stages
) stn on stn.StageNum = t.StageNum
) t
where tpos=spos
order by row_number() over(partition by ID order by StageNum desc)
CodePudding user response:
Here is one possibility without creating a table to records stages. We use string_agg()
and then check that all the stages are present. We could do this without a CTE but we would have to run string_agg
several times per row.
create table c( ID int, Stage varchar(20), StageNum int, sDate Date); insert into c values (104,'Released',10,'2022-02-07'), (104,'Slab', 20,'2022-02-18'), (104,'Frame', 30,'2022-03-07'), (104,'Mechanicals',42,'2022-03-10'), (105,'Released', 10,'2022-02-07'), (105,'Slab', 20,'2022-02-18'), (106,'Released',10,'2022-02-07'), (106,'Slab', 20,'2022-02-18'), (106,'Frame', 30,'2022-03-04'), (106,'Cornice', 40,'2022-03-08');
select * from c;
ID | Stage | StageNum | sDate --: | :---------- | -------: | :--------- 104 | Released | 10 | 2022-02-07 104 | Slab | 20 | 2022-02-18 104 | Frame | 30 | 2022-03-07 104 | Mechanicals | 42 | 2022-03-10 105 | Released | 10 | 2022-02-07 105 | Slab | 20 | 2022-02-18 106 | Released | 10 | 2022-02-07 106 | Slab | 20 | 2022-02-18 106 | Frame | 30 | 2022-03-04 106 | Cornice | 40 | 2022-03-08
select id, string_agg(stageNUM,'-') from c group by id;
id | (No column name) --: | :--------------- 104 | 10-20-30-42 105 | 10-20 106 | 10-20-30-40
with stages as( select id, string_agg(stageNUM,'') s from c group by id ) select id, case when s = '1020304042' then 'Mechanicals' when left(s,8) = '10203040' then 'Cornice' when left(s,6) = '102030' then 'Frame' when left(s,4) = '1020' then 'Slab' when left(s,2) = '10' then 'Released' else 'unknown' end Stage from stages
id | Stage --: | :------ 104 | Frame 105 | Slab 106 | Cornice
db<>fiddle here