Home > Blockchain >  How to return the highest value in a sequence for each ID unless one has been skipped, with SQL?
How to return the highest value in a sequence for each ID unless one has been skipped, with SQL?

Time:03-12

I want to return the columns listed below but would like it grouped by JobNumber and only return the maximum StageNum so long as it has not skipped any stages.

For example, my table currently looks like this:

ID Stage StageNum Date
104 Released 10 2022-02-07
104 Slab 20 2022-02-18
104 Frame 30 2022-03-07
104 Mechanicals 42 2022-03-10
105 Released 10 2022-02-07
105 Slab 20 2022-02-18
106 Released 10 2022-02-07
106 Slab 20 2022-02-18
106 Frame 30 2022-03-04
106 Cornice 40 2022-03-08

And the result should look like this with ID 104 returning StageNum 30 instead of 42 since StageNum 40 was skipped:

ID Stage StageNum Date
104 Frame 30 2022-03-07
105 Slab 20 2022-02-18
106 Cornice 40 2022-03-08

I can return the highest value by grouping by ID and selecting MAX(StageNum) but do not know how to account for the skipped stages.

CodePudding user response:

Provided there exists the stages table

select * into stages
from(
  values
   ('Released', 10),
   ('Slab',     20),
   ('Frame',    30),
   ('Mechanicals', 42),
   ('Cornice',     40)
) t(Stage,  StageNum)

You can compare positions requierd with the operations

select top(1) with ties ID, Stage, StageNum, Date
from (
  select t.ID ,t.StageNum, t.Date, stn.Stage, stn.spos, row_number() over(partition by ID order by t.StageNum) tpos
  from tbl t
  join (
     select StageNum, Stage, row_number() over(order by StageNum) spos
     from stages
  ) stn on stn.StageNum = t.StageNum
) t
where tpos=spos
order by row_number() over(partition by ID order by StageNum desc)

CodePudding user response:

Here is one possibility without creating a table to records stages. We use string_agg() and then check that all the stages are present. We could do this without a CTE but we would have to run string_agg several times per row.

create table c(
ID int,
Stage varchar(20),
StageNum int,
sDate Date);
insert into c values
(104,'Released',10,'2022-02-07'),
(104,'Slab',  20,'2022-02-18'),
(104,'Frame', 30,'2022-03-07'),
(104,'Mechanicals',42,'2022-03-10'),
(105,'Released',  10,'2022-02-07'),
(105,'Slab',  20,'2022-02-18'),
(106,'Released',10,'2022-02-07'),
(106,'Slab',  20,'2022-02-18'),
(106,'Frame', 30,'2022-03-04'),
(106,'Cornice',   40,'2022-03-08');
select * from c;
 ID | Stage       | StageNum | sDate     
--: | :---------- | -------: | :---------
104 | Released    |       10 | 2022-02-07
104 | Slab        |       20 | 2022-02-18
104 | Frame       |       30 | 2022-03-07
104 | Mechanicals |       42 | 2022-03-10
105 | Released    |       10 | 2022-02-07
105 | Slab        |       20 | 2022-02-18
106 | Released    |       10 | 2022-02-07
106 | Slab        |       20 | 2022-02-18
106 | Frame       |       30 | 2022-03-04
106 | Cornice     |       40 | 2022-03-08
select
  id,
  string_agg(stageNUM,'-')
from c
group by id;
 id | (No column name)
--: | :---------------
104 | 10-20-30-42     
105 | 10-20           
106 | 10-20-30-40     
with stages as(
select
  id,
  string_agg(stageNUM,'') s
from c
group by id
)
select
  id,
  case when s = '1020304042'
    then 'Mechanicals'
    when left(s,8) = '10203040'
    then 'Cornice'
    when left(s,6) = '102030'
    then 'Frame'
    when left(s,4) = '1020'
    then 'Slab'
    when left(s,2) = '10'
    then 'Released'
    else 'unknown' end Stage
from stages
 id | Stage  
--: | :------
104 | Frame  
105 | Slab   
106 | Cornice

db<>fiddle here

  • Related