I have a table like this:
id | section |
---|---|
1 | 6 |
2 | 6 |
3 | 7 |
4 | 7 |
5 | 6 |
and would like to obtain a grouped list that says
section | section_nr | first_id |
---|---|---|
6 | 1 | 1 |
7 | 2 | 3 |
6 | 3 | 5 |
Using ROW_NUMBER twice I am able to obtain something close:
SELECT section, ROW_NUMBER() OVER (ORDER BY id) AS section_nr, id as first_id
FROM (
SELECT id, section, ROW_NUMBER() OVER (PARTITION BY section ORDER BY id) AS nr_within
FROM X
)
WHERE nr_within = 1
section | section_nr | first_id |
---|---|---|
6 | 1 | 1 |
7 | 2 | 3 |
... but of course the second section 6 is missing, since PARTITION BY groups all section=6 together. Is it somehow possible to only group until the next section?
More generally (regarding GROUP BY instead of PARTITION BY), is there a simple solution to group (1,1,2,2,1) to (1,2,1) instead of (1,2)?
CodePudding user response:
This is a typical gaps and islands problem that can be solved like this:
with u as
(select id, section,
case when section = lag(section) over(order by id) then 0 else 1 end as grp
from X),
v as
(select id,
section,
sum(grp) over(order by id) as section_nr
from u)
select section,
section_nr,
min(id) as first_id
from v
group by section, section_nr;
Basically you keep tabs in a column where there is a change in section
by comparing current section
to section
from the row above (ordered by id). Whenever there is a change, set this column to 1, when no change set it to 0. The rolling sum of this column will be the section number. Getting first_id
is a simple matter of using group by
.
CodePudding user response:
That's a classic.
P.S.
If id
is indeed a series of integers without gaps, we can use it instead of rn
select section
,row_number() over (order by min(id)) as section_nr
,min(id) as first_id
from (select id
,section
,row_number() over (order by id) as rn
,row_number() over (partition by section order by id) as rn_section
from X
)
group by section
,rn - rn_section
SECTION | SECTION_NR | FIRST_ID |
---|---|---|
6 | 1 | 1 |
7 | 2 | 3 |
6 | 3 | 5 |