I am stuck with a weird problem of handling data sequences. My source data looks like -
Roll-on, Marker
1,1
2,0
3,0
5,1
8,1
9,0
10,1
the marker column can only have two values, 1 and 0
if the roll no column is in a sequence, the marker value of 1 indicates the start of the sequence and all the remaining roll no will have marker value 0 within that sequence. So for roll no sequence 1-3, marker value is 1 for roll no 1 and 0 for the rest. However, if roll no doesn't fall into a sequence(as in roll no 8), the marker value is 1. From this data I need to create an output as follows -
Roll range
1
2
3
1-3
5
5-5
8
9
10
8-10
Meaning -
- display the roll no in sequence as in the input
- after each sequence ends, display a new record containing the start and end roll no of the proceeding sequence
How is this possible?
Thanks in advance for help.
CodePudding user response:
It seems like an island and gap problem.
If I understand correctly, we can try to use SUM
window function with conditions to make it.
Generator a gap of row number then getting min
and max
group by
SELECT CONCAT(MIN(Roll),'-',MAX(Roll))
FROM (
SELECT *,
SUM(CASE WHEN Marker = 1 THEN 1 ELSE 0 END) OVER(ORDER BY Roll) grp
FROM T
) t1
GROUP BY grp
as I comment I am not sure about the logic of 8-10
(why isn't 8-9
and 10-10
) from your expect result and columns description, I think we can try to judge Max
of Roll then do some arithmetic.
SELECT CONCAT(MIN(Roll),'-',MAX(Roll))
FROM (
SELECT *,
SUM(CASE WHEN Marker = 1 THEN 1 ELSE 0 END) OVER(ORDER BY Roll) IIF(MAX(Roll) OVER() = Roll, - Marker,0) grp
FROM T
) t1
GROUP BY grp
so that the final query combines result set we can use UNION ALL
;WITH CTE AS (
SELECT *,
SUM(CASE WHEN Marker = 1 THEN 1 ELSE 0 END) OVER(ORDER BY Roll) IIF(MAX(Roll) OVER() = Roll, - Marker,0) grp
FROM T
)
SELECT [Roll range]
FROM (
SELECT CONCAT(MIN(Roll),'-',MAX(Roll)) 'Roll range',MAX(Roll) seq
FROM CTE t1
GROUP BY grp
UNION ALL
SELECT CAST(Roll AS VARCHAR(5)),Roll
FROM CTE t1
) t1
ORDER BY seq
CodePudding user response:
SELECT
CASE WHEN a=2 AND CHARINDEX('-',R)=0 THEN CONCAT(R,'-',R) ELSE R END as R,
R2,
a
FROM (
SELECT
1 as a,
CONVERT(VARCHAR(3), Roll) R,
Roll as R2
FROM table1
UNION ALL
SELECT
2,
STRING_AGG(Roll,'-') R,
MAX(Roll) as R2
FROM (
SELECT
Roll,
SUM(Marker) OVER (ORDER BY Roll) S
FROM
table1
) x
GROUP BY S
) x
ORDER BY R2,a
output:
R | R2 | a |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
1-2-3 | 3 | 2 |
5 | 5 | 1 |
5-5 | 5 | 2 |
8 | 8 | 1 |
9 | 9 | 1 |
8-9 | 9 | 2 |
10 | 10 | 1 |
10-10 | 10 | 2 |
- Columns
R2
anda
are added for correct sorting. - I group
8-9
and10-10
, but this question is still open, see comment