I have a set of data example below:
ID | Role | START_DATE | END_DATE |
---|---|---|---|
1 | A | 2022-02-01 | 2022-02-03 |
1 | A | 2022-02-03 | 2022-02-10 |
1 | A | 2022-02-10 | 9000-12-31 |
2 | B | 2022-02-01 | 2022-02-03 |
2 | A | 2022-02-03 | 2022-02-04 |
2 | B | 2022-02-04 | 2022-02-11 |
I want to group them based on the ID, ROLE, START_DATE and END_DATE. So it looks like below: x
ID | Role | START_DATE | END_DATE |
---|---|---|---|
1 | A | 2022-02-01 | 9000-12-31 |
2 | B | 2022-02-01 | 2022-02-03 |
2 | A | 2022-02-03 | 2022-02-04 |
2 | B | 2022-02-04 | 2022-02-11 |
How can I achieve this?
Thanks
CodePudding user response:
Simple task for Teradata's NORMALIZE:
with cte as
( -- works on PERIODs only
select NORMALIZE -- combine overlapping periods
ID, Role, period(START_DATE, END_DATE) as pd
from mytable
)
-- split period back into start/end
select ID, Role, begin(pd) as START_DATE, end(pd) as END_DATE
from cte
CodePudding user response:
It seems like a gap and islands problem, you can try to use ROW_NUMBER
window function to get the gap of your expectation grouping.
SELECT ID,Role,MAX(START_DATE),MAX(END_DATE)
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY START_DATE) -
ROW_NUMBER() OVER(PARTITION BY ID,Role ORDER BY START_DATE) grp
FROM T
) t1
GROUP BY grp,ID,Role