I'm trying to take a column of dates and create a new table that includes the start and end date of each continuous date range. For example (date format being mm/dd/yyyy):
id | date |
---|---|
2 | 01/02/2022 |
5 | 01/03/2022 |
5 | 01/04/2022 |
5 | 01/05/2022 |
6 | 01/02/2022 |
6 | 01/04/2022 |
6 | 01/05/2022 |
would create the following table:
id | start | end |
---|---|---|
2 | 01/02/2022 | 01/02/2022 |
5 | 01/03/2022 | 01/01/2022 |
6 | 01/02/2022 | 01/02/2022 |
6 | 01/04/2022 | 01/05/2022 |
CodePudding user response:
Use below
select id, min(cur_date) start, max(cur_date) as `end`
from (
select *, countif(date_diff(next_date, cur_date, day) != 1) over win as grp
from (
select id, date, cur_date,
ifnull(lead(cur_date) over(partition by id order by cur_date), cur_date) next_date
from your_table, unnest([struct(parse_date('%m/%d/%Y', date) as cur_date)])
)
window win as (partition by id order by cur_date rows between unbounded preceding and 1 preceding)
)
group by id, grp
if applied to sample data in your question - output is