Say I have a table like
store | date | is_open |
---|---|---|
Bay | 1/1/2022 | true |
Bay | 1/2/2022 | true |
Bay | 1/3/2022 | true |
Bay | 1/4/2022 | false |
Bay | 1/5/2022 | false |
Bay | 1/6/2022 | false |
Bay | 1/7/2022 | true |
Bay | 1/8/2022 | true |
Bay | 1/9/2022 | true |
Walmart | 1/7/2022 | true |
Walmart | 1/8/2022 | false |
Walmart | 1/9/2022 | true |
I want them to use partition by and get the rank of the group such as
store | date | is_open | group |
---|---|---|---|
Bay | 1/1/2022 | true | 1 |
Bay | 1/2/2022 | true | 1 |
Bay | 1/3/2022 | true | 1 |
Bay | 1/4/2022 | false | 2 |
Bay | 1/5/2022 | false | 2 |
Bay | 1/6/2022 | false | 2 |
Bay | 1/7/2022 | true | 3 |
Bay | 1/8/2022 | true | 3 |
Bay | 1/9/2022 | true | 3 |
Walmart | 1/7/2022 | true | 1 |
Walmart | 1/8/2022 | false | 2 |
Walmart | 1/9/2022 | true | 3 |
I started trying partition by store
and is_open
but really confuse what to use in order by clause any help would be appreciated.
CodePudding user response:
This is actually a gaps and islands problem. One approach uses the difference in row numbers method:
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY store ORDER BY date) rn1,
ROW_NUMBER() OVER (PARTITION BY store, is_open ORDER BY date) rn2
FROM yourTable t
),
cte2 AS (
SELECT t.*, MIN(date) OVER (PARTITION BY store, is_open, rn1 - rn2) AS min_date
FROM cte t
)
SELECT store, date, is_open,
DENSE_RANK() OVER (PARTITION BY store ORDER BY rn1 - rn2, min_date) "group"
FROM cte2
ORDER BY store, date;
Note carefully that we use a second CTE cte2
here to find the minimum date value for each island. This is being done to distinguish two islands from different is_open
values (true/false) which happen to have the same difference in row numbers. It ensures that in the case of tie for difference in row number, the earlier island is reported first.
Demo
CodePudding user response:
You can use LAG() to detect start of a group.
with cte AS (
SELECT t.*, case when lag(is_open) OVER (PARTITION BY store ORDER BY date) = is_open then 0 else 1 end sflag
FROM yourTable t
)
SELECT store, date, is_open, sum(sflag) over(PARTITION BY store ORDER BY date) grp
FROM cte
ORDER BY store, date;