I have a table similar to:
Date | Person | Distance |
---|---|---|
2022/01/01 | John | 15 |
2022/01/02 | John | 0 |
2022/01/03 | John | 0 |
2022/01/04 | John | 0 |
2022/01/05 | John | 19 |
2022/01/01 | Pete | 25 |
2022/01/02 | Pete | 12 |
2022/01/03 | Pete | 0 |
2022/01/04 | Pete | 0 |
2022/01/05 | Pete | 1 |
I want to find all persons who have a distance of 0 for 3 or more consecutive days. So in the above, it must return John and the count of the days with a zero distance. I.e.
Person | Consecutive Days with Zero |
---|---|
John | 3 |
I'm looking at something like this, but I think this might be way off:
Select Person, count(*),
(row_number() over (partition by Person, Date order by Person, Date))
from mytable
CodePudding user response:
Provided I understand your requirement you could, for your sample data, just calculate the difference in days of a windowed min/max date:
select distinct Person, Consecutive from (
select *, DateDiff(day,
Min(date) over(partition by person),
Max(date) over(partition by person)
) 1 Consecutive
from t
where distance = 0
)t
where Consecutive >= 3;
Example Fiddle
If you can have gaps in the dates you could try the following that only considers rows with 1 day between each date (and could probably be simplified):
with c as (
select *, Row_Number() over (partition by person order by date) rn,
DateDiff(day, Lag(date) over(partition by person order by date), date) c
from t
where distance = 0
), g as (
select Person, rn - Row_Number() over(partition by person, c order by date) grp
from c
)
select person, Count(*) 1 consecutive
from g
group by person, grp
having Count(*) >= 2;
CodePudding user response:
One option is to:
- transform your "Distance" values into a boolean, where distance of 0 becomes 1 and any other value becomes zero
- compute a running sum over your transformed "Distance" values in a window of three rows, using a frame specification clause
- filter out any "Person" value which has at least one sum of 3.
WITH cte AS (
SELECT *, SUM(CASE WHEN Distance = 0 THEN 1 ELSE 0 END) OVER(
PARTITION BY Person
ORDER BY Date_
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS window_of_3
FROM tab
)
SELECT DISTINCT Person
FROM cte
WHERE window_of_3 = 3
Check the demo here.
Note: This solution requires your table to have no missing dates. In case missing dates is a possible scenario, then it's necessary to add missing rows corresponding to the dates not found for each "Person" value, for this solution to work.