Home > Blockchain >  Find repeating values of a certain value
Find repeating values of a certain value

Time:08-24

I have a table similar to:

Date Person Distance
2022/01/01 John 15
2022/01/02 John 0
2022/01/03 John 0
2022/01/04 John 0
2022/01/05 John 19
2022/01/01 Pete 25
2022/01/02 Pete 12
2022/01/03 Pete 0
2022/01/04 Pete 0
2022/01/05 Pete 1

I want to find all persons who have a distance of 0 for 3 or more consecutive days. So in the above, it must return John and the count of the days with a zero distance. I.e.

Person Consecutive Days with Zero
John 3

I'm looking at something like this, but I think this might be way off:

Select Person, count(*), 
       (row_number() over (partition by Person, Date order by Person, Date))
from mytable

CodePudding user response:

Provided I understand your requirement you could, for your sample data, just calculate the difference in days of a windowed min/max date:

select distinct Person, Consecutive from (
  select *, DateDiff(day,
    Min(date) over(partition by person), 
    Max(date) over(partition by person)
  )   1 Consecutive
  from t
  where distance = 0
)t
where Consecutive >= 3;

Example Fiddle

If you can have gaps in the dates you could try the following that only considers rows with 1 day between each date (and could probably be simplified):

with c as (
  select *, Row_Number() over (partition by person order by date) rn, 
    DateDiff(day, Lag(date) over(partition by person order by date), date) c
  from t
  where distance = 0
), g as (
  select Person, rn - Row_Number() over(partition by person, c order by date) grp
  from c
)
select person, Count(*)   1 consecutive
from g
group by person, grp
having Count(*) >= 2;

CodePudding user response:

One option is to:

  • transform your "Distance" values into a boolean, where distance of 0 becomes 1 and any other value becomes zero
  • compute a running sum over your transformed "Distance" values in a window of three rows, using a frame specification clause
  • filter out any "Person" value which has at least one sum of 3.
WITH cte AS (
    SELECT *, SUM(CASE WHEN Distance = 0 THEN 1 ELSE 0 END) OVER(
                  PARTITION BY Person 
                  ORDER     BY Date_ 
                  ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
              ) AS window_of_3
    FROM tab
)
SELECT DISTINCT Person 
FROM cte
WHERE window_of_3 = 3

Check the demo here.

Note: This solution requires your table to have no missing dates. In case missing dates is a possible scenario, then it's necessary to add missing rows corresponding to the dates not found for each "Person" value, for this solution to work.

  • Related