Home > Software design >  How do I find the largest range in a dataset, and filter out the other data?
How do I find the largest range in a dataset, and filter out the other data?


       Competitor  Laps  
1        1          1       
2        1          2 
3        1          3   
4        1          4         
5        1          1                
6        1          2       
7        1          3 
8        1          4   
9        1          5
10       1          6 
11       1          7   
12       1          8

I need to identify the longest range in laps. Here, that range is from row 5 to row 12. The range is 7. As opposed to row 1 to row 4 which has a range of 3. After identifying the largest range, I should only keep the values values that contribute to said range. So, my final dataset should look like:

       Competitor  Laps          
5        1          1                
6        1          2       
7        1          3 
8        1          4   
9        1          5
10       1          6 
11       1          7   
12       1          8

How should I go about this?

CodePudding user response:

Potential solution with dplyr:

dat <- tibble(
  Competitor = 1,
  Laps = c(seq(1,4), seq(1,8))

dat |> 
  mutate(StintId = cumsum(if_else(Laps == 1, 1, 0))) |> 
  group_by(StintId) |> 
  mutate(range = max(Laps) - min(Laps)) |> 
  ungroup() |> 
  filter(range == max(range)) |> 
  select(-StintId, -range)


# A tibble: 8 x 2
  Competitor  Laps
       <dbl> <int>
1          1     1
2          1     2
3          1     3
4          1     4
5          1     5
6          1     6
7          1     7
8          1     8

CodePudding user response:

Returns the largest range for each competitor. Assumes first laps always starts with 1, and laps are sequential.

df<-data.frame(Competitor=c(rep(1,12), rep(2,16)),
               Laps=c(1:4, 1:8, 1:9, 1:7))

df %>% 
  group_by(Competitor) %>% 
  mutate(LapGroup=cumsum(if_else(Laps==1,1,0))) %>% 
  group_by(Competitor, LapGroup) %>% 
  mutate(MaxLaps=max(Laps)) %>%
  group_by(Competitor) %>% 
  • Related