Home > Net >  Select a group of n elements by position of the max in a data frame in r by group
Select a group of n elements by position of the max in a data frame in r by group

Time:01-26

I have a dataframe with group, value and columns based in rollaply mean of the last n values, just like that:

library(dplyr); library(zoo)
df = data.frame( group = c(rep(1,5), rep(2,5)),
                 value = c(23,14,53,12,56,32,65,76,36,74)) %>%
  group_by(group) %>%
  mutate(
    roll1 = rollapplyr(value, 1, mean, fill = NA, na.rm = T, partial = F),
    roll2 = rollapplyr(value, 2, mean, fill = NA, na.rm = T, partial = F),
    roll3 = rollapplyr(value, 3, mean, fill = NA, na.rm = T, partial = F)
  )

df
   group value roll1 roll2 roll3
 1     1    23    23  NA    NA  
 2     1    14    14  18.5  NA  
 3     1    53    53  33.5  30  
 4     1    12    12  32.5  26.3
 5     1    56    56  34    40.3
 6     2    32    32  NA    NA  
 7     2    65    65  48.5  NA  
 8     2    76    76  70.5  57.7
 9     2    36    36  56    59  
10     2    74    74  55    62

The 'rolln' column represents the average of the last n values.

Then I would like to summarize in a new dataframe which group of values ​​provided the highest average. Remembering that the roll3 column, for example, has a set of 3 values.

enter image description here

I tried to use which.max function, but without success. The position of NA's in the final data.frame isn't important

Thanks in advance

CodePudding user response:

I'd love to see a more concise solution, but this seems to work:

library(tidyverse)
df %>%
  pivot_longer(starts_with("roll"), values_to = "avg") %>%
  filter(!is.na(avg)) %>%
  group_by(group, name) %>%
  filter(slider::slide_dbl(avg, max, .after = 2) == max(avg)) %>% # EDIT #2
  #filter(avg == max(avg) | 
  #         lead(avg, default = 0) == max(avg) | 
  #         lead(avg, 2, default = 0) == max(avg)) %>%
  mutate(items = n()   1 - parse_number(name)) %>% # EDIT
  slice(items:n()) %>%
  mutate(row = row_number()) %>% 
  select(-avg, -items) %>%
  pivot_wider(names_from = name, values_from = value)

Result

  group   row roll1 roll2 roll3
  <dbl> <int> <dbl> <dbl> <dbl>
1     1     1    56    12    53
2     1     2    NA    56    12
3     1     3    NA    NA    56
4     2     1    76    65    76
5     2     2    NA    76    36
6     2     3    NA    NA    74
  •  Tags:  
  • r
  • Related