Home > Blockchain >  How can I use rollapply after group by in a whole data frame in R?
How can I use rollapply after group by in a whole data frame in R?

Time:11-02

I have a data frame that looks like this:

library(tidyverse)
library(zoo)
date = c(rep(as.Date("2022/1/1"),5),rep(as.Date("2022/1/12"),5)) 
a = seq(1,10,1)
b = seq(-1,-10,-1)
c = seq(10,-20,length.out=10)
window=c(rep(2,5),rep(3,5))
A = tibble(date,a,b,c,window);A

   date           a     b       c window
   <date>     <dbl> <dbl>   <dbl>  <dbl>
 1 2022-01-01     1    -1  10          2
 2 2022-01-01     2    -2   6.667      2
 3 2022-01-01     3    -3   3.333      2
 4 2022-01-01     4    -4   0          2
 5 2022-01-01     5    -5  -3.333      2
 6 2022-01-12     6    -6  -6.667      3
 7 2022-01-12     7    -7 -10          3
 8 2022-01-12     8    -8 -13.33       3
 9 2022-01-12     9    -9 -16.67       3
10 2022-01-12    10   -10 -20          3

I want to group by date the columns a,b,c and apply a rolling window of width equal the window value for this date in the all the columns a,b,c and calculate the minimum of them.

For example, for the date 2022-01-01 with window 2 it will search for the minimum in the first row (which will be -1) then will move downwards by 1 so will search the minimum in the first and second row that will be again -2,then into second and third that will be -3, to third and fourth that will be -4,then the fourth and fifth row that will be -5, and last the fifth row that will be -5 again.

The same procedure for the date 2022-01-12 with width 3 and will be, -10,-13,33,-16,67,-20 and -20.

Ideally I want the resulted data frame to be the minimum of all minimums grouped by - summarized by date:

date MINS
2022-01-01 -5
2022-01-12 -20

I think that rollapply function for whole matrices can be done with by.column = FALSE

So I tried :

A%>%group_by(date)%>%
  summarise(Vectors = rollapply(A[,2:4],width = window,min,by=1,by.column=FALSE))

but it doesn't work.

Any help ?

CodePudding user response:

To get the final result you said you want, you might as well skip the rolling windows altogether:

A %>% 
  group_by(date) %>% 
  summarise(min = min(a, b, c))
#> # A tibble: 2 x 2
#>   date         min
#>   <date>     <dbl>
#> 1 2022-01-01    -5
#> 2 2022-01-12   -20

If you do want the rolling minimums: You’re on the right track, but you need to use the grouped data (don't reference A inside the summarise()), and set align and partial:

A %>% 
  group_by(date) %>% 
  mutate(
    min = zoo::rollapply(
      cbind(a, b, c),
      window,
      min,
      align = "left",
      partial = TRUE,
      by.column = FALSE
    )
  )
#> # A tibble: 10 x 6
#> # Groups:   date [2]
#>    date           a     b      c window   min
#>    <date>     <dbl> <dbl>  <dbl>  <dbl> <dbl>
#>  1 2022-01-01     1    -1  10         2  -2  
#>  2 2022-01-01     2    -2   6.67      2  -3  
#>  3 2022-01-01     3    -3   3.33      2  -4  
#>  4 2022-01-01     4    -4   0         2  -5  
#>  5 2022-01-01     5    -5  -3.33      2  -5  
#>  6 2022-01-12     6    -6  -6.67      3 -13.3
#>  7 2022-01-12     7    -7 -10         3 -16.7
#>  8 2022-01-12     8    -8 -13.3       3 -20  
#>  9 2022-01-12     9    -9 -16.7       3 -20  
#> 10 2022-01-12    10   -10 -20         3 -20
  • Related