Home > Enterprise >  applying moving averages/ rollapply over multiple variables
applying moving averages/ rollapply over multiple variables

Time:04-30

I'm working with panel data. I want to add 3 and 5-year moving averages of some of my variables into my dataset. I have the following code which works fine but is very long. Is there a neater and more efficient way of creating moving averaged variables?

dataset <- dataset %>%  
  group_by(iso3c) %>% 
  dplyr::mutate(manu_GDP_3=rollapply(manu_GDP,3,mean,align='right',fill=NA), 
                manu_GDP_5=rollapply(manu_GDP,5,mean,align='right',fill=NA), 
                age_dep_3=rollapply(age_dep,3,mean,align='right',fill=NA), 
                age_dep_5=rollapply(age_dep,5,mean,align='right',fill=NA), 
                agr_GDP_3=rollapply(agr_GDP,3,mean,align='right',fill=NA), 
                agr_GDP_5=rollapply(agr_GDP,5,mean,align='right',fill=NA), 
                services_GDP_3=rollapply(services_GDP,3,mean,align='right',fill=NA), 
                services_GDP_5=rollapply(services_GDP,5,mean,align='right',fill=NA), 
                debtGNI_3=rollapply(debtGNI,3,mean,align='right',fill=NA),
                debtGNI_5=rollapply(debtGNI,5,mean,align='right',fill=NA),
                Foreign_liab_3=rollapply(Foreign_liab,3,mean,align='right',fill=NA),
                Foreign_liab_5=rollapply(Foreign_liab,5,mean,align='right',fill=NA),
                intcapimp_X_3=rollapply(intcapimp_X,3,mean,align='right',fill=NA),
                intcapimp_X_5=rollapply(intcapimp_X,5,mean,align='right',fill=NA),
                regime_3=rollapply(regime,3,mean,align='right',fill=NA),
                regime_5=rollapply(regime,5,mean,align='right',fill=NA),
                CBI2_3=rollapply(CBI2,3,mean,align='right',fill=NA),
                CBI2_5=rollapply(CBI2,5,mean,align='right',fill=NA),
                resource_rent_3=rollapply(resource_rent,3,mean,align='right',fill=NA),
                resource_rent_5=rollapply(resource_rent,5,mean,align='right',fill=NA),
                oil_rents_3=rollapply(oil_rents,3,mean,align='right',fill=NA),
                oil_rents_5=rollapply(oil_rents,5,mean,align='right',fill=NA),
                coal_rents_3=rollapply(coal_rents,3,mean,align='right',fill=NA),
                coal_rents_5=rollapply(coal_rents,5,mean,align='right',fill=NA))  %>% 
  ungroup()

CodePudding user response:

rollapply does work on multiple columns. For example, using the built in mtcars we have the following.

library(zoo)

mtcars %>%
  select(1:5) %>%
  group_by(cyl) %>%
  transform(avg3 = rollmeanr(cbind(hp, disp), 3, fill = NA),
            avg5 = rollmeanr(cbind(hp, disp), 5, fill = NA)) %>%
  ungroup

giving:

    mpg cyl  disp  hp drat   avg3.hp avg3.disp avg5.hp avg5.disp
1  21.0   6 160.0 110 3.90        NA        NA      NA        NA
2  21.0   6 160.0 110 3.90        NA        NA      NA        NA
3  22.8   4 108.0  93 3.85 104.33333 142.66667      NA        NA
4  21.4   6 258.0 110 3.08 104.33333 175.33333      NA        NA
5  18.7   8 360.0 175 3.15 126.00000 242.00000   119.6    209.20
6  18.1   6 225.0 105 2.76 130.00000 281.00000   118.6    222.20
7  14.3   8 360.0 245 3.21 175.00000 315.00000   145.6    262.20
8  24.4   4 146.7  62 3.69 137.33333 243.90000   139.4    269.94
...etc...

CodePudding user response:

A couple things:

  1. zoo::rollmean (and its *r right variant) are more efficient than rollapply(., n, mean), I suggest using it instead.
  2. across allows you to iterate one or more functions over a selection of columns. I'm using across(Sepal.Length:Petal.Width, ..)below, but it could easily beacross(c(Sepal.Length, Petal.Width, Sepal.Width), ..)` just as easily. When given a list of functions, the naming convention seems intuitive, see the results below.
iris %>%
  group_by(Species) %>%
  mutate(
    across(Sepal.Length:Petal.Width,
           list(mean3 = ~ zoo::rollmeanr(., 3, fill = NA),
                mean5 = ~ zoo::rollmeanr(., 5, fill = NA)) )
  ) %>%
  ungroup()
# # A tibble: 150 x 13
#    Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_mean3 Sepal.Length_mean5 Sepal.Width_mean3 Sepal.Width_mean5 Petal.Length_mean3 Petal.Length_mean5 Petal.Width_mean3 Petal.Width_mean5
#           <dbl>       <dbl>        <dbl>       <dbl> <fct>                <dbl>              <dbl>             <dbl>             <dbl>              <dbl>              <dbl>             <dbl>             <dbl>
#  1          5.1         3.5          1.4         0.2 setosa               NA                 NA                NA                NA                 NA                 NA               NA                 NA   
#  2          4.9         3            1.4         0.2 setosa               NA                 NA                NA                NA                 NA                 NA               NA                 NA   
#  3          4.7         3.2          1.3         0.2 setosa                4.9               NA                 3.23             NA                  1.37              NA                0.2               NA   
#  4          4.6         3.1          1.5         0.2 setosa                4.73              NA                 3.1              NA                  1.4               NA                0.2               NA   
#  5          5           3.6          1.4         0.2 setosa                4.77               4.86              3.3               3.28               1.4                1.4              0.2                0.2 
#  6          5.4         3.9          1.7         0.4 setosa                5                  4.92              3.53              3.36               1.53               1.46             0.267              0.24
#  7          4.6         3.4          1.4         0.3 setosa                5                  4.86              3.63              3.44               1.5                1.46             0.3                0.26
#  8          5           3.4          1.5         0.2 setosa                5                  4.92              3.57              3.48               1.53               1.5              0.3                0.26
#  9          4.4         2.9          1.4         0.2 setosa                4.67               4.88              3.23              3.44               1.43               1.48             0.233              0.26
# 10          4.9         3.1          1.5         0.1 setosa                4.77               4.86              3.13              3.34               1.47               1.5              0.167              0.24
# # ... with 140 more rows

This means your code might be reduced to the following (untested):

dataset %>%
  group_by(iso3c) %>%
  mutate(
    across(c(manu_GDP, age_dep, agr_GDP, services_GDP, debtGNI, Foreign_liab,
             intcapimp_X, regime, CBI2, resource_rent, oil_rents, coal_rents),
           list("3" = ~ zoo::rollmeanr(., 3, fill = NA),
                "5" = ~ zoo::rollmeanr(., 5, fill = NA)) )
  ) %>%
  ungroup()
  • Related