Home > Mobile >  How do I apply a function to a dataframe and then to a list of dataframes?
How do I apply a function to a dataframe and then to a list of dataframes?

Time:10-22

I need to calculate the rolling three-month return for multiple investment portfolios. I would like to apply a function to achieve this because I have several portfolios and benchmarks to calculate this for.

For the sake of my understanding

  • I first want to know how to calculate it on one dataframe using an apply method.
  • Then, I would like to know how to apply a similar function to a list of dataframes.

The dataframe

    stefi <- tibble::tribble(    ~date ,  ~return,
                           1996-04-30,   0.0126,
                           1996-05-31,   0.0126,
                           1996-06-30,   0.0119,
                           1996-07-31,   0.0144,
                           1996-08-31,   0.0132,
                           1996-09-30,   0.0136,
                           1996-10-31,   0.0135,
                           1996-11-30,   0.0127,
                           1996-12-31,   0.0143,
                           1997-01-31,   0.0144)

My attempt at a function

Here is my function to calculate the three-month return. The math is fine, but I'm not sure whether it's supposed to return a vector, variable, or something else for the purpose of the apply method I need to use.

    calc_3m_return <- function(x){
      
      y <- (x   1) *
        ((lag(X, 1))   1) *
        ((lag(X, 2))   1) - 1
      
      return(y)
      
    }

I'm not having luck with apply or lapply.

lapply(stefi$return, calc_3m_return, return)

R's output

> lapply(stefi$return, calc_3m_return, return)
Error in FUN(X[[i]], ...) : unused argument (return)

What I can get working

I manage to get the desired result with the following steps:

#Calculate return function
calc_3m_return <- function(return){
  
  y <- (return   1) *
    ((lag(return, 1))   1) *
    ((lag(return, 2))   1) - 1
  
  return(y)
  
}

#Calculate 3-month return and append it to the dataframe
stefi <- stefi %>%
  mutate(return_3m = calc_3m_return(return))

Result

# A tibble: 6 x 3
  date       return return_3m
  <date>      <dbl>     <dbl>
1 1996-04-30 0.0126   NA     
2 1996-05-31 0.0126   NA     
3 1996-06-30 0.0119    0.0376
4 1996-07-31 0.0144    0.0395
5 1996-08-31 0.0132    0.0401
6 1996-09-30 0.0136    0.0418

CodePudding user response:

A possible solution (ldf contains a list of dataframes, as wanted):

library(tidyverse)

stefi <- tibble::tribble(    ~date ,  ~return,
                             1996-04-30,   0.0126,
                             1996-05-31,   0.0126,
                             1996-06-30,   0.0119,
                             1996-07-31,   0.0144,
                             1996-08-31,   0.0132,
                             1996-09-30,   0.0136,
                             1996-10-31,   0.0135,
                             1996-11-30,   0.0127,
                             1996-12-31,   0.0143,
                             1997-01-31,   0.0144)
ldf <- list(stefi,stefi)

calc_3m_return <- function(x){
  
  y <- (x   1) *
    ((lag(x, 1))   1) *
    ((lag(x, 2))   1) - 1
  
  return(y)
}

map(ldf, ~ {.x$return_3m <- calc_3m_return(.x$return); .x})

# Or alternatively
# map(ldf, ~ tibble(.x, return_3m = calc_3m_return(.x$return))) 
  • Related