I have a tibble which looks as follows:
df <- tibble(return = c(NA,
0.120436251,
-0.019871515,
0.024876142,
-0.013786987,
-0.012571415),
value = c(100,
NA,
NA,
NA,
NA,
NA))
My goal is to compute the value of all subsequent rows (except the first) using the following formula: (1 return)*lag(value)
. The expected output should look as follows:
df_1 <- tibble(return = c(NA,
0.120436251,
-0.019871515,
0.024876142,
-0.013786987,
-0.012571415),
value = c(100,
112.0436,
109.8171,
112.5490,
110.9973,
109.6019))
In base R, i can use the following for-loop:
df_1 = df
for (i in 2:nrow(df_1)){
df_1$value[i] <- df_1$value[i-1]*(1 df_1$return[i])
}
The function to use in tidy is probably accumulate
, however, when using the following code, for each row I receive a list:
df <- df %>%
mutate(Value = accumulate(value, ~if(is.na(.y)) .x*(1 df$return) else .y))
Any help would be appreciated.
CodePudding user response:
Here are some approaches. (The first two could easily be converted to base R by replacing mutate
with transform
and using base R pipe.)
library(dplyr)
df %>% mutate(value = c(1, cumprod(1 return[-1])) * value[1])
library(dplyr)
mult <- function(x, y) x * (1 y)
df %>% mutate(value = Reduce(mult, init = value[1], return[-1], acc = TRUE))
library(dplyr)
library(purrr)
# mult defined above
df %>% mutate(value = accumulate(return[-1], mult, .init = value[1]))