R - mutate based on previous values in the same column-CodePudding

Data:

df <- data.frame(year = c(2018, 2019, 2020, 2021),
                 growth = c(0.05, 0.1, 0.08, 0.06),
                 size = c(100, NA, NA, NA))

  year growth size
1 2018   0.05  100
2 2019   0.10   NA
3 2020   0.08   NA
4 2021   0.06   NA

I have size for year 2018 and growth rates for subsequent years. My goal is to calculate size for each subsequent year as size[i] = size[i-1] * (1 growth[i]). I can do it with a for loop:

for (i in (2:nrow(df))) {
  df$size[i] <- df$size[i-1] * (1   df$growth[i]) 
}

  year growth    size
1 2018   0.05 100.000
2 2019   0.10 110.000
3 2020   0.08 118.800
4 2021   0.06 125.928

But I cannot find a dplyr way of doing the same thing, with mutate for example. Hoping to hear your ideas. Thanks!

CodePudding user response：

Since the first value of size is effectively a multiplicative constant for the rest of the column, we can just use the cumprod (cumulative product) of 1 growth to get the factor by which to multiply size[1] to fill the rest of the size column.

The slight complication is that your algorithm has to ignore the first value of growth. We can get round this by using a combination of lead and lag.

The following therefore works without having to use loops.

library(dplyr)

mutate(df, size = lag(size[1] * cumprod(lead(growth   1)), default = size[1]))

#>   year growth    size
#> 1 2018   0.05 100.000
#> 2 2019   0.10 110.000
#> 3 2020   0.08 118.800
#> 4 2021   0.06 125.928

CodePudding user response：

A solution with purrr::reduce:

library(tidyverse)

df <- data.frame(year = c(2018, 2019, 2020, 2021),
                 growth = c(0.05, 0.1, 0.08, 0.06),
                 size = c(100, NA, NA, NA))

reduce(2:nrow(df), function(x,y) 
  {x$size[y] <- x$size[y-1]*(1 x$growth[y]); x}, .init=df)
#>   year growth    size
#> 1 2018   0.05 100.000
#> 2 2019   0.10 110.000
#> 3 2020   0.08 118.800
#> 4 2021   0.06 125.928