Home > Back-end >  Conditionally mutate based on previous values in the same column
Conditionally mutate based on previous values in the same column

Time:09-13

Background

I'm implementing an equation of a change detection using cumulative sum.

Problem

I would like to conditionally mutate a column based on previous values within the column. If the condition is not met, I would like to reset the value to 0.

Example

library(dplyr)
library(tibble)

df_sample = as_tibble("x"=c(-15,95,-115,145,-55,15,-45,5,15)) %>% rowid_to_column("index")
df_sample = df_sample %>% 
  mutate(y = rep(0)) %>% 
  add_row(index=0, y=0,.before=1)

df_sample

#> # A tibble: 10 × 3
#>    index value     y
#>    <dbl> <dbl> <dbl>
#>  1     0    NA     0
#>  2     1   -15     0
#>  3     2    95     0
#>  4     3  -115     0
#>  5     4   145     0
#>  6     5   -55     0
#>  7     6    15     0
#>  8     7   -45     0
#>  9     8     5     0
#> 10     9    15     0

I'm trying to add the value to the previous y column and save it in y column only if the result of the summation is bigger than 0, if not it should reset the result in y colum to 0.

df_sample %>% 
  mutate(y = if_else(value lag(y) > 0, value lag(y),0))
#> # A tibble: 10 × 3
#>    index value     y
#>    <dbl> <dbl> <dbl>
#>  1     0    NA    NA
#>  2     1   -15     0
#>  3     2    95    95
#>  4     3  -115     0
#>  5     4   145   145
#>  6     5   -55     0
#>  7     6    15    15
#>  8     7   -45     0
#>  9     8     5     5
#> 10     9    15    15

Expected values in y column

#> # A tibble: 10 × 3
#>    index value     y
#>    <dbl> <dbl> <dbl>
#>  1     0    NA     0
#>  2     1   -15     0
#>  3     2    95     95
#>  4     3  -115     0
#>  5     4   145     145
#>  6     5   -55     90
#>  7     6    15     105
#>  8     7   -45     60
#>  9     8     5     65
#> 10     9    15     80

This is similar to this post but that case is conditional and it's not a cumproduct.

CodePudding user response:

Use accumulate (in the formula, .x is the current value of column value, and .y is the accumulated value) :

library(purrr)
df_sample %>% 
  mutate(y = accumulate(value[-1], .init = 0, ~ if_else(.x   .y > 0, .x   .y, 0)))
# A tibble: 10 × 3
   index value     y
   <dbl> <dbl> <dbl>
 1     0    NA     0
 2     1   -15     0
 3     2    95    95
 4     3  -115     0
 5     4   145   145
 6     5   -55    90
 7     6    15   105
 8     7   -45    60
 9     8     5    65
10     9    15    80
  • Related