Fill in column based on previous value in R-CodePudding

I am looking for a tidyverse solution to the following problem.

I have the following example dataset:

ex_tbl <- tibble(col1 = c(5, NA, NA, NA), col2 = c(1, 1.3, 0.8, 1.2))

I need to fill in the NAs based on col2 and the previous value in the col1. If I start with the first NA (col1, row 2), and col2 is greater than 1, then I'd multiply the previous col1 value (row 1) by col2 row 2. Then if col2 is less than or equal to 1, then col1 is just equal to the previous value.

This should be the final result:

I tried case_when() but am not getting the desired result:

output <- ex_tbl %>% mutate(col1 = case_when(col2 > 1 ~ col1 * col2,
                                              col2 <= 1 ~ col1,
                                              TRUE ~ col1)

Clearly, I'm not capturing the part where I want to use the previous row's value. Any help would be appreciated.

CodePudding user response：

Perhaps you could use

library(dplyr)

ex_tbl %>% 
  mutate(col1 = first(col1) * cumprod(pmax(col2, 1)))

to get

# A tibble: 4 x 2
   col1  col2
  <dbl> <dbl>
1   5     1  
2   6.5   1.3
3   6.5   0.8
4   7.8   1.2

Edit: Follow-up question

If you want to multiply by a value from a third column in case of col2 < 1, you could use:

ex_tbl <- tibble(col1 = c(5, NA, NA, NA), col2 = c(1, 1.3, 0.8, 1.2), col3 = c(0.5, 2, 2, 0.3))

ex_tbl %>% 
  mutate(
    factor = (col2 >= 1) * col2   (col2 < 1) * col3,
    col1 = first(col1) * cumprod((col2 >= 1) * col2   (col2 < 1) * col3))

Note: the column factor here isn't necessary. It's just to show what is happening inside the cumprod function.

CodePudding user response：

We may need accumulate here

library(dplyr)
library(purrr)
ex_tbl %>% 
   mutate(coln = accumulate(col2, ~  if(.y <= 1) .x else .x * .y , 
         .init = first(col1))[-1])

-output

# A tibble: 4 × 3
   col1  col2  coln
  <dbl> <dbl> <dbl>
1     5   1     5  
2    NA   1.3   6.5
3    NA   0.8   6.5
4    NA   1.2   7.8