Home > OS >  Recursion with dplyr
Recursion with dplyr

Time:07-14

I have data

library(dplyr, warn.conflicts = FALSE)
mtcars %>% 
  as_tibble() %>% 
  select(mpg, qsec) %>% 
  head(5) %>% 
  mutate(new_col = c(10, rep(NA, times = nrow(.)-1))) 
#> # A tibble: 5 × 3
#>     mpg  qsec new_col
#>   <dbl> <dbl>   <dbl>
#> 1  21    16.5      10
#> 2  21    17.0      NA
#> 3  22.8  18.6      NA
#> 4  21.4  19.4      NA
#> 5  18.7  17.0      NA

I need a solution for new_col like mpg qsec - lag(new_col) but with recursion.
For the second row : 21 17.0 - 10 = 28
For the third : 22.8 18.6 - 28(from second row) = 13.4

Expected output:

#> # A tibble: 5 × 3
#>     mpg  qsec new_col
#>   <dbl> <dbl>   <dbl>
#> 1  21    16.5    10  
#> 2  21    17.0    28  
#> 3  22.8  18.6    13.4
#> 4  21.4  19.4    27.4
#> 5  18.7  17.0     8.3

CodePudding user response:

You can use purrr::accumulate() (or base::Reduce() if you prefer):

library(dplyr)
library(purrr)

mtcars %>% 
  as_tibble() %>% 
  select(mpg, qsec) %>% 
  head(5) %>% 
  mutate(new_col = accumulate(tail(mpg   qsec, -1), .f = ~ .y - .x, .init = 10))

# A tibble: 5 × 3
    mpg  qsec new_col
  <dbl> <dbl>   <dbl>
1  21    16.5   10   
2  21    17.0   28.0 
3  22.8  18.6   13.4 
4  21.4  19.4   27.4 
5  18.7  17.0    8.27

CodePudding user response:

Another possible solution, using purrr::reduce:

library(tidyverse)

df %>% 
  transmute(reduce(2:n(), ~ {.x$new_col[.y] <- (.x$mpg[.y]   .x$qsec[.y] - 
    .x$new_col[.y-1]); .x}, .init = .))

#> # A tibble: 5 × 3
#>     mpg  qsec new_col
#>   <dbl> <dbl>   <dbl>
#> 1  21    16.5   10   
#> 2  21    17.0   28.0 
#> 3  22.8  18.6   13.4 
#> 4  21.4  19.4   27.4 
#> 5  18.7  17.0    8.27
  • Related