Home > Software engineering >  Creating a function with lag variables from same column in a list in R
Creating a function with lag variables from same column in a list in R

Time:04-06

I have a set of dataframes in a list and have to create an extra column for each dataframe (which I´ve done) and then create a formula for the first row, and a different one from the second row onwards taking lags from the same column:

Let say the list name is "CCNRRF_list"

Creation of the fourth column (X4)

CNRRF_list<- mapply(cbind, CNRRF_list, "X4"=NA,SIMPLIFY=F)

one of the resulting dataframes

        x1  x2  x3  x4   
    1   1   1   1   NA
    2   2   2   2   NA
    3   3   3   3   NA
    4   4   4   4   NA 
    5   5   5   5   NA
    6   6   6   6   NA
    7   7   7   7   NA
    8   8   8   8   NA

First formula first row

for (i in seq_along(CNRRF_list)) {
CNRRF_list[[i]]$X4[1]<-(1 CNRRF_list[[i]]$X3[1])
}

Resulting data

        x1  x2  x3  x4   
    1   1   1   1   2  ===> "formula (1 X3)=(1 1)=2"
    2   2   2   2   NA
    3   3   3   3   NA
    4   4   4   4   NA 
    5   5   5   5   NA
    6   6   6   6   NA
    7   7   7   7   NA
    8   8   8   8   NA

now it gets tricky, from the second row onwards the formula is:

lag(X4)*(1 X3)

so the resulting data should look like this for each dataframe in the list:

        x1  x2  x3  x4   
    1   1   1   1   2  
    2   2   2   2   6 ===> "formula lag(X4)*(1 x3)=2*(1 2)=6"
    3   3   3   3   24 ===> "formula 6*(1 3)"
    4   4   4   4   120 ===> "formula 24*(1 4)"
    5   5   5   5   720 ===> "formula 120*(1 5)"
    6   6   6   6   5040 ===> "formula 720*(1 6)"
    7   7   7   7   40320 ===> "formula 5040*(1 7)"
    8   8   8   8   362880 ===> "formula 40320*(1 8)"

But I haven´t been able to create a good enough formula.

some of my attempts

for (i in seq_along(CNRRF_list)) {
CNRRF_list[[i]] <- mutate(CNRRF_list[[i]], X4 = (ifelse(is.na(CNRRF_list[[i]]$X4),lag(CNRRF_list[[i]]$X4)*(1 CNRRF_list[[i]]$X3), 1*(1 CNRRF_list[[i]]$X3))))
}

Not working...any help will be appreciate.

Thanks

CodePudding user response:

How about this:

  dat <- tibble::tribble(
  ~x1,  ~x2,  ~x3, ~x4,   
1,   1,   1,   NA,
2,   2,   2,   NA,
3,   3,   3,   NA,
4,   4,   4,   NA, 
5,   5,   5,   NA,
6,   6,   6,   NA,
7,   7,   7,   NA,
8,   8,   8,   NA)

for(i in 1:nrow(dat)){
  dat$x4[i] <- prod(c(NA, lag(dat$x4))[i], (1 dat$x3[i]), na.rm=TRUE)
}

dat
#> # A tibble: 8 × 4
#>      x1    x2    x3     x4
#>   <dbl> <dbl> <dbl>  <dbl>
#> 1     1     1     1      2
#> 2     2     2     2      6
#> 3     3     3     3     24
#> 4     4     4     4    120
#> 5     5     5     5    720
#> 6     6     6     6   5040
#> 7     7     7     7  40320
#> 8     8     8     8 362880

Created on 2022-04-05 by the reprex package (v2.0.1)


Edit: Apply to a list of data frames

Here's how you could apply this to a list of data frames.

  dat <- tibble::tribble(
  ~x1,  ~x2,  ~x3, ~x4,   
1,   1,   1,   NA,
2,   2,   2,   NA,
3,   3,   3,   NA,
4,   4,   4,   NA, 
5,   5,   5,   NA,
6,   6,   6,   NA,
7,   7,   7,   NA,
8,   8,   8,   NA)

  
dat_list <- list(dat, dat, dat)  
res <- lapply(dat_list, function(x){  
for(i in 1:nrow(x)){
  x$x4[i] <- prod(c(NA, lag(x$x4))[i], (1 x$x3[i]), na.rm=TRUE)
}
x
})

res
#> [[1]]
#> # A tibble: 8 × 4
#>      x1    x2    x3     x4
#>   <dbl> <dbl> <dbl>  <dbl>
#> 1     1     1     1      2
#> 2     2     2     2      6
#> 3     3     3     3     24
#> 4     4     4     4    120
#> 5     5     5     5    720
#> 6     6     6     6   5040
#> 7     7     7     7  40320
#> 8     8     8     8 362880
#> 
#> [[2]]
#> # A tibble: 8 × 4
#>      x1    x2    x3     x4
#>   <dbl> <dbl> <dbl>  <dbl>
#> 1     1     1     1      2
#> 2     2     2     2      6
#> 3     3     3     3     24
#> 4     4     4     4    120
#> 5     5     5     5    720
#> 6     6     6     6   5040
#> 7     7     7     7  40320
#> 8     8     8     8 362880
#> 
#> [[3]]
#> # A tibble: 8 × 4
#>      x1    x2    x3     x4
#>   <dbl> <dbl> <dbl>  <dbl>
#> 1     1     1     1      2
#> 2     2     2     2      6
#> 3     3     3     3     24
#> 4     4     4     4    120
#> 5     5     5     5    720
#> 6     6     6     6   5040
#> 7     7     7     7  40320
#> 8     8     8     8 362880

Created on 2022-04-05 by the reprex package (v2.0.1)

CodePudding user response:

With accumulate:

purrr::accumulate(1:9, .f = ~ .y * .x)
[1]      1      2      6     24    120    720   5040  40320 362880
  • Related