Home > Enterprise >  How to iteratively add to a data frame in R using a for-loop?
How to iteratively add to a data frame in R using a for-loop?

Time:09-16

I am working on an example for-loop (a reduction of a larger problem I am dealing with), whereby the data frame expands by columns as the for-loop iterates. Here are the results I am trying to get when running the code at the bottom:

> data
   x x_1 x_2 x_3
1 10  11  12  13
2 11  12  13  14
3 12  13  14  15
4 13  14  15  16
5 14  15  16  17
6 15  16  17  18

However, the below code only manages to iterate through column x_1 before I get the error message "Error: Problem with mutate() column x_2. i x_2 = x_prior 1. x non-numeric argument to binary operator". What am I doing wrong?

I know there are other ways to generate this data frame but please don't change the overall structure of what I'm doing. I'm trying to learn how to iteratively add columns to the DF whereby the first added column refers back to a base column outside the loop (column x in this case), and all columns added after that via the loop refer back to the immediately prior column that was also iteratively generated. I am not too concerned with speed because in practice this loop will never execute > 20 times, so no need for the apply() family I think unless there's some magic there. The nice thing about a plodding for-loop is understandability.

library(dplyr)
library(stringr)

data <- data.frame(x = 10:15)

for(i in 1:3) {
  x_curnt <- str_c("x_", i)
  x_prior <- str_c("x_",i-1)
  
  data <- if(i==1){
    data %>% mutate(!! x_curnt:= x   1)} else {
    data %>% mutate(!! x_curnt:= x_prior   1)
    }
}

data

Please don't mark this as a duplicate of Unquote the variable name on the right side of mutate function in dplyr because this example is far simpler, and the solution in that post (an old post) doesn't work anymore. Though its solution did help resolve this post.

CodePudding user response:

You were very close with your base R approach. You just needed to:

  • remove the assignment by the if. This is because in this base R approach, you're just creating a column and not returning the dataframe.
  • refer to data[["x"]] (or data$x) rather than just x in the i=1 case.

Here's a complete working example:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)


data <- data.frame(x = 10:15)

for(i in 1:3) {
  x_curnt <- str_c("x_", i)
  x_prior <- str_c("x_",i-1)

  if(i==1){
    data[[x_curnt]] <- data[["x"]]   1
  } else {
    data[[x_curnt]] <- data[[x_prior]]   1
  }
}

data
#>    x x_1 x_2 x_3
#> 1 10  11  12  13
#> 2 11  12  13  14
#> 3 12  13  14  15
#> 4 13  14  15  16
#> 5 14  15  16  17
#> 6 15  16  17  18

Created on 2022-09-16 by the reprex package (v2.0.1)

CodePudding user response:

df <- data.frame(x = 10:15)

library(tidyverse)
bind_cols(df, map_dfc(1:3, ~transmute(df, !!str_c("x_", .x) := x   .x)))
#>    x x_1 x_2 x_3
#> 1 10  11  12  13
#> 2 11  12  13  14
#> 3 12  13  14  15
#> 4 13  14  15  16
#> 5 14  15  16  17
#> 6 15  16  17  18

Created on 2022-09-16 with reprex v2.0.2

CodePudding user response:

Following nwbort's and Limey's suggestions, I studied the post Unquote the variable name on the right side of mutate function in dplyr and it offers a solution. To make this work, the righthand string (the x_prior string in my code x_curnt:= x_prior 1 above) must be converted to a "quosure", which can be done with sym() from the rlang package. So here's revised code that works:

for(i in 1:3) {
  x_curnt <- str_c("x_", i)
  x_prior <- str_c("x_",i-1)
  
  data <- if(i==1){
    data %<>% mutate(!! x_curnt:= x   1)} else {
      data %<>% mutate(!! x_curnt:= !!rlang::sym(x_prior)   1)
    }
}
  • Related