Home > Mobile >  Adding columns to data frames in double for loop
Adding columns to data frames in double for loop

Time:12-09

I have the following setup

df_names <- c("df1", "df2", "df3")
df1 <- tibble("1" = "hallo")
df2 <- tibble("1" = "hallo")
df3 <- tibble("1" = "hallo")
missing_columns <- c("2", "3")

My goal is to add to each data frame the columns seen in missing_columns.

I tried

for(i in df_names){
  
  for(j in missing_columns){
    
    get(i)[, j] <- ""
    
  }
  
}

Error in get(i) <- `*vtmp*` : could not find function "get<-"

and

for(i in df_names){
  
  for(j in missing_columns){
    
    assign(get(i)[, j], "")
    
  }
  
}

Error: Can't subset columns that don't exist.
x Column `2` doesn't exist.

Ofcourse column 2 does not exist, that's why i want to add it.

CodePudding user response:

You need to assign the object to the global environment to have access to them after running the code:

library(tidyverse)

df_names <- c("df1", "df2", "df3")
df1 <- tibble("1" = "hallo")
df2 <- tibble("1" = "hallo")
df3 <- tibble("1" = "hallo")
missing_columns <- c("2", "3")

df1
#> # A tibble: 1 x 1
#>   `1`  
#>   <chr>
#> 1 hallo
df2
#> # A tibble: 1 x 1
#>   `1`  
#>   <chr>
#> 1 hallo

expand_grid(
  col = missing_columns,
  df = df_names
) %>%
  mutate(
    new_df = map2(col, df, ~ {
      res <- get(.y)
      res[[.x]] <- "foo"
      assign(.y, res, envir = globalenv())
    })
  )
#> # A tibble: 6 x 3
#>   col   df    new_df          
#>   <chr> <chr> <list>          
#> 1 2     df1   <tibble [1 × 2]>
#> 2 2     df2   <tibble [1 × 2]>
#> 3 2     df3   <tibble [1 × 2]>
#> 4 3     df1   <tibble [1 × 3]>
#> 5 3     df2   <tibble [1 × 3]>
#> 6 3     df3   <tibble [1 × 3]>

df1
#> # A tibble: 1 x 3
#>   `1`   `2`   `3`  
#>   <chr> <chr> <chr>
#> 1 hallo foo   foo
df2
#> # A tibble: 1 x 3
#>   `1`   `2`   `3`  
#>   <chr> <chr> <chr>
#> 1 hallo foo   foo

Created on 2021-12-09 by the reprex package (v2.0.1)

CodePudding user response:

Depends also on what your final goal is, perhaps this approach might be usefull for you.

df_names <- c("df1", "df2", "df3")
# note the small change in sample data
df1 <- tibble("1" = "hallo")
df2 <- tibble("2" = "hallo")
df3 <- tibble("3" = "hallo")

# I suggest to work with required columns, what is not there becomes missing
required <- c("1", "2", "3")

dfs <- lapply(df_names, function(df) {
  t <- get(df)
  t[setdiff(required, names(t))] <- NA
  t
})

dfs

[[1]]
# A tibble: 1 x 3
  `1`   `2`   `3`  
  <chr> <lgl> <lgl>
1 hallo NA    NA   

[[2]]
# A tibble: 1 x 3
  `2`   `1`   `3`  
  <chr> <lgl> <lgl>
1 hallo NA    NA   

[[3]]
# A tibble: 1 x 3
  `3`   `1`   `2`  
  <chr> <lgl> <lgl>
1 hallo NA    NA   

# if you want to combine the data anyhow
do.call("rbind", dfs)

# A tibble: 3 x 3
  `1`   `2`   `3`  
  <chr> <chr> <chr>
1 hallo NA    NA   
2 NA    hallo NA   
3 NA    NA    hallo
  • Related