Home > Software engineering >  Dplyr: add multiple columns with mutate/across from character vector
Dplyr: add multiple columns with mutate/across from character vector

Time:10-06

I want to add several columns (filled with NA) to a data.frame using dplyr. I've defined the names of the columns in a character vector. Usually, with only one new column, you can use the following pattern:

test %>% 
  mutate(!!new_column := NA)

However, I don't get it to work with across:

library(dplyr)

test <- data.frame(a = 1:3)
add_cols <- c("col_1", "col_2")

test %>% 
  mutate(across(!!add_cols, ~ NA))
#> Error: Problem with `mutate()` input `..1`.
#> x Can't subset columns that don't exist.
#> x Columns `col_1` and `col_2` don't exist.
#> ℹ Input `..1` is `across(c("col_1", "col_2"), ~NA)`.

test %>% 
  mutate(!!add_cols := NA)
#> Error: The LHS of `:=` must be a string or a symbol

expected_output <- data.frame(
  a = 1:3,
  col_1 = rep(NA, 3),
  col_2 = rep(NA, 3)
)
expected_output
#>   a col_1 col_2
#> 1 1    NA    NA
#> 2 2    NA    NA
#> 3 3    NA    NA

Created on 2021-10-05 by the reprex package (v1.0.0)

With the first approach, the column names are correctly created, but then it directly tries to find it in the existing column names. In the second approach, I can't use anything other than a single string.

Is there a tidyverse solution or do I need to resort to the good old for loop?

CodePudding user response:

The !! works for a single element

for(nm in add_cols) test <- test %>% 
         mutate(!! nm := NA)

-output

> test
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

Or another option is

test %>% 
   bind_cols(setNames(rep(list(NA), length(add_cols)), add_cols))
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

In base R, this is easier

test[add_cols] <- NA

Which can be used in a pipe

test %>%
  `[<-`(., add_cols, value = NA)
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

across works only if the columns are already present i.e. it is suggesting to loop across the columns present in the data and do some modification/create new columns with .names modification


We could make use add_column from tibble

library(tibble)
library(janitor)
add_column(test, !!! add_cols) %>% 
   clean_names %>% 
   mutate(across(all_of(add_cols), ~ NA))
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

CodePudding user response:

Another approach:

library(tidyverse)
f <- function(x) df$x = NA
mutate(test, map_dfc(add_cols,~ f(.x)))
  • Related