Home > Back-end >  R with dplyr rename, avoid error if column doesn't exist AND create new column with NAs
R with dplyr rename, avoid error if column doesn't exist AND create new column with NAs

Time:02-10

We are looking to rename columns in a dataframe in R, however the columns may be missing and this throws an error:

my_df <- data.frame(a = c(1,2,3), b = c(4,5,6))
my_df %>% dplyr::rename(aa = a, bb = b, cc = c)

Error: Can't rename columns that don't exist.
x Column `c` doesn't exist.

our desired output is this, which creates a new column with NA values if the original column does not exist:

> my_df
  aa bb  c
1  1  4 NA
2  2  5 NA
3  3  6 NA

CodePudding user response:

You can use a named vector with any_of() to rename that won't error on missing variables. I'm uncertain of a dplyr way to then create the missing vars but it's easy enough in base R.

library(dplyr)

cols <- c(aa = "a", bb = "b", cc = "c")

my_df %>%
  rename(any_of(cols)) %>%
  `[<-`(., , setdiff(names(cols), names(.)), NA)

  aa bb cc
1  1  4 NA
2  2  5 NA
3  3  6 NA

CodePudding user response:

A possible solution:

library(tidyverse)

my_df <- data.frame(a = c(1,2,3), b = c(4,5,6))

cols <- c(a = NA_real_, b = NA_real_, c = NA_real_)

my_df %>% add_column(!!!cols[!names(cols) %in% names(.)]) %>% 
  rename(aa = a, bb = b, cc = c)

#>   aa bb cc
#> 1  1  4 NA
#> 2  2  5 NA
#> 3  3  6 NA

CodePudding user response:

Here is a solution using the data.table function setnames. I've added a second "missing" column "d" to demonstrate generality.

library(tidyverse)
library(data.table)
my_df <- data.frame(a = c(1,2,3), b = c(4,5,6))
curr <-  names(my_df)
cols <- data.frame(new=c("aa","bb","cc","dd"), old = c("a", "b", "c","d"))     %>%  
mutate(exist = old %in% curr)
foo <- filter(cols, exist)
bar <- filter(cols, !exist)
setnames(my_df, new = foo$new)
my_df[, bar$old] <- NA
my_df

#> my_df
#  aa bb  c  d
#1  1  4 NA NA
#2  2  5 NA NA
#3  3  6 NA NA
  • Related