We are looking to rename columns in a dataframe in R, however the columns may be missing and this throws an error:
my_df <- data.frame(a = c(1,2,3), b = c(4,5,6))
my_df %>% dplyr::rename(aa = a, bb = b, cc = c)
Error: Can't rename columns that don't exist.
x Column `c` doesn't exist.
our desired output is this, which creates a new column with NA values if the original column does not exist:
> my_df
aa bb c
1 1 4 NA
2 2 5 NA
3 3 6 NA
CodePudding user response:
You can use a named vector with any_of()
to rename that won't error on missing variables. I'm uncertain of a dplyr
way to then create the missing vars but it's easy enough in base R.
library(dplyr)
cols <- c(aa = "a", bb = "b", cc = "c")
my_df %>%
rename(any_of(cols)) %>%
`[<-`(., , setdiff(names(cols), names(.)), NA)
aa bb cc
1 1 4 NA
2 2 5 NA
3 3 6 NA
CodePudding user response:
A possible solution:
library(tidyverse)
my_df <- data.frame(a = c(1,2,3), b = c(4,5,6))
cols <- c(a = NA_real_, b = NA_real_, c = NA_real_)
my_df %>% add_column(!!!cols[!names(cols) %in% names(.)]) %>%
rename(aa = a, bb = b, cc = c)
#> aa bb cc
#> 1 1 4 NA
#> 2 2 5 NA
#> 3 3 6 NA
CodePudding user response:
Here is a solution using the data.table function setnames. I've added a second "missing" column "d" to demonstrate generality.
library(tidyverse)
library(data.table)
my_df <- data.frame(a = c(1,2,3), b = c(4,5,6))
curr <- names(my_df)
cols <- data.frame(new=c("aa","bb","cc","dd"), old = c("a", "b", "c","d")) %>%
mutate(exist = old %in% curr)
foo <- filter(cols, exist)
bar <- filter(cols, !exist)
setnames(my_df, new = foo$new)
my_df[, bar$old] <- NA
my_df
#> my_df
# aa bb c d
#1 1 4 NA NA
#2 2 5 NA NA
#3 3 6 NA NA