Home > Software engineering >  dplyr Rename if column exists else
dplyr Rename if column exists else

Time:01-28

I am trying to build a pipe that it capable of handling various input types. I left_join to a master table that may have an identical column name. dplyr appends the matching column names with .x and .y in the joined table. I want rename the column that originated from the master table (.y) to its original name. The pipe must work for both input1 and input2 (as independent input). Currently my Rename step throws and error. I tried mutate() as well but gives a similar error.

> names(input1.df)
[1] "A"

> names(input2.df)
[1] "A" "B"

>names(MasterTable.df)
[1] "A" "B" "C" "D" "E"

joined.df <- input2.df %>%
  dplyr::left_join(MasterTable.df, by=("A")) %>%
  dplyr::rename(B = ifelse(B.y %in% names(.)) B.y, B) %>%
  dplyr::select(A, B) %>%
  dplyr::mutate(New_Column = ifelse(is.na(B), A, B))

CodePudding user response:

You can use rename_with(.fn=):

input1.df <- data.frame(A=1)
input2.df <- data.frame(A=1, B=2)
MasterTable.df <- data.frame(A=1, B=2, C=3, D=4, E=5)
library(dplyr)
input2.df %>%
  left_join(MasterTable.df, by=("A")) %>%
  rename_with(.fn = ~ gsub("\\.y$", "", .))
#   A B.x B C D E
# 1 1   2 2 3 4 5

and you can select(-ends_with(".x")) if desired.

The ~-func is a bit naive in that it does not check if the non-.y column already exists, but it looks like that isn't a factor in your case.

Perhaps a different approach, though, is to change the suffixes at join time:

input2.df %>%
  left_join(MasterTable.df, by=("A"), suffix=c(".x", ""))
#   A B.x B C D E
# 1 1   2 2 3 4 5

CodePudding user response:

Without dataset and output example I'm not sure I'm getting it but I'll give it a try :

Why dont you just rename your .y or .x col after the left joining ?

joined.df <- left_join(MasterTable.df,input2.df, by=("A"))
names(joined.df) <- gsub(".y","",names(joined.df))
names(joined.df) <- gsub(".x","",names(joined.df))
  • Related