Home > other >  How to merge one file with another with conditions on match across multiple columns in R
How to merge one file with another with conditions on match across multiple columns in R

Time:01-17

I am stuck on what seems to be a doable task in R. I am merging several files and am required to change the name of columns as I go to maintain the data. There might be a better way to do that buy that is another story. But in very simple terms I have two files dfA and dfB as below. I need to merge the two files by "model" on dfB against EITHER column "PART1" or "PART2" or "PART3" depending on a match.

enter image description here

CodePudding user response:

We could do the following:

  1. Bring dfA into long format (note the use of the argument values_transform see here: pivot_longer: values_ptypes: can't convert <integer> to <character>

  2. then use right_join by the appropriate columns and do some select:

library(dplyr)
library(tidyr)

dfA %>% 
  pivot_longer(
    starts_with("PART"),
    names_to = "key",
    values_to = "val", 
    values_transform = list(val = as.character)
  ) %>% 
  right_join(dfB, by=c("val"="Model")) %>% 
  select(Model=val, Detail)
  Model Detail
  <chr> <chr> 
1 A     Dog   
2 2     Cat   
3 Z     Cow 

data:

dfA <- tibble(PART1 = c("A", "B", "C"),
              PART2 = c("X", "Y", "Z"),
              PART3 = c(1,2,3),
              Detail = c("Dog", "Cat", "Cow"))

dfB <- tibble(Model = c("A", "Z", 2))

CodePudding user response:

Try this simple yet not ideally optimal way:

 library("dplyr")
 dfC <- 
 union(
 union(
 left_join(dfB, dfA, by = c("Model" = "PART1")), 
 left_join(dfB, dfA, by = c("Model" = "PART2")) 
 ),
 left_join(dfB, dfA, by = c("Model" = "PART3"))
 )
  •  Tags:  
  • Related