Home > Back-end >  Find two columns that match in two dataframes and put third column from dataframe 2 into a new colum
Find two columns that match in two dataframes and put third column from dataframe 2 into a new colum

Time:11-14

I have 2 dataframes:

df1:
word1 previousWord
a     na
b     a
c     b

another dataframe looks like this

df2: this contains more pairs than exist in df1. It contains every combo possible

word1 previousWord Score
a     a            1
a     b            .5
a     c            .9
b     a            .5
b     b            1
b     c            .2
c     a            .9
c     b            .2
c     c            1

I want to find when the pairs from df1(i.e. b-a, c-b) in df2 and copy the score from df2 and add it in a new column in df1.

For example, the output would look like this:

word1 previousWord Score
a     na           na
b     a            .5
c     b            .2

This is what I have tried, however it appears like it deleted a lot of my data from df1. Switching the order did not eliminate that issue.

df3<-merge(df2, df1, by = c("word1", "previousWord"))

Any help is much appreciated.

CodePudding user response:

You can use left_join() from dplyr here.

library(dplyr)
df3<- left_join(df1, df2, by = c("word1", "previousWord"))

Output

  word1 previousWord Score
1     a         <NA>    NA
2     b            a   0.5
3     c            b   0.2

Data

df1 <- structure(list(word1 = c("a", "b", "c"), previousWord = c(NA, 
"a", "b")), class = "data.frame", row.names = c(NA, -3L))

df2 <- structure(list(word1 = c("a", "a", "a", "b", "b", "b", "c", "c", 
"c"), previousWord = c("a", "b", "c", "a", "b", "c", "a", "b", 
"c"), Score = c(1, 0.5, 0.9, 0.5, 1, 0.2, 0.9, 0.2, 1)), class = "data.frame", row.names = c(NA, 
-9L))
  • Related