Home > Mobile >  Merge, and its result in R
Merge, and its result in R

Time:12-22

I want to merge datasets in R, but I want to know which row succeeded to merge after the merge process. In Stata, _merge column is generated automatically after the merge process, and the column has 3 values, master only(1), using only(2), and matched(3), respectively. You can see the output screenshot here.

I think R also has this function, but it is hard to search.

CodePudding user response:

I'd add columns that allow the source to be identified

df1 <- data.frame(x=c("a","b","c"), y=c(1,2,3))
df2 <- data.frame(x=c("a","b","d"), z=c(1,2,NA))

# solution:
df1$in1 <- TRUE
df2$in2 <- TRUE
merge(df1, df2, all=TRUE)

To add the labels as your example

df3$source <- ifelse(df3$in1 & is.na(df3$in2), "master only", 
                     ifelse(df3$in2 & is.na(df3$in1), "using only", "matched"))
df3$in1 <- NULL
df3$in2 <- NULL
  • Related