I have 2 dataframes with different data, but with some similar id. I'd like to extract columns containing id of both dataframes, compare them and see which id from the first dataframe doesn't have a pair in another, vice versa. Here is the example:
>df1
id
1
2
3
>df2
id
2
3
4
The requested output would be something like this:
>df.new
df1_id df2_id
1 NA
2 2
3 3
NA 4
or it also could be like this:
>df.new
id
1
4
better if both variants are there.
CodePudding user response:
Using dplyr
:
First case:
full_join(df1 %>% mutate(temp = id), df2 %>% mutate(temp2 = id), by = "id") %>%
rename(df1_id = temp,
df2_id = temp2) %>%
select(-id)
Second case:
bind_rows(df1 %>%
filter(!(id %in% df2$id)),
df2 %>%
filter(!(id %in% df1$id)))
Outputs:
df1_id df2_id
1 1 NA
2 2 2
3 3 3
4 NA 4
and
id
1 1
2 4