I have two dataframes that look like this:
df_1 <- tibble(id = c(1,1,2,2,3,3), y = c("a", "b", "a", "b","a", "b"))
df_2 <- tibble(id = c(1,1,3,3), z = c(4,5,6,5))
I want to merge the two dfs such that it looks like this:
df_3 <- tibble(id = c(1,1,2,2,3,3), y = c("a", "b", "a", "b","a", "b"), z = c(4,5,NA,NA,6,5))
How may I do this in R? Thank you!
CodePudding user response:
Create a sequence column by 'id' and then join (as there are duplicates for 'id' in both datasets)
library(dplyr)
library(data.table)
df_1 %>%
mutate(rn = rowid(id)) %>%
left_join(df_2 %>%
mutate(rn = rowid(id))) %>%
select(-rn)
-output
# A tibble: 6 × 3
id y z
<dbl> <chr> <dbl>
1 1 a 4
2 1 b 5
3 2 a NA
4 2 b NA
5 3 a 6
6 3 b 5
CodePudding user response:
left_join(df_1, cbind(df_1 %>% filter(id %in% df_2$id), df_2 %>% select(z)))
Output:
# A tibble: 6 x 3
id y z
<dbl> <chr> <dbl>
1 1 a 4
2 1 b 5
3 2 a NA
4 2 b NA
5 3 a 6
6 3 b 5