Home > other >  Merging two dataframes in R with missing values
Merging two dataframes in R with missing values

Time:04-02

I have two dataframes that look like this:

df_1 <- tibble(id = c(1,1,2,2,3,3), y = c("a", "b", "a", "b","a", "b"))
df_2 <- tibble(id = c(1,1,3,3), z = c(4,5,6,5))

I want to merge the two dfs such that it looks like this:

df_3 <- tibble(id = c(1,1,2,2,3,3), y = c("a", "b", "a", "b","a", "b"), z = c(4,5,NA,NA,6,5))

How may I do this in R? Thank you!

CodePudding user response:

Create a sequence column by 'id' and then join (as there are duplicates for 'id' in both datasets)

library(dplyr)
library(data.table)
df_1 %>% 
   mutate(rn = rowid(id)) %>%
   left_join(df_2 %>% 
      mutate(rn = rowid(id))) %>% 
    select(-rn)

-output

# A tibble: 6 × 3
     id y         z
  <dbl> <chr> <dbl>
1     1 a         4
2     1 b         5
3     2 a        NA
4     2 b        NA
5     3 a         6
6     3 b         5

CodePudding user response:

left_join(df_1, cbind(df_1 %>% filter(id %in% df_2$id), df_2 %>% select(z)))

Output:

# A tibble: 6 x 3
     id y         z
  <dbl> <chr> <dbl>
1     1 a         4
2     1 b         5
3     2 a        NA
4     2 b        NA
5     3 a         6
6     3 b         5
  • Related