Home > Back-end >  Merging Unequal dataframes showing all NA values instead of actual values
Merging Unequal dataframes showing all NA values instead of actual values

Time:04-14

I'm having trouble merging two unequal data frames. Trying to do a left join and the right table is resulting in all NA values when joined by 'Id' and 'date'. I tried the below but wasn't successful. Structures are all the same.

Date = m/d/y
Left Table:
id  Date       V1  V2  V3
1   4/13/2022  42  69  99
2   4/10/2022  32  58  44
3   3/15/2022  12  23  95
4   3/10/2022  84  42  63

Right Table:
id  Day        X1  X2 
1   4/13/2022  10  9
2   4/09/2022   8  7
3   3/15/2022   6  5


left_right_merged <- merge(
   x = left_table, 
   y = right_table, 
   by.x = c("id", "date"), 
   by.y = c("id", "day"), 
   all.x= TRUE
)       
                                                                                 

Expected:
Id  Date       V1  V2  V3  X1  X2
1   4/13/2022  42  69  99  10   9
2   4/09/2022  NA  NA  NA   8   7
2   4/10/2022  32  58  44  NA  NA
3   3/15/2022  12  23  95   6   5
4   3/10/2022  84  42  63  NA  NA

Actual:
Id  Date       V1  V2  V3  X1  X2
1   4/13/2022  42  69  99  NA  NA
2   4/09/2022  NA  NA  NA  NA  NA
2   4/10/2022  32  58  44  NA  NA
3   3/15/2022  12  23  95  NA  NA
4   3/10/2022  84  42  63  NA  NA

CodePudding user response:

There's a minor issue with your column names not being the same in the code and input (as mentioned in a comment). But the main issue is that you need all.y as well as all.x to get that output:

left_table  <- read.table(text = "id  Date       V1  V2  V3
1   4/13/2022  42  69  99
2   4/10/2022  32  58  44
3   3/15/2022  12  23  95
4   3/10/2022  84  42  63", header = T)

right_table  <- read.table(text = "id  Day        X1  X2 
1   4/13/2022  10  9
2   4/09/2022   8  7
3   3/15/2022   6  5", header = T)

merge(x = left_table, y = right_table, by.x = c("id", "Date"), by.y = c("id", "Day"), all.x= TRUE, all.y = TRUE)       
                      
# id      Date V1 V2 V3 X1 X2
# 1  1 4/13/2022 42 69 99 10  9
# 2  2 4/09/2022 NA NA NA  8  7
# 3  2 4/10/2022 32 58 44 NA NA
# 4  3 3/15/2022 12 23 95  6  5
# 5  4 3/10/2022 84 42 63 NA NA

CodePudding user response:

A tidyverse solution, as requested.

library(tidyverse)

# Generate test data
leftTable <- read.table(textConnection("id  Date       V1  V2  V3
1   4/13/2022  42  69  99
2   4/10/2022  32  58  44
3   3/15/2022  12  23  95
4   3/10/2022  84  42  63"), header=TRUE) %>% 
mutate(Date=lubridate::mdy(Date))

rightTable <- read.table(textConnection("id  Day        X1  X2 
1   4/13/2022  10  9
2   4/09/2022   8  7
3   3/15/2022   6  5"), header=TRUE) %>% 
mutate(Day=lubridate::mdy(Day))

# Solve the problem
leftTable %>% 
  full_join(rightTable, by=c("id", "Date"="Day")) %>% 
  arrange(id, Date)
  id       Date V1 V2 V3 X1 X2
1  1 2022-04-13 42 69 99 10  9
2  2 2022-04-09 NA NA NA  8  7
3  2 2022-04-10 32 58 44 NA NA
4  3 2022-03-15 12 23 95  6  5
5  4 2022-03-10 84 42 63 NA NA
  • Related