I'm having trouble merging two unequal data frames. Trying to do a left join and the right table is resulting in all NA values when joined by 'Id' and 'date'. I tried the below but wasn't successful. Structures are all the same.
Date = m/d/y
Left Table:
id Date V1 V2 V3
1 4/13/2022 42 69 99
2 4/10/2022 32 58 44
3 3/15/2022 12 23 95
4 3/10/2022 84 42 63
Right Table:
id Day X1 X2
1 4/13/2022 10 9
2 4/09/2022 8 7
3 3/15/2022 6 5
left_right_merged <- merge(
x = left_table,
y = right_table,
by.x = c("id", "date"),
by.y = c("id", "day"),
all.x= TRUE
)
Expected:
Id Date V1 V2 V3 X1 X2
1 4/13/2022 42 69 99 10 9
2 4/09/2022 NA NA NA 8 7
2 4/10/2022 32 58 44 NA NA
3 3/15/2022 12 23 95 6 5
4 3/10/2022 84 42 63 NA NA
Actual:
Id Date V1 V2 V3 X1 X2
1 4/13/2022 42 69 99 NA NA
2 4/09/2022 NA NA NA NA NA
2 4/10/2022 32 58 44 NA NA
3 3/15/2022 12 23 95 NA NA
4 3/10/2022 84 42 63 NA NA
CodePudding user response:
There's a minor issue with your column names not being the same in the code and input (as mentioned in a comment). But the main issue is that you need all.y
as well as all.x
to get that output:
left_table <- read.table(text = "id Date V1 V2 V3
1 4/13/2022 42 69 99
2 4/10/2022 32 58 44
3 3/15/2022 12 23 95
4 3/10/2022 84 42 63", header = T)
right_table <- read.table(text = "id Day X1 X2
1 4/13/2022 10 9
2 4/09/2022 8 7
3 3/15/2022 6 5", header = T)
merge(x = left_table, y = right_table, by.x = c("id", "Date"), by.y = c("id", "Day"), all.x= TRUE, all.y = TRUE)
# id Date V1 V2 V3 X1 X2
# 1 1 4/13/2022 42 69 99 10 9
# 2 2 4/09/2022 NA NA NA 8 7
# 3 2 4/10/2022 32 58 44 NA NA
# 4 3 3/15/2022 12 23 95 6 5
# 5 4 3/10/2022 84 42 63 NA NA
CodePudding user response:
A tidyverse solution, as requested.
library(tidyverse)
# Generate test data
leftTable <- read.table(textConnection("id Date V1 V2 V3
1 4/13/2022 42 69 99
2 4/10/2022 32 58 44
3 3/15/2022 12 23 95
4 3/10/2022 84 42 63"), header=TRUE) %>%
mutate(Date=lubridate::mdy(Date))
rightTable <- read.table(textConnection("id Day X1 X2
1 4/13/2022 10 9
2 4/09/2022 8 7
3 3/15/2022 6 5"), header=TRUE) %>%
mutate(Day=lubridate::mdy(Day))
# Solve the problem
leftTable %>%
full_join(rightTable, by=c("id", "Date"="Day")) %>%
arrange(id, Date)
id Date V1 V2 V3 X1 X2
1 1 2022-04-13 42 69 99 10 9
2 2 2022-04-09 NA NA NA 8 7
3 2 2022-04-10 32 58 44 NA NA
4 3 2022-03-15 12 23 95 6 5
5 4 2022-03-10 84 42 63 NA NA