I have two data frames EMOJ and EYETRACK and I need to merge them by "session", without duplicating rows.
Dataframes:
> EMOJ
session age attitude
1 s001 18 2
2 s002 22 4
> EYETRACK
session stimuli response_time
1 s001 A 1023
2 s001 B 1009
3 s001 C 1832
4 s002 A 1092
5 s002 B 1076
What I want:
session age attitude stimuli response_time
1 s001 18 2 A 1023
2 s001 B 1009
3 s001 C 1832
4 s002 22 4 A 1092
5 s002 B 1076
What I am getting:
df <- merge(EMOJ, EYETRACK, by.x = 'session', by.y = 'session')
session age attitude stimuli response_time
1 s001 18 2 A 1023
2 s001 18 2 B 1009
3 s001 18 2 C 1832
4 s002 22 4 A 1092
5 s002 22 4 B 1076
CodePudding user response:
I came up with this but I had to add 'stimuli' to the EMOJ df
EMOJ$stimuli <- 'A'
df1 <- merge(EMOJ, EYETRACK, by = c('session','stimuli'), all = TRUE)
CodePudding user response:
Using the dplyr
package:
EYETRACK %>%
left_join(distinct(EMOJ, session, .keep_all = TRUE))
CodePudding user response:
One approach is to number each row within each session consecutively. Then merge
on both the session
and this index number. Only the first row within each session
from both data.frames will be merged. If there's only 1 row per session
in EMOJ, can simply use EMOJ$i <- 1
.
library(data.table)
EMOJ$i <- rowid(EMOJ$session)
EYETRACK$i <- rowid(EYETRACK$session)
merge(EYETRACK, EMOJ, by = c("session", "i"), all.x = T)
Output
session i stimuli response_time age attitude
1 s001 1 A 1023 18 2
2 s001 2 B 1009 NA NA
3 s001 3 C 1832 NA NA
4 s002 1 A 1092 22 4
5 s002 2 B 1076 NA NA