Merging data frames in R without duplicating rows in x due to repeated values in y-CodePudding

I have two data frames EMOJ and EYETRACK and I need to merge them by "session", without duplicating rows.

Dataframes:

> EMOJ
  session age attitude
1    s001  18        2
2    s002  22        4


> EYETRACK
  session stimuli response_time
1    s001       A          1023
2    s001       B          1009
3    s001       C          1832
4    s002       A          1092
5    s002       B          1076

What I want:

  session age attitude stimuli response_time
1    s001  18        2       A          1023
2    s001                    B          1009
3    s001                    C          1832
4    s002  22        4       A          1092
5    s002                    B          1076

What I am getting:

df <- merge(EMOJ, EYETRACK, by.x = 'session', by.y = 'session')

  session age attitude stimuli response_time
1    s001  18        2       A          1023
2    s001  18        2       B          1009
3    s001  18        2       C          1832
4    s002  22        4       A          1092
5    s002  22        4       B          1076

CodePudding user response：

I came up with this but I had to add 'stimuli' to the EMOJ df

EMOJ$stimuli <- 'A'

df1 <- merge(EMOJ, EYETRACK, by = c('session','stimuli'), all = TRUE)

CodePudding user response：

Using the dplyr package:

EYETRACK %>%
  left_join(distinct(EMOJ, session, .keep_all = TRUE))

CodePudding user response：

One approach is to number each row within each session consecutively. Then merge on both the session and this index number. Only the first row within each session from both data.frames will be merged. If there's only 1 row per session in EMOJ, can simply use EMOJ$i <- 1.

library(data.table)

EMOJ$i <- rowid(EMOJ$session)
EYETRACK$i <- rowid(EYETRACK$session)

merge(EYETRACK, EMOJ, by = c("session", "i"), all.x = T)

Output

  session i stimuli response_time age attitude
1    s001 1       A          1023  18        2
2    s001 2       B          1009  NA       NA
3    s001 3       C          1832  NA       NA
4    s002 1       A          1092  22        4
5    s002 2       B          1076  NA       NA