I have multiple data sets that I have been combining into one summary data frame in R. I am at the last step, but I am stymied.
Here is a dummy scenario: Imagine the two data frames df1 and df2 being
df1 df2
ID Make ID Max
1 A 1 2
1 B 2 3
2 A 3 5
2 B
2 B
2 C
3 B
I want to show all unique instances of "Make" for each ID and append that to df2 The resulting df2 should look like:
df2
ID Max Make
1 2 A,B
2 3 A,B,C
3 5 B
My attempts have not worked. This is the closest I have gotten:
df2[df1
, ':=' (Make = paste(Make, collapse = ","))
, by = .EACHI
, on = .(ID = ID)]
ID Max Make
1: 1 2 B
2: 2 3 C
3: 3 5 B
Any help is appreciated.
CodePudding user response:
It may be easier with a two step process - i.e summarise the 'df1' data Make
column by paste
ing the unique
elements by 'ID' and do a join with df2
on
the 'ID' column
library(data.table)
df2[df1[, .(Make = toString(unique(Make))), ID], on = .(ID)]
ID Max Make
<num> <num> <char>
1: 1 2 A, B
2: 2 3 A, B, C
3: 3 5 B
Or if we want to use .EACHI
, instead of doing the assignment (:=
), summarise the output within a list
(or .(
)
df1[df2, .(Max, Make = toString(unique(Make))), on = .(ID), by = .EACHI]
ID Max Make
<num> <num> <char>
1: 1 2 A, B
2: 2 3 A, B, C
3: 3 5 B
NOTE: Here we assume both datasets to be data.table
objects
data
df1 <- structure(list(ID = c(1, 1, 2, 2, 2, 2, 3), Make = c("A", "B",
"A", "B", "B", "C", "B")), row.names = c(NA, -7L), class = c("data.table",
"data.frame"))
df2 <- structure(list(ID = c(1, 2, 3), Max = c(2, 3, 5)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"))