Suppose I have these data
data1 <- read.delim(textConnection(
"id val1
1 blue
1 green
1 red
2 black
2 brown
2 white"
), sep=' ')
data2 <- read.delim(textConnection(
"id val2
1 cat
1 dog
1 fish
2 hat
2 coat
2 car"
), sep=' ')
I would like to calculate all permutations of blue, green, and red cat, dog, and fish for id=1 and brown, black, and white hats, coats, and cars for id=2. I could do it in a for
loop with expand.grid
, and then "build" the output using rbind
. But my actual data have several IDs and several vals so it runs poorly.
CodePudding user response:
It turns out that merge
does this by default
> merge(data1, data2, by='id')
id val1 val2
1 1 blue cat
2 1 blue dog
3 1 blue fish
4 1 green cat
5 1 green dog
6 1 green fish
7 1 red cat
8 1 red dog
9 1 red fish
10 2 black hat
11 2 black coat
12 2 black car
13 2 brown hat
14 2 brown coat
15 2 brown car
16 2 white hat
17 2 white coat
18 2 white car
CodePudding user response:
In base R
, we could use split
on both the datasets to create a list
of values by 'id' and then apply the expand.grid
on the corresponding elements of the list and rbind (if needed)
Map(expand.grid, split(data1$val1, data1$id), split(data2$val2, data2$id))
Or in data.table
library(data.table)
setDT(data1)[data2, on = .(id), allow.cartesian = TRUE]