I have a data table init
that looks like this:
> init
---
| id|
---
| a|
---
| b|
---
| c|
---
I want to obtain all pairs for id
column, so I need to cross join the init
data table on itself. Additionally, I want to exclude equal and symmetric results (in my case a,b == b,a
, etc.).
Desired output is:
--- ---
|id1|id2|
--- ---
| a| b|
| a| c|
| b| c|
--- ---
How can this be done with the data.table
approach?
Full cross join can be implemented as
full_cj <- CJ(init$id, init$id)
:
--- ---
| V1| V2|
--- ---
| a| a|
| a| b|
| a| c|
| b| a|
| b| b|
| b| c|
| c| a|
| c| b|
| c| c|
--- ---
But how can I remove symmetrical and identical results from the output?
My real data is huge, so I'm looking for an efficient solution.
CodePudding user response:
You can use a non equi join after converting to factor
dt[, id2:=as.factor(id)][dt, on=.(id2>id2),nomatch=0,.(id, id2)]