Home > Blockchain >  R: cross join column on itself excluding symmetric results
R: cross join column on itself excluding symmetric results

Time:06-15

I have a data table init that looks like this:

> init
 --- 
| id|       
 --- 
|  a|
 --- 
|  b|
 --- 
|  c|
 --- 

I want to obtain all pairs for id column, so I need to cross join the init data table on itself. Additionally, I want to exclude equal and symmetric results (in my case a,b == b,a, etc.).

Desired output is:

 --- --- 
|id1|id2|
 --- --- 
|  a|  b|
|  a|  c|
|  b|  c|
 --- --- 

How can this be done with the data.table approach?

Full cross join can be implemented as full_cj <- CJ(init$id, init$id):

 --- --- 
| V1| V2|
 --- --- 
|  a|  a|
|  a|  b|
|  a|  c|
|  b|  a|
|  b|  b|
|  b|  c|
|  c|  a|
|  c|  b|
|  c|  c|
 --- --- 

But how can I remove symmetrical and identical results from the output?
My real data is huge, so I'm looking for an efficient solution.

CodePudding user response:

You can use a non equi join after converting to factor

dt[, id2:=as.factor(id)][dt, on=.(id2>id2),nomatch=0,.(id, id2)]
  • Related