I have a data (dt) as below (it is just a simplified example, so you can understand my request):
col 1 | col 2 | col 3 |
---|---|---|
A | cell 1 | 1,2 |
B | cell 2 | 3,5,6 |
A | cell 3 | 1,2 |
D | cell 4 | 9,7,8 |
What I want to know is in how many cells I can have the same combinations of col1 and col3 :
So the output should be like this :
col 1 | col 3 | col 2 |
---|---|---|
A | 1,2 | 2 cells |
B | 3,5,6 | 1 cell |
D | 9,7,8 | 1 cell |
I was trying this command using "data.table" in r :
data.table(table(dt$col1,dt$col3))
However it gives me this error :
error in table(dt$col1,dt$col3) : attempt to make a table with >= 2^31 elements
I am doing this in R, data.table
CodePudding user response:
You can just aggregate, i.e.
library(data.table)
setDT(d2)[, .(col2 = .N, col3 = toString(unique(col3))), col1][]
# col1 col2 col3
#1: A 2 1_2
#2: B 1 3_5_6
#3: D 1 9_7_8
CodePudding user response:
Try this
library(dplyr)
df %>% group_by(col1 , col3) %>% summarise("coll2" = paste(length(col3) , "cells"))
CodePudding user response:
We could use count
from dplyr
and its name
argument:
df %>%
count(col1, col3, name = "col2")
col1 col3 col2
1 A 1,2 2
2 B 3,5,6 1
3 D 9,7,8 1
In case you need cell
as string, we could add mutate(col2 = paste(col2, "cell"))
to the code.