Home > OS >  Count unique combinations in and summarize other columns in new one
Count unique combinations in and summarize other columns in new one

Time:02-26

I have a data table that has four columns: a,b,c,d that I'd like to count the unique combinations of a, b, and c but also record which values of d are in that combo in a new column.

Example of a data.table:

a b c d
1a 1b 1c n1
1a 1b 1c n2
2a 2b 2c n1
2a 2b 2c n2
2a 2b 2c n3
2a 2b 2c n4

I take unique values and count them:dt[,.N,by=.(a,b,c)] and I get the following result:

a   b   c   N
1a  1b  1c  2
2a  2b  2c  4

Now I would like to add a new column which includes the values in column d. But i'm not sure how to properly .SD to achieve this:

Example output:

a   b   c   N new_col
1a  1b  1c  2 n1,n2
2a  2b  2c  4 n1,n2,n3,n4

Or preferably have new_col a list of items.

CodePudding user response:

We could use return as a list

library(data.table)
dt[, .(N = .N, new_col = .(d)), by = .(a, b, c)]
        a      b      c     N     new_col
   <char> <char> <char> <int>      <list>
1:     1a     1b     1c     2       n1,n2
2:     2a     2b     2c     4 n1,n2,n3,n4
  • Related