Suppose I have a dataframe
related <- structure(list(IID1 = c("035-0028-0151526", "036-1002-0282844",
"950-0001-0081329", "951-1000-0084436", "954-0009-0337439", "954-0011-0379455",
"955-0017-0354237", "6011039", "6011039", "6011039", "6011039",
"6205526", "A1858", "A2159"), IID2 = c("035-0028-0157134", "036-1002-0305628",
"950-0001-0100509", "951-1000-0084537", "954-0009-0337641", "954-0011-0379657",
"955-0017-0363137", "6205526", "6215777", "A1864", "A2267", "A1864",
"A1864", "A2267")), row.names = c(NA, -14L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x38b6d80>)
I would like to find the counts for all elements in IID1 and IID2.
I tried this code below, but it only generates counts for unique IIDs and not all values (including duplicates).
as.vector(aggregate(n ~ IID1, data=transform(related,n=1), length)[,2])
as.vector(aggregate(n ~ IID2, data=transform(related,n=1), length)[,2])
I want to be able to do something like this below. What should be my approach?
related$nIID1 <- as.vector(aggregate(n ~ IID1, data=transform(related,n=1), length)[,2])
related$nIID2 <- as.vector(aggregate(n ~ IID2, data=transform(related,n=1), length)[,2])
CodePudding user response:
If we need to create a column, use ave
instead of aggregate
as aggregate
returns the summarised output
related$nIID1 <- with(transform(related, n = 1), ave(n, IID1, FUN = length))
CodePudding user response:
And another attempt with data.table only:
df_count = rbind(related[, IID:=IID1], related[, IID:=IID2])[, "IID"]
df_count[, .N, by=IID]
Output:
IID N
1: 035-0028-0157134 2
2: 036-1002-0305628 2
3: 950-0001-0100509 2
4: 951-1000-0084537 2
5: 954-0009-0337641 2
6: 954-0011-0379657 2
7: 955-0017-0363137 2
8: 6205526 2
9: 6215777 2
10: A1864 6
11: A2267 4
Or if you want to count the combinations between IID1 and IID2:
related[, .N, by=.(IID1, IID2)]
IID1 IID2 N
1: 035-0028-0151526 035-0028-0157134 1
2: 036-1002-0282844 036-1002-0305628 1
3: 950-0001-0081329 950-0001-0100509 1
4: 951-1000-0084436 951-1000-0084537 1
5: 954-0009-0337439 954-0009-0337641 1
6: 954-0011-0379455 954-0011-0379657 1
7: 955-0017-0354237 955-0017-0363137 1
8: 6011039 6205526 1
9: 6011039 6215777 1
10: 6011039 A1864 1
11: 6011039 A2267 1
12: 6205526 A1864 1
13: A1858 A1864 1
14: A2159 A2267 1