count and show elements that are shared in at least 3 or 4 lists in R-CodePudding

I have 4 lists with different elements. How can I extract and count the number of elements that are shared in at least 2 lists and more?

For example:

a=c(1,2,3,4,5,6,7)
  b=c(1,4,5,7,8)
  c=c(2,5,9,10)
  d=c(11,12,13,14)

The answer should be: 4 counts for elements 1,2,4 and 7.

CodePudding user response：

I prefer (and recommend) to work with lists of vectors,

L <- list(a,b,c,d)

This will get your "2 or more" elements:

uniq <- sort(unique(unlist(L)))
uniq
#  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14
uniq[rowSums(sapply(L, `%in%`, x = uniq)) > 1]
# [1] 1 2 4 5 7

And I think you forgot the 5 :-)

CodePudding user response：

First, I keep only unique values for each list (in case of duplicated values in same vector), then put everything into one vector. Then, I return the unique duplicates.

x <- unlist(lapply(list(a, b, c, d), unique))

x
# [1]  1  1  2  2  3  4  4  5  5  5  6  7  7  8  9 10 11 12 13 14

sort(unique(x[duplicated(x)]))
# [1] 1 2 4 5 7

However, if you want the number of lists that the value shows up in, then we can use table.

table(x)[table(x) >= 2]

x
1 2 4 5 7 
2 2 2 3 2

Or you can also just return the names using table as well:

as.numeric(names(table(x))[table(x) >= 2])

CodePudding user response：

Another way is to use a named list, stack them together and obtain the counts:

my_list <-list(a=a, b=b, c=c, d=d)
subset(aggregate(ind~values, unique(stack(my_list)), length), ind>=2, values, drop = TRUE)
[1] 1 2 4 5 7

or even:

unique(subset(unique(stack(my_list)), duplicated(values), values, drop=TRUE))
[1] 1 4 5 7 2