I have a single list that looks like this
main.list <- c("dog", "cat", "bird", "snake")
I have a bunch of equal sized comparator elements that share some, but not all, the elements in main.list
comparator.list1 <- c("dog", "cat", "bird", "crescent")
comparator.list2 <- c("dog", "lizard", "cup", "plate")
comparator.list3 <- c("lizard", "bird", "squirrel", "snake")
I want to make a list that consists of the proportion overlapping elements between all the comparator lists, and the main list. So in this case:
List Number.ofshared.elemts
comparator.list1 0.75
comparator.list2 0.25
comparator.list3 0.5
How can I do that?
CodePudding user response:
Get the 'comparator' objects in a list
, use %in%
to return a logical vector by comparing the elements with 'main.list', convert to proportion with mean
, and stack
the key/value pair to a data.frame
with two columns
out <- stack(lapply(mget(ls(pattern = 'comparator')),
function(x) mean(main.list %in% x)))[2:1]
names(out) <- c("List", "Number.of.shared.elements")
-output
> out
List Number.of.shared.elements
1 comparator.list1 0.75
2 comparator.list2 0.25
3 comparator.list3 0.50
We may also use intersect
with length
and divide by the length
of the vector
out <- stack(lapply(mget(ls(pattern = 'comparator')),
function(x) length(intersect(main.list, x))/length(x)))[2:1]
names(out) <- c("List", "Number.of.shared.elements")
Or using tidyverse
library(dplyr)
library(tibble)
library(tidyr)
mget(ls(pattern = 'comparator')) %>%
enframe(name = 'List') %>%
unnest(value) %>%
group_by(List) %>%
summarise(Number.of.shared.elements = length(intersect(value,
main.list))/n(), .groups = 'drop')
# A tibble: 3 × 2
List Number.of.shared.elements
<chr> <dbl>
1 comparator.list1 0.75
2 comparator.list2 0.25
3 comparator.list3 0.5
CodePudding user response:
A possible alternative:
library(tidyverse)
# Data --------------------------------------------------------------------
main.list <- c("dog", "cat", "bird", "snake")
comparator.list1 <- c("dog", "cat", "bird", "crescent")
comparator.list2 <- c("dog", "lizard", "cup", "plate")
comparator.list3 <- c("lizard", "bird", "squirrel", "snake")
# Code --------------------------------------------------------------------
nms <- str_subset(ls(), '^comparator\\.')
nms %>%
syms() %>%
map2_dfr(nms, ~ tibble(List = .y, Number.ofshared.elemts = mean(eval(.) %in% main.list)))
#> # A tibble: 3 × 2
#> List Number.ofshared.elemts
#> <chr> <dbl>
#> 1 comparator.list1 0.75
#> 2 comparator.list2 0.25
#> 3 comparator.list3 0.5
Created on 2021-11-21 by the reprex package (v2.0.1)