I have multiple dataframes that look like this:
>df1
NAME
Josh
Sarah
Sammy
Jake
>df2
NAME
Josh
Sarah
Sammy
Mark
>df3
NAME
Josh
Michael
Mike
Adam
>df4
NAME
Josh
Michael
Mike
Adam
I want to create a new dataframe that contains the number of intersections these dfs have, so like this
>df.final
df1 df2 df3 df4
df1 4 3 1 4
df2 3 4 1 1
df3 1 1 4 4
df4 1 1 4 4
How can I achieve this? Essentially I'm looking to automate the intersect()
and length()
functions without manually typing them out.
#create the data
df1 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Jake"))
df2 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Mark"))
df3 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
df4 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
CodePudding user response:
#create the data
df1 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Jake"))
df2 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Mark"))
df3 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
df4 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
l <- c("df1","df2","df3","df4")
names(l) <- l
result <- outer(mget(l),mget(l), function(x,y)
mapply(function(x,y) length(intersect(x$NAME , y$NAME)),x,y ) )
result
#> df1 df2 df3 df4
#> df1 4 3 1 1
#> df2 3 4 1 1
#> df3 1 1 4 4
#> df4 1 1 4 4
EDIT
Vectorize also works:
result <- outer(mget(l),mget(l), Vectorize(
function(x,y) length(intersect(x$NAME , y$NAME))))