Consider the following phylogenetic tree and matrix of presence-absence data for several taxa in 3 sites:
set.seed(020123)
tree_ <- phytools::pbtree(b = 1, n = 7)
matrix_ <- structure(
c(1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0,
0, 0, 0, 1, 0, 0, 1, 0, 0, 1),
dim = c(3L, 7L),
dimnames = list(c("site_1", "site_2", "site_3"),
c("t1", "t2", "t3", "t4", "t5", "t6", "t7"))
)
I'd like to compute the pairwise phylogenetic distance for all sites and store the results in an object of class dist
.
I can compute the phylogenetic distance between two sites using the following functions:
get_community_taxa <- function(x) {
list(
colnames(x)[x[1, ] == 1L],
colnames(x)[x[2, ] == 1L]
)
}
phylo_dist <- function(x, tree) {
distance_mtx <- ape::cophenetic.phylo(tree)
communities <- get_community_taxa(x)
phylo_dist <- distance_mtx[
communities[[1]],
communities[[2]]
]
if (length(phylo_dist) == 1L) {
return(phylo_dist)
}
mean(c(
apply(phylo_dist, 1, function(x) min(x, na.rm = TRUE)),
apply(phylo_dist, 2, function(x) min(x, na.rm = TRUE))
))
}
I've tried using the function above in combination with combn
which gives me a vector of distances for each pair of sites:
apply(
combn(nrow(v), 2),
MARGIN = 2,
function(x) phylo_dist(matrix_[x, ], tree_)
)
#> [1] 1.276455 2.191359 3.045831
But I'm not sure how to output these values in a well structured matrix (see below).
Here's the desired output (object of class dist
):
site_1 site_2
site_2 1.276455
site_3 2.191359 3.045831
CodePudding user response:
Maybe we could use outer
here and convert to dist
out <- outer(seq_len(nrow(matrix_)), seq_len(nrow(matrix_)),
Vectorize(function(i, j) phylo_dist(matrix_[c(i, j),], tree_)))
dimnames(out) <- list(row.names(matrix_), row.names(matrix_))
-output
> as.dist(out)
site_1 site_2
site_2 1.276455
site_3 2.191359 3.045831