Home > other >  Co-occurrence calculations from a presence-absence matrix
Co-occurrence calculations from a presence-absence matrix

Time:01-22

I have this great presence-absence dataset in which I need to calculate a C score (CS) for each species pair (BABO, BW, RC, SKS, MANG) to measure species co-occurrences.

C score equation

ki and kj denote the numbers of occurrences of species i and j and K is the number of co-occurrences of both species.

I have looked at articles like Find the pair of most correlated variables and Turning co-occurrence counts into co-occurrence probabilities with cascalog but was unable to determine the most efficient way to go about it in R. I have tried creating a function but was unsuccessful.

My data:

data <- structure(list(group_id = c("2008-2-11.C3_900", "2008-2-11.C3_960", 
"2008-2-11.C3_1200", "2008-2-11.C3_1230", "2008-2-11.C3_1460", 
"2008-2-11.C3_1490", "2008-2-22.Mwani_0", "2008-2-22.Mwani_110", 
"2008-2-22.Mwani_600", "2008-2-22.Mwani_1650", "2008-2-20.Sanje_150", 
"2008-2-20.Sanje_410", "2008-2-20.Sanje_3000", "2008-5-9.C3_900", 
"2008-5-13.Mwani_750", "2008-5-13.Mwani_800", "2008-5-13.Mwani_900", 
"2008-5-13.Mwani_1080", "2008-5-13.Mwani_1800", "2008-5-13.Mwani_2200", 
"2008-5-13.Mwani_2900"), BABO = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), BW = c(1, 1, 1, 1, 0, 0, 
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1), RC = c(0, 0, 1, 
1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0), SKS = c(0, 
1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0), 
    MANG = c(0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 1)), row.names = c(NA, -21L), class = c("tbl_df", 
"tbl", "data.frame"))
group_id               BABO BW RC SKS MANG
<chr>                  <dbl><dbl><dbl><dbl>
2008-2-11.C3_900        0   1   0   0   0
2008-2-11.C3_960        0   1   0   1   0
2008-2-11.C3_1200       0   1   1   1   0
2008-2-11.C3_1230       0   1   1   0   0
2008-2-11.C3_1460       0   0   1   0   0
2008-2-11.C3_1490       0   0   0   1   0
2008-2-22.Mwani_0       0   0   0   1   1
2008-2-22.Mwani_110     0   0   1   0   1
2008-2-22.Mwani_600     0   1   0   0   0
2008-2-22.Mwani_1650    0   0   0   1   0
2008-2-20.Sanje_150     0   1   0   1   0
2008-2-20.Sanje_410     0   0   1   0   0
2008-2-20.Sanje_3000    0   0   1   0   0
2008-5-9.C3_900         0   0   0   1   0
2008-5-13.Mwani_750     0   0   0   1   0
2008-5-13.Mwani_800     1   0   1   0   0
2008-5-13.Mwani_900     0   0   1   0   0
2008-5-13.Mwani_1080    0   1   1   0   0
2008-5-13.Mwani_1800    0   1   0   0   0
2008-5-13.Mwani_2200    0   1   0   0   0
2008-5-13.Mwani_2900    0   1   0   0   1

CodePudding user response:

First define a function to compute CS for a given pair of species; then use combn() to generate all possible pairs; then pass each pair to the CS() function using purrr::map2_dbl().

library(dplyr)
library(purrr)

CS <- function(i, j, data) {
  K <- sum(data[[i]] == 1 & data[[j]] == 1)
  ki <- sum(data[[i]])
  kj <- sum(data[[j]])
  ((ki - K) * (kj - K)) / (ki * kj)
}

names(sp_data)[-1] %>%
  combn(2) %>%
  t() %>%
  as_tibble() %>%
  set_names(c("i", "j")) %>%
  mutate(CS = map2_dbl(i, j, CS, data = sp_data))
# A tibble: 10 × 3
   i     j        CS
   <chr> <chr> <dbl>
 1 BABO  BW    1    
 2 BABO  RC    0    
 3 BABO  SKS   1    
 4 BABO  MANG  1    
 5 BW    RC    0.467
 6 BW    SKS   0.438
 7 BW    MANG  0.6  
 8 RC    SKS   0.778
 9 RC    MANG  0.593
10 SKS   MANG  0.583

(Side note: using the equation you provided, species with little co-occurrence end up with high CS scores, and vice versa. If this isn’t what you expected, perhaps your equation should be subtracted from 1?)

CodePudding user response:

In base R, we may use crossprod

t1 <- crossprod(as.matrix(data[-1]))
v1 <- diag(t1)
c1 <- v1[col(t1)]
r1 <- v1[row(t1)]
((c1 - t1) * (r1 - t1))/(c1 * r1)

-output

     BABO        BW        RC       SKS      MANG
BABO    0 1.0000000 0.0000000 1.0000000 1.0000000
BW      1 0.0000000 0.4666667 0.4375000 0.6000000
RC      0 0.4666667 0.0000000 0.7777778 0.5925926
SKS     1 0.4375000 0.7777778 0.0000000 0.5833333
MANG    1 0.6000000 0.5925926 0.5833333 0.0000000
  • Related