Compute cohensd for several vectors at once-CodePudding

Imagine I have a dataframe that looks like this

Concentration  Value
Low            0.21
Medium         0.85    
Low            0.10
Low            0.36
High           2.21
Medium         0.50
High           1.85

I can easily transform this into a list, where I will have:

$High
[1] -0.04  0.00 -0.10 -0.02 -0.01 -0.01

$Medium
[1]  0.01 -0.01 -0.05  0.03  0.02

$Low
[1]  0.010  0.040 -0.020 -0.007  0.010

There's a function called cohen.d() from the library(effsize) package, that allows you to calculate the effect size between two groups. You could do, for example, cohen.d(dat$Low, dat$Medium) to obtain the effect size between these two columns.

In this case, however, I would like to use a function from the apply family to compute the cohend between one factor (one of the vectors in the list) and the rest of the factors (all the other vectors in the list)

Edit:

> dput(dat2)
list(High = c(-0.04, 0, -0.1, -0.02, -0.00999999999999998, -0.00999999999999998
), Medium = c(0.01, -0.00999999999999998, -0.05, 0.03, 0.02), 
    Low = c(0.01, 0.04, -0.02, -0.00700000000000001, 0.01))

CodePudding user response：

# get all combinations of list-item names
combos <- data.frame(t(combn(unique(names(dat2)), m = 2)))              

# perform a test on each combination
tests <- Map(\(x, y) cohen.d(dat2[[x]], dat2[[y]]), combos$X1, combos$X2)   

# rename the test list to reflect the groups tested                  
names(tests) <- paste(combos$X1, combos$X2, sep = '_')

tests

$High_Medium

Cohen's d

d estimate: -0.8660254 (large)
95 percent confidence interval:
     lower      upper 
-2.2980934  0.5660426 


$High_Low

Cohen's d

d estimate: -1.168403 (large)
95 percent confidence interval:
    lower     upper 
-2.649588  0.312783 


$Medium_Low

Cohen's d

d estimate: -0.2403738 (small)
95 percent confidence interval:
    lower     upper 
-1.704076  1.223329

CodePudding user response：

To get all the pairwise cohen.d between each member of the list:

lapply(1:3, \(x) lapply(1:2, \(y) cohen.d(l[[x]], l[-x][[y]])))

CodePudding user response：

You could use combn() with lapply() like this:

do.call(
  rbind, 
  lapply(combn(unique(df$Concentration),2, simplify = F), function(x) {
    diff = cohen.d(df[df$Concentration==x[1], "Value"],df[df$Concentration==x[2], "Value"])
    ci = as.numeric(diff$conf.int)
    data.frame("g1" = x[1], "g2"=x[2], "cohen_d" = diff$estimate, "lower" = ci[1], "upper" = ci[2])
  })
)

Output:

      g1     g2   cohen_d      lower     upper
1    Low Medium -2.533928  -6.399536 1.3316800
2    Low   High -9.952077 -20.380461 0.4763081
3 Medium   High -5.397378 -14.667036 3.8722790