Imagine I have a dataframe that looks like this
Concentration Value
Low 0.21
Medium 0.85
Low 0.10
Low 0.36
High 2.21
Medium 0.50
High 1.85
I can easily transform this into a list, where I will have:
$High
[1] -0.04 0.00 -0.10 -0.02 -0.01 -0.01
$Medium
[1] 0.01 -0.01 -0.05 0.03 0.02
$Low
[1] 0.010 0.040 -0.020 -0.007 0.010
There's a function called cohen.d() from the library(effsize) package, that allows you to calculate the effect size between two groups. You could do, for example, cohen.d(dat$Low, dat$Medium) to obtain the effect size between these two columns.
In this case, however, I would like to use a function from the apply family to compute the cohend between one factor (one of the vectors in the list) and the rest of the factors (all the other vectors in the list)
Edit:
> dput(dat2)
list(High = c(-0.04, 0, -0.1, -0.02, -0.00999999999999998, -0.00999999999999998
), Medium = c(0.01, -0.00999999999999998, -0.05, 0.03, 0.02),
Low = c(0.01, 0.04, -0.02, -0.00700000000000001, 0.01))
CodePudding user response:
# get all combinations of list-item names
combos <- data.frame(t(combn(unique(names(dat2)), m = 2)))
# perform a test on each combination
tests <- Map(\(x, y) cohen.d(dat2[[x]], dat2[[y]]), combos$X1, combos$X2)
# rename the test list to reflect the groups tested
names(tests) <- paste(combos$X1, combos$X2, sep = '_')
tests
$High_Medium
Cohen's d
d estimate: -0.8660254 (large)
95 percent confidence interval:
lower upper
-2.2980934 0.5660426
$High_Low
Cohen's d
d estimate: -1.168403 (large)
95 percent confidence interval:
lower upper
-2.649588 0.312783
$Medium_Low
Cohen's d
d estimate: -0.2403738 (small)
95 percent confidence interval:
lower upper
-1.704076 1.223329
CodePudding user response:
To get all the pairwise cohen.d between each member of the list:
lapply(1:3, \(x) lapply(1:2, \(y) cohen.d(l[[x]], l[-x][[y]])))
CodePudding user response:
You could use combn()
with lapply()
like this:
do.call(
rbind,
lapply(combn(unique(df$Concentration),2, simplify = F), function(x) {
diff = cohen.d(df[df$Concentration==x[1], "Value"],df[df$Concentration==x[2], "Value"])
ci = as.numeric(diff$conf.int)
data.frame("g1" = x[1], "g2"=x[2], "cohen_d" = diff$estimate, "lower" = ci[1], "upper" = ci[2])
})
)
Output:
g1 g2 cohen_d lower upper
1 Low Medium -2.533928 -6.399536 1.3316800
2 Low High -9.952077 -20.380461 0.4763081
3 Medium High -5.397378 -14.667036 3.8722790