I have a dataframe as follows:
# A tibble: 6 x 4
Placebo High Medium Low
<dbl> <dbl> <dbl> <dbl>
1 0.0400 -0.04 0.0100 0.0100
2 0.04 0 -0.0100 0.04
3 0.0200 -0.1 -0.05 -0.0200
4 0.03 -0.0200 0.03 -0.00700
5 -0.00500 -0.0100 0.0200 0.0100
6 0.0300 -0.0100 NA NA
You could get the cohensD for two of the columns using the cohen.d() function from the effsize package:
df <- data.frame(Placebo = c(0.0400, 0.04, 0.0200, 0.03, -0.00500, 0.0300),
Low = c(-0.04, 0, -0.1, -0.0200, -0.0100, -0.0100),
Medium = c(0.0100, -0.0100, -0.05, 0.03, 0.0200, NA ),
High = c(0.0100, 0.04, -0.0200, -0.00700, 0.0100, NA))
library(effsize)
cohen.d(as.vector(na.omit(df$Placebo)), as.vector(na.omit(df$High)))
Interestingly enough, I'm getting the following error with this code:
Error in data[, group] : incorrect number of dimensions
However, I would like to create a function that allows you to obtain all the cohensd between one of the columns and the rest of them.
In order to get the cohensD of all columns against the Placebo we would use something like:
sapply(df, function(i) cohen.d(pull(df, as.vector(na.omit(!!Placebo))), as.vector(na.omit(i))))
But I'm not sure this would work anyway.
Edit: I don't want to erase the full row, as cohens d can be computed for different length vectors. Ideally, I would like to get the stat with the NA removed for each column independetly
CodePudding user response:
It may be better to remove the NA
on each of the columns separately by creating a logical index along with 'Placebo'
library(dplyr)
library(effsize)
df %>%
summarise(across(Low:High, ~ list({
i1 <- complete.cases(Placebo)& complete.cases(.x)
cohen.d(Placebo[i1], .x[i1])})))
Or if we want to use lapply/sapply
, loop over the columns other than Placebo
lapply(df[-1], function(x) {
x1 <- na.omit(cbind(df$Placebo, x))
cohen.d(x1[,1], x1[,2])
})
-output
$Low
Cohen's d
d estimate: 1.947312 (large)
95 percent confidence interval:
lower upper
0.3854929 3.5091319
$Medium
Cohen's d
d estimate: 0.9622504 (large)
95 percent confidence interval:
lower upper
-0.5782851 2.5027860
$High
Cohen's d
d estimate: 0.8884639 (large)
95 percent confidence interval:
lower upper
-0.6402419 2.4171697