I have a dataset that contains numerous items that were measured using a pre- and posttest instrument. Here is an example dataset:
Question Score Test
QA 5 Pre
QA 2 Pre
QA 3 Post
QA 7 Post
QA 3 Post
QB 2 Pre
QB 1 Pre
QB 4 Pre
QC 7 Pre
QC 3 Pre
QC 2 Post
QC 3 Post
QC 6 Post
I want to perform a Cohen's D on this data, and create an object in my data environment, such as:
Effectsize1<-effectsize::cohens_d(df$Score[df$Question== "QA"]~ df$Test[df$Question== "QA"], data = df)
instead of writing out this code for each item, I have tried to perform this using a loop:
questions<-as.data.frame(unique(df$Questions))
er<-NULL
i for (1:rnow(questions)){
er$i<-effectsize::cohens_d(df$Score[df$Question== i] ~ df$Test[df$Question== i] data = df)
print(er$i)
}
I am not sure if I am close, or far off. Any help is much appreciated. Thanks so much!
CodePudding user response:
If d
is your data:
library(data.table)
setDT(d)[, effectsize::cohens_d(Score~Test), Question]
Output:
Question Cohens_d CI CI_low CI_high
<char> <num> <num> <num> <num>
1: QA 0.3706247 0.95 -1.4690832 2.153623
2: QB 1.9611614 0.95 -0.7822302 4.510194
3: QC -0.5656854 0.95 -2.3643683 1.316452
Input:
d = data.table::fread("Question Score Test
QA 5 Pre
QA 2 Pre
QA 3 Post
QA 7 Post
QA 3 Post
QB 2 Pre
QB 1 Pre
QB 4 Post
QB 9 Post
QC 7 Pre
QC 3 Pre
QC 2 Post
QC 3 Post
QC 6 Post")
CodePudding user response:
You don't need a loop, you could do it all with tidy functions:
library(dplyr)
library(tidyr)
dat <- tibble::tribble(
~Question, ~Score, ~Test,
"QA", 5, "Pre",
"QA", 2, "Pre",
"QA", 3, "Post",
"QA", 7, "Post",
"QA", 3, "Post",
"QB", 2, "Pre",
"QB", 1, "Pre",
"QB", 4, "Pre",
"QB", 3, "Post",
"QB", 5, "Post",
"QC", 7, "Pre",
"QC", 3, "Pre",
"QC", 2, "Post",
"QC", 3, "Post",
"QC", 6, "Post")
dat %>%
group_by(Question) %>%
summarise(d = effectsize::cohens_d(Score ~ Test)) %>%
unnest(d)
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
#> # A tibble: 3 × 5
#> Question Cohens_d CI CI_low CI_high
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 QA 0.371 0.95 -1.47 2.15
#> 2 QB 1.12 0.95 -0.933 3.03
#> 3 QC -0.566 0.95 -2.36 1.32
Created on 2022-07-14 by the reprex package (v2.0.1)
If you wanted to do the loop instead, you could do it this way:
questions<-data.frame(q = unique(dat$Question))
er<-vector(mode="list", length=nrow(questions))
names(er) <- questions$q
for(i in questions$q){
er[[i]]<-effectsize::cohens_d(dat$Score[dat$Question== i] ~ dat$Test[dat$Question== i])
}
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
er
#> $QA
#> Cohen's d | 95% CI
#> -------------------------
#> 0.37 | [-1.47, 2.15]
#>
#> - Estimated using pooled SD.
#> $QB
#> Cohen's d | 95% CI
#> -------------------------
#> 1.12 | [-0.93, 3.03]
#>
#> - Estimated using pooled SD.
#> $QC
#> Cohen's d | 95% CI
#> -------------------------
#> -0.57 | [-2.36, 1.32]
#>
#> - Estimated using pooled SD.
Here, the loop counter i
stands in for the question names (i.e., it is a string and must be used as any string can be used in R. We can initialize the er
object as a list with the right number of elements and then can name the elements according to the questions. Now, inside the loop, when you use I
it will have the values "QA"
, "QB"
and "QC"
as it moves through the loop.
Created on 2022-07-14 by the reprex package (v2.0.1)
CodePudding user response:
The correct syntax for a for
loop is:
for (i in 1:rnow(questions)) {
# code here
}