Utilizing loop for repetitive analyses and outputs using R


I have a dataset that contains numerous items that were measured using a pre- and posttest instrument. Here is an example dataset:

Question    Score   Test
  QA         5       Pre
  QA         2       Pre
  QA         3       Post
  QA         7       Post
  QA         3       Post
  QB         2       Pre
  QB         1       Pre
  QB         4       Pre
  QC         7       Pre
  QC         3       Pre
  QC         2       Post
  QC         3       Post
  QC         6       Post

I want to perform a Cohen's D on this data, and create an object in my data environment, such as:

Effectsize1<-effectsize::cohens_d(df$Score[df$Question== "QA"]~ df$Test[df$Question== "QA"], data = df)

instead of writing out this code for each item, I have tried to perform this using a loop:


i for (1:rnow(questions)){
 er$i<-effectsize::cohens_d(df$Score[df$Question== i] ~ df$Test[df$Question== i] data = df)

I am not sure if I am close, or far off. Any help is much appreciated. Thanks so much!

If d is your data:

setDT(d)[, effectsize::cohens_d(Score~Test), Question]


   Question   Cohens_d    CI     CI_low  CI_high
     <char>      <num> <num>      <num>    <num>
1:       QA  0.3706247  0.95 -1.4690832 2.153623
2:       QB  1.9611614  0.95 -0.7822302 4.510194
3:       QC -0.5656854  0.95 -2.3643683 1.316452


You don't need a loop, you could do it all with tidy functions:

dat %>% 
  group_by(Question) %>%
  summarise(d = effectsize::cohens_d(Score ~ Test)) %>% 
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.

#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
#> # A tibble: 3 × 5
#>   Question Cohens_d    CI CI_low CI_high
#>   <chr>       <dbl> <dbl>  <dbl>   <dbl>
#> 1 QA          0.371  0.95 -1.47     2.15
#> 2 QB          1.12   0.95 -0.933    3.03
#> 3 QC         -0.566  0.95 -2.36     1.32

If you wanted to do the loop instead, you could do it this way:

questions<-data.frame(q = unique(dat$Question))
er<-vector(mode="list", length=nrow(questions))
names(er) <- questions$q
for(i in questions$q){
  er[[i]]<-effectsize::cohens_d(dat$Score[dat$Question== i] ~ dat$Test[dat$Question== i])
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.

#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
#> $QA
#> Cohen's d |        95% CI
#> -------------------------
#> 0.37      | [-1.47, 2.15]
#> - Estimated using pooled SD.
#> $QB
#> Cohen's d |        95% CI
#> -------------------------
#> 1.12      | [-0.93, 3.03]
#> - Estimated using pooled SD.
#> $QC
#> Cohen's d |        95% CI
#> -------------------------
#> -0.57     | [-2.36, 1.32]
#> - Estimated using pooled SD.

Here, the loop counter i stands in for the question names (i.e., it is a string and must be used as any string can be used in R. We can initialize the er object as a list with the right number of elements and then can name the elements according to the questions. Now, inside the loop, when you use I it will have the values "QA", "QB" and "QC" as it moves through the loop.

The correct syntax for a for loop is:

for (i in 1:rnow(questions)) {

# code here

  • Related