Home > Net >  Saving results of a simulation using a for loop
Saving results of a simulation using a for loop

Time:11-12

I am trying to run a more complicated simulation and save the results of reach replication to a data frame. A very simple abstraction of what I am trying to do is like:

sim.res <- data.frame(mn1 = mn1, mn2 = mn2, rep = rep)

for (k in 1:10){
  
  set.seed(k*8)
  mn1 <- mean(rnorm(25, 1, 1))
  mn2 <- mean(rnorm(25, 2, 1))

  sim.res$mn1 <- mn1
  sim.res$mn2 <- mn2
  sim.res$rep <- k
}

The output should be a dataframe with three columns and, in this case 10 rows, with the results of each replication stored in each row. What I get is the 10th replication. Where am I going wrong?

Thanks.

CodePudding user response:

You're overwriting the entirity of each column each time you iterate through the loop because R is vectorised (runs on columns) by default. Also, theres no need to reset your seed each time you iterate through the loop. Try something like this.

library(tidyverse)

set.seed(123)

lapply(
  1:10,
  function(k) {
    data.frame(
      mn1 = mean(rnorm(25, 1, 1)), 
      mn2 = mean(rnorm(25, 2, 1)), 
      rep = k
    )
  }
) %>%
bind_rows()

         mn1      mn2 rep
1  0.9666697 2.102137   1
2  1.0102410 2.282576   2
3  0.7231910 1.769008   3
4  1.2152427 1.862371   4
5  1.1074981 1.875461   5
6  1.1727220 2.326180   6
7  0.9789770 2.025131   7
8  1.0981718 1.752828   8
9  0.9402713 1.928519   9
10 1.2714378 2.283175  10

Or, taking advantage of the fact that the mean of n draws from N(mu, sd) is N(mu, sd/sqrt(n)),

data.frame(
  rep=rep(1:10),
  mn1=rnorm(10, 1, sqrt(1/25)),
  mn2=rnorm(10, 2, sqrt(1/25))
)

   rep       mn1      mn2
1    1 1.2053570 2.176493
2    2 1.1502123 2.041120
3    3 0.6981667 1.876713
4    4 0.9809705 1.853040
5    5 0.8208104 1.973639
6    6 0.5858498 2.062003
7    7 1.0300240 1.792064
8    8 0.9841577 1.963138
9    9 0.9805261 2.193453
10  10 1.0432305 1.978344

CodePudding user response:

In R, the most basic object is still a vector, so this doesn't require a loop at all.

When you are setting sim.res$mn1 <- mn1 you are putting mn1 into the data.frame, but since mn1 is only length 1 and sim.res has 10 rows, R repeats the value 10 times and puts it in your data.frame.

Also set.seed only needs to be used once for reproducibility.

set.seed(42)
sim.res <- data.frame(mn1 = replicate(10, mean(rnorm(25, 1, 1))),
                      mn2 = replicate(10, mean(rnorm(25, 2, 1))), 
                      rep = 1:10)
sim.res
         mn1      mn2 rep
1  1.0818928 1.934759   1
2  1.0153544 2.185476   2
3  0.9898056 1.523066   3
4  0.9132660 2.229027   4
5  1.1003520 1.754576   5
6  0.7468452 2.211500   6
7  0.8480360 2.303661   7
8  1.2965247 2.014611   8
9  0.7511059 1.801408   9
10 1.0231177 1.924155  10

replicate is just a wrapper for sapply that makes this code easier to follow.
Maybe technically that's using a loop, but it sure doesn't look like one.

  • Related