Home > Software engineering >  How to call a function multiple times with changing arguments and store them in a data frame in R?
How to call a function multiple times with changing arguments and store them in a data frame in R?

Time:11-24

I´m a R-beginner and want to run a simulation, where I call the normal distribution multiple times with different values for sd.

For example, I want to run rnorm with each sd one time, where the sd increases from 1 to M with each run. And I want to store the results in a list or data frame. I know that what I try to achieve is a pretty basic step, but I failed trying it by myself.

rnorm(n=1, mean=0, sd=1)

rnorm(n=1, mean=0, sd=2)

... rnorm(n=1, mean=0, sd=M)

I tried to code this with a for loop, but it didnt work.

test <- for(i in 1:10){
  test <- rnorm(n=1, mean=0, sd=i)
  return(test)
}

If I enter test in the console, I just get NULL as output. Hope someone can point me in the right direction.

Thanks in advance.

CodePudding user response:

If you want to generate 1 values per i, you may try

res <- c()
for(i in 1:10){
  res <- c(res,rnorm(n=1, mean=0, sd=i))
}
res
[1]  1.3586796 -0.2055755  1.1630148 -0.2152202 -6.8852978 -2.4899674 -2.7600297 -0.4745072  9.9002283  7.6317575

The result will be saved in a vector.

If you want to generate several values per i, for example, 5 values per i.

res <- matrix(0, nrow = 5, ncol = 10)
for(i in 1:10){
  res[,i] <- rnorm(n=5, mean=0, sd=i)
}
res
           [,1]       [,2]      [,3]       [,4]       [,5]       [,6]      [,7]         [,8]       [,9]     [,10]
[1,] -0.1645236 -1.4149903  1.194318  7.9215996 12.0080888   1.132754  3.328567  2.331569884  -5.118019  3.329504
[2,] -0.2533617  0.7291639 -1.836079 -1.4688859 -0.1962000 -10.829752 -4.969625 -3.546334986  -1.216608 10.630998
[3,]  0.6969634  1.5370658  1.023359 -4.1765385  3.4486968   8.793329  4.275084  0.008842813  10.602783 -3.041839
[4,]  0.5566632 -0.2246924 -3.388089  2.2788785  0.1400108   0.919520 -6.538683  0.594730593 -13.712101  3.700188
[5,] -0.6887557  1.7622155  4.299071 -0.5402184 -3.7163660  13.035670 -8.775434 -4.716167570   5.345516  2.670988

CodePudding user response:

It is good R practice to preallocate arrays. On a sample this small it won't make any functional difference, but your code would be more idiomatic as follows:

> k <- 10
> test <- rep(NA, k)
> for(i in 1:k){
      test[i] <- rnorm(n=1, mean=0, sd=i)
  }
> test
 [1]  -1.0083914   0.4404118   3.0581583  -8.2564404   3.7986092 -10.9064909  -6.5173390  -3.8936812
 [9]   5.2027044 -11.1803524

(In your code above: you don't return values from inside for loops, and you'd be overwriting your array with the unnecessary assignment in every iteration.)

CodePudding user response:

The above solutions work, but here is an alternative that I think is useful for R beginners.

We can achieve this without for-loops by adapting the function to the use case. Using a vectorized solution is usually preferable in R.

v <- Vectorize(rnorm, vectorize.args = c("sd"))

Which we can now call like this

set.seed(10)
v(n = 1, mean = 0, sd = 1:10)
#> 0.01874617  -0.36850508  -4.11399165  -2.39667086   1.47272563   2.33876580  -8.45653323  -2.90940814 -14.64005414  -2.56478394

Which is the same as

set.seed(10)
sds <- 1:10
out <- vector(mode = "numeric", 10)
for (i in seq_along(sds)) {
  out[i] <- rnorm(1, 0, sds[i])
}
out
#> 0.01874617  -0.36850508  -4.11399165  -2.39667086   1.47272563   2.33876580  -8.45653323  -2.90940814 -14.64005414  -2.56478394

but v is much (about 0.05 milliseconds vs 2 milliseconds on this trivial example) faster, and requires less memory.

  • Related