Home > Software engineering >  Why does it take so much time for R to compute m for loop with basic calculations?
Why does it take so much time for R to compute m for loop with basic calculations?

Time:01-16

Why does the computation of the following code in R take so much time? It takes many minutes, so I have interruped the calculations.

My aim is to adapt my simulated random numbers (sumzv, dim(sumzv) = 1000000 x 10) to my market model S_t (geometric brownian motion). The vectors m and s describe the drift and the deviation of the GBM and are vectors containing 10 numbers. DEL is the variable for the time steps. S_0 is a vector containing 10 stock prices at time 0.

n <- 1000000
k <- 10

S_t <- data.frame(matrix(0, nrow = n, ncol = k))

i <- 1
j <- 1
t <- 10

for (j in 1:k) {
  
  for (i in 1:n) {
    S_t[i, j] <- S_0[j] * exp(m[j] * t * DEL   s[j] * sqrt(DEL) * sumzv[i, j])
  
  }
  
}

Thank you for your help. Please keep in mind that I'm a beginner :)

Unfortunately, I couldn't find any helpful information so far on the internet. Some pages said, vectorization is helpful to speed up an R Code, but this doesn't seem helpful to me. I tried to break down the data frames into vectors but this got very complex.

CodePudding user response:

The following code with vectorized inner loop is equivalent to the posted code.
It also pre-computes some inner loop vectors, fac1 and fac2.

S_t <- data.frame(matrix(0, nrow = n, ncol = m))
fac1 <- m * t * DEL
fac2 <- s * sqrt(DEL)
for (j in 1:k) {
  S_t[, j] <- S_0[j] * exp(fac1[j]   fac2[j] * sumzv[, j])
}

The fully vectorized version of the loop on j above is the one-liner below. The transposes are needed because R is column major and we are multiplying by row vectors indexed on j = 1:k.

S_t2 <- t(S_0 * exp(fac1   fac2 * t(sumzv)))

CodePudding user response:

I'm not going to try to solve your entire issue but you mention vectorisation, so here is an example of that and the benefits; I calculate the same info twice; once in a double loop form like your general approach; and one where the outer loop is switched our for vectorisation. On my system the first version takes 1second to calculate; and the second version .2 seconds (or 5x faster).

If you are comfortable with the double for loop approach in terms of your comprehension; you may find it easier to implement the double loop in RCPP or similar.

  • Related