Home > OS >  How to parallelize a for loop that is looping over a vector in R
How to parallelize a for loop that is looping over a vector in R

Time:08-25

set.seed(3)
myvec <- rnorm(1000)

output <- vector("list", length = length(myvec))
for(i in 1:length(myvec)){
   output[[i]] <- floor(myvec[i])^2   exp(myvec[i])^2/2
}

Suppose I have a pre-specified vector of numbers called myvec. I would like to loop over each element, and the final output is a list.

Using for loop can be very inefficient. Similarly, using lapply is also quite slow.

output <- lapply(1:length(myvec), function(i){
floor(myvec[i])^2   exp(myvec[i])^2/2
})

Is there an alternative that's much faster? The function that I made up above is a toy function. In reality, the function I'm running is much more complicated than just floor(myvec[i])^2 exp(myvec[i])^2/2, so I'm looking for alternatives to for loop and lapply.

CodePudding user response:

Here's a foreach example:

library(foreach)
library(doParallel)

registerDoParallel(cores = 6)
output <- foreach(x = myvec) %dopar% {floor(x)^2   exp(x)^2/2}

CodePudding user response:

Several different ways to accomplish this but my go-to is purrr. The purrr implementation would be as follows:

output <- map(my_vec, function(x) {
  floor(x)^2   exp(x)^2/2
})

There's several different ways you could rewrite the above code including using anonymous functions or using map_dbl to return a vector of numeric types as opposed to a list with the results, but the above is the most basic explicit version.

The beauty of purrr is that you can also parallelize it very easily with furrr. The same chunk could be easily parallelized as folllows:

library(furrr)
plan(multiprocess)

output <- future_map(my_vec, function(x) {
  floor(x)^2   exp(x)^2/2
})
  • Related