Home > Software engineering >  In R: Force `foreach` loop to assign each computation inside the loop to a separate variable
In R: Force `foreach` loop to assign each computation inside the loop to a separate variable

Time:08-17

So I have a foreach loop that assigns several different calculations inside the loop to separate variables. While running under a normal for loop, all variables are accessible, but when switching to a foreach loop, the function only returns the last 'stored' variable. How do I change this?

# assume the parallel processing is already set;
# won't affect the result

set.seed(1)

data <- dplyr::tibble(
  x = rnorm(5),
  y = rnorm(5, sd = 0.5),
  z = rnorm(5, sd = 0.25)
)

sims = 10

df = foreach(sim = 1:sims) %dopar% {
  
  calc <- data |> 
          furrr::future_pmap_dfr(
            function(x, y, z) {
              res <- (x   y) / z
              names(res) <- 'col'
              return(res)
            }
          )
  
  calc_2 <- data |>
            furrr::future_pmap_dfr(
              function(x, y, z) {
                res <- (x y z)^2
                names(res) <- 'col'
                return(res)
              }
            )
}

df[[1]] returns:

# A tibble: 5 × 1
    col
  <dbl>
1 0.434
2 0.275
3 0.387
4 1.77 
5 0.210

When switching the orders of calc and calc_2, the function now returns:

# A tibble: 5 × 1
     col
   <dbl>
1 -2.74 
2  4.38 
3  3.00 
4 -3.40 
5  0.629

This loop is not making the other calculation accessible, and when I run the foreach inside my real function -- which operates as intended using a for loop -- the function fails because only one of the three variables assigned in the foreach loop is stored and made accessible throughout the function call. Similar to a for loop, how do I force the foreach loop to assign all variables calculated inside the loop so that they can be carried throughout the function?

CodePudding user response:

You can return the list of results, and process accordingly.

df = foreach(sim = 1:sims) %dopar% {
  
  calc <- data |> 
          furrr::future_pmap_dfr(
            function(x, y, z) {
              res <- (x   y) / z
              names(res) <- 'col'
              return(res)
            }
          )
  
  calc_2 <- data |>
            furrr::future_pmap_dfr(
              function(x, y, z) {
                res <- (x y z)^2
                names(res) <- 'col'
                return(res)
              }
            )
  list(calc, calc_2)
}

df is now a list of length sims, where each element is a list of length 2, containing calc and calc_2

  • Related