Home > Software engineering >  How to use the `purrr` package in R instead of for-loop to iterate over indices
How to use the `purrr` package in R instead of for-loop to iterate over indices

Time:12-14

I have a list of S4 objects, and I'm trying to iterate a function over these lists where I select an index position, and then from that position extract keywords I'm interested in. I am able to do a for loop and apply the function successfully, but is there a way this could be done using the purrr package? I'm not sure how to replicate the S4 object exactly, so I've included a very high level example just to get an idea of my process.

list_1 <- list("Sample", "test", "test Date")
list_2 <- list("test", "sample", "test Date")
listoflists <- list(list_1, list_2)

I created a list of indices of "Sample":

groupList <- map(listoflists,~which(toupper(.) == "SAMPLE"))

As well as a list of keywords that I'd like to extract:

keywordsList <- list(c("One test", "two test"), c("one test", "two test"))

I have a function that takes the S4 objects, selects the index where "sample" is found, and from that extracts the keywords.

for(i in seq_along(listoflists){
output[[i]] <- some_function(listoflists[[i]], index = groupList[[i]], keywords = keywordsList[[i]]) }

I tried using imap, but it seems like when I do this, the output's sublist only has 1 keyword (say "One test" in first list and "two test" in second list) instead of 3:

output <- listoflists %>% imap(~some_function(.x,index = groupList[[.y]], keywords = keywordsList[[.y]])

CodePudding user response:

You are missing an closing bracket in your for loop but other than that your code should work. I am going to define a trivial some_function() to demonstrate:

some_function <- function(x, index, keywords) {
    c(x[[index]], keywords)
}

loop_output <- vector(mode = "list", length = length(listoflists))
for (i in seq_along(listoflists)) {
    loop_output[[i]] <- some_function(listoflists[[i]], index = groupList[[i]], keywords = keywordsList[[i]])
}

purr_output <- imap(
    listoflists,
    ~ some_function(
        .x,
        index = groupList[[.y]],
        keywords = keywordsList[[.y]]
    )
)

identical(loop_output, purr_output)
# TRUE

If even with the correct brackets, your example works in a loop but not using imap I doubt that the use of S4 objects is relevant.

You can be tripped up if you have a named list. From the imap docs:

imap_xxx(x, ...), an indexed map, is short hand for map2(x, names(x), ...) if x has names, or map2(x, seq_along(x), ...) if it does not.

See for example:

listoflists <- list(list_1, list_2)
imap(listoflists, ~.y)
# [[1]]
# [1] 1

# [[2]]
# [1] 2

listoflists <- list(l1 = list_1, l2 = list_2)
imap(listoflists, ~.y)
# $l1
# [1] "l1"

# $l2
# [1] "l2"

Make sure you are looping over the indices rather than the names and the output should be identical.

CodePudding user response:

You could also do this with purrr::pmap(), which maps in parallel over an arbitrary number of lists (passed within a super-list):

output <-
  pmap(.l = list(listoflists, index = groupList, keywords = keywordsList),
       .f = some_function)
  • Related