Home > Software design >  Using invoke_map() or exec() on a data.frame
Using invoke_map() or exec() on a data.frame

Time:05-28

I have a dataframe, where different lines require different evaluations to compute a result. Each of these evaluations is implemented in a function, and the respective function to use is specified in a column in the dataframe. Here is a minimal example:

f1 = function(a,...){return(2*a)}
f2 = function(a,b,...){return(a b)}

df = data.frame(a=1:4,b=5:8,f=c('f1','f2','f2','f1'))

#Expected result:
  a b  f result
1 1 5 f1      2
2 2 6 f2      8
3 3 7 f2     10
4 4 8 f1      8

With pmap, I am able to apply a function to each row of a dataframe, and I also read about exec() replacing invoke_map(), but none of my attempts to combine both seem to work because exec() only seems to work with lists:

df$result = purrr::pmap(df,df$f)
df$result = purrr::pmap(df$f,exec,df)
...

Is there a more elegant way than filtering the dataframe for each function, using pmap on each filtered dataframe and then binding everything back together?

Thank you in advance!

Edit: I should mention that my dataframe has a lot of columns, and that the functions do not need the same arguments (e.g. some may be skipping ´´´a´´´, but require ´´´b´´´). Therefore I need a method where I don't need to pass the arguments explicitly.

CodePudding user response:

You can do this with exec() and pmap()

f1 = function(a,...){return(2*a)}
f2 = function(a,b,...){return(a b)}

df = data.frame(a= 1:4, b = 5:8, f = c('f1',' f2', 'f2', 'f1'))

require(purrr)
require(dplyr)

df |> mutate(result = pmap(list(f, a, b), exec))
#>   a b  f result
#> 1 1 5 f1      2
#> 2 2 6 f2      8
#> 3 3 7 f2     10
#> 4 4 8 f1      8

Created on 2022-05-27 by the reprex package (v2.0.1)


PS. You might have been getting an error because you were passing named arguments to exec(). When you pmap(list(f = "f1", a = 1, b = 1), exec), all the named arguments are passed to ... in exec(.fn, ...), because none of the list elements are named .fun.

In the above example, the list elements are passed without their names, and the first argument is therefore assumed (by exec()) to be .fun.

So you can use the method you suggested in conjunction with base::unname():

df |> relocate(f) |> unname() |> pmap(exec)
# [[1]]
# [1] 2
#
# [[2]]
# [1] 8
# 
# [[3]]
# [1] 10
#
# [[4]]
# [1] 8

Whereas without unname() you will get at error:

df |> relocate(f) |> pmap(exec)
# Error in .f(f = .l[[1L]][[i]], a = .l[[2L]][[i]], b = .l[[3L]][[i]], ...):
#   argument ".fn" is missing, with no default

Alternatively, you could rename df$f to df$.fn and pass the whole data.frame:

df |> rename(.fn = "f") |> pmap(exec)
# [[1]]
# [1] 2
#
# [[2]]
# [1] 8
# 
# [[3]]
# [1] 10
#
# [[4]]
# [1] 8

CodePudding user response:

Using lapply() over the rows, use do.call()


df$result = lapply(1:nrow(df), \(i) {
  do.call(df[i,"f"],as.list(subset(df[i,],select=-f)))
})

Output:

      a     b f     result
  <int> <int> <chr>  <dbl>
1     1     5 f1         2
2     2     6 f2         8
3     3     7 f2        10
4     4     8 f1         8
  • Related