Evaluate R function stored in column that references other column names-CodePudding

I have the following dataset:

my.df <- data.frame(my_function=rep(c("Var1 Var 2","Var 2-Var1","(Var 2-(Var 2-Var1))/Var 2"), 1),
                    `Var1`=rep(1:1,3), 
                    `Var 2`=rep(5:5,3), check.names = FALSE)

my.df
#                  my_function Var1 Var 2
# 1                 Var1 Var 2    1     5
# 2                 Var 2-Var1    1     5
# 3 (Var 2-(Var 2-Var1))/Var 2    1     5

And I want to use column named my_function to calculate the values for each row into a new column called outcome

The outcome would be: 1 5=6,5-1=4,(5-(5-1))/5=0.2 for each of the rows.

EDIT Correct answers also reference the following original dataset:

my.df <- data.frame(my_function=rep(c("1000 2000","2000-1000","(2000-(2000-1000))/2000"), 1), `1000`=rep(1:1,3), `2000`=rep(5:5,3))

CodePudding user response：

Loop through my_function, then loop through column names gsub with value, finally evil parse:

vars <- colnames(my.df)[ -1 ]

sapply(seq(nrow(my.df)), function(i){
  res <- my.df[i, 1]
  for(v in vars){
    res <- gsub(v, my.df[i, v], res, fixed = TRUE)
  }
  eval(parse(text = res))
})
# [1] 6.0 4.0 0.2

Note:

fortunes::fortune("answer is parse")
# If the answer is parse() you should usually rethink the question.
#    -- Thomas Lumley
#       R-help (February 2005)

CodePudding user response：

A solution could be:

my.df <- data.frame(my_function=rep(c("1000 2000","2000-1000","(2000-(2000-1000))/2000"), 1), `1000`=rep(1:1,3), `2000`=rep(5:5,3))

my.df
#>               my_function X1000 X2000
#> 1               1000 2000     1     5
#> 2               2000-1000     1     5
#> 3 (2000-(2000-1000))/2000     1     5

my.df$my_function = gsub("1000", "X1000", my.df$my_function)
my.df$my_function = gsub("2000", "X2000", my.df$my_function)

my.df$outcome = sapply(split(my.df, 1:NROW(my.df)), function(x)
  eval(str2lang(x$my_function),x))

my.df
#>                   my_function X1000 X2000 outcome
#> 1                 X1000 X2000     1     5     6.0
#> 2                 X2000-X1000     1     5     4.0
#> 3 (X2000-(X2000-X1000))/X2000     1     5     0.2

However you should read the comments since there are security concerns about evaluating arbitrary code. See https://stackoverflow.com/a/18391779/6912817 for case.

CodePudding user response：

As expressed in the comments, I don't love parsing code from text, especially is the code text was generated through some user input. Here is, in my opinion, a safe way to evaluate these expressions:

library(tidyverse)

my.df <- data.frame(my_function=rep(c("1000 2000","2000-1000","(2000-(2000-1000))/2000"), 1), `1000`=rep(1:1,3), `2000`=rep(5:5,3))

my.df |>
  mutate(sub_function = pmap_chr(list(my_function, X1000, X2000),
                                 ~gsub(pattern = "1000", 
                                      replacement = ..2,
                                      x = ..1) |> 
                                   gsub(pattern = "2000",
                                       replacement = ..3)),
         eval = map_chr(sub_function, ~as.character(Ryacas::yac_symbol(.x))))
#>               my_function X1000 X2000 sub_function eval
#> 1               1000 2000     1     5          1 5    6
#> 2               2000-1000     1     5          5-1    4
#> 3 (2000-(2000-1000))/2000     1     5  (5-(5-1))/5  1/5

CodePudding user response：

Using rlang and purrr::pmap_dbl():

library(rlang)
library(purrr)

my.df$outcome <- pmap_dbl(
  my.df,
  \(my_function, Var1, Var2, ...) {
    eval(parse_expr(enexpr(my_function)))
  }
)

my.df

              my_function Var1 Var2 outcome
1               Var1 Var2    1    5     6.0
2               Var2-Var1    1    5     4.0
3 (Var2-(Var2-Var1))/Var2    1    5     0.2

CodePudding user response：

Here is another approach using bquote and deparse. Since your example data uses integers I first transform those to numeric to get rid of the L in the output.

my.df <- data.frame(
  my_function = rep(c("Var1 Var 2",
                      "Var 2-Var1",
                      "(Var 2-(Var 2-Var1))/Var 2"),
                    1),
  `Var1` = rep(1:1,3),
  `Var 2` = rep(5:5,3),
  check.names = FALSE)

library(dplyr)
library(stringr)

my.df %>% 
  mutate(across(starts_with("Var"), as.double)) %>%
  rowwise() %>% 
  mutate(outcome = str_replace_all(my_function,
                                   "(Var\\s{0,1}[0-9] )",
                                   '.(.data[["\\1"]])') %>% 
           paste0("bquote(", ., ")") %>%
           str2lang %>%
           eval %>%
           list,
         outcome = paste0(deparse(outcome), " = ", res = eval(outcome)))

#> # A tibble: 3 x 4
#> # Rowwise: 
#>   my_function                 Var1 `Var 2` outcome              
#>   <chr>                      <dbl>   <dbl> <chr>                
#> 1 Var1 Var 2                     1       5 1   5 = 6            
#> 2 Var 2-Var1                     1       5 5 - 1 = 4            
#> 3 (Var 2-(Var 2-Var1))/Var 2     1       5 (5 - (5 - 1))/5 = 0.2

^{Created on 2022-11-07 by the reprex package (v2.0.1)}