I would like to substitute symbols in equation as well as simplify equations in R.
data:
num_names <- c("num_a","num_aa","num_aaa","num_b")
num_values <- c(1,2,3,4)
df <- data.frame(id=c(1:3),
equation=c("2*x_a*num_a","num_a*(num_aa 1)^2","num_aaa num_b*x_b"),
stringsAsFactors = F)
df
id equation
1 1 2*x_a*num_a
2 2 num_a*(num_aa 1)^2
3 3 num_aaa num_b*x_b
expected output:
id equation
1 1 2*x_a
2 2 9
3 3 3 4*x_b
CodePudding user response:
Try this:
df$eqn2 <- Reduce(function(prev, this) gsub(paste0("\\b", num_names[this], "\\b"), num_values[this], prev),
seq_along(num_names), init = df$equation)
df$eqn2 <- sapply(df$eqn2, function(eq) if (grepl("[A-Za-z_]", eq)) eq else eval(parse(text = eq)))
df$eqn2 <- gsub("(\\b1\\*|\\*1\\b)", "", df$eqn2)
df
# id equation eqn2
# 1 1 2*x_a*num_a 2*x_a
# 2 2 num_a*(num_aa 1)^2 9
# 3 3 num_aaa num_b*x_b 3 4*x_b
Not the most elegant, but it works well-enough here.
One problem with doing this perfectly symbolically is that some objects exist (as referenced by the num_*
variables, not the preferred format for true symbolic lookup) and some do not. I don't know of a way to evaluate only part of the equations without running into "not found" errors.
CodePudding user response:
We use the polynom package to convert the string to a polynomial class object and then a string. e
is the equation converted to an R expression and L
is a list of the values of the variables with names being the names of the variables. We also include polynomial()
with name held in xname
(which is the only undefined variable in e
or dummy
if none) into L
. Then evaluate e
with respect to L
and convert it to character. polynomial
always results in the name x
so at the end we replace x
with xname
.
library(polynom)
transform(df, equation = sapply(equation, function(ch) {
e <- str2lang(ch)
L <- c(list(polynomial()), as.list(num_values))
xname <- setdiff(all.vars(e), num_names)
if (length(xname) == 0) xname <- "dummy"
names(L) <- c(xname, num_names)
gsub("x", xname, as.character(eval(e, L)))
}))
giving
id equation
1 1 2*x_a
2 2 9
3 3 3 4*x_b