Home > other >  Text Argument in R Function With GLM: Object Not Found
Text Argument in R Function With GLM: Object Not Found

Time:12-06

A function included in a package is throwing an error when I attempt to supply weights to the function. The portion of the package call is required to be specified like this:

weights = c("kernel_wght")

Inside the function, the following two lines of code are used to specify a data frame object called weight:

weight1 <- sprintf("dataarg$%s", weights)
weight <- as.data.frame(eval(parse(text = weight1)))

However, the analytic portion of the function attempts to use glm to conduct an analysis of data, using the weights provided.

result1 <- glm(f1, family="gaussian", weights=weight, data=dataarg)

Doing so yields the following error:

Error in (function (arg) : object 'weight' not found

I've seen some recommendations that the whole glm call should be re-specified...and i've seen some referrals to global environment objects. Why can i print the dataframe, verifying it indeed is created, but can't refer to it in the call to glm? Is there a fix that i have overlooked?

As per requested, here is a small example. I created some sample data, as if it had come from a multiple imputation generating process:

dat <- c(1, 1, 0, .5, 1, 3, 0,  1, 1, 4, 0, .5, 1, 5, 1,  1, 1, 2, 1, 
 .5,
     2, 7, 1,  1, 2, 3, 0, .5, 2, 2, 0,  1, 2, 4, 1, .5)
dat <- data.frame(matrix(dat,ncol=4, byrow=T))
colnames(dat) <- c("id", "y", "tx", "wt")

imp_lst <- lapply(1:2, function(s) dplyr::filter(dat, id == s))
for (i in 1:length(imp_lst)) { assign(paste0("imp", i), 
as.data.frame(imp_lst[[i]])) }

df_lst <- list()
for (i in 1:length(imp_lst)) { 
  assign(paste0("imp", i), as.data.frame(imp_lst[[i]]))
  df_lst <- append(df_lst, list(get(paste0("imp", i))))
  names(df_lst)[i] <- paste0("imp", i)
}

And here is a small example, mostly taken from the package, that re-creates the problem:

my_ex <- function(datasets, y, treatment, weights=NULL, ...) {
 data <- names(datasets)

  for (i in 1:length(treatment)) {
    d1 <- sprintf("datasets$%s", data[i])
    dataarg <- eval(parse(text=d1))
    print(dataarg)

    if(!is.null(weights)) {
      weight1 <- sprintf("dataarg$%s", weights)
      weight <- as.data.frame(eval(parse(text = weight1)))
      print(weight)
    } else {
      dataarg$weight <- weight <- rep(1,nrow(dataarg))
    }

  f1 <- sprintf("%s ~ %s ", y, treatment)
  print(f1)
  result1 <- glm(f1, family="gaussian", weights=weight, data=dataarg)
  print(summary(result1))
  }
}

Using the following call, the error appears:

testrun <- my_ex(df_lst, y = c("y","y"), treatment = c("tx","tx"), weights = c("wt","wt"))enter code here

CodePudding user response:

The proximal problem is that you are defining the formula as a character string and passing it to glm. It gets converted to a formula within glm, but when that happens its environment is the environment of glm, so it doesn't know where to look for the weights variable (loosely speaking, glm will look (1) within the data frame provided as data and (2) in the environment of the formula). You can work around this by using as.formula() to convert the string to a formula before passing it to glm (e.g. glm(as.formula(f1), ...)).

However: using functions like eval, parse, assign is a code smell in R — it means there's probably a more natural, simpler, more robust way to do what you want. For example, I think this function does the same as what your function is trying to do, relying on indexing within lists rather than using eval(parse(...)) and friends.

my_ex2 <- function(datasets, y, treatment,  weights = NULL, ...) {
    result <- list()
    for (i in 1:length(treatment)) {
        form <- reformulate(treatment[i], response = y[i])
        data <- datasets[[i]]
        ## note double brackets around second term - we want
        ## the results to be a vector, not a data frame
        weight <- data[[weights[i]]]
        result[[i]] <- glm(formula = form, weight = weight, data = data)
    }
    result
}

Then, to print out all the summaries, lapply(result, summary) (if you really think you only need the summary, you can save the summary instead of the fitted object inside the loop).

  • Related