Home > Back-end >  Dynamically calling variables within a function for a function using formula notation (in this case
Dynamically calling variables within a function for a function using formula notation (in this case

Time:06-22

I have data as follows:

dat <- data.table(type=c("V","X","Y","Z","V","X","Y","Z"), cat=c("A", "B", "A", "B","C", "D","C", "D"), val=c(2,2,3,4,5,6,7,6) )

# Let's say I want to make a table of the means by a category as follows:

# dat[, mean_val_by_cat := mean(val), by=cat]

# xtabs(mean_val_by_cat ~cat,dat)

# cat
# A  B  C  D 
# 5  6 12 12 

# Now I would like to wrap this in a function:

myfun <- function(dat, by_var, value="val") {
  mean_val_by_var <- paste0("mean_val_by_", by_var)
  print(mean_val_by_var) # [1] "mean_val_by_cat"
  print(value) # [1] "val"
  print(by_var) # [1] "cat"
  dat[, (mean_val_by_var) :=  mean(get(value)), by=c(by_var)]
  xtabs(mean_val_by_var ~ cat, dat)
}

myfun(dat, by_var="cat")

I get the following error:

Error in model.frame.default(formula = mean_val_by_var ~ cat, data = dat) : 
  variable lengths differ (found for 'cat')
Called from: model.frame.default(formula = mean_val_by_var ~ cat, data = dat)

How should I make sure that mean_val_by_var is read as the variable name mean_val_by_cat?

CodePudding user response:

Use as.formula():

xtabs(as.formula(paste(mean_val_by_var, "~", by_var)), dat)

CodePudding user response:

We assume that what is wanted is to take the mean of value by by_var and then use xtabs to summarize that.

1) reformulate We can use the reformulate function to create the formula:

myfun2 <- function(dat, by_var, value = "val") {
  Mean <- dat[, .(mean = mean(get(value))), by = c(by_var)]     
  xtabs(reformulate(by_var, "mean"), Mean)
}

myfun2(dat, by_var = "cat")
## cat
##   A   B   C   D 
## 2.5 3.0 6.0 6.0 

2) data.frame Another possibility is that xtabs allows passing of a data frame:

myfun3 <- function(dat, by_var, value = "val") {
  Mean <- dat[, .(mean = mean(get(value))), by = c(by_var)]     
  xtabs(Mean[, 2:1])
}

myfun3(dat, by_var = "cat")
## cat
##   A   B   C   D 
## 2.5 3.0 6.0 6.0 
  • Related