Home > Software engineering >  Input list of variables one at a time into the same formula to test how formula output changes
Input list of variables one at a time into the same formula to test how formula output changes

Time:06-04

How I can input each variable one at a time from a list of variables with values (such as (apples=5,6,7), (oranges=9,10,4), (bananas=(3,2,5), matching with Trend=(1,2,3) for (revenue=32,44,56)) into a model and save the summary of each model in R? For example I want apples, then oranges, then bananas to be tested in the following model

model=gam(price~s(Trend,k=2) s(apples,k=2),
          family="quasipoisson")

without having to type a new model every time for each of the variables since I have over 100 variables I want to test in this model, and only wanting to modify the model by changing out apples for other fruits.

I also want to save the output of each of the models and have that be automatically compiled into a table that will keep track of the new variable tested in the model and either the entirety of the model summary or just one specific indicator value I want from the summary (such as summary$GCV).

CodePudding user response:

Let's say you have a dataframe df, and a bunch of columns,including Trend, Price, and some fruit columns, like apples, bananas, and oranges. You can use lapply() over the column names of interest, feeding each column to a simple function that leverages as.formula to create the formula of interest, and returns the data of interest. Here is an exmaple:

f <- function(col,df) {
  form = as.formula(paste0("Price~s(Trend,k=2) s(",col,",k=2)"))
  data.frame(col = col, gcv = gam(form, data = df, family="quasipoisson")$gcv)
}

Now, apply the function columns of interest, and wrap in do.call(rbind()) to return a single table

do.call(rbind, lapply(c("apples","bananas", "oranges"), f, df=df))

Output:

      col      gcv
1  apples 1.658002
2 bananas 1.649134
3 oranges 1.637182

Input:

set.seed(123)
df = data.frame(
  Trend = sample(1:20, 100, replace=T),
  Price = sample(30:60, 100, replace=T),
  apples = sample(5:15, 100, replace=T),
  oranges = sample(2:9, 100, replace=T),
  bananas = sample(4:8, 100, replace=T)
)
  •  Tags:  
  • r
  • Related