Briefly, I am working with data sets from two different countries. My aim is to ensemble the models for both countries to see how generalizable the ensemble becomes
My set-up is: I have trained one worfklow_set for each country (10 model specifications with resampling and a grid search of size 20).
This is the error I get when trying to add them as candidates:
predictions <- stacks() %>%
add_candidates(wf_set_1) %>%
add_candidates(wf_set_2)
Error: It seems like the new candidate member 'Logistic Regression' doesn't make use of the same resampling object as the existing candidates.
CodePudding user response:
Thanks for the question!
Unfortunately, we don't support ensembling models trained on different data sets in stacks. There are a few operations that are no longer well-defined when this is the case.
Given your description of the problem, though, this sounds like a setting where, rather than fitting a model for each country, the country would be included as a feature in one model that fits across countries. For any covariates x_i
whose effect you feel may be dependent on country, you can create an interaction term with step_interact(x_i, country)
.