Here is a small example of data. Imagine I have many more covariates than this.
install.packages("mltools")
library(mltools)
library(dplyr)
set.seed(1234)
data <- tibble::data_frame(
age = round(runif(60, min = 48, max = 90)),
gender = sample(c(0,1), replace=TRUE, size=60),
weight = round(runif(60, min = 100, max = 300)),
group = sample(letters[1:4], size = 60, replace = TRUE))
one_hot <- data[,c("group")] %>%
glmnet::makeX() %>%
data.frame()
data$group <- NULL
data <- cbind(data, one_hot)
I want to create a data.frame that interacts with the group (groupa, groupb, groupc,groupd) and all variables (age, gender weight).
groupa * age
groupa * gender
groupa * weight
Same for the groupb, groupc, and groupd.
I've seen many questions about all possible interaction generators.
But I haven't seen any that show interaction with one column and the rest.
Hope this question was clear enough.
Thanks.
CodePudding user response:
I am sure there is a more elegant solution, but you could try writing your own function that does the interaction then use apply
to go over the columns and do.call
to combine everything:
intfun <- function(var){
data %>%
mutate(across(starts_with("group"),~.*{{var}})) %>%
select(starts_with("group"))
}
int_terms <- cbind(data,
do.call(cbind, apply(data[,1:3], 2, function(x) intfun(x))))
Output (note not all columns presented here):
# > head(int_terms)
# age gender weight groupa groupb groupc groupd age.groupa age.groupb age.groupc age.groupd gender.groupa gender.groupb gender.groupc gender.groupd weight.groupa
# 1 88 33 113 0 1 0 0 0 88 0 0 0 33 0 0 0
# 2 49 33 213 1 0 0 0 49 0 0 0 33 0 0 0 213
# 3 83 33 152 1 0 0 0 83 0 0 0 33 0 0 0 152
# 4 75 33 101 0 1 0 0 0 75 0 0 0 33 0 0 0
# 5 61 33 218 0 1 0 0 0 61 0 0 0 33 0 0 0
# 6 79 33 204 1 0 0 0 79 0 0 0 33 0 0 0 204