I want to create residuals vs fitted plots for each of my numeric variables.
An example:
plot(glm(Outcome~Age, family="binomial", data=pima_train))
Which plots:
My current code
mod_summaries <- list() # creating empty list
for(i in 2:ncol(select_if(is.numeric(pima_train)))) { # Head of for-loop
predictors_i <- colnames(pima_train)[2:i] # Create vector of predictor names
mod_summaries[[i - 1]] <- summary( # Store regression model summary in list
lm(y ~ ., data[ , c("y", predictors_i)]))
}
Is it possible to get code that doesn't need me to create an entire dataset of only numeric variables? Saving RAM space would be good just to see what that would look like.
CodePudding user response:
You could identify all the numeric covariates when initiating the loop and avoid the numeric_train
dataset:
Data
set.seed(123)
pima_train <- data.frame(Outcome = as.factor(sample(0:1, 100, replace = TRUE)),
pred1 = runif(100),
pred2 = runif(100),
pred3 = sample(LETTERS, 100, replace = TRUE),
pred4 = runif(100))
Use is.numeric
and loop through the columns with lapply
to determine the numeric values:
for(i in names(pima_train)[unlist(lapply(pima_train, is.numeric))]){
plot(glm(Outcome ~ get(i), family = "binomial", data = pima_train), main = i)
}
## or alternative, use an `if` statement in the loop
for(i in names(pima_train)) {
if (is.numeric(pima_train[, i])) {
plot(glm(Outcome ~ get(i), family = "binomial", data = pima_train), main = i)
}
}
You could also use lapply
to avoid a for
loop altogether:
lapply(df[-1], function(x) if(is.numeric(x)){
plot(glm(Outcome ~ x, family = "binomial", data = pima_train))})