im working on NHIS 2020 survey and cant find a function for ROC or sensitivity anlysis for logistic regression model is there any function known?? one more Q is there any function for splitting the survey data into training and test ?
CodePudding user response:
You can try the pROC
package.
To split the data, you need to decide how to split it. For example, you may use half as training data, half as test data. Assume dataset
is your dataset. It has 10,000 rows
default_idx = sample(nrow(dataset), 5000)
default_trn = dataset[default_idx, ]
default_tst = dataset[-default_idx, ]
You can then get the ROC like this:
model_glm = glm(DV ~ IV, data = default_trn, family = "binomial")
test_prob = predict(model_glm, newdata = default_tst, type = "response")
test_roc = roc(default_tst$DV ~ test_prob, plot = TRUE, print.auc = TRUE)
See for example here for more detailed explanations: https://daviddalpiaz.github.io/r4sl/logistic-regression.html#roc-curves
CodePudding user response:
I have created some sample data and code below that may point you in the correct direction, but your question is quite broad so might not answer everything. The caTools
package can help with splitting into test/train. The pROC
package can help with ROC.
set.seed(05062020)
# Create sample data
alldata <- data.frame(outcome = sample(0:1, 100, replace = TRUE),
predictor1 = sample(1:3, 100, replace = TRUE),
predictor2 = sample(1:5, 100, replace = TRUE))
# Split into testing and training
library(caTools)
sample <- sample.split(alldata$outcome, SplitRatio = 0.7)
train <- subset(alldata, sample == TRUE)
test <- subset(alldata, sample == FALSE)
# Run example logistic model
example_model <- glm(outcome ~., family = binomial, data = train)
# get prediction from fitted model
predicts <- predict(example_model, type = "response", newdata = test[,-which(names(test) == "outcome")])
# ROC and plot
library(pROC)
roc(test$outcome, predicts) #ROC
plot.roc(smooth(roc(test$outcome, predicts)), col = 1, lwd = 3,
main = "AUC", xlab = "1 - Specificity", legacy.axes = TRUE)