I would like to ask for help with my project. My goal is to get ROC curve from existing logistic regression.
First of all, here is what I'm analyzing.
glm.fit <- glm(Severity_Binary ~ Side State Timezone Temperature.F. Wind_Chill.F. Humidity... Pressure.in. Visibility.mi. Wind_Direction Wind_Speed.mph. Precipitation.in. Amenity Bump Crossing Give_Way Junction No_Exit Railway Station Stop Traffic_Calming Traffic_Signal Sunrise_Sunset , data = train_data, family = binomial)
glm.probs <- predict(glm.fit,type = "response")
glm.probs = predict(glm.fit, newdata = test_data, type = "response")
glm.pred = ifelse(glm.probs > 0.5, "1", "0")
This part works fine, I am able to show a table of prediction and mean result. But here comes the problem for me, I'm using pROC
library, but I am open to use anything else which you can help me with. I'm using test_data with approximately 975 rows, but variable proc has only 3 sensitivities/specificities values.
library(pROC)
proc <- roc(test_data$Severity_Binary,glm.probs)
test_data$sens <- proc$sensitivities[1:975]
test_data$spec <- proc$specificities[1:975]
ggplot(test_data, aes(x=spec, y=sens)) geom_line()
Here´s what I have as a result:
With Warning message:
Removed 972 row(s) containing missing values (
geom_path
).
As I found out, proc has only 3 values as I said.
CodePudding user response:
You can't (and shouldn't) assign the sensitivity and specificity to the data. They are summary data and exist in a different dimension than your data.
Specifically, these two lines are wrong and make no sense at all:
test_data$sens <- proc$sensitivities[1:975]
test_data$spec <- proc$specificities[1:975]
Instead you must either save them to a new data.frame, or use some of the existing functions like