I have a rather small dataset resulted from a linkage between two different datasets. I would like to know how can I calculate specificity, sensibility, predictive values and plot the ROC curve. This is the first time I'm using this kind of statistics in R, so I don't even know how to start.
Part of the data looks like this:
data <- data.frame(NMM_TOTAL = c(1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1),
CPAV_TOTAL = c(0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0),
SIH_NMM_TOTAL = c(0, 0, 0, 1, 1, 1, 1, 1, 1, 0 , 0, 1, 1, 0, 1),
SIH_CPAV_TOTAL = c(1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1))
And the two way tables would be the combination of:
tab1 <- table(data$SIH_NMM_TOTAL, data$NMM_TOTAL)
tab2 <- table(data$SIH_CPAV_TOTAL, data$CPAV_TOTAL)
Where NMM_TOTAL and CPAV_TOTAL are the "gold standard". I don't know if any of this makes sense. Thanks in advance!
Obs: 1 stands for positive and 0 for negative.
CodePudding user response:
Let's work with tab1 to demonstrate specificity, sensitivity, and predictive values. Consider labeling the rows and columns of your tables to enhance clarity
act <- data$SIH_NMM_TOTAL
ref <- data$NMM_TOTAL
table(act,ref)
Load this library
library(caret)
The input data needs to be factors
act <- factor(act)
ref <- factor(ref)
The commands look like this
sensitivity(act, ref)
specificity(act, ref)
posPredValue(act, ref)
negPredValue(act, ref)
ROC curve. The Receiver Operating Characteristic (ROC) curve is used to assess the accuracy of a continuous measurement for predicting a binary outcome. It is not clear from your data that you can plot an ROC curve. Let me show you a simple example on how to generate one. The example is drawn from https://cran.r-project.org/web/packages/plotROC/vignettes/examples.html
library(ggplot2)
library(plotROC)
set.seed(1)
D.ex <- rbinom(200, size = 1, prob = .5)
M1 <- rnorm(200, mean = D.ex, sd = .65)
test <- data.frame(D = D.ex, D.str = c("Healthy", "Ill")[D.ex 1],
M1 = M1, stringsAsFactors = FALSE)
head(test)
ggplot(test, aes(d = D, m = M1)) geom_roc()