Home > Blockchain >  Extracting only some coefficients for the logistic regression model and its plot
Extracting only some coefficients for the logistic regression model and its plot

Time:06-15

sample_data = read.table("http://freakonometrics.free.fr/db.txt",
                         header=TRUE, sep=";")
head(sample_data)
model = glm(Y~0 X1 X2 X3,family=binomial,data=sample_data)
summary(model)
exp(coef(model ))
exp(cbind(OR = coef(model ), confint(model )))

I have the above sample data on logistic regression with categorical predictor I try the above code i get the following output,

            OR       2.5 %     97.5 %
X1  1.67639337 1.352583976 2.09856514
X2  1.23377720 1.071959330 1.42496949
X3A 0.01157565 0.001429430 0.08726854
X3B 0.06627849 0.008011818 0.54419759
X3C 0.01118084 0.001339984 0.08721028
X3D 0.01254032 0.001545240 0.09539880
X3E 0.10654454 0.013141540 0.87369972

but I am wondering how to extract OR and CI only for factors. My desired output will be:

 OR       2.5 %     97.5 %
X3A 0.01157565 0.001429430 0.08726854
X3B 0.06627849 0.008011818 0.54419759
X3C 0.01118084 0.001339984 0.08721028
X3D 0.01254032 0.001545240 0.09539880
X3E 0.10654454 0.013141540 0.87369972

Can any one help me the code to extract it? additionally I want to plot the above OR with confidence interval for the extracted one. Can you also help me the code with plot,or box plot?

CodePudding user response:

One possibility, using broom to extract the coefficients, dplyr::filter to select the terms you want, and dwplot to plot.

library(broom)
library(dotwhisker)
library(dplyr)
tt <- (tidy(model, exponentiate = TRUE, conf.int = TRUE)
   |> filter(stringr::str_detect(term, "^X3"))
)
dwplot(tt)

In addition, I would suggest:

library(ggplot2)
dwplot(tt)   scale_x_log10()   geom_vline(xintercept = 1, lty = 2)  
   labs(x="Odds ratio")

CodePudding user response:

You could filter out the rows that are the same as variable names in your data frame, since those row names with factor levels appended will not match:

result <- exp(cbind(OR = coef(model ), confint(model )))

result[!rownames(result) %in% names(sample_data),]
#>             OR       2.5 %     97.5 %
#> X3A 0.01157565 0.001429430 0.08726854
#> X3B 0.06627849 0.008011818 0.54419759
#> X3C 0.01118084 0.001339984 0.08721028
#> X3D 0.01254032 0.001545240 0.09539880
#> X3E 0.10654454 0.013141540 0.87369972

To extract the necessary rows and plot them, the full reproducible code would be:

library(tidyverse)

sample_data <-  read.table("http://freakonometrics.free.fr/db.txt",
                           header = TRUE, sep = ";")

model <- glm(Y ~ 0   X1   X2   X3,family = binomial, data = sample_data)
result <- exp(cbind(OR = coef(model), confint(model)))
#> Waiting for profiling to be done...

result %>% 
  as.data.frame(check.names = FALSE) %>%
  rownames_to_column(var = "Variable") %>%
  filter(!Variable %in% names(sample_data)) %>%
  ggplot(aes(x = OR, y = Variable))  
  geom_vline(xintercept = 1, linetype = 2)  
  geom_errorbarh(aes(xmin = `2.5 %`, xmax = `97.5 %`), height = 0.1)  
  geom_point(size = 2)  
  scale_x_log10(name = "Odds ratio (log scale)")  
  theme_minimal(base_size = 16)

Created on 2022-06-14 by the reprex package (v2.0.1)

  • Related