Creating a full vs. reduced model with one explanatory variable-CodePudding

I am attempting to create a Cox proportional hazards model with only one explanatory variable. To perform a likelihood ratio test, I know I need a full and reduced model. I also know that a full model would be a separate mean for each group, and a reduced model would use an overall mean for the whole data set. How can I ensure I am setting this up properly in R? In this model z is 1 if the patient had heart surgery, and z is 0 otherwise

I have:

model<-coxph(Surv(time,delta)~z,method='breslow',data=heartdata)

X.lr <- 2*(model$loglik[2]-model$loglik[1])

Does this achieve that? I get an answer I just want to know whether this makes a full vs. reduced model since I don't have other variables to use?

CodePudding user response：

In this case, this does work, but I think there's a better/more transparent solution using update() and anova() (I wasn't even aware that the log-likelihood component of coxph models included both the full and the null deviances).

Using a built-in data set from the survival package:

## drop NAs so we are using the same data set for full & reduced models
lungna <- na.omit(lung)
## fit full model
m1 <- coxph(Surv(time, status) ~ ph.ecog, data=lungna)
## update model to fit intercept only (` ~ 1 ` replaces the RHS of the formula):
## ~ 1 means "intercept only" in R formula notation
m0 <- update(m1, . ~ 1)
## anova() runs a likelihood-ratio test
anova(m0,m1)

Results:

Analysis of Deviance Table
 Cox model: response is  Surv(time, status)
 Model 1: ~ 1
 Model 2: ~ ph.ecog
   loglik  Chisq Df P(>|Chi|)    
1 -508.12                        
2 -501.91 12.409  1 0.0004273 ***

You can confirm that 2*diff(m1$loglik) gives 12.409, the same value reported by anova() for the deviance ("Chisq") difference, and that pchisq(chisq_val, df = 1, lower.tail = FALSE) gives the reported p-value.