Home > Enterprise >  How to fill the area between two curves (Kaplan Meier)
How to fill the area between two curves (Kaplan Meier)

Time:11-30

How can I fill the area between two curves (Kaplan-Meier) limited to a especific time, in the figure (by example, x=40) where the length of "x" and "y" differ. x1= kmenter image description here

I am looking for something like this: enter image description here

CodePudding user response:

You have tagged with ggplot2, so here is a method using geom_ribbon. It requires quite a bit of data wrangling to get the model into the correct format though:

library(tidyverse)

with(km, 
  data.frame(time, surv, strata = rep(names(strata), times = strata))
  ) %>%
  mutate(strata = sub("x=", "", strata)) %>%
  group_by(strata) %>%
  summarise(time = c(time - 0.001, time),
            surv = c(1, head(surv, -1), surv),
            .groups = "drop") %>%
  arrange(time) %>%
  pivot_wider(names_from = "strata", values_from = "surv") %>%
  fill(2:3, .direction = "downup") %>%
  ggplot(aes(time, Maintained))  
  geom_line(aes(linetype = "Maintained"))  
  geom_line(aes(y = Nonmaintained, linetype = "Nonmaintained"))  
  geom_ribbon(aes(ymin = Nonmaintained, ymax = Maintained, fill = "Difference"),
              )  
  scale_y_continuous(labels = scales::percent, name = "Percentage survival",
                     expand = c(0, 0))  
  scale_x_continuous(expand = c(0, 0))  
  scale_fill_manual(values = "#6dd8ae", name = NULL)  
  labs(linetype = "Group")  
  guides(linetype = guide_legend(order = 1))  
  theme_classic(base_size = 20)  
  theme(legend.position = "bottom",
        panel.grid.major.y = element_line(color = "gray95"))

enter image description here

CodePudding user response:

Probably easier to create a table with the all the x observations and fill in the blanks:

library(survival)
data(cancer)
km<- survfit(Surv(time,status)~x, aml)


library(ggplot2)
library(data.table)
library(zoo) 

fill_data=function(this_km){
  out=rbind(data.table(time=0,surv=1),data.table(time=this_km$time,surv=this_km$surv))
  out=rbind(na.omit(out[,.(time,surv=shift(surv))]),out)
  setkeyv(out,"time")
  return(out)
}
data=merge(fill_data(km[1]),fill_data(km[2]),by = 'time',all=TRUE,suffixes = c('1','2'))

#Fill in the blanks
data[,surv1:=na.locf(surv1)]
data[,surv2:=na.locf(surv2)]

#plot with geom path and ribbon
ggplot(data =data) 
  geom_ribbon(mapping = aes(x=time,ymin=surv1,ymax=surv2),fill='purple',alpha=.2) 
  geom_path(aes(x=time,y=surv1),col='blue') 
  geom_path(aes(x=time,y=surv2),col='red')

enter image description here

  • Related