Home > front end >  Is there an R function that can take time series and transform it so excel can make a stepwise chart
Is there an R function that can take time series and transform it so excel can make a stepwise chart

Time:11-14

I am working on a project for a client that want their charts done in Excel.

One of the charts I need to do is of cumulative hazards, which I get with Survival::survfit,

My problem is that excel can't do stepwise charts, so I need to transform the data, so every timepoint occurs twice; once with the previous cumulative hazard and once with the cumulative hazard at that time-point.

It is relatively easy, but annoying and time-consuming, to do this in Excel - Is there a smart way of doing it in R?

I am a relatively new r user, and I have not been able to figure out a way to do what I want.

I have tried showing what I get and what I want below:

#Load survival package"
library(survival)

#Create survfit object
Survival_Function <- survfit(Surv(lung$time,
                                 lung$status == 2)~1)

#extract cumulative hazards
cumhaz <- data.frame(Survival_Function$time, Survival_Function$cumhaz)

head(cumhaz)

Gives me the following:

  Survival_Function.time Survival_Function.cumhaz
1                      5              0.004385965
2                     11              0.017601824
3                     12              0.022066110
4                     13              0.031034720
5                     15              0.035559606
6                     26              0.040105061

But for excel to make the charts properly I'd need it to look like this:

  Survival_Function.time Survival_Function.cumhaz
1                      5              0.004385965
2                     11              0.004385965
3                     11              0.017601824
4                     12              0.017601824
5                     12              0.022066110
6                     13              0.022066110
7                     13              0.031034720
8                     15              0.031034720
9                     15              0.035559606
10                    26              0.035559606
11                    26              0.040105061

CodePudding user response:

Based on your code, one simple approach is to repeat the columns, with each element repeated twice. From here, you can remove the first element from the time column and the last from cumhaz column then combine. An example of this code is:

x <- data.frame(
    Time = c(1,2,3,4,5), 
    Hazard = c(6,7,8,9,10)
)

 data.frame(
     Time = rep(x$Time, each = 2)[-1], #Repeats the time, removing the first to give you the desired formatting
     Hazard = rep(x$Hazard, each = 2)[-length(rep(x$Hazard, each = 2))] # By removing the last element is means that they have the same length
)

and this gives you the desired output.

Note: If you have a large amount of columns this will be cumbersome, however for just two or so it should be fine.

  • Related