Home > Enterprise >  Calculating rolling mean by ID with variable window width
Calculating rolling mean by ID with variable window width

Time:05-19

I have a repeated measures dataset of vital signs. I'm trying to calculate some summary statistics (mean, min, max, slope, etc) of the patient's prior 24 hours of observations, measured by the Admit_to_Perform variable, excluding the current observation. Here is an extract of the first patient's first 15 observations:

df1 <- data.frame(ID = rep(1, 15), 
Admit_to_Perform = c(1.07, 1.07, 1.70, 3.73, 3.73, 4.20, 8.87, 11.68, 14.80, 15.67, 19.08, 23.15, 29.68, 36.03, 39.08), 
Resp_Rate = c(18, 17, 18, 17, 16, 16, 16, 16, 16, 17, 16, 16, 16, 16, 16))

     ID     Admit_to_Perform Resp_Rate
 1:  1             1.07        18
 2:  1             1.07        17
 3:  1             1.70        18
 4:  1             3.73        17
 5:  1             3.73        16
 6:  1             4.20        16
 7:  1             8.87        16
 8:  1            11.68        16
 9:  1            14.80        16
10:  1            15.67        17
11:  1            19.08        16
12:  1            23.15        16
13:  1            29.68        16
14:  1            36.03        16
15:  1            39.08        16

What I would like is to add on a column for each summary statistic of Resp_Rate. The first row has no prior observations in the past 24 hours, so it can be blank, but for the second row the mean would be 18, for the third row 17.5, the fourth row 17.667, and so on. However for the 13th row, because Admit_to_Perform is more than 24 hours after the first 6 observations, it would only take the mean of rows 7-12.

I've tried using some of the zoo and data.table functions but don't seem to be getting anywhere.

CodePudding user response:

Bit of a quick and dirty solution where Resp_Rate is hard-coded into the f(), and it will be slow because it performs the filter on the dataset for every row, but this does what you want.

library(tidyverse)

df1 <- data.frame(ID = rep(1, 15), 
                  Admit_to_Perform = c(1.07, 1.07, 1.70, 3.73, 3.73, 4.20, 8.87, 11.68, 14.80, 15.67, 19.08, 23.15, 29.68, 36.03, 39.08), 
                  Resp_Rate = c(18, 17, 18, 17, 16, 16, 16, 16, 16, 17, 16, 16, 16, 16, 16))


f <- function(data, id, outcome, time, window=24) {
  data <- filter(data, 
                 ID==id,
                 Admit_to_Perform>(time-window),
                 Admit_to_Perform<time)
  if(length(!is.na(data$Resp_Rate))==0) return(NA)
  mean(data$Resp_Rate)
}



df1 %>%
  rowwise() %>%
  mutate(roll=f(data=., id=ID, outcome=Resp_Rate, time=Admit_to_Perform))
  • Related