Home > Software engineering >  R - Calculate the average if 2 values present, calculate the median if 3 values
R - Calculate the average if 2 values present, calculate the median if 3 values

Time:05-30

I have a dataset in which each individual has 3 possible readings of systolic blood pressure (SBP) and 3 possible readings of diastolic blood pressure (DBP):

a = data.frame(
  ID = c(1:10),
  SBP1 = c(120, 121, 122, as.numeric(NA), 123, 124, 145, as.numeric(NA), 101, 110),
  SBP2 = c(134, 124, as.numeric(NA), as.numeric(NA), 102, 133, 123, as.numeric(NA), as.numeric(NA), 109),
  SBP3 = c(111, 123, as.numeric(NA), as.numeric(NA), as.numeric(NA), 133, 132, 111, 110, 123),
  DBP1 = c(89, 90, 87, as.numeric(NA), 65, 98, 80, as.numeric(NA), 66, 65),
  DBP2 = c(90, 92, as.numeric(NA), as.numeric(NA), 65, 78, 88, as.numeric(NA), as.numeric(NA), 91),
  DBP3 = c(91, 93, as.numeric(NA), as.numeric(NA), as.numeric(NA), 92, 78, 88, 88, 54)
)

I would like to create two new variables (one for the SBP called 'SBP_new', and the other for the DBP called 'DBP_new') using the following rules:

  1. If all 3 of the SBP/DBP readings are complete, then calculate the median (e.g., for ID1, SBP_new = 120, DBP = 90)
  2. If two of the 3 SBP/DBP readings are present, then calculate the mean (e.g., for ID5, SBP_new = (123 102)/2 and DBP_new = (65 65)/2)
  3. If only 1 pair of SBP/DBP reading available, then take that pair (e.g., for ID3, SBP_new = 122, DBP_new = 87)
  4. Finally, if all NA, then assign NA (e.g., for ID4, SBP_new = NA, DBP_new = NA)

I can subset my dataset into 4 subsets and then do the calculation in each individually then combine.

But is there a more efficient way to do this?

CodePudding user response:

Like @Ritchie Sacramento says in his comment to the question, compute the median for all cases. But remove NA's depending on whether or not all values are NA.

i_sbp <- grep("SBP", names(a))
i_dbp <- grep("DBP", names(a))

a$SBP_new <- apply(a[i_sbp], 1, \(x) median(x, na.rm = any(!is.na(x))))
a$DBP_new <- apply(a[i_dbp], 1, \(x) median(x, na.rm = any(!is.na(x))))

Created on 2022-05-29 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related