Home > Net >  Long to Wide Patient Data in R
Long to Wide Patient Data in R

Time:12-04

I have patient data that is in the following format

Exam_Report    PatientID   ExamDate   LVEF LA_size
Lorem ipsum1   101         1stQuarter 55   20
Lorem ipsum2   101         3rdQuarter 58   20
Lorem ipsum3   102         2ndQuarter 39   24
Lorem ipsum4   102         3rdQuarter 49   24

And I would like to reshape the file so that it is as follows:

Exam_Report  PatientID  1stQ_LVEF 2ndQ_LVEF 3rdQ_LVEF 1stQ_LA_size 2nQ_LA_size 3rdQ_LA_size(...)
Lorem Ipsum1 101        55        NA        58        20           NA          20
Lorem Ipsum2 101        55        NA        58        20           NA          20
Lorem Ipsum3 102        NA        39        49        NA           24          24
Lorem Ipsum4 102        NA        39        49        NA           24          24

So, in summary: I have a table with variables that must be maintained for the wide format (for example, exam report, as the report differs from timepoint to timepoint), thus jusfifying that patientID must be repeated in the wide format. I need to create new variables (with a prefix containing the quarter that the exam was performed on) that hold the echocardiographic variables.

I'm stuck with this problem. Does anyone have a solution? Thanks

CodePudding user response:

We can pivot_wider (from package {tidyr}) on PatientID, then join back by PatientID:

library(tidyr)
library(dplyr)

df %>% 
  mutate(across(ExamDate, sub, pattern = "Quarter", replacement = "Q")) %>% 
  pivot_wider(
    id_cols = PatientID,
    names_from = ExamDate,
    values_from = LVEF:LA_size,
    names_glue = "{ExamDate}_{.value}",
    names_sort = TRUE
  ) %>% 
  left_join(
    x = select(df, Exam_Report:PatientID),
    y = .,
    by = "PatientID"
  )
   Exam_Report PatientID 1stQ_LVEF 2ndQ_LVEF 3rdQ_LVEF 1stQ_LA_size 2ndQ_LA_size 3rdQ_LA_size
1 Lorem ipsum1       101        55        NA        58           20           NA           20
2 Lorem ipsum2       101        55        NA        58           20           NA           20
3 Lorem ipsum3       102        NA        39        49           NA           24           24
4 Lorem ipsum4       102        NA        39        49           NA           24           24

With data:

df <- read.table(text = 
'Exam_Report    PatientID   ExamDate   LVEF LA_size
"Lorem ipsum1"   101         1stQuarter 55   20
"Lorem ipsum2"   101         3rdQuarter 58   20
"Lorem ipsum3"   102         2ndQuarter 39   24
"Lorem ipsum4"   102         3rdQuarter 49   24',
  header = TRUE)
  •  Tags:  
  • r
  • Related