I have patient data that is in the following format
Exam_Report PatientID ExamDate LVEF LA_size
Lorem ipsum1 101 1stQuarter 55 20
Lorem ipsum2 101 3rdQuarter 58 20
Lorem ipsum3 102 2ndQuarter 39 24
Lorem ipsum4 102 3rdQuarter 49 24
And I would like to reshape the file so that it is as follows:
Exam_Report PatientID 1stQ_LVEF 2ndQ_LVEF 3rdQ_LVEF 1stQ_LA_size 2nQ_LA_size 3rdQ_LA_size(...)
Lorem Ipsum1 101 55 NA 58 20 NA 20
Lorem Ipsum2 101 55 NA 58 20 NA 20
Lorem Ipsum3 102 NA 39 49 NA 24 24
Lorem Ipsum4 102 NA 39 49 NA 24 24
So, in summary: I have a table with variables that must be maintained for the wide format (for example, exam report, as the report differs from timepoint to timepoint), thus jusfifying that patientID must be repeated in the wide format. I need to create new variables (with a prefix containing the quarter that the exam was performed on) that hold the echocardiographic variables.
I'm stuck with this problem. Does anyone have a solution? Thanks
CodePudding user response:
We can pivot_wider
(from package {tidyr}) on PatientID
, then join back by PatientID
:
library(tidyr)
library(dplyr)
df %>%
mutate(across(ExamDate, sub, pattern = "Quarter", replacement = "Q")) %>%
pivot_wider(
id_cols = PatientID,
names_from = ExamDate,
values_from = LVEF:LA_size,
names_glue = "{ExamDate}_{.value}",
names_sort = TRUE
) %>%
left_join(
x = select(df, Exam_Report:PatientID),
y = .,
by = "PatientID"
)
Exam_Report PatientID 1stQ_LVEF 2ndQ_LVEF 3rdQ_LVEF 1stQ_LA_size 2ndQ_LA_size 3rdQ_LA_size
1 Lorem ipsum1 101 55 NA 58 20 NA 20
2 Lorem ipsum2 101 55 NA 58 20 NA 20
3 Lorem ipsum3 102 NA 39 49 NA 24 24
4 Lorem ipsum4 102 NA 39 49 NA 24 24
With data:
df <- read.table(text =
'Exam_Report PatientID ExamDate LVEF LA_size
"Lorem ipsum1" 101 1stQuarter 55 20
"Lorem ipsum2" 101 3rdQuarter 58 20
"Lorem ipsum3" 102 2ndQuarter 39 24
"Lorem ipsum4" 102 3rdQuarter 49 24',
header = TRUE)