I have a dataset that looks similar to this:
Patient ID | Unit Type | Status |
---|---|---|
1234 | ABC | Returned |
1234 | ABC | Returned |
1234 | ABC | Returned |
1234 | ABC | Returned |
1234 | ABC | Transfused |
1234 | DEF | Transfused |
1234 | DEF | Transfused |
1234 | DEF | Transfused |
1234 | GHI | Transfused |
Table Code:
> dput(df)
structure(list(Patient_ID = c("1234", "1234", "1234", "1234",
"1234", "1234", "1234", "1234", "1234"), Unit_Type = c("ABC",
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned",
"Returned", "Returned", "Returned", "Transfused", "Transfused",
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA,
-9L))
I want my output to look like this (preferably in a new dataframe):
Patient ID | ABC_Returned | ABC_Transfused | DEF_Tranfused | GHI_Transfused |
---|---|---|---|---|
1234 | 4 | 1 | 3 | 1 |
How can I create a function that would automate this process?
CodePudding user response:
library(tidyverse)
df %>%
count(Patient_ID, Unit_Type, Status) %>%
pivot_wider(names_from = c(Unit_Type, Status), values_from = n)
result
# A tibble: 1 × 5
Patient_ID ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused
<chr> <int> <int> <int> <int>
1 1234 4 1 3 1
CodePudding user response:
library(dplyr)
df <- structure(list(Patient_ID = c("1234", "1234", "1234", "1234",
"1234", "1234", "1234", "1234", "1234"), Unit_Type = c("ABC",
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned",
"Returned", "Returned", "Returned", "Transfused", "Transfused",
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA,
-9L))
df2 <- df %>%
mutate(unit_stat = paste(Unit_Type, Status, sep = "_")) %>%
count(Patient_ID, unit_stat) %>%
tidyr::pivot_wider(id_cols = "Patient_ID", names_from = unit_stat, values_from = n)