I have a dataframe of about 2000 rows and 3 columns. In essence, I want to reshape this dataframe to be wider than longer. This is an example of my current data:
ID | Procedure | Date |
---|---|---|
D55 | Sedation | 01/01/2001 |
D55 | Excision | 01/01/2001 |
D55 | Biopsy | 01/01/2001 |
A66 | Sedation | 02/02/2001 |
A66 | Excision | 02/02/2001 |
T44 | Sedation | 03/03/2001 |
T44 | Biopsy | 03/03/2001 |
T44 | Sedation | 04/04/2001 |
T44 | Excision | 04/04/2001 |
G88 | Sedation | 05/05/2001 |
G88 | Biopsy | 05/05/2001 |
G88 | Sedation | 06/06/2001 |
G88 | Excision | 06/06/2001 |
G88 | Sedation | 07/07/2001 |
G88 | Re-excision | 07/07/2001 |
I want the each row to be one line for the ID, so I'd want to create something like this:
ID | Date 1 | Procedure(s) | Date 2 | Procedure(s) | Date 3 | Procedure(s) |
---|---|---|---|---|---|---|
D55 | 01/01/2001 | Sedation, Excision, Biopsy | ||||
A66 | 02/02/2001 | Sedation, Excision | ||||
T44 | 03/03/2001 | Sedation, Biopsy | 04/04/2001 | Sedation, Excision | ||
G88 | 05/05/2001 | Sedation, Biopsy | 06/06/2001 | Sedation, Excision | 07/07/2001 | Sedation, Re-excision |
The majority of IDs all have the same date, but different procedures documented. There are a handful that came in for further procedures on subsequent dates. I can't see any that came in for more than 3 different dates, but a way to count the dates documented per ID would be useful.
I've tried using cast and dcast so far, but I'm not really getting anywhere. I'm very new to R, so any help would be greatly appreciated! Thanks for reading.
CodePudding user response:
library(tidyverse)
df %>%
group_by(ID, Date) %>%
summarize(Procedure = paste0(Procedure, collapse = ", ")) %>%
mutate(col = row_number()) %>%
ungroup() %>%
pivot_wider(names_from = col, values_from = c(Date, Procedure))
This currently requires some reordering afterwards, which could be done like in this answer: https://stackoverflow.com/a/60400134/6851825
# A tibble: 4 x 7
ID Date_1 Date_2 Date_3 Procedure_1 Procedure_2 Procedure_3
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A66 2/2/01 NA NA Sedation, Excision NA NA
2 D55 1/1/01 NA NA Sedation, Excision, Biopsy NA NA
3 G88 5/5/01 6/6/01 7/7/01 Sedation, Biopsy Sedation, Excision Sedation, Re-excision
4 T44 3/3/01 4/4/01 NA Sedation, Biopsy Sedation, Excision NA