I have data in this format:
ID | risk | severity |
---|---|---|
1 | A | green |
1 | A | amber |
1 | A | amber |
1 | B | amber |
1 | B | amber |
1 | B | amber |
2 | B | green |
2 | B | green |
2 | B | amber |
2 | C | green |
2 | C | red |
2 | C | red |
I want to convert it to this format using dplyr on R-Studio:
ID | A | min_sev | max_sev | B | min_sev | max_sev | C | min_sev | max_sev |
---|---|---|---|---|---|---|---|---|---|
1 | 3 | green | amber | 3 | amber | amber | 0 | NA | NA |
2 | 0 | NA | NA | 3 | green | amber | C | green | red |
where,
min_sev = minimum severity
max_sev = maximum severity
Any help is appreciated, thanks in advance.
CodePudding user response:
Good catch @r2evans, here is an even simpler solution with nest and unnest:
library(dplyr)
library(tidyr)
df %>%
group_by(ID, risk) %>%
summarise(min_sev = first(severity),
max_sev = last(severity),
count = n()) %>%
group_by(ID, risk) %>%
nest() %>%
pivot_wider(names_from = risk, values_from = data) %>%
unnest(cols = everything(), names_sep = "_")