I have a dataset similar to this:
df<-structure(list(Person = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), Stab = c(1,
0, 1, 0, 0, 1, 0, 0, 0, 0), Shot = c(0, 0, 1, 1, 0, 0, 0, 0,
0, 1), Cut = c(0, 1, 1, 1, 0, 0, 0, 1, 0, 1), ShotBow = c(0,
0, 1, 0, 1, 0, 0, 0, 0, 0), Punched = c(0, 0, 1, 0, 1, 0, 0,
1, 0, 0), Slapped = c(0, 0, 1, 0, 0, 1, 0, 0, 1, 0), `Car Accident` = c(0,
0, 1, 0, 0, 0, 0, 0, 0, 0), `Bicycle Accident` = c(0, 0, 1, 0,
0, 0, 1, 0, 0, 0), FellOver = c(0, 0, 1, 0, 0, 0, 1, 0, 1, 0)), spec = structure(list(
cols = list(Person = structure(list(), class = c("collector_double",
"collector")), Stab = structure(list(), class = c("collector_double",
"collector")), Shot = structure(list(), class = c("collector_double",
"collector")), Cut = structure(list(), class = c("collector_double",
"collector")), ShotBow = structure(list(), class = c("collector_double",
"collector")), Punched = structure(list(), class = c("collector_double",
"collector")), Slapped = structure(list(), class = c("collector_double",
"collector")), `Car Accident` = structure(list(), class = c("collector_double",
"collector")), `Bicycle Accident` = structure(list(), class = c("collector_double",
"collector")), FellOver = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x000002898df11210>, row.names = c(NA,
-10L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
))
As you can see, the data is about different patients and what happened to them. Variables in the real dataset are slightly different but kinda like: "stabbed", "shot", "slapped" etc...
I want to squish all these columns into one column that is effectively "how bad is the injury", and working with some of my medical colleagues we've decided on some rankings (again, these aren't the real injuries, I made these ones up).
The rankings for this fake one are:
Level 1 Severity (worst)
- Car Accident
- Shot
- ShotBow
Level 2 Severity (not as bad)
- Stab
- Cut
- Bicycle Accident
Level 3 (really not bad)
- Punched
- Slapped
- Fell Over
What I want to do is create a variable called "Severity" and give patients a 1,2,or 3 based on if they had that respective column (prioritizing the most severe injury). I.e. Patient 1 was stabbed, so they get a "2" (for level 2). Patient 8 was cut and slapped, so they'd get a 2 for the cut... which would overrule injuries less severe such as the slap. Patient 10 was shot and cut, so they'd get a "1" because shot is more severe than cut.
My expected output would look like this:
CodePudding user response:
We may need a named vector
library(dplyr)
library(purrr)
nm1 <- setNames(c(2, 1, 2, 1, 3, 3, 1, 2, 3), names(df)[-1])
df %>%
mutate(Severity = across(-Person, ~ na_if(., 0) * nm1[[cur_column()]]) %>%
{invoke(pmin, c(., na.rm = TRUE))})
-output
# A tibble: 10 × 11
Person Stab Shot Cut ShotBow Punched Slapped `Car Accident` `Bicycle Accident` FellOver Severity
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0 0 0 0 0 0 0 0 2
2 2 0 0 1 0 0 0 0 0 0 2
3 3 1 1 1 1 1 1 1 1 1 1
4 4 0 1 1 0 0 0 0 0 0 1
5 5 0 0 0 1 1 0 0 0 0 1
6 6 1 0 0 0 0 1 0 0 0 2
7 7 0 0 0 0 0 0 0 1 1 2
8 8 0 0 1 0 1 0 0 0 0 2
9 9 0 0 0 0 0 1 0 0 1 3
10 10 0 1 1 0 0 0 0 0 0 1