I have a data.frame with risk of bias categories in separate columns in the form
a<- data.frame(
Q1_long_name=(sample(c("y","n","m"), 21, replace = T)),
Q2_long_name=(sample(c("y","n","m"), 21, replace = T)),
Q3_long_name=(sample(c("y","n","m"), 21, replace = T)),
Q4_long_name=(sample(c("y","n","m"), 21, replace = T)),
Q5_long_name=(sample(c("y","n","m"), 21, replace = T)),
Q6_long_name=(sample(c("y","n","m"), 21, replace = T))
Q7_long_name=(sample(c("y","n","m"), 21, replace = T))
)
As I have really long names for the variables (required for other function), I am having statements of case_when() that are pretty long and unreadable. Kind of like:
a %>%
mutate(overall_rob=
case_when(
Q1_long_name=="y"& Q2_long_name=="n" & Q3_long_name=="n" & Q5_long_name!="m" ~ "high",
Q1_long_name=="n"| Q2_long_name=="n" | Q3_long_name=="n" | Q5_long_name!="m" ~ "low",
TRUE ~ "unclear" ))
I managed to do it by renaming my variables before using case_when() and then changing them back but it still looks messy (as pointed by TarJae).
a %>%
rename_with(.cols=matches("^Q"), ~ gsub("^(Q[0-9]).*","\\1", .x))
Thus, I was wondering if there is any way to stream line case_when to use %in%
or something similar to specify multiple conditions at once? If not, TarJae's way would definitely be easier
CodePudding user response:
Are you looking for such a solution?
library(dplyr)
a %>%
rename_with(~str_extract(., "^[^_] (?=_)")) %>%
mutate(overall_rob=
case_when(
Q1=="y" & Q2=="n" & Q3=="n" & Q5!="m" ~ "high",
Q1=="n" | Q2=="n" | Q3=="n" | Q5!="m" ~ "low",
TRUE ~ "unclear"))
CodePudding user response:
Maybe like this ?
a %>%
mutate(case1 = Q1_long_name=="y"&
Q2_long_name=="n" &
Q3_long_name=="n" &
Q5_long_name!="m")%>%
mutate(case2 = Q1_long_name=="n"|
Q2_long_name=="n" |
Q3_long_name=="n" |
Q5_long_name!="m")
mutate(overall_rob=
case_when(
case1 ~ "high",
case2 ~ "low",
TRUE ~ "unclear" ))