Home > Enterprise >  In the dplyr package can you mutate a column based on the values in a different column
In the dplyr package can you mutate a column based on the values in a different column

Time:07-30

My aim is to create a new df column called audit_cat which is based on two already existing df column called audit_score (discrete numeric variable) and gender ("Male", "Female").

audit_cat will be a dichotomous string variable with ranges 0-2 ~"Not hazardous" and 3-12 ~"Hazardous" for "Females" and for "Males" 0-3 = "Not hazardous" and 4-12.

It would be ideal if audit_cat contained both the female and male scores.

Using filter to separate by gender and then mutate the audit_score with case_when has worked except it filters out the other gender.

df2<- df2%>% 
   filter(gender == "Male") %>%
   mutate(audit_score_cat =
            case_when(audit_score >= 0 & audit_score <= 3 ~ "Not hazardous",
                      audit_score >=4 & audit_score <= 12 ~ "Hazardous"))

Is there a way I can create the new df column audit_cat based on the two different scoring systems which is dependent on gender?

Thank you.

CodePudding user response:

You could list the conditions for "Not hazardous" and assign the rest to "Hazardous".

df2 %>%
  mutate(audit_score_cat = case_when(
    gender ==    "Male" & between(audit_score, 0, 3) ~ "Not hazardous",
    gender == "Females" & between(audit_score, 0, 2) ~ "Not hazardous",
    TRUE                                             ~ "Hazardous"
  ))
  • Related