Home > Net >  Use mutate and case_when on a group of variables
Use mutate and case_when on a group of variables

Time:08-27

I am trying to make a new variable that depends on a few conditions. Here is an example of data similar to mine:

df <- read.table(text="
color     num_1   shape      num_2   season    num_3     num_4  
red        1      triangle    4       Fall      2          8
blue       5      square      4       Summer    8          1
green      3      square      11      Summer    4          1
red        3      circle      2       Summer    1          5
red        7      triangle    6       Winter    7          9
blue       9      square      2       Fall      7          4", header=T)

I want to use mutate and case_when to make a new variable, for example if the color=red and any of the "num" categories are less than 3, the new variable's value would be "yes", or if the color=blue and any of the num categories are less than 5, the new variable would be "yes".

color     num_1   shape      num_2   season    num_3     num_4     new_var
  
red        1      triangle    4       Fall      2          8         yes 
blue       5      square      4       Summer    8          1         yes
blue       9      square      11      Summer    8          7         no
red        3      circle      2       Summer    1          5         yes
red        7      triangle    6       Winter    7          9         no
blue       9      square      2       Fall      7          4         yes

I think I can do something like:


df <-df %>%
 mutate(new_var=case_when(
   color=="red" & c(2,4,6,7) < 3 ~ "Yes",
   color=="blue" & c(2,4,6,7) < 5 ~ "Yes" ,
   TRUE~"No"))

But I don't know if it is possible to chose the columns by position like this. Any advice would be great!

CodePudding user response:

You can't use raw column indexes like that, but you can use if_any

df %>% 
  mutate(
    new_var = case_when(
      color=="red" & if_any(starts_with("num"), ~ . < 3) ~ "Yes",
      color=="blue" & if_any(starts_with("num"), ~ . < 5) ~ "Yes",
      TRUE ~ "No")
  )

The functions across, if_any, and if_all are all related and allow you to use the tidyselect helpers to look at multiple columns at once.

  • Related