I'm trying to create a new column in an R dataframe based on a set of conditions that are mutually exclusive. There is a clever way to achieve this on python using np.select(conditions, choices), instead of np.where (See this solved question). I've been looking for an equivalent on R that allows me to avoid writing a gigantic nested ifelse (which is the equivalent of np.where) without any success.
The amount of conditions that I have can change and I'm implementing a function for this. Therefore, and equivalent could be really helpful. Is there any option to do this? I'm new in R and come from python.
Thank you!
CodePudding user response:
Yes, you can use case_when
in R:
library(dplyr)
mtcars%>%
mutate(cyl2=case_when(cyl>7~"High",
cyl==6~"Medium",
TRUE~"Low"))
mpg cyl disp hp drat wt qsec vs am gear carb cyl2
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 Medium
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 Medium
3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 Low
4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 Medium
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 High
CodePudding user response:
There's also cut()
, Convert Numeric to Factor, with or without your own labels:
df <- data.frame(a = 1:10)
df$b <- cut(df$a,
breaks = c(-Inf,3,7,Inf),
labels = c("lo", "med", "hi"))
df$c <- cut(df$a,
breaks = c(-Inf,3,7,Inf))
df
#> a b c
#> 1 1 lo (-Inf,3]
#> 2 2 lo (-Inf,3]
#> 3 3 lo (-Inf,3]
#> 4 4 med (3,7]
#> 5 5 med (3,7]
#> 6 6 med (3,7]
#> 7 7 med (3,7]
#> 8 8 hi (7, Inf]
#> 9 9 hi (7, Inf]
#> 10 10 hi (7, Inf]