This is my first post here and I am quite new in R.
I am having some problems when trying to categorize a variable, waist circumference, using case_when.
data_norm <- data_norm %>%
mutate(
Waist_C_Classification = case_when(
Sex = 1 & Waist_C < 94 ~ "low",
Sex = 1 & Waist_C <= 102 ~ "medium",
Sex = 1 & Waist_C > 102 ~ "high",
Sex = 2 & Waist_C < 80 ~ "low",
Sex = 2 & Waist_C <= 88 ~ "medium",
Sex = 2 & Waist_C > 88 ~ "high"
)
)
Sex Waist_C Waist_C_Classification
2 86.00 low
2 73.00 low
2 94.00 medium
In this case he last one should be high as it is Sex 2 and more than 88 cm.
I have tried to use == instead of = and to use "Male" and "Female" as the variable is labelled, but I obtained same result.
The idea would be to obtain one variable with the categories per sex.
Thanks!
CodePudding user response:
As noted by others above, using ==
should work perfectly:
library(tidyverse)
data_norm <- tibble(Sex = 2, Waist_C = c(86.0, 73.0, 94.0))
data_norm %>%
mutate(
Waist_C_Classification = case_when(
Sex == 1 & Waist_C < 94 ~ "low",
Sex == 1 & Waist_C <= 102 ~ "medium",
Sex == 1 & Waist_C > 102 ~ "high",
Sex == 2 & Waist_C < 80 ~ "low",
Sex == 2 & Waist_C <= 88 ~ "medium",
Sex == 2 & Waist_C > 88 ~ "high"
)
)
#> # A tibble: 3 × 3
#> Sex Waist_C Waist_C_Classification
#> <dbl> <dbl> <chr>
#> 1 2 86 medium
#> 2 2 73 low
#> 3 2 94 high
CodePudding user response:
Like everyone here has mentioned == solves most of the issue. I think the code can also benefit from defining a lower boundary in some of them. For example, you have Sex = 1 & Waist_C < 94 ~ "low" also, Sex = 1 & Waist_C <= 102 ~ "medium". Now anything lower than 94 can fall under either of these categories.
Try this,
library(tidyverse)
library(data.table)
data_norm <- data.frame(Sex = c(1,1,2,2,1,2), Waist_C = c(86, 96 , 104, 94, 73, 88))
data_norm <- data_norm %>%
mutate(
Waist_C_Classification = case_when(
Sex == 1 & Waist_C <= 94 ~ "low",
Sex == 1 & Waist_C > 94 & Waist_C <= 102 ~ "medium",
Sex == 1 & Waist_C > 102 ~ "high",
Sex == 2 & Waist_C <= 80 ~ "low",
Sex == 2 & Waist_C > 80 & Waist_C <= 88 ~ "medium",
Sex == 2 & Waist_C > 88 ~ "high"
)
)
data_norm
Result-
Sex Waist_C Waist_C_Classification
1 86 low
1 96 medium
2 104 high
2 94 high
1 73 low
2 88 medium