Home > Software engineering >  Case_when - Not returning the correct values
Case_when - Not returning the correct values

Time:12-16

This is my first post here and I am quite new in R.

I am having some problems when trying to categorize a variable, waist circumference, using case_when.

data_norm <- data_norm  %>%
  mutate(
    Waist_C_Classification = case_when(
      Sex = 1 & Waist_C < 94 ~ "low",
      Sex = 1 & Waist_C <= 102 ~ "medium",
      Sex = 1 & Waist_C > 102 ~ "high",
      Sex = 2 & Waist_C < 80 ~ "low",
      Sex = 2 & Waist_C <= 88 ~ "medium",      
      Sex = 2 & Waist_C > 88 ~ "high"
    )     
)
Sex    Waist_C  Waist_C_Classification
2   86.00   low     
2   73.00   low     
2   94.00   medium

In this case he last one should be high as it is Sex 2 and more than 88 cm.

I have tried to use == instead of = and to use "Male" and "Female" as the variable is labelled, but I obtained same result.

The idea would be to obtain one variable with the categories per sex.

Thanks!

CodePudding user response:

As noted by others above, using == should work perfectly:

library(tidyverse)
data_norm <- tibble(Sex = 2, Waist_C = c(86.0, 73.0, 94.0))

data_norm  %>%
  mutate(
    Waist_C_Classification = case_when(
      Sex == 1 & Waist_C < 94 ~ "low",
      Sex == 1 & Waist_C <= 102 ~ "medium",
      Sex == 1 & Waist_C > 102 ~ "high",
      Sex == 2 & Waist_C < 80 ~ "low",
      Sex == 2 & Waist_C <= 88 ~ "medium",      
      Sex == 2 & Waist_C > 88 ~ "high"
    )     
  )
#> # A tibble: 3 × 3
#>     Sex Waist_C Waist_C_Classification
#>   <dbl>   <dbl> <chr>                 
#> 1     2      86 medium                
#> 2     2      73 low                   
#> 3     2      94 high

CodePudding user response:

Like everyone here has mentioned == solves most of the issue. I think the code can also benefit from defining a lower boundary in some of them. For example, you have Sex = 1 & Waist_C < 94 ~ "low" also, Sex = 1 & Waist_C <= 102 ~ "medium". Now anything lower than 94 can fall under either of these categories.

Try this,

library(tidyverse)
library(data.table)

data_norm <- data.frame(Sex = c(1,1,2,2,1,2), Waist_C = c(86, 96 , 104, 94, 73, 88))


data_norm <- data_norm  %>%
  mutate(
    Waist_C_Classification = case_when(
      Sex == 1 & Waist_C <= 94 ~ "low",
      Sex == 1 & Waist_C > 94 & Waist_C <= 102 ~ "medium",
      Sex == 1 & Waist_C > 102 ~ "high",
      Sex == 2 & Waist_C <= 80 ~ "low",
      Sex == 2 & Waist_C > 80 & Waist_C <= 88 ~ "medium",      
      Sex == 2 & Waist_C > 88 ~ "high"
    )     
  )

data_norm

Result-


  Sex Waist_C Waist_C_Classification
   1      86                    low
   1      96                 medium
   2     104                   high
   2      94                   high
   1      73                    low
   2      88                 medium
  • Related