Home > OS >  How to group individual values from an existing variable into a new variable in R
How to group individual values from an existing variable into a new variable in R

Time:09-07

I'm new to R and I'm stuck. I'm working on a health dataset with each row as one patient's information.

I have a variable called diag_codes. It has the patient's medical condition in the form of a diagnostic code/number. I want to group the individual condition codes into broader categories (heart disease, resp disease, liver disease) and make that a new variable.

E.g. I know that 1,2,3,4,84 are all respiratory diseases. I also know that 5, 6, 7, 32, 56 are all cardiovascular diseases. I want to create a new variable called diagnosis.

diag_code diagnosis
1 "resp disease"
2 "resp disease"
56 "CVD disease"
3 "resp disease"
4 "resp disease"
84 "resp disease"
5 "CVD disease"
6 "CVD disease"
7 "CVD disease"
32 "CVD disease"

I have tried to use case_when() and mutate(), or ifelse() and mutate(), but they usually involve a single true or false condition.

I want to be able to do something like this (I know this is incorrect):

data <- data %>%
mutate(diagnosis = case_when(diag_code==c(1,2,3,5,84)) ~ "Resp disease",
                   case_when(diag_code==c(5,6,7,32,56)) ~ "CVD disease", 
                   TRUE ~ "Unknown)

CodePudding user response:

There are two things that you need to correct to make it work:

First, you can use only one case_when() statement and second, when you want to evaluate a vector you can use %in% instead of ==. This then should look like this:

data <- data %>%
mutate(diagnosis = case_when(diag_code %in% c(1,2,3,5,84) ~ "Resp disease",
                             diag_code %in% c(5,6,7,32,56) ~ "CVD disease", 
                             TRUE ~ "Unknown)
  • Related