Home > Software engineering >  Create new colum based on multiple conditions in r
Create new colum based on multiple conditions in r

Time:05-31

I have a dataframe where I want to create a new column ("path") based on other existing columns ("th" and "br") in the dataframe. If the value in "th" is "NA" i want to create the new column based on the columns "ce" and "br".

A reproducible sample of the data can be created with the following code:

df <- structure(list(
  th = c(3, 1, NA, 2, 2, 0, 3, 3, 0, 2, 3, 2, 1, NA, 3, 4, 3, 3, 1, 3), 
  br = c(1, 2, 4, 1, 2, 2, 1, 2, 2, 5, 4, 1, 1, 2, 1, 5, 2, 1, 1, 1), 
  ce = c(2, 3, 2, 0, 1, 0, 2, 1, 1, 1, 1, 0, 0, 1, 2, 0, 0, 1, 1, 2)), 
  row.names = c(NA, 20L), class = "data.frame")

I have tride the following code with an if and else statement:

df <- df %>% if(!is.na(th)) {
  mutate(path = case_when(
    (th %in% c(0:2) & br %in% c(0:3) ~ "Low"), 
    (th %in% c(3:5) & br %in% c(0:3) ~ "Intermediate"),
    (th %in% c(3:5) & br %in% c(4:6) ~ "High")))
} else {
  mutate(path = case_when(
    (ce %in% c(0:1) & br %in% c(0:3) ~ "Low"), 
    (ce %in% c(1:3) & br %in% c(3) ~ "Intermediate"),
    (ce %in% c(2:3) & br %in% c(4:6) ~ "High")))
}

This results in an error "Error in if (.) !is.na(th) else { : the condition has length > 1"

What is wrong with the code? Any suggestions how to improve the code or for an alternative solution?

CodePudding user response:

Try this:

df %>%
  mutate(path = ifelse(!is.na(th),
                       case_when(
                         (th %in% c(0:2) & br %in% c(0:3) ~ "Low"), 
                         (th %in% c(3:5) & br %in% c(0:3) ~ "Intermediate"),
                         (th %in% c(3:5) & br %in% c(4:6) ~ "High")),
                       case_when(
                         (ce %in% c(0:1) & br %in% c(0:3) ~ "Low"), 
                         (ce %in% c(1:3) & br %in% c(3) ~ "Intermediate"),
                         (ce %in% c(2:3) & br %in% c(4:6) ~ "High"))))
  • Related