Home > OS >  dplyr Replace specific cases in a column based on row conditions, leaving the other cases untouched
dplyr Replace specific cases in a column based on row conditions, leaving the other cases untouched

Time:03-07

How can I apply, two (or more) replace cases, in a column based on rows from another column? For example, giving the next data frame

data <- data.frame(x1 = c(1, 2, 3, 2, 4, 1, 2, 3, 4, 2),  
                     x2 = c(4, 36, 45, 22, 36, 1, 11, 12, 31, 22))

I want to apply 2 conditions:

  • If X1 = 1 then replace its corresponding value in X2 by YES
  • If X1 = 2 then replace its corresponding value in X2 by NO
  • Leave all the other values unchanged, keep the original values in X2

I wanted to use the "replace" function with "case_when" for this but I did not find a solution. I used then the "mutate" option, but it replaced the values I wanted to leave untouched by NAs. (I want to use case_when or similar to optimize my script, here I just provide an example but I may need to substitute many cases and I do not want to have multiple times the "replace" function in my pipe :) )

I used the next:

data %>%
  mutate(x2 = case_when(x1 %in% 1 ~   “YES”,
                        x1 %in% 2 ~   “NO”))

And what I obtained:

enter image description here

How can I solve this? Thank you in advance

S

CodePudding user response:

You almost had it.

For rows where you have no condition and you want to leave as is, you can do:

data %>%
  mutate(x2 = case_when(x1 %in% 1 ~ 'YES',
                        x1 %in% 2 ~ 'NO',
                        TRUE ~ as.character (x2)))


   x1  x2
1   1 YES
2   2  NO
3   3  45
4   2  NO
5   4  36
6   1 YES
7   2  NO
8   3  12
9   4  31
10  2  NO

CodePudding user response:

A base R option using match replace

transform(
  data,
  x2 = {
    v <- c("yes", "no")[match(x1, c(1, 2))]
    replace(data$x2, !is.na(v), na.omit(v))
  }
)

gives

   x1  x2
1   1 yes
2   2  no
3   3  45
4   2  no
5   4  36
6   1 yes
7   2  no
8   3  12
9   4  31
10  2  no

CodePudding user response:

Here is a more 'fun' approach with nested ifelse. The presented answers by @deschen and @ThomasIsCoding are my favorites:

library(dplyr)

data %>% 
  mutate(x2 = ifelse(x1 > 2, x2,
                         ifelse(x1==2, "NO", 
                                ifelse(x1 ==1, "YES", x2))))
   x1  x2
1   1 YES
2   2  NO
3   3  45
4   2  NO
5   4  36
6   1 YES
7   2  NO
8   3  12
9   4  31
10  2  NO
  • Related