Home > database >  Mutate with ifelse multiple conditions in R
Mutate with ifelse multiple conditions in R

Time:02-24

I have dataframe like below

monkey = data.frame(girl = 1:10, kn = NA, boy = 5)

And i want to understand the following code meaning step by step

monkey %>%
  mutate(t = ifelse(is.na(kn),.[,grepl('a',names(.))],ll))

Thank you everyone in advance for your support.

CodePudding user response:

In my opinion, this is not good code, but I'll try to explain what it is doing.

  • is.na(kn) (in the context of monkey) returns a logical vector of whether each value in that column is NA,

    with(monkey, is.na(kn))
    #  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    
  • The use of . in .[grepl(*)] refers to the current data at the start of this call to mutate; it would be more dplyr-canonical to use cur_data(), which would be more-complete (e.g., taking into account previous mutated columns that . does not recognize, not a factor here). I believe this .[*] code is trying to select a column dynamically based on the current data.

    Why this one is bad: 1. There is no column here whose name contains "a"; 2. There could be more than one columns whose names contain "a", which means the yes= argument to ifelse would produce a nested frame in the new t= column; 3. The behavior of .[,*] changes if the original frame is the base-R data.frame or if it is the tibble-variant tbl_df: see monkey[,1] versus tibble(monkey)[,1].

  • no= argument refers to an object ll that is not defined. This should (intuitively) fail with Error: object 'll' not found or similar, but since all of the test= argument is true, the no= is not needed and so it not evaluated. Consider ifelse(c(TRUE, TRUE), 1:2, stop("oops")) (no error) versus ifelse(c(TRUE, FALSE), 1:2, stop("oops")).

Ultimately, this code is not defensive-enough to be safe (base-vs-tibble variant) and its intent is unclear.

My advice when using dplyr is to use dplyr::if_else instead of base R's ifelse. For one, ifelse has some issues and limitations (e.g., How to prevent ifelse() from turning Date objects into numeric objects); for another, if_else protects you from ambiguous, inconsistent-results code such as in your question.

  • Related