Home > Back-end >  Error in using custom vectorized function in mutate/case_when
Error in using custom vectorized function in mutate/case_when

Time:02-18

Below is a simple code to reproduce the error. I define a simple function, vectorize it with another function using purrr::map, and then try to use it in a mutate case_when where the condition should normally ensure that the arguments are valid. The error occurs in the condition if(arg1 > 0) when arg1 = NA, but I don't understand why that even happens. If I apply the filter, the error disappears. Does anyone have an idea what I'm doing wrong? My feeling is that it should work.

require(tidyverse)

f_single <- function(arg1, arg2) {
  if (arg1 > 0) {
    return(arg1 * arg2)
  }
}

f_vector <- function(arg1, arg2) {
  result <- map2_dbl(arg1, arg2, f_single)
  return(result)
}

x <- tribble(~ arg1, ~ arg2,
             NA, 1,
             2, 3,
             4, 5,)

x %>%
  # filter(!is.na(arg1)) %>%
  mutate(y = case_when(arg1 > 0 ~ f_vector(arg1, arg2)))

The error is the following:

Error in `mutate()`:
! Problem while computing `y = case_when(arg1 > 0 ~ f_vector(arg1, arg2))`.
Caused by error in `if (arg1 > 0) ...`:
! missing value where TRUE/FALSE needed

CodePudding user response:

Two issues:

  1. Passing NA to an if statement will throw an error. You can avoid this by wrapping the condition with isTRUE.
  2. Your code will still throw an error, because f_single returns NULL when arg1 is missing or <= 0, but map_* expects a return value for every input.

Changing f_single as follows will solve both problems:

f_single <- function(arg1, arg2) {
  if (isTRUE(arg1 > 0)) {
    arg1 * arg2
  } else {
    NA_real_
  }
}

# rest of code unchanged from original

# # A tibble: 3 x 3
#    arg1  arg2     y
#   <dbl> <dbl> <dbl>
# 1    NA     1    NA
# 2     2     3     6
# 3     4     5    20
  • Related