Home > Mobile >  Function breaks when looped within dplyr::case_when()
Function breaks when looped within dplyr::case_when()

Time:09-16

I have a function that extracts the min or minimal of a range of values (within a character string) that appears to work fine on individual cases.

However, when I try to use it within case_when() it does not behave as expected.

Reproducible example

library(dplyr)
library(tibble)
library(stringr)


val_from_range <- function(.str, .fun = "min"){
  str_extract_all(.str, "\\d*\\.?\\d ") |> 
    unlist() |> 
    as.numeric() |> 
    (\(x) if (.fun == "min") x |> min() 
     else if (.fun == "max") x |> max())()
  
}

tibble(x = c("5-6", "4", "6-9", "5", "NA")) |> 
  mutate(min = case_when(str_detect(x, "-") ~ val_from_range(x, "min"))) |> 
  mutate(max = case_when(str_detect(x, "-") ~ val_from_range(x, "max")))

# A tibble: 5 x 3
  x       min   max
  <chr> <dbl> <dbl>
1 5-6       4     9
2 4        NA    NA
3 6-9       4     9
4 5        NA    NA
5 NA       NA    NA

However, I want:

# A tibble: 5 x 3
  x       min   max
  <chr> <dbl> <dbl>
1 5-6       5     6
2 4        NA    NA
3 6-9       6     9
4 5        NA    NA
5 NA       NA    NA

The function performs as expected on individual cases

> val_from_range("5-6", "min")
[1] 5
> val_from_range("5-6", "max")
[1] 6
> val_from_range("5-6-8-10", "max")
[1] 10

Any help would be greatly appreciated. Thanks in advance.

CodePudding user response:

Couple of changes required. The function works only for one value at a time . If you pass in more than one value it ignores the second value.

val_from_range("5-6", "min")
#[1] 5

val_from_range(c("5-6", "8-10"), "min")
#[1] 5

To pass them one by one you can take help of rowwise. Secondly, case_when still executes the function for values that do not satisfy the condition hence it returns a warning for "NA" value. We can use if/else here to avoid that.

library(dplyr)
library(stringr)

tibble(x = c("5-6", "4", "6-9", "5", "NA")) %>%
  rowwise() %>%
  mutate(min = if(str_detect(x, "-")) val_from_range(x, "min") else NA,
         max = if(str_detect(x, "-")) val_from_range(x, "max") else NA) %>%
  ungroup

#   x       min   max
#  <chr> <dbl> <dbl>
#1 5-6       5     6
#2 4        NA    NA
#3 6-9       6     9
#4 5        NA    NA
#5 NA       NA    NA
  • Related