I have a function that extracts the min or minimal of a range of values (within a character string) that appears to work fine on individual cases.
However, when I try to use it within case_when() it does not behave as expected.
Reproducible example
library(dplyr)
library(tibble)
library(stringr)
val_from_range <- function(.str, .fun = "min"){
str_extract_all(.str, "\\d*\\.?\\d ") |>
unlist() |>
as.numeric() |>
(\(x) if (.fun == "min") x |> min()
else if (.fun == "max") x |> max())()
}
tibble(x = c("5-6", "4", "6-9", "5", "NA")) |>
mutate(min = case_when(str_detect(x, "-") ~ val_from_range(x, "min"))) |>
mutate(max = case_when(str_detect(x, "-") ~ val_from_range(x, "max")))
# A tibble: 5 x 3
x min max
<chr> <dbl> <dbl>
1 5-6 4 9
2 4 NA NA
3 6-9 4 9
4 5 NA NA
5 NA NA NA
However, I want:
# A tibble: 5 x 3
x min max
<chr> <dbl> <dbl>
1 5-6 5 6
2 4 NA NA
3 6-9 6 9
4 5 NA NA
5 NA NA NA
The function performs as expected on individual cases
> val_from_range("5-6", "min")
[1] 5
> val_from_range("5-6", "max")
[1] 6
> val_from_range("5-6-8-10", "max")
[1] 10
Any help would be greatly appreciated. Thanks in advance.
CodePudding user response:
Couple of changes required. The function works only for one value at a time . If you pass in more than one value it ignores the second value.
val_from_range("5-6", "min")
#[1] 5
val_from_range(c("5-6", "8-10"), "min")
#[1] 5
To pass them one by one you can take help of rowwise
. Secondly, case_when
still executes the function for values that do not satisfy the condition hence it returns a warning for "NA"
value. We can use if
/else
here to avoid that.
library(dplyr)
library(stringr)
tibble(x = c("5-6", "4", "6-9", "5", "NA")) %>%
rowwise() %>%
mutate(min = if(str_detect(x, "-")) val_from_range(x, "min") else NA,
max = if(str_detect(x, "-")) val_from_range(x, "max") else NA) %>%
ungroup
# x min max
# <chr> <dbl> <dbl>
#1 5-6 5 6
#2 4 NA NA
#3 6-9 6 9
#4 5 NA NA
#5 NA NA NA