I have a vectorized function defined like this:
hai_hyperbolic_vec <- function(.x, .scale_type = c("sin","cos","tan","sincos")){
scale_type = base::as.character(.scale_type)
term = .x
if (scale_type == "sin"){
ret <- base::sin(term)
} else if (scale_type == "cos") {
ret <- base::cos(term)
} else if (scale_type == "tan") {
ret <- base::tan(term)
} else if (scale_type == "sincos") {
ret <- base::sin(term) * base::cos(term)
}
return(ret)
}
This works just fine.
library(tidyverse)
len_out = 10
by_unit = "month"
start_date = as.Date("2021-01-01")
data_tbl <- tibble(
date_col = seq.Date(from = start_date, length.out = len_out, by = by_unit),
a = rnorm(len_out),
b = runif(len_out)
)
hai_hyperbolic_vec(data_tbl$b, .scale_type = "sin")
> hai_hyperbolic_vec(data_tbl$b, .scale_type = "sin")
[1] 0.02405150 0.40920185 0.39953987 0.16234068 0.04183186 0.57301045 0.74441929 0.60728533
[9] 0.69755824 0.46611496
I have another function that will augment a data.frame/tibble
.
hai_hyperbolic_augment <- function(.data
, .value
, .names = "auto"
, .scale_type = c("sin","cos","tan","sincos")
){
column_expr <- rlang::enquo(.value)
if(rlang::quo_is_missing(column_expr)) stop(call. = FALSE, "hyperbolic_augment(.value) is missing.")
col_nms <- names(tidyselect::eval_select(rlang::enquo(.value), .data))
make_call <- function(col, scale_type){
rlang::call2(
"hai_hyperbolic_vec",
.x = rlang::sym(col)
, .scale_type = .scale_type
, .ns = "healthyR.ai"
)
}
grid <- expand.grid(
col = col_nms
, scale_type = .scale_type
, stringsAsFactors = FALSE
)
calls <- purrr::pmap(.l = list(grid$col, grid$scale_type), make_call)
if(any(.names == "auto")) {
newname <- paste0(grid$col, "_", grid$scale_type)
} else {
newname <- as.list(.names)
}
calls <- purrr::set_names(calls, newname)
ret <- tibble::as_tibble(dplyr::mutate(.data, !!!calls))
return(ret)
}
The function works but will spit out a warning message if I choose more than one .scale_type
even though the calculations are performed. I do not understand why this is happening as the vector function is being applied via purrr to a list. Can I silence this warning or is there a better way to write the function/or use the function so that this does not occur? Enforce that strictly one scale_type be called? I prefer to be able to call as I do, as a grid is made inside the augment function.
> hai_hyperbolic_augment(.data = data_tbl, .value = c(a,b), .scale_type = c("sin","tan"))
# A tibble: 10 x 7
date_col a b a_sin b_sin a_tan b_tan
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021-01-01 -1.96 0.0241 -0.925 0.0241 -0.925 0.0241
2 2021-02-01 -1.03 0.422 -0.856 0.409 -0.856 0.409
3 2021-03-01 1.55 0.411 1.00 0.400 1.00 0.400
4 2021-04-01 0.108 0.163 0.108 0.162 0.108 0.162
5 2021-05-01 -0.627 0.0418 -0.587 0.0418 -0.587 0.0418
6 2021-06-01 -0.556 0.610 -0.528 0.573 -0.528 0.573
7 2021-07-01 -0.0544 0.840 -0.0544 0.744 -0.0544 0.744
8 2021-08-01 -0.714 0.653 -0.655 0.607 -0.655 0.607
9 2021-09-01 -0.646 0.772 -0.602 0.698 -0.602 0.698
10 2021-10-01 -1.06 0.485 -0.873 0.466 -0.873 0.466
Warning messages:
1: Problem with `mutate()` column `a_sin`.
i `a_sin = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used
2: Problem with `mutate()` column `b_sin`.
i `b_sin = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used
3: Problem with `mutate()` column `a_tan`.
i `a_tan = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used
4: Problem with `mutate()` column `b_tan`.
i `b_tan = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used
CodePudding user response:
The problem comes because you are passing two arguments to the .scale_type
parameter of hai_hyperbolic_vec
.
If you go into the debugger and look at the calls
object created by the line calls <- purrr::set_names(calls, newname)
you will see:
calls
#> $a_sin
healthyR.ai::hai_hyperbolic_vec(.x = a, .scale_type = c("sin", "tan"))
#>
#> $b_sin
#> healthyR.ai::hai_hyperbolic_vec(.x = b, .scale_type = c("sin", "tan"))
#>
#> $a_tan
#> healthyR.ai::hai_hyperbolic_vec(.x = a, .scale_type = c("sin", "tan"))
#>
#> $b_tan
#> healthyR.ai::hai_hyperbolic_vec(.x = b, .scale_type = c("sin", "tan"))
But inside the hai_hyperbolic_vec
function, we see the line if (scale_type == "sin")
. So, with each call as shown above in the call
object, you are passing a length-two vector to this logical test. It will only check the first member of the vector, and emit a warning to say it has done so.
You will notice that your output is actually wrong too - the a_tan
and b_tan
columns are the same as the a_sin
and b_sin
columns, because the logic means that only the sin
is being calculated.
I think this comes from a typo (the single addition of a period) in the function make_call
where you accidentally use .scale_type
when you should be using scale_type
:
make_call <- function(col, scale_type){
rlang::call2(
"hai_hyperbolic_vec",
.x = rlang::sym(col)
, .scale_type = .scale_type # <- here is the problem
, .ns = "healthyR.ai"
)
}
Should be
make_call <- function(col, scale_type){
rlang::call2(
"hai_hyperbolic_vec",
.x = rlang::sym(col)
, .scale_type = scale_type
, .ns = "healthyR.ai"
)
}
If you make this change, you get no warnings as well as a correct result:
hai_hyperbolic_augment(.data = data_tbl, .value = c(a,b), .scale_type = c("sin","tan"))
#> # A tibble: 10 x 7
#> date_col a b a_sin b_sin a_tan b_tan
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2021-01-01 -1.01 0.977 -0.849 0.829 -1.61 1.48
#> 2 2021-02-01 0.424 0.719 0.411 0.658 0.451 0.875
#> 3 2021-03-01 -0.133 0.338 -0.132 0.332 -0.134 0.352
#> 4 2021-04-01 0.259 0.238 0.256 0.235 0.265 0.242
#> 5 2021-05-01 0.631 0.110 0.590 0.109 0.731 0.110
#> 6 2021-06-01 -0.0500 0.995 -0.0500 0.839 -0.0500 1.54
#> 7 2021-07-01 0.302 0.569 0.298 0.539 0.312 0.639
#> 8 2021-08-01 -0.681 0.901 -0.629 0.784 -0.810 1.26
#> 9 2021-09-01 -0.296 0.374 -0.292 0.365 -0.305 0.393
#> 10 2021-10-01 -0.384 0.506 -0.374 0.484 -0.404 0.554