Home > Software design >  Problem with `mutate()`...first element used
Problem with `mutate()`...first element used

Time:11-09

I have a vectorized function defined like this:

hai_hyperbolic_vec <- function(.x, .scale_type = c("sin","cos","tan","sincos")){

    scale_type = base::as.character(.scale_type)
    term       = .x

    if (scale_type == "sin"){
        ret <- base::sin(term)
    } else if (scale_type == "cos") {
        ret <- base::cos(term)
    } else if (scale_type == "tan") {
        ret <- base::tan(term)
    } else if (scale_type == "sincos") {
        ret <- base::sin(term) * base::cos(term)
    }

    return(ret)

}

This works just fine.

library(tidyverse)

len_out    = 10
by_unit    = "month"
start_date = as.Date("2021-01-01")

data_tbl <- tibble(
    date_col = seq.Date(from = start_date, length.out = len_out, by = by_unit),
    a    = rnorm(len_out),
    b    = runif(len_out)
)

hai_hyperbolic_vec(data_tbl$b, .scale_type = "sin")
> hai_hyperbolic_vec(data_tbl$b, .scale_type = "sin")
 [1] 0.02405150 0.40920185 0.39953987 0.16234068 0.04183186 0.57301045 0.74441929 0.60728533
 [9] 0.69755824 0.46611496

I have another function that will augment a data.frame/tibble.

hai_hyperbolic_augment <- function(.data
                                   , .value
                                   , .names = "auto"
                                   , .scale_type = c("sin","cos","tan","sincos")
){

    column_expr <- rlang::enquo(.value)

    if(rlang::quo_is_missing(column_expr)) stop(call. = FALSE, "hyperbolic_augment(.value) is missing.")

    col_nms <- names(tidyselect::eval_select(rlang::enquo(.value), .data))

    make_call <- function(col, scale_type){
        rlang::call2(
            "hai_hyperbolic_vec",
            .x            = rlang::sym(col)
            , .scale_type = .scale_type
            , .ns         = "healthyR.ai"
        )
    }

    grid <- expand.grid(
        col                = col_nms
        , scale_type       = .scale_type
        , stringsAsFactors = FALSE
    )

    calls <- purrr::pmap(.l = list(grid$col, grid$scale_type), make_call)

    if(any(.names == "auto")) {
        newname <- paste0(grid$col, "_", grid$scale_type)
    } else {
        newname <- as.list(.names)
    }

    calls <- purrr::set_names(calls, newname)

    ret <- tibble::as_tibble(dplyr::mutate(.data, !!!calls))

    return(ret)

}

The function works but will spit out a warning message if I choose more than one .scale_type even though the calculations are performed. I do not understand why this is happening as the vector function is being applied via purrr to a list. Can I silence this warning or is there a better way to write the function/or use the function so that this does not occur? Enforce that strictly one scale_type be called? I prefer to be able to call as I do, as a grid is made inside the augment function.

> hai_hyperbolic_augment(.data = data_tbl, .value = c(a,b), .scale_type = c("sin","tan"))
# A tibble: 10 x 7
   date_col         a      b   a_sin  b_sin   a_tan  b_tan
   <date>       <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>
 1 2021-01-01 -1.96   0.0241 -0.925  0.0241 -0.925  0.0241
 2 2021-02-01 -1.03   0.422  -0.856  0.409  -0.856  0.409 
 3 2021-03-01  1.55   0.411   1.00   0.400   1.00   0.400 
 4 2021-04-01  0.108  0.163   0.108  0.162   0.108  0.162 
 5 2021-05-01 -0.627  0.0418 -0.587  0.0418 -0.587  0.0418
 6 2021-06-01 -0.556  0.610  -0.528  0.573  -0.528  0.573 
 7 2021-07-01 -0.0544 0.840  -0.0544 0.744  -0.0544 0.744 
 8 2021-08-01 -0.714  0.653  -0.655  0.607  -0.655  0.607 
 9 2021-09-01 -0.646  0.772  -0.602  0.698  -0.602  0.698 
10 2021-10-01 -1.06   0.485  -0.873  0.466  -0.873  0.466 
Warning messages:
1: Problem with `mutate()` column `a_sin`.
i `a_sin = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used 
2: Problem with `mutate()` column `b_sin`.
i `b_sin = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used 
3: Problem with `mutate()` column `a_tan`.
i `a_tan = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used 
4: Problem with `mutate()` column `b_tan`.
i `b_tan = healthyR.ai::hai_hyperbolic_vec(...)`.
i the condition has length > 1 and only the first element will be used

CodePudding user response:

The problem comes because you are passing two arguments to the .scale_type parameter of hai_hyperbolic_vec.

If you go into the debugger and look at the calls object created by the line calls <- purrr::set_names(calls, newname) you will see:

calls
#> $a_sin
healthyR.ai::hai_hyperbolic_vec(.x = a, .scale_type = c("sin", "tan"))
#> 
#> $b_sin
#> healthyR.ai::hai_hyperbolic_vec(.x = b, .scale_type = c("sin", "tan"))
#> 
#> $a_tan
#> healthyR.ai::hai_hyperbolic_vec(.x = a, .scale_type = c("sin", "tan"))
#> 
#> $b_tan
#> healthyR.ai::hai_hyperbolic_vec(.x = b, .scale_type = c("sin", "tan"))

But inside the hai_hyperbolic_vec function, we see the line if (scale_type == "sin"). So, with each call as shown above in the call object, you are passing a length-two vector to this logical test. It will only check the first member of the vector, and emit a warning to say it has done so.

You will notice that your output is actually wrong too - the a_tan and b_tan columns are the same as the a_sin and b_sin columns, because the logic means that only the sin is being calculated.

I think this comes from a typo (the single addition of a period) in the function make_call where you accidentally use .scale_type when you should be using scale_type:

    make_call <- function(col, scale_type){
        rlang::call2(
            "hai_hyperbolic_vec",
            .x            = rlang::sym(col)
            , .scale_type = .scale_type # <- here is the problem
            , .ns         = "healthyR.ai"
        )
    }

Should be

    make_call <- function(col, scale_type){
        rlang::call2(
            "hai_hyperbolic_vec",
            .x            = rlang::sym(col)
            , .scale_type = scale_type
            , .ns         = "healthyR.ai"
        )
    }

If you make this change, you get no warnings as well as a correct result:

hai_hyperbolic_augment(.data = data_tbl, .value = c(a,b), .scale_type = c("sin","tan"))
#> # A tibble: 10 x 7
#>    date_col         a     b   a_sin b_sin   a_tan b_tan
#>    <date>       <dbl> <dbl>   <dbl> <dbl>   <dbl> <dbl>
#>  1 2021-01-01 -1.01   0.977 -0.849  0.829 -1.61   1.48 
#>  2 2021-02-01  0.424  0.719  0.411  0.658  0.451  0.875
#>  3 2021-03-01 -0.133  0.338 -0.132  0.332 -0.134  0.352
#>  4 2021-04-01  0.259  0.238  0.256  0.235  0.265  0.242
#>  5 2021-05-01  0.631  0.110  0.590  0.109  0.731  0.110
#>  6 2021-06-01 -0.0500 0.995 -0.0500 0.839 -0.0500 1.54 
#>  7 2021-07-01  0.302  0.569  0.298  0.539  0.312  0.639
#>  8 2021-08-01 -0.681  0.901 -0.629  0.784 -0.810  1.26 
#>  9 2021-09-01 -0.296  0.374 -0.292  0.365 -0.305  0.393
#> 10 2021-10-01 -0.384  0.506 -0.374  0.484 -0.404  0.554
  • Related