Home > Software design >  Pass a variable to .x$ in a formula of a map function
Pass a variable to .x$ in a formula of a map function

Time:05-25

The first example works, but I'd like to move more of the repetitive code inside a function along the lines of example 2. I haven't found the right rlang approach to deal with the .x$x.

library(tidyverse)

# This produces the desired result:
slope <- function(y, x){
  coef(lm(y ~ x))[2]
}

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = map_dbl(data, ~ slope(.x$y, .x$x1)),
    slope2 = map_dbl(data, ~ slope(.x$y, .x$x2))
  )
#> # A tibble: 2 × 4
#>   group data             slope1 slope2
#>   <chr> <list>            <dbl>  <dbl>
#> 1 a     <tibble [3 × 3]>    1     1   
#> 2 b     <tibble [3 × 3]>    0.5   0.25

# I would like to use rlang to further simplify with something like this,
# but I can't see how to combine the variable with .x$:
slope <- function(x) {
  map_dbl(data, ~ coef(lm(.x$y ~ .x$x))[2])
}

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = slope(x1),
    slope2 = slope(x2)
  )
#> Error in `mutate()`:
#> ! Problem while computing `slope1 = slope(x1)`.
#> Caused by error in `purrr:::stop_bad_type()`:
#> ! `.x` must be a vector, not a function

Created on 2022-05-24 by the reprex package (v2.0.1)

CodePudding user response:

To get the code working in the second version, your slope function will need to take a data argument as well as the name of the column you wish to regress. The column name then needs to make it into the formula of the lm call, which can be done in a few ways, perhaps the easiest being with some string parsing and as.formula. I would also use the data argument of lm for simplicity.

slope <- function(data, var) {
  
  f <- as.formula(paste('y', deparse(substitute(var)), sep = "~"))
  map_dbl(data, ~ coef(lm(f, data = .x))[2])
}

So that you have:

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = slope(data, x1),
    slope2 = slope(data, x2)
  )

#> # A tibble: 2 x 4
#>   group data             slope1 slope2
#>   <chr> <list>            <dbl>  <dbl>
#> 1 a     <tibble [3 x 3]>    1     1   
#> 2 b     <tibble [3 x 3]>    0.5   0.25

Personally, I think that a more robust and versatile way to do it is by passing a formula:

slope <- function(data, f) map_dbl(data, ~ coef(lm(f, data = .x))[2])

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = slope(data, y ~ x1),
    slope2 = slope(data, y ~ x2)
  )

#> # A tibble: 2 x 4
#>   group data             slope1 slope2
#>   <chr> <list>            <dbl>  <dbl>
#> 1 a     <tibble [3 x 3]>    1     1   
#> 2 b     <tibble [3 x 3]>    0.5   0.25
  • Related