The first example works, but I'd like to move more of the repetitive code inside a function along the lines of example 2. I haven't found the right rlang approach to deal with the .x$x
.
library(tidyverse)
# This produces the desired result:
slope <- function(y, x){
coef(lm(y ~ x))[2]
}
tribble(
~group, ~y, ~x1, ~x2,
"a", 1, 1, 4,
"a", 2, 2, 5,
"a", 3, 3, 6,
"b", 1, 4, 8,
"b", 2, 6, 12,
"b", 3, 8, 16
) |>
nest(data = -group) |>
mutate(
slope1 = map_dbl(data, ~ slope(.x$y, .x$x1)),
slope2 = map_dbl(data, ~ slope(.x$y, .x$x2))
)
#> # A tibble: 2 × 4
#> group data slope1 slope2
#> <chr> <list> <dbl> <dbl>
#> 1 a <tibble [3 × 3]> 1 1
#> 2 b <tibble [3 × 3]> 0.5 0.25
# I would like to use rlang to further simplify with something like this,
# but I can't see how to combine the variable with .x$:
slope <- function(x) {
map_dbl(data, ~ coef(lm(.x$y ~ .x$x))[2])
}
tribble(
~group, ~y, ~x1, ~x2,
"a", 1, 1, 4,
"a", 2, 2, 5,
"a", 3, 3, 6,
"b", 1, 4, 8,
"b", 2, 6, 12,
"b", 3, 8, 16
) |>
nest(data = -group) |>
mutate(
slope1 = slope(x1),
slope2 = slope(x2)
)
#> Error in `mutate()`:
#> ! Problem while computing `slope1 = slope(x1)`.
#> Caused by error in `purrr:::stop_bad_type()`:
#> ! `.x` must be a vector, not a function
Created on 2022-05-24 by the reprex package (v2.0.1)
CodePudding user response:
To get the code working in the second version, your slope
function will need to take a data argument as well as the name of the column you wish to regress. The column name then needs to make it into the formula of the lm
call, which can be done in a few ways, perhaps the easiest being with some string parsing and as.formula
. I would also use the data
argument of lm
for simplicity.
slope <- function(data, var) {
f <- as.formula(paste('y', deparse(substitute(var)), sep = "~"))
map_dbl(data, ~ coef(lm(f, data = .x))[2])
}
So that you have:
tribble(
~group, ~y, ~x1, ~x2,
"a", 1, 1, 4,
"a", 2, 2, 5,
"a", 3, 3, 6,
"b", 1, 4, 8,
"b", 2, 6, 12,
"b", 3, 8, 16
) |>
nest(data = -group) |>
mutate(
slope1 = slope(data, x1),
slope2 = slope(data, x2)
)
#> # A tibble: 2 x 4
#> group data slope1 slope2
#> <chr> <list> <dbl> <dbl>
#> 1 a <tibble [3 x 3]> 1 1
#> 2 b <tibble [3 x 3]> 0.5 0.25
Personally, I think that a more robust and versatile way to do it is by passing a formula:
slope <- function(data, f) map_dbl(data, ~ coef(lm(f, data = .x))[2])
tribble(
~group, ~y, ~x1, ~x2,
"a", 1, 1, 4,
"a", 2, 2, 5,
"a", 3, 3, 6,
"b", 1, 4, 8,
"b", 2, 6, 12,
"b", 3, 8, 16
) |>
nest(data = -group) |>
mutate(
slope1 = slope(data, y ~ x1),
slope2 = slope(data, y ~ x2)
)
#> # A tibble: 2 x 4
#> group data slope1 slope2
#> <chr> <list> <dbl> <dbl>
#> 1 a <tibble [3 x 3]> 1 1
#> 2 b <tibble [3 x 3]> 0.5 0.25