In a dplyr mutate context, I would like to select the column a function is applied to by purrr:map using the value of another column.
Let's take a test data frame
test <- data.frame(a = c(1,2), b = c(3,4), selector = c("a","b"))
I want to apply following function
calc <- function(col)
{res <- col ^ 2
return(res)
}
I am trying something like this:
test_2 <- test %>% mutate(quad = map(.data[[selector]], ~ calc(.x)))
My expected result would be:
a b selector quad
1 1 3 a 1
2 2 4 b 16
but I get
Error in local_error_context(dots = dots, .index = i, mask = mask) :
promise already under evaluation: recursive default argument reference or earlier problems?
I know .data[[var]]
is supposed to be used only in special context of function programming, but also if I wrap this in functions or similar I cannot get it done. Trying to use tidy-selection gives the error that selecting helpers can only be used in special dplyr verbs, not functions like purrr:map.
how to use dynamic variable in purrr map within dplyr
hinted me to use get()
and anonymous functions, but this also did not work in this context.
CodePudding user response:
Here's one way:
test %>%
mutate(quad = map(seq_along(selector), ~ calc(test[[selector[.x]]])[.x]))
# a b selector quad
# 1 1 3 a 1
# 2 2 4 b 16
Instead of .data
, you can also cur_data
(which accounts for grouping):
test %>%
mutate(quad = map(seq(selector), ~ calc(cur_data()[[selector[.x]]])[.x]))
Or, with diag
:
test %>%
mutate(quad = diag(as.matrix(calc(cur_data()[selector]))))
# a b selector quad
#1 1 3 a 1
#2 2 4 b 16
CodePudding user response:
You could also change the function to return a single number and use purrr
:
calc <- function(col, id) {test[[col]][[id]]^2}
test %>%
mutate(
quad = purrr::map2_dbl(selector, row_number(), calc)
)
a b selector quad
1 1 3 a 1
2 2 4 b 16
CodePudding user response:
Not quite what you asked for but an alternative might be to restructure the data so that the calculation is easier:
test %>%
pivot_longer(
cols = c(a, b)
) %>%
filter(name == selector) %>%
mutate(quad = value**2)
# A tibble: 2 × 4
selector name value quad
<chr> <chr> <dbl> <dbl>
1 a a 1 1
2 b b 4 16
You can join the results back onto the original data using an id column.
CodePudding user response:
You can use rowwise()
and get()
the selector variable:
library(dplyr)
test %>%
rowwise() %>%
mutate(quad = calc(get(selector))) %>%
ungroup()
# A tibble: 2 × 4
a b selector quad
<dbl> <dbl> <chr> <dbl>
1 1 3 a 1
2 2 4 b 16
Or if the selector repeats, group_by()
will be more efficient:
test <- data.frame(a = c(1,2,5), b = c(3,4,6), selector = c("a","b","a"))
test %>%
group_by(selector) %>%
mutate(quad = calc(get(selector[1]))) %>%
ungroup()
# A tibble: 3 × 4
a b selector quad
<dbl> <dbl> <chr> <dbl>
1 1 3 a 1
2 2 4 b 16
3 5 6 a 25