I already posted this as an issue in dplyr's repo on github, and they said that across is not a good fit for this type of problem, but I want to post here to see if anyone can give me insight into why this doesn't actually work.
I'm trying to write a function that takes a data frame (y) and uses it to update the column classes of another data frame (x) by matching column names. I had this code lying around and when I wrote it about a year ago I could've swore it worked, but now it doesn't seem to.
library(tidyverse)
(x = tibble(data.frame(a = as.character(c(1,2,3,4)),
b = as.character(c(1,2,3,4)),
c = as.character(c(1,2,3,4)),
d = as.character(c('a', 'b', 'c', 'd')))))
#> # A tibble: 4 × 4
#> a b c d
#> <chr> <chr> <chr> <chr>
#> 1 1 1 1 a
#> 2 2 2 2 b
#> 3 3 3 3 c
#> 4 4 4 4 d
(y = tibble(data.frame(a = as.numeric(c(1,1,1,1)),
b = as.character(c(1,1,1,1)),
c = as.numeric(c(1,1,1,1)),
d = as.character(c('a', 'a', 'a', 'a')))))
#> # A tibble: 4 × 4
#> a b c d
#> <dbl> <chr> <dbl> <chr>
#> 1 1 1 1 a
#> 2 1 1 1 a
#> 3 1 1 1 a
#> 4 1 1 1 a
## this code i have in my function gives an error
result <-
x %>%
dplyr::mutate( dplyr::across( .cols = tidyselect::all_of( colnames( y ) ),
.fns = ~eval(parse(text = paste0(
"as.",
class( y[[dplyr::cur_column()]] ),
"(.)"
)))))
#> Error in `dplyr::mutate()`:
#> ! Problem while computing `..1 = dplyr::across(...)`.
#> Caused by error in `across()`:
#> ! Problem while computing column `a`.
#> Caused by error:
#> ! 'list' object cannot be coerced to type 'double'
#> Backtrace:
#> ▆
#> 1. ├─x %>% ...
#> 2. ├─dplyr::mutate(...)
#> 3. ├─dplyr:::mutate.data.frame(...)
#> 4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
#> 5. │ ├─base::withCallingHandlers(...)
#> 6. │ ├─base::withCallingHandlers(...)
#> 7. │ └─mask$eval_all_mutate(quo)
#> 8. ├─base::eval(...)
#> 9. │ └─base::eval(...)
#> 10. └─base::.handleSimpleError(...)
#> 11. └─dplyr (local) h(simpleError(msg, call))
#> 12. └─rlang::abort(msg, call = call("across"), parent = cnd)
## and my desired result is:
result = data.frame(a = as.numeric(1,2,3,4),
b = as.character(1,2,3,4),
c = as.numeric(1,2,3,4),
d = as.character('a', 'b', 'c', 'd'))
CodePudding user response:
Or a similar option with across
library(dplyr)
x %>%
mutate(across(all_of(names(y)), ~ `class<-`(.x, class(y[[cur_column()]]))))
CodePudding user response:
This seems like a crazy way to achieve the end goal here. A short one-liner using map2_df
would do the same thing:
map2_df(x, y, ~ `class<-`(.x, class(.y)))
#> # A tibble: 4 x 4
#> a b c d
#> <dbl> <chr> <dbl> <chr>
#> 1 1 1 1 a
#> 2 2 2 2 b
#> 3 3 3 3 c
#> 4 4 4 4 d
As for why your code doesn't work, you are right in the sense that this is due to the way that eval
works inside a lambda function (it is taking the .
as referring to the data frame that was passed into the function call, not as a placeholder to be used inside the lambda function). This is why it is warning you about a list
.
If you change the lambda function to a standard function it will work as expected.
x %>%
dplyr::mutate( dplyr::across( .cols = tidyselect::all_of( colnames( y ) ),
.fns = function(x) eval(parse(text = paste0(
"as.", class(y[[dplyr::cur_column()]]),
"(x)")))
))
CodePudding user response:
It is a matter of which environment eval
works with. If you specify the current environment then the code in the question works.
library(dplyr)
x %>%
mutate(across(all_of(colnames(y)),
~eval(parse(text = paste0("as.", class(y[[cur_column()]]),"(.)"))),
environment()))
giving
# A tibble: 4 × 4
a b c d
<dbl> <chr> <dbl> <chr>
1 1 1 1 a
2 2 2 2 b
3 3 3 3 c
4 4 4 4 d
The code could be simplified by using the as
function but you may already know that and are really just looking for info on eval/parse.
x %>%
mutate(across(all_of(colnames(y)), ~ as(., class(y[[cur_column()]]))))
Note
The inputs simplified:
library(dplyr)
x = tibble(a = 1:4, b = 1:4, c = 1:4, d = letters[1:4])
x[] <- lapply(x, as.character)
y <- tibble(a = rep(1, 4), b = "1", c = 1, d = "a")