I'd like to take a tibble (or dataframe), convert one of the columns to numeric, only select the same column plus a third column, and filter out NAs.
Given the following data:
library(tidyverse)
set.seed(1)
mytib <- tibble(a = as.character(c(1:5, NA)),
b = as.character(c(6:8, NA, 9:10)),
c = as.character(sample(x = c(0,1), size = 6, replace = TRUE)))
vars <- c("a", "b")
I have created the following function
convert_tib <- function(var, tib){
tib <- tib %>%
mutate("{var}" = as.numeric({{ var }})) %>%
dplyr::select({{ var }}, c) %>%
filter(!is.na({{ var }}))
return(tib)
}
And run it with purrr:map
map(vars, ~ convert_tib(var = ., tib = mytib))
The output of this code unfortunately does not convert the vector to numeric and it also doesn't filter out the NA. I have tried many different strategies such as ensym(var)
and enquo(var)
inside the function and leaving out the curly-curly operators.
What I'd like to get is the following:
> map(vars, ~ convert_tib(var = ., tib = mytib))
[[1]]
# A tibble: 5 × 2
a c
<int> <int>
1 1 0
2 2 1
3 3 0
4 4 0
5 5 1
[[2]]
# A tibble: 5 × 2
b c
<int> <int>
1 6 0
2 7 1
3 8 0
4 9 1
5 10 0
CodePudding user response:
You may try this. I made use of ensym
function inside your custom function, since I noticed you would like to specify the variable names as strings. Then I also used !!
called big bang operator to unquote it. In the end you also need :=
to define a custom variable name in place of =
:
library(dplyr)
library(rlang)
library(purrr)
convert_tib <- function(var, tib){
var <- ensym(var)
tib <- tib %>%
dplyr::select(!!var, c) %>%
mutate(!!var := as.integer(!!var),
c = as.integer(c)) %>%
filter(!is.na(!!var))
return(tib)
}
map(vars, convert_tib, mytib)
The output:
[[1]]
# A tibble: 5 x 2
a c
<int> <int>
1 1 0
2 2 1
3 3 0
4 4 0
5 5 1
[[2]]
# A tibble: 5 x 2
b c
<int> <int>
1 6 0
2 7 1
3 8 0
4 9 1
5 10 0
CodePudding user response:
You can do this without injection or embracing:
library(dplyr)
library(purrr)
convert_tib <- function(tib, var) {
tib %>%
transmute(across(c(var, c), as.integer)) %>%
filter(!is.na(.data[[var]]))
}
map(vars, convert_tib, tib = mytib)
[[1]]
# A tibble: 5 x 2
a c
<int> <int>
1 1 0
2 2 1
3 3 0
4 4 0
5 5 1
[[2]]
# A tibble: 5 x 2
b c
<int> <int>
1 6 0
2 7 1
3 8 0
4 9 1
5 10 0