I'm very new to R. I have this data frame named "log"
a b c d e
x 1 2 3 4 5
y 1 na TRUE 2 five
z 1 2 3 TRUE FALSE
t TRUE TRUE FALSE FALSE five
I run this code that goes well.
> filter(log, grepl("^[A-Za-z] $", log$b)==TRUE)
Output a b c d e y 1 na TRUE 2 five t TRUE TRUE FALSE FALSE five
If I try to put this code in a function where object = log (my data frame) and column = a column of my data frame :
wrong <- function(object, column){
filter(object, grepl("^[A-Za-z] $", object$column)==TRUE)
}
I get this error
Error in `filter()`:
! Problem while computing `..1 = grepl("^[A-Za-z] $", object$column) == TRUE`.
✖ Input `..1` must be of size 4 or 1, not size 0.
---
Backtrace:
▆
1. ├─global wrong(log, b)
2. │ ├─dplyr::filter(...)
3. │ └─dplyr:::filter.data.frame(object, grepl("^[A-Za-z] $", object$column) == TRUE)
4. │ └─dplyr:::filter_rows(.data, ..., caller_env = caller_env())
5. │ └─dplyr:::filter_eval(dots, mask = mask, error_call = error_call)
6. │ ├─base::withCallingHandlers(...)
7. │ └─mask$eval_all_filter(dots, env_filter)
8. ├─dplyr:::dplyr_internal_error(...)
9. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data)
10. │ └─rlang:::signal_abort(cnd, .file)
11. │ └─base::signalCondition(cnd)
12. └─dplyr (local) `<fn>`(`<dpl:::__>`)
13. └─rlang::abort(bullets, call = error_call, parent = skip_internal_condition(e))
I don't understand what's wrong, why the first code works but the second dosen't work. I tried to modify my code but it didn't work. Any help appreciated
Thanks!
CodePudding user response:
The $column
will not work as column
it will look for the literal column
instead of the value. Also, not clear if we are passing unquoted or quoted i.e. string column name in the function. If we are passing unquoted, use {{}}
operator. Also, note that grepl
returns a logical vector as output i.e. TRUE/FALSE, therefore, there is no need to do another comparison (== TRUE
) on top of it
library(dplyr)
f1 <- function(object, column){
filter(object, grepl("^[A-Za-z] $", {{column}}))
}
-testing
> f1(log, b)
a b c d e
y 1 na TRUE 2 five
t TRUE TRUE FALSE FALSE five
It may not work if we est as f1(log, "b")
. To make it more general, i.e. to pass either unquoted or quoted, convert to sym
bol with ensym
and evaluate (!!
)
f2 <- function(object, column)
{
filter(object, grepl("^[A-Za-z] $", !! rlang::ensym(column)))
}
-testing
> f2(log, b)
a b c d e
y 1 na TRUE 2 five
t TRUE TRUE FALSE FALSE five
> f2(log, "b")
a b c d e
y 1 na TRUE 2 five
t TRUE TRUE FALSE FALSE five
Another option is also to make use of .data[[column]]
if we pass as string or if there are one or more columns, use if_any/if_all
as well
f3 <- function(object, column){
filter(object, if_any(all_of(column), ~ grepl("^[A-Za-z] $", .x)))
}
-testing
> f3(log, "b")
a b c d e
y 1 na TRUE 2 five
t TRUE TRUE FALSE FALSE five
> f3(log, c("b", "c", "e"))
a b c d e
y 1 na TRUE 2 five
z 1 2 3 TRUE FALSE
t TRUE TRUE FALSE FALSE five
data
log <- structure(list(a = c("1", "1", "1", "TRUE"), b = c("2", "na",
"2", "TRUE"), c = c("3", "TRUE", "3", "FALSE"), d = c("4", "2",
"TRUE", "FALSE"), e = c("5", "five", "FALSE", "five")),
class = "data.frame", row.names = c("x",
"y", "z", "t"))