I am getting myself confused with dplyr and if_any
. I am trying to perform something along these lines:
If a column is present then evaluate an expression. If not return a
FALSE
.
So these three scenarios capture what I am thinking:
library(dplyr)
dat <- data.frame(x = 1)
## GOOD: if foo_col is NA then return FALSE
dat %>%
mutate(foo_col = NA_character_) %>%
mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#> x foo_col present
#> 1 1 <NA> FALSE
## GOOD: if foo_col is not NA return FALSE
dat %>%
mutate(foo_col = "value") %>%
mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#> x foo_col present
#> 1 1 value TRUE
## NOT GOOD: if foo_col is absent, return TRUE? Want this to be FALSE.
dat %>%
mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#> x present
#> 1 1 TRUE
So can anyone suggest a way to determine how I could check for the is.na
condition but also if the column is actually there?
CodePudding user response:
If we need the last to be FALSE
while giving the TRUE/FALSE for the other two cases
library(dplyr)
dat %>%
mutate(present = ncol(pick(matches("foo_col"))) > 0 &
if_any(matches("foo_col"), ~ !is.na(.x)))
-output
x present
1 1 FALSE
Or as @boshek mentioned in the comments, rlang::is_empty
should work as well
dat %>%
mutate(present = !rlang::is_empty((across(matches("foo_col")))) &
if_any(matches("foo_col"), ~ !is.na(.x)))
-output
x present
1 1 FALSE
For the other cases
> dat %>%
mutate(foo_col = NA_character_) %>%
mutate(present = ncol(pick(matches("foo_col"))) > 0 &if_any(matches("foo_col"), ~ !is.na(.x)))
x foo_col present
1 1 <NA> FALSE
> dat %>%
mutate(foo_col = "value") %>%
mutate(present = ncol(pick(matches("foo_col"))) > 0 & if_any(matches("foo_col"), ~ !is.na(.x)))
x foo_col present
1 1 value TRUE
NOTE: But this test cannot differentiate the FALSE
from the NA
cases and column not found FALSE