Home > database >  dplyr: If a column is present then evaluate an expression. If not return a `FALSE`
dplyr: If a column is present then evaluate an expression. If not return a `FALSE`

Time:12-14

I am getting myself confused with dplyr and if_any. I am trying to perform something along these lines:

If a column is present then evaluate an expression. If not return a FALSE.

So these three scenarios capture what I am thinking:

library(dplyr)

dat <- data.frame(x = 1)

## GOOD: if foo_col is NA then return FALSE
dat %>%
  mutate(foo_col = NA_character_) %>%
  mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#>   x foo_col present
#> 1 1    <NA>   FALSE

## GOOD: if foo_col is not NA return FALSE
dat %>%
  mutate(foo_col = "value") %>%
  mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#>   x foo_col present
#> 1 1   value    TRUE

## NOT GOOD: if foo_col is absent, return TRUE? Want this to be FALSE.
dat %>%
  mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#>   x present
#> 1 1    TRUE

So can anyone suggest a way to determine how I could check for the is.na condition but also if the column is actually there?

CodePudding user response:

If we need the last to be FALSE while giving the TRUE/FALSE for the other two cases

library(dplyr)
dat %>%
   mutate(present = ncol(pick(matches("foo_col"))) > 0 & 
                   if_any(matches("foo_col"), ~ !is.na(.x)))

-output

  x present
1 1   FALSE

Or as @boshek mentioned in the comments, rlang::is_empty should work as well

dat %>% 
  mutate(present = !rlang::is_empty((across(matches("foo_col")))) & 
                if_any(matches("foo_col"), ~ !is.na(.x)))

-output

  x present
1 1   FALSE

For the other cases

> dat %>%
    mutate(foo_col = NA_character_) %>%
    mutate(present = ncol(pick(matches("foo_col"))) > 0 &if_any(matches("foo_col"), ~ !is.na(.x)))
  x foo_col present
1 1    <NA>   FALSE
> dat %>%
    mutate(foo_col = "value") %>%
    mutate(present =  ncol(pick(matches("foo_col"))) > 0 & if_any(matches("foo_col"), ~ !is.na(.x)))
  x foo_col present
1 1   value    TRUE

NOTE: But this test cannot differentiate the FALSE from the NA cases and column not found FALSE

  • Related