Home > Back-end >  dplyr negative select if df not null
dplyr negative select if df not null

Time:11-13

I run an api extract each morning. There is a variable bla that may or may not be null. If it's not null it is a dataframe that contains a field sessions, in which case I would like to deselect sessions.

Normally, bla is indeed a dataframe and not null and the following block runs fine:

bla |> ... dplyr chain here ... |> select(-sessions)

But in those cases where bla is just null, I need this code to still run and not buckle my workflow.

Tried:

bla |> ... dplyr chain here ... |> select(!any_of('sessions'))

But this errors with:

Error in UseMethod("select") : 
  no applicable method for 'select' applied to an object of class "NULL"

How can I tell dply to run the select command only if bla is not null?

CodePudding user response:

There are couple of ways to deal with this. Use an if condition on exists (if we have created an object earlier in the step)

library(dplyr)
if(exists('bla')) {
      bla |> 
       select(!any_of('sessions'))
  }

Or wrap with tryCatch

tryCatch(bla |> 
      select(!any_of('sessions')), error = function(e) NULL)

Based on the OP's update in comments, if this is part of a chain, we can check if the number of rows are greater than 0 before we select

ab_tests <- get_ga_df(viewId, c(start_date, end_date), ab_metrics, ab_dims) %>%  
     index_zen_name(dimension_lookup_marketing) %>%    
    dedup_key(c('session_id', 'date')) %>%  {  
      if(!is.null(.)) {          
        select(., !any_of('sessions'))
       } else .
   }
  • Related