Home > other >  How to only show true rows in a vector including NA values in R
How to only show true rows in a vector including NA values in R

Time:09-10

The basic setting looks as follows:

example_data <- data.frame(num=1:5, let=letters[1:5])

complicated_criterium <- function(data) { c(T, NA, F, T, F)}

example_data[complicated_criterium(example_data), ]

This shows the following:

   num  let
1  1    a
NA NA   <NA>
4  4    d

I would like to only see rows 1 and 4, the rows where my complicated criterium is true. What is a simple way to achieve that?

It is correct that the complicated criterium spits on NA on row 2 so I don't want to change that behavior. I played around with is.na and na.omit but logically combining NA with T or F just gives NA again and omitting the NA changes the length of the vector so I don't get the correct rows anymore.

CodePudding user response:

You can just wrap it in which()

example_data[which(complicated_criterium(example_data)),]

Output:

  num let
1   1   a
4   4   d

CodePudding user response:

I am sure there are more elegant ways, but the following base R approach will do the trick:

example_data[complicated_criterium(example_data) &
               !is.na(complicated_criterium(example_data)),]

# or

keeps <- complicated_criterium(example_data)
example_data[keeps & !is.na(keeps),]

Output:

#  num let
#1   1   a
#4   4   d

CodePudding user response:

dplyr::filter drops NAs by default

library(dplyr, warn = FALSE)

data.frame(x = 1:5) %>% 
  filter(c(TRUE, NA, FALSE, TRUE, FALSE))
#>   x
#> 1 1
#> 2 4

Created on 2022-09-09 with reprex v2.0.2

  • Related