The basic setting looks as follows:
example_data <- data.frame(num=1:5, let=letters[1:5])
complicated_criterium <- function(data) { c(T, NA, F, T, F)}
example_data[complicated_criterium(example_data), ]
This shows the following:
num let
1 1 a
NA NA <NA>
4 4 d
I would like to only see rows 1 and 4, the rows where my complicated criterium is true. What is a simple way to achieve that?
It is correct that the complicated criterium spits on NA on row 2 so I don't want to change that behavior. I played around with is.na and na.omit but logically combining NA with T or F just gives NA again and omitting the NA changes the length of the vector so I don't get the correct rows anymore.
CodePudding user response:
You can just wrap it in which()
example_data[which(complicated_criterium(example_data)),]
Output:
num let
1 1 a
4 4 d
CodePudding user response:
I am sure there are more elegant ways, but the following base R approach will do the trick:
example_data[complicated_criterium(example_data) &
!is.na(complicated_criterium(example_data)),]
# or
keeps <- complicated_criterium(example_data)
example_data[keeps & !is.na(keeps),]
Output:
# num let
#1 1 a
#4 4 d
CodePudding user response:
dplyr::filter
drops NAs by default
library(dplyr, warn = FALSE)
data.frame(x = 1:5) %>%
filter(c(TRUE, NA, FALSE, TRUE, FALSE))
#> x
#> 1 1
#> 2 4
Created on 2022-09-09 with reprex v2.0.2