Home > front end >  dplyr filter keep the NAs AND OR conditions
dplyr filter keep the NAs AND OR conditions

Time:05-13

I have the following data with 10 entries:

test_data_1 <- structure(list(Art = c(188, NA, NA, 140, NA, 182, NA, NA, 182, 
                       NA)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
                       ))

Let's say I want to keep only the NAs, 188 and 140. So I tried the following command:

test_data_1 %>% filter(is.na(Art), Art != 182) # with | instead of a comma, it works

With this command, a tibble with zero entries results. Why do I have to use the | sign instead of a comma? This site (https://sebastiansauer.github.io/dplyr_filter/) states: "Multiple logical comparisons can be combined. Just add ‘em up using commas; that amounts to logical OR “addition”:" So the comma should act as an OR, but it doesn't. Another approach:

test_data_1 %>% filter(Art != 182)

Here, by dplyr default, the 6 NAs entries are deleted, which is not my wish. The command na.rm=FALSE doesn't help, either. Now zero entries are kept. Why is that? Why aren't at least the entries 188 and 140 kept?

test_data_1 %>% filter(Art != 182, na.rm=FALSE)

Last question: If I want to keep various numbers in a column, I could use %in% followed by a vector, e.g.:

test_data_1 %>% filter(Art %in% c(140,188))

But how could I combine %in% with is.na if I would just like to keep the NAs and e.g. 140?

CodePudding user response:

Use | instead of &. With filter, multiple expressions separated by , are taken as &. It is not possible to have a value that is both NA and not equal to 182

library(dplyr)
test_data_1 %>% 
   filter(is.na(Art) | Art != 182)

-output

# A tibble: 8 × 1
    Art
  <dbl>
1   188
2    NA
3    NA
4   140
5    NA
6    NA
7    NA
8    NA

The second part of the question is with %in%. We can use | again

test_data_1 %>%
   filter(Art %in% c(140,188) | is.na(Art))
# A tibble: 8 × 1
    Art
  <dbl>
1   188
2    NA
3    NA
4   140
5    NA
6    NA
7    NA
8    NA

NOTE: By default, filter removes the NA elements. In addition, there is no na.rm argument in filter

  • Related