If in x or y is NA, I want to keep this row containing NA and discard the rows, where both, x and y are not NA. I tried with dplyr::filter()
, purrr::keep()
and more but nothing worked.
It is essential to do that conditionally and not by the row number since my data set is too large for that.
library(tibble, quietly = T, warn.conflicts = F)
library(dplyr, quietly = T, warn.conflicts = F)
df <- tribble(
~name, ~x, ~y,
"id_1", 1, NA,
"id_2", 3, NA,
"id_3", NA, 29,
"id_4", -99, 0,
"id_5", -98, 28,
) %>%
mutate(name = factor(name))
df
#> # A tibble: 5 x 3
#> name x y
#> <fct> <dbl> <dbl>
#> 1 id_1 1 NA
#> 2 id_2 3 NA
#> 3 id_3 NA 29
#> 4 id_4 -99 0
#> 5 id_5 -98 28
Created on 2022-11-21 with reprex v2.0.2
The target is to keep rows like 1 to 3.
CodePudding user response:
You can use filter()
with if_any
to filter for rows with NA values. For example
df %>% filter(if_any(everything(), is.na))
If you just wanted to use a range of columns rather than all, you could use
df %>% filter(if_any(c(x, y), is.na))
df %>% filter(if_any(x:y, is.na))
df %>% filter(if_any(-name, is.na))
for example
CodePudding user response:
Using rowSums, check if at least one NA in a row:
df[ rowSums(is.na(df)) == 1, ]
CodePudding user response:
Base R solutions
df[!complete.cases(df),]
df[is.na(df$x) | is.na(df$y),] # if you want to specify specific columns
Alternative packages solution
library(hacksaw)
df %>% keep_na(x, y, .logic = 'OR')
Output
> # A tibble: 3 × 3
> name x y
> <fct> <dbl> <dbl>
> 1 id_1 1 NA
> 2 id_2 3 NA
> 3 id_3 NA 29