I would like to filter my dataframe to obtain all rows that have no zero in any columns.
Here's my dataframe:
df1 = structure(list(item = c(NA, "polygon", NA, NA, NA, NA, NA, "polygon",
"polygon", "0", NA, "polygon", "polygon"), identity = c("pol",
"0", NA, NA, NA, NA, "0", "pol", "pol", "0", NA, "pol", "pol"
), unique = c("polx", "0", "0", "0", "0", "0", "0", "polx", "0",
"0", NA, "polx", "0"), detect = c("ox", "0", "0", "0", "0", "0",
NA, NA, "ox", "0", NA, "ox", NA), structure = c("hex", "0", "0",
"0", "0", "0", NA, "hex", "hex", "0", NA, "hex", "hex"), key = c("IUGDG",
"0", "0", "0", NA, "0", NA, "IUGDG", "IUGDG", "0", NA, "IUGDG",
"IUGDG"), code = c("adg434kk", "0", "0", "0", NA, "0", NA, "adg434kk",
"adg434kk", "0", NA, "adg434kk", "adg434kk")), class = "data.frame", row.names = c(NA,
-13L))
item identity unique detect structure key code
1 <NA> pol polx ox hex IUGDG adg434kk
2 polygon 0 0 0 0 0 0
3 <NA> <NA> 0 0 0 0 0
4 <NA> <NA> 0 0 0 0 0
5 <NA> <NA> 0 0 0 <NA> <NA>
6 <NA> <NA> 0 0 0 0 0
7 <NA> 0 0 <NA> <NA> <NA> <NA>
8 polygon pol polx <NA> hex IUGDG adg434kk
9 polygon pol 0 ox hex IUGDG adg434kk
10 0 0 0 0 0 0 0
11 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
12 polygon pol polx ox hex IUGDG adg434kk
13 polygon pol 0 <NA> hex IUGDG adg434kk
Expected output
item identity unique detect structure key code
1 <NA> pol polx ox hex IUGDG adg434kk
8 polygon pol polx <NA> hex IUGDG adg434kk
11 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
12 polygon pol polx ox hex IUGDG adg434kk
All the attempts I have tried below failed:
df1 %>%
filter(!if_any(item:code, ~ .x == 0))
df1 %>%
filter(if_any(item:code, ~ .x != 0))
Using anti_join provides the solution but is there a more direct way?
df2 <- df1 %>%
filter(if_any(item:code, ~ .x == 0))
anti_join(df1, df2)
Most of the solutions I found is changing 0 to NA but I want to find all rows with 0. Can someone provides a solution using dplyr, that is with piping approach? thanks.
CodePudding user response:
We could use if_all
with %in%
library(dplyr)
df1 %>%
filter(if_all(item:code, ~ !.x %in% 0))
-output
item identity unique detect structure key code
1 <NA> pol polx ox hex IUGDG adg434kk
2 polygon pol polx <NA> hex IUGDG adg434kk
3 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
4 polygon pol polx ox hex IUGDG adg434kk
Or with if_any
with %in%
df1 %>%
filter(!if_any(item:code, ~ .x %in% 0))
item identity unique detect structure key code
1 <NA> pol polx ox hex IUGDG adg434kk
2 polygon pol polx <NA> hex IUGDG adg434kk
3 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
4 polygon pol polx ox hex IUGDG adg434kk
If we want to use ==
or !=
, make sure that NA
values are taken care of
df1 %>%
filter(!if_any(item:code, ~ .x == 0 & !is.na(.x)))
item identity unique detect structure key code
1 <NA> pol polx ox hex IUGDG adg434kk
2 polygon pol polx <NA> hex IUGDG adg434kk
3 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
4 polygon pol polx ox hex IUGDG adg434kk
The difference is that ==
or !=
returns NA
when there is an NA
whereas %in%
returns FALSE
for the NA
. When there is NA
as return value, filter
removes those rows. If we create a column, it becomes more clear
> df1 %>%
mutate(new = if_any(item:code, ~ .x != 0))
item identity unique detect structure key code new
1 <NA> pol polx ox hex IUGDG adg434kk TRUE
2 polygon 0 0 0 0 0 0 TRUE
3 <NA> <NA> 0 0 0 0 0 NA
4 <NA> <NA> 0 0 0 0 0 NA
5 <NA> <NA> 0 0 0 <NA> <NA> NA
6 <NA> <NA> 0 0 0 0 0 NA
7 <NA> 0 0 <NA> <NA> <NA> <NA> NA
8 polygon pol polx <NA> hex IUGDG adg434kk TRUE
9 polygon pol 0 ox hex IUGDG adg434kk TRUE
10 0 0 0 0 0 0 0 FALSE
11 <NA> <NA> <NA> <NA> <NA> <NA> <NA> NA
12 polygon pol polx ox hex IUGDG adg434kk TRUE
13 polygon pol 0 <NA> hex IUGDG adg434kk TRUE
whereas with %in%
> df1 %>% mutate(new = !if_any(item:code, ~ .x %in% 0))
item identity unique detect structure key code new
1 <NA> pol polx ox hex IUGDG adg434kk TRUE
2 polygon 0 0 0 0 0 0 FALSE
3 <NA> <NA> 0 0 0 0 0 FALSE
4 <NA> <NA> 0 0 0 0 0 FALSE
5 <NA> <NA> 0 0 0 <NA> <NA> FALSE
6 <NA> <NA> 0 0 0 0 0 FALSE
7 <NA> 0 0 <NA> <NA> <NA> <NA> FALSE
8 polygon pol polx <NA> hex IUGDG adg434kk TRUE
9 polygon pol 0 ox hex IUGDG adg434kk FALSE
10 0 0 0 0 0 0 0 FALSE
11 <NA> <NA> <NA> <NA> <NA> <NA> <NA> TRUE
12 polygon pol polx ox hex IUGDG adg434kk TRUE
13 polygon pol 0 <NA> hex IUGDG adg434kk FALSE