I have difficulty removing rows in grouped data where all value
s are NA
:
df <- data.frame(
File = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
value = c(1.111, 1.222, 1.121, 3.325, NA, 3.455, NA, NA, NA)
)
Filtering with if_any
on the condition that any value
is NA
works as expected:
library(dplyr)
df %>%
# group_by(File) %>%
filter(!if_any(value, is.na))
File value
1 A 1.111
2 A 1.222
3 A 1.121
4 B 3.325
5 B 3.455
But filtering with if_all
on the condition that all value
s of a group be NA
does not work as expected - I get the same result as for if_any
:
df %>%
group_by(File) %>%
filter(!if_all(value, is.na))
# A tibble: 5 × 2
# Groups: File [2]
File value
<chr> <dbl>
1 A 1.11
2 A 1.22
3 A 1.12
4 B 3.32
5 B 3.46
The expected result is:
1 A 1.11
2 A 1.22
3 A 1.12
Where am I going wrong?
CodePudding user response:
is this what you are looking for:
df %>%
group_by(File) %>%
filter(all(!is.na(value)))
You group by File
and only return groups rows where there are no NA
values.
This matches your expected output.
But in your question you state that you only want to remove rows where all values are NA
. You could achieve that with
df %>%
group_by(File) %>%
filter(any(!is.na(value)))
if_any
and if_all
are intended to use within filer
when you want to filter across multiple columns. This is not what you want I guess.
Here is an example on how you would use if_all
:
df2 <- tibble(
x1 = c(TRUE, TRUE, FALSE, FALSE),
x2 = c(TRUE, FALSE, TRUE, FALSE),
)
df2 %>%
filter(if_all(x1:x2, ~.)) #rows where x1 and x2 are TRUE
df2 %>%
filter(if_any(x1:x2, ~.)) #rows where x1 or x2 are TRUE
CodePudding user response:
This was my approach :
my_df <- data.frame(
File = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
value = c(1.111, 1.222, 1.121, 3.325, NA, 3.455, NA, NA, NA)
)
my_df2 <- my_df %>% group_by(File) %>% filter(n() == sum(!is.na(value)))
Edit : My bad I thought you wanted only group where the value was equal to NA but it was the opposite so by adding the ! before the is.na will give you the expected result