I would like to be able to find columns that ONLY have ALL zero values and replace these with NAs. Found a partial solution here: Replace all zero columns with NA but my data frame also contains columns with all NAs and this fails. Here is an example:
DF <- tibble(
y1 = c(0.0, 0.0, 0.0, 0.0, 0.0),
y2 = c(0.0, 0.75, 1.5, 3.5, 6.0),
y3 = c(0.0, 1.0, 2.0, 4.0, 3.0),
y4 = c(NA, NA, NA, NA, NA))
DF %>% mutate_all(~case_when(all(. == 0, na.rm = TRUE) ~ NA_real_, TRUE ~ .))
So the goal is to change y1 to all NAs and leave everything else as is. This code works if DF is just y1:y3. Including the NA column throws an error. Note that I altered the original solution to include na.rm in the all() function, but this does not solve the issue. Error returned is
Caused by error in `` names(message) <- `*vtmp*` ``:
! 'names' attribute [1] must be the same length as the vector [0]
CodePudding user response:
A possible solution (notice that mutate_all
is superseded):
library(dplyr)
DF %>%
mutate(across(y1:y4, ~ {if (all(.x == 0, na.rm = T)) NA_real_ else .x}))
#> # A tibble: 5 × 4
#> y1 y2 y3 y4
#> <dbl> <dbl> <dbl> <dbl>
#> 1 NA 0 0 NA
#> 2 NA 0.75 1 NA
#> 3 NA 1.5 2 NA
#> 4 NA 3.5 4 NA
#> 5 NA 6 3 NA
CodePudding user response:
Using base R
DF[!colSums(DF != 0, na.rm = TRUE)] <- NA
-output
> DF
# A tibble: 5 × 4
y1 y2 y3 y4
<lgl> <dbl> <dbl> <lgl>
1 NA 0 0 NA
2 NA 0.75 1 NA
3 NA 1.5 2 NA
4 NA 3.5 4 NA
5 NA 6 3 NA
With dplyr
, we can use a condition in where
DF %>%
mutate(across(where(~ all(.x== 0, na.rm = TRUE)), ~ NA))
-output
# A tibble: 5 × 4
y1 y2 y3 y4
<lgl> <dbl> <dbl> <lgl>
1 NA 0 0 NA
2 NA 0.75 1 NA
3 NA 1.5 2 NA
4 NA 3.5 4 NA
5 NA 6 3 NA