I have a dataset and I would like to delete the rows that have a complete set of NAs in columns 456:555, I want to keep those with some NAs but I need to delete those with a complete set of NAs
I have tried
final[complete.cases(final[ , 456:555]),]
but this doesn't work. It says
Error in help.search(c("[", "final", "complete.cases(final[, c(456:555)])", : argument ‘pattern’ must be a single character string
then I think this probably would work:
data[rowSums(is.na(data)) != ncol(data),]
but I don't know where to include 456:555 there
what should I do?
Thanks!
CodePudding user response:
Maybe you can do something like this, not the cleanest approach:
# data frame with one row complete NA
df <- data.frame(V1 = c(NA, 3, NA, 2, 3),
V2 = c(NA, 3, 1, NA, 5),
V3 = c(NA, NA, NA ,NA, NA))
df
V1 V2 V3
1 NA NA NA
2 3 3 NA
3 NA 1 NA
4 2 NA NA
5 3 5 NA
old_df <- df[4:5,] # get rows you wanna keep regardless of number of NAs
new_df <- df[1:3,] # get rows where you wanna delete complete NAs
# "delete" complete NAs
new_df <- new_df %>%
filter(is.na(new_df) %>% rowSums() != length(new_df))
# build the two dfs together
df <-rbind(old_df, new_df)
df
V1 V2 V3
4 2 NA NA
5 3 5 NA
1 3 3 NA
2 NA 1 NA
CodePudding user response:
Here is one simple solution with the sjmisc
package:
df[!apply(df[456:555],1,sjmisc::all_na),]
To check that it does what you want, please find a little reprex:
REPREX
df <- data.frame(V1 = c(NA, 3, NA, 2, 3),
V2 = c(NA, 3, 1, NA, 5),
V3 = c(NA, NA, NA ,NA, NA))
df
#> V1 V2 V3
#> 1 NA NA NA
#> 2 3 3 NA
#> 3 NA 1 NA
#> 4 2 NA NA
#> 5 3 5 NA
# Select all line except `all_na` lines for the selected columns:
df[!apply(df[2:3],1,sjmisc::all_na),]
#> V1 V2 V3
#> 2 3 3 NA
#> 3 NA 1 NA
#> 5 3 5 NA
Created on 2021-10-11 by the reprex package (v2.0.1)