Home > Software design >  R filter_at: remove rows with specific multiple columns with only NAs
R filter_at: remove rows with specific multiple columns with only NAs

Time:09-27

I got a dataframe (dat) with 64 columns which looks like this:

   ID A  B  C 
   1  NA NA NA
   2  5  5  5
   3  5  5  NA

I would like to remove rows which contain only NA values in the columns 3 to 64, lets say in the example columns A, B and C but I want to ignore column ID. So it should look like this:

   ID A  B  C 
   2  5  5  5
   3  5  5  NA

I tried the following code, but it leaves me with an empty dataframe

features <- names(dat)[3:64] # define vector with column names to be filtered at
dat <- dat %>%
  filter_at(vars(features), all_vars(!is.na(.)))

Someone has an idea why or how to solve this in a better way?

CodePudding user response:

You can use if_all to remove (!) rows that are all NA between index 3 and 64. Also, note that filter_at is superseded by the use of if_any and if_all.

library(dplyr)
dat %>% 
  filter(!(if_all(3:64, is.na)))

In your example, this translates to:

dat %>% 
  filter(!(if_all(2:4, is.na)))

#  ID A B  C
#1  2 5 5  5
#2  3 5 5 NA

CodePudding user response:

This should remove all rows which are completely NA:

dat[rowSums(is.na(dat)) != ncol(dat), ]

CodePudding user response:

We can also use rowMeans

dat[rowMeans(is.na(dat[,3:64]))<1,]
  • Related