I got a dataframe (dat) with 64 columns which looks like this:
ID A B C
1 NA NA NA
2 5 5 5
3 5 5 NA
I would like to remove rows which contain only NA values in the columns 3 to 64, lets say in the example columns A, B and C but I want to ignore column ID. So it should look like this:
ID A B C
2 5 5 5
3 5 5 NA
I tried the following code, but it leaves me with an empty dataframe
features <- names(dat)[3:64] # define vector with column names to be filtered at
dat <- dat %>%
filter_at(vars(features), all_vars(!is.na(.)))
Someone has an idea why or how to solve this in a better way?
CodePudding user response:
You can use if_all
to remove (!
) rows that are all NA
between index 3 and 64. Also, note that filter_at
is superseded by the use of if_any
and if_all
.
library(dplyr)
dat %>%
filter(!(if_all(3:64, is.na)))
In your example, this translates to:
dat %>%
filter(!(if_all(2:4, is.na)))
# ID A B C
#1 2 5 5 5
#2 3 5 5 NA
CodePudding user response:
This should remove all rows which are completely NA:
dat[rowSums(is.na(dat)) != ncol(dat), ]
CodePudding user response:
We can also use rowMeans
dat[rowMeans(is.na(dat[,3:64]))<1,]