Home > Back-end >  R - Remove all row that only have NA
R - Remove all row that only have NA

Time:01-03

I have this type of dataset with ID, age and gender and then questions the ID's have responded in a survey:

> ID gender age Q1 Q2 Q3
  1  male   35  1  1   NA
  2  male   45  NA NA  NA
  3 female  55  NA NA  NA

I want to remove all rows where all Q´s is NA, so in this case the result will be:

> ID gender age Q1 Q2 Q3
  1  male   35  1  1   NA

How can I do so with data.table?

CodePudding user response:

library(tidyverse)

df %>%
  filter(if_any(starts_with("Q"), ~ !is.na(.x)))

# A tibble: 1 × 6
     ID gender   age    Q1    Q2 Q3   
  <dbl> <chr>  <dbl> <dbl> <dbl> <lgl>
1     1 male      35     1     1 NA   

CodePudding user response:

dt[rowSums(is.na(dt[, paste0("Q", 1:3)])) != length(paste0("Q", 1:3)), ]

Or to make it agnostic over the names:

cols <- startsWith(colnames(dt), "Q")
dt[rowSums(is.na(dt[, ..cols])) != sum(cols), ]

CodePudding user response:

One data.table option:

DT[ DT[, apply(.SD, 1, \(x) any(!is.na(x))), .SDcols = q1:q3] ]
  • Related