I have this type of dataset with ID, age and gender and then questions the ID's have responded in a survey:
> ID gender age Q1 Q2 Q3
1 male 35 1 1 NA
2 male 45 NA NA NA
3 female 55 NA NA NA
I want to remove all rows where all Q´s is NA, so in this case the result will be:
> ID gender age Q1 Q2 Q3
1 male 35 1 1 NA
How can I do so with data.table
?
CodePudding user response:
library(tidyverse)
df %>%
filter(if_any(starts_with("Q"), ~ !is.na(.x)))
# A tibble: 1 × 6
ID gender age Q1 Q2 Q3
<dbl> <chr> <dbl> <dbl> <dbl> <lgl>
1 1 male 35 1 1 NA
CodePudding user response:
dt[rowSums(is.na(dt[, paste0("Q", 1:3)])) != length(paste0("Q", 1:3)), ]
Or to make it agnostic over the names:
cols <- startsWith(colnames(dt), "Q")
dt[rowSums(is.na(dt[, ..cols])) != sum(cols), ]
CodePudding user response:
One data.table option:
DT[ DT[, apply(.SD, 1, \(x) any(!is.na(x))), .SDcols = q1:q3] ]