Home > Blockchain >  Remove data.table rows whose vector elements contain nested NAs
Remove data.table rows whose vector elements contain nested NAs

Time:01-03

I need to remove from a data.table any row in which column a contains any NA nested in a vector:

library(data.table)

a = list(as.numeric(c(NA,NA)), 2,as.numeric(c(3, NA)), c(4,5) )
b <- 11:14

dt <- data.table(a,b)

Thus, rows 1 and 3 should be removed.

I tried three solutions without success:

dt1 <- dt[!is.na(a)] 
dt2 <- dt[!is.na(unlist(a))]
dt3 <- dt[dt[,!Reduce(`&`, lapply(a, is.na))]]

Any ideas? Thank you.

CodePudding user response:

You can do the following:

dt[sapply(dt$a, \(l) !any(is.na(l)))]

This alternative also works, but you will get warnings

dt[sapply(dt$a, all)]

Output:

     a  b
1:   2 12
2: 4,5 14

A third option that you might prefer: You could move the functionality to a separate helper function that ingests a list of lists (nl), and returns a boolean vector of length equal to length(nl), and then apply that function as below. In this example, I explicitly call unlist() on the result of lapply() rather than letting sapply() do that for me, but I could also have used sapply()

f <- \(nl) unlist(lapply(nl,\(l) !any(is.na(l))))

dt[f(a)]

CodePudding user response:

An alternative to *apply()

dt[, .SD[!anyNA(a, TRUE)], by = .I][, !"I"]

#         a     b
#    <list> <int>
# 1:      2    12
# 2:    4,5    14
  • Related