In the toy example below, I want to delete all rows that have Inf
or Nan
values. In my actual data.table
, there are much more columns.
Group<-c("A","B","C","D","E","F","G")
LRR <- c(Inf, 1,2,3,-Inf,4, 5)
LRR.var <- c(NaN, Inf, 3, -Inf, -Inf, 6,7)
data<-data.table(cbind(Group, LRR, LRR.var))
data
Group LRR LRR.var
A Inf NaN
B 1 Inf
C 2 3
D 3 -Inf
E -Inf -Inf
F 4 6
G 5 7
To delete all the rows in one go, I am using the following code but getting an error -
Code -
data[!is.finite(data)]
Error -
Error: default method not implemented for type 'list'
Can someone suggest a method to delete all rows with any NaN or Inf values from data.table in one go?
I do not want to use code like the one below as in such a case I have to name all the columns one by one to check for infinite values.
data[is.finite(data$LRR) & is.finite(data$LRR.var), ]
CodePudding user response:
The columns are character
class, thus is.infinite
or is.finite
doesn't work as it expects numeric
columns. According to ?is.infinite
is.infinite returns a vector of the same length as x the jth element of which is TRUE if x[j] is infinite (i.e., equal to one of Inf or -Inf) and FALSE otherwise. This will be false unless x is numeric or complex. Complex numbers are infinite if either the real or the imaginary part is.
> str(data)
Classes ‘data.table’ and 'data.frame': 7 obs. of 3 variables:
$ Group : chr "A" "B" "C" "D" ...
$ LRR : chr "Inf" "1" "2" "3" ...
$ LRR.var: chr "Inf" "Inf" "3" "-Inf" ...
> is.finite(data$LRR)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> is.infinite(data$LRR)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
We may need to convert to numeric before applying. As the data is a data.table
, we may use data.table methods to subset
library(data.table)
data <- type.convert(data, as.is = TRUE)
data[data[, Reduce(`&`,
lapply(.SD, is.finite)), .SDcols = is.numeric]]
-output
Group LRR LRR.var
1: C 2 3
2: F 4 6
3: G 5 7
Note: The reason we get all character columns is because of creation of matrix
from cbind
(default is cbind.matrix
) as matrix
handle only a single class, it is converted to character
class based on the column 'Group'. Instead, create the data.table
or data.frame
directly
data <- data.table(Group, LRR, LRR.var)
> str(data)
Classes ‘data.table’ and 'data.frame': 7 obs. of 3 variables:
$ Group : chr "A" "B" "C" "D" ...
$ LRR : num Inf 1 2 3 -Inf ...
$ LRR.var: num Inf Inf 3 -Inf -Inf ...
Another option is if_all
with filter
from dplyr
library(dplyr)
data %>%
filter(if_all(where(is.numeric), is.finite))
Group LRR LRR.var
1: C 2 3
2: F 4 6
3: G 5 7
CodePudding user response:
In order to avoid conversion from numeric to char when create your datatable you can use cbind.data.frame
instead of cbind
:
Group<-c("A","B","C","D","E","F","G")
LRR <- c(1, Inf,2,3, -Inf,4, 5)
LRR.var <- c(Inf, Inf, 3, -Inf, -Inf, 6,7)
data<-data.table(cbind.data.frame(Group, LRR, LRR.var))
str(data)
Output:
Classes ‘data.table’ and 'data.frame': 7 obs. of 3 variables:
$ Group : chr "A" "B" "C" "D" ...
$ LRR : num 1 Inf 2 3 -Inf ...
$ LRR.var: num Inf Inf 3 -Inf -Inf ...
- attr(*, ".internal.selfref")=<externalptr>
Then a posible solution could be convert infinite numbers to NA
and finally drop_na
from table:
is.na(data) <- sapply(data, is.infinite)
data %>% drop_na()
Output:
Group LRR LRR.var
1: C 2 3
2: F 4 6
3: G 5 7