Home > Software engineering >  Remove row with zeros in large data set does not work
Remove row with zeros in large data set does not work

Time:07-05

I have a large data set, namely Sachs which is freely available at the gss package. The data is so large with 7466 observations and 12 variables. I tried to remove all rows with at least one zero. That is, if one row contains zero, then remove this row over all the variables. For example, if one variable contains zero value, then this row and the corresponding row of all other variables need to be removed. I tried all available methods and, I am failing. Here is one of my tries. I know that many similar questions are already there on this website, but I tried all of them but none of them work for me.

library(gss)
data <- data.frame(Sachs[,-12])
dat <- data[apply(data,1, function(x) all(data!= 0.0000000)),]
View(dat)

CodePudding user response:

To remove rows with contain at least one zero, you can use the following code:

library(gss)
data("Sachs")
Sachs[!apply(Sachs==0,1,any),]

CodePudding user response:

Or using dplyr:

library(tidyverse)
library(gss)
data("Sachs")

Sachs |> filter(!if_any(everything(), ~ . == 0))
  •  Tags:  
  • r
  • Related