Here is a representation of my dataset:
Number<-c(1:10)
AA<-c(head(LETTERS,4), rep(NA,6))
BB<-c(head(letters,6), rep(NA,4))
CC<-c(1:6, rep(NA,4))
DD<-c(10:14, rep(NA,5))
EE<-c(3:8, rep(NA,4))
FF<-c(6:1, rep(NA,4))
mydata<-data.frame(Number,AA,BB,CC,DD,EE,FF)
I want to delete all the IDs (Number)
that have no information in the remaining columns, automatically. I want to tell the function that if there is a value in Number
but there is only NA
in all the remaining columns, delete the row.
I must have the dataframe below:
Number AA BB CC DD EE FF
1 1 A a 1 10 3 6
2 2 B b 2 11 4 5
3 3 C c 3 12 5 4
4 4 D d 4 13 6 3
5 5 <NA> e 5 14 7 2
6 6 <NA> f 6 NA 8 1
CodePudding user response:
Another possible base R solution:
mydata[rowSums(is.na(mydata[,-1])) != ncol(mydata[,-1]), ]
Output
Number AA BB CC DD EE FF
1 1 A a 1 10 3 6
2 2 B b 2 11 4 5
3 3 C c 3 12 5 4
4 4 D d 4 13 6 3
5 5 <NA> e 5 14 7 2
6 6 <NA> f 6 NA 8 1
Or we could use apply
:
mydata[!apply(mydata[,-1], 1, function(x) all(is.na(x))),]
CodePudding user response:
We can use if_all/if_all
library(dplyr)
mydata %>%
filter(if_any(-Number, complete.cases))
-output
Number AA BB CC DD EE FF
1 1 A a 1 10 3 6
2 2 B b 2 11 4 5
3 3 C c 3 12 5 4
4 4 D d 4 13 6 3
5 5 <NA> e 5 14 7 2
6 6 <NA> f 6 NA 8 1
or
mydata %>%
filter(!if_all(-Number, is.na))
Or with base R
subset(mydata, rowSums(!is.na(mydata[-1])) >0 )
Number AA BB CC DD EE FF
1 1 A a 1 10 3 6
2 2 B b 2 11 4 5
3 3 C c 3 12 5 4
4 4 D d 4 13 6 3
5 5 <NA> e 5 14 7 2
6 6 <NA> f 6 NA 8 1
CodePudding user response:
A possible solution, using janitor::remove_empty
:
library(dplyr)
library(janitor)
inner_join(mydata, remove_empty(mydata[-1], which = "rows"))
#> Joining, by = c("AA", "BB", "CC", "DD", "EE", "FF")
#> Number AA BB CC DD EE FF
#> 1 1 A a 1 10 3 6
#> 2 2 B b 2 11 4 5
#> 3 3 C c 3 12 5 4
#> 4 4 D d 4 13 6 3
#> 5 5 <NA> e 5 14 7 2
#> 6 6 <NA> f 6 NA 8 1
CodePudding user response:
Try this:
df <- df[,colSums(is.na(df))<nrow(df)]
This makes a copy of your data though. If you have a large dataset then you can use:
Filter(function(x)!all(is.na(x)), df)
and depending on your approach you can use
library(data.table)
DT <- as.data.table(df)
DT[,which(unlist(lapply(DT, function(x)!all(is.na(x))))),with=F]
If you want to use a data.table which is usually a pretty solid go-to