I got data which has many columns but I want to remove NAs values from specific columns If i got a df such as below
structure(list(id = c(1, 1, 2, 2), admission = c("2001/01/01",
"2001/03/01", "NA", "2005/01/01"), discharged = c("2001/01/07",
"NA", "NA", "2005/01/03")), class = "data.frame", row.names = c(NA,
-4L))
and I want to exclude the records for each id which has NAs in it and get the df such as below
structure(list(id2 = c(1, 2), admission2 = c("2001/01/01", "2005/01/01"
), discharged2 = c("2001/01/07", "2005/01/03")), class = "data.frame", row.names = c(NA,
-2L))
CodePudding user response:
The NA
s in the data are "NA"
. Convert to NA
with is.na
and then use na.omit
is.na(df1) <- df1 == "NA"
df1 <- na.omit(df1)
df1
id admission discharged
1 1 2001/01/01 2001/01/07
4 2 2005/01/01 2005/01/03
If it is specific columns, use
df1[complete.cases(df1[c("admission", "discharged")]),]
CodePudding user response:
Base R option:
- character
"NA"
to NA - then use
complete.cases
df[df=="NA"] <- NA
dfNA <- complete.cases(df)
df[dfNA,]
id admission discharged
1 1 2001/01/01 2001/01/07
4 2 2005/01/01 2005/01/03
CodePudding user response:
We can make like this too:
library(tidyr)
df1[df1=="NA"] <- NA
df1_new <- df1 %>% drop_na()
df1_new
id admission discharged
1 1 2001/01/01 2001/01/07
2 2 2005/01/01 2005/01/03