I have tried
- subset(df,df$date >= as.Date('2008-01-01'),na.rm = FALSE)
- subset(df,df$date >= as.Date('2008-01-01'),na.omit = FALSE)
I'm losing all the people who have NAs too. Please suggest a way to sort it out
I tried subset(df,df$date >= as.Date('2008-01-01'),na.rm = FALSE)
CodePudding user response:
If you look at the ?subset
help page, it doesn't have any arguments named na.rm
or na.omit
. Those aren't magic keywords. They're common arguments that some (but not all) functions take, and you need to look at the function's help page to see if they work with a certain function.
Also, the point of using subset
rather than just [
is that you don't have to use data$
after passing the data
argument.
subset(df, date >= "2008-01-01" | is.na(date))
This should work to keep rows where the date is >= 2008-01-01 OR where the date is NA
.
CodePudding user response:
Here is an example using filter
from dplyr
package: instead of subset
:
library(dplyr)
# create tibble
dat <- tibble(x = c(rep(as.Date('2008-01-01'),10)))
# add NA to tibble
set.seed(123)
df <- as.data.frame(lapply(dat, \(x) replace(x, sample(length(x), .3*length(x)), NA)))
# filter all 2008-01-01 and NA
df %>%
filter(x == "2008-01-01" | is.na(.))