Having a dataframe like this:
data.frame(id = c(1,2,3), date1 = c("2014-Dec 2018","2009-2010","Jan 2009-Aug 2010"), date2 = c("Feb 2016-Dec 2018","2014-Dec 2018","Oct 2013-Dec 2018"))
How is it possible to keep rows which contain in any column year without month, months have 3 letters.
Example output:
data.frame(id = c(1,2), date1 = c("2014-Dec 2018","2009-2010"), date2 = c("Feb 2016-Dec 2018","2014-Dec 2018"))
CodePudding user response:
Here is one way to do it:
library(dplyr)
df %>%
filter(!if_all(!id, ~ grepl('^[A-Z][a-z]{2}\\s ', .x)))
id date1 date2
1 1 2014-Dec 2018 Feb 2016-Dec 2018
2 2 Jan 2009-Aug 2010 2014-Dec 2018
CodePudding user response:
df <- data.frame(id = c(1, 2, 3),
date1 = c('2014-Dec 2018', '2009-2010', 'Jan 2009-Aug 2010'),
date2 = c('Feb 2016-Dec 2018', '2014-Dec 2018', 'Oct 2013-Dec 2018'))
idx <- apply(df[ , 2:3 ], 1, function(x) {
all(stringr::str_detect(x, '[A-Za-z]{3}'))
}) |> which()
df[ idx, ] |> print()
CodePudding user response:
Using base R
subset(df1, Reduce(`&`, lapply(df1[-1], grepl,
pattern = paste(month.abb, collapse = "|"))))
id date1 date2
1 1 2014-Dec 2018 Feb 2016-Dec 2018
3 3 Jan 2009-Aug 2010 Oct 2013-Dec 2018