Home > database >  Is there any command to keep in a subset rows with a specific format time string?
Is there any command to keep in a subset rows with a specific format time string?

Time:10-23

Having a dataframe like this:

data.frame(id = c(1,2,3), date1 = c("2014-Dec 2018","2009-2010","Jan 2009-Aug 2010"), date2 = c("Feb 2016-Dec 2018","2014-Dec 2018","Oct 2013-Dec 2018"))

How is it possible to keep rows which contain in any column year without month, months have 3 letters.

Example output:

data.frame(id = c(1,2), date1 = c("2014-Dec 2018","2009-2010"), date2 = c("Feb 2016-Dec 2018","2014-Dec 2018"))

CodePudding user response:

Here is one way to do it:

library(dplyr)

df %>%
  filter(!if_all(!id, ~ grepl('^[A-Z][a-z]{2}\\s ', .x)))

  id             date1             date2
1  1     2014-Dec 2018 Feb 2016-Dec 2018
2  2 Jan 2009-Aug 2010     2014-Dec 2018

CodePudding user response:

df <- data.frame(id = c(1, 2, 3), 
                 date1 = c('2014-Dec 2018', '2009-2010', 'Jan 2009-Aug 2010'), 
                 date2 = c('Feb 2016-Dec 2018', '2014-Dec 2018', 'Oct 2013-Dec 2018'))
idx <- apply(df[ , 2:3 ], 1, function(x) {
  all(stringr::str_detect(x, '[A-Za-z]{3}'))
}) |> which()
df[ idx, ] |> print()

CodePudding user response:

Using base R

subset(df1, Reduce(`&`, lapply(df1[-1], grepl, 
    pattern = paste(month.abb, collapse = "|"))))
  id             date1             date2
1  1     2014-Dec 2018 Feb 2016-Dec 2018
3  3 Jan 2009-Aug 2010 Oct 2013-Dec 2018
  • Related