readr
package has a function called parse_number
that returns the numbers in a string
:
readr::parse_number("Hello 2022!")
[1] 2022
Is there a similar method for returning a date from a string
? The readr
has a function called parse_date
but it does something different:
readr::parse_date("X2018-01-11_poland")
Warning: 1 parsing failure.
row col expected actual
1 -- date like X2018-01-11_poland
[1] NA
Desired output:
# the raw string is "X2018-01-11_poland"
2018-01-11
P.S. I am not interested in doing this with a regular expression.
CodePudding user response:
Here is a regex free idea,
parse_date(strsplit(x, '_', fixed = TRUE)[[1]][1], format = 'X%Y-%m-%d')
#[1] "2018-01-11"
However, IF the poland part is also fixed, you can again do,
parse_date(x, format = 'X%Y-%m-%d_poland')
#[1] "2018-01-11"
CodePudding user response:
The lubridate
package has parse_date_time2
which is easy to use.
library(lubridate)
dstring <- "X2018-01-11_poland"
date <- parse_date_time2(dstring, orders='Ymd')
date
#[1] "2018-01-11 UTC"
CodePudding user response:
1) This uses only base R and does not use any regular expressions. It assumes that (1) there are only letters and spaces before the date as that is the case in the question but that could easily be relaxed, if necessary, by adding additional characters to lets and (2) the date is in standard Date format. chartr translates the ith character in its first argument to the ith character in its second replacing each letter with a space. Then use as.Date. Note that as.Date ignores junk at the end so it is ok if additional characters not in lets follow the date.
x <- "X2018-01-11_poland"
lets <- paste(letters, collapse = "")
as.Date(chartr(lets, strrep(" ", nchar(lets)), tolower(x)))
## [1] "2018-01-11"
2) If we knew that the string always starts with X and the Date appears right after it then we can just specify the prefix in the as.Date format string. It also does not use any regular expressions and only uses base R.
as.Date(x, "X%Y-%m-%d")
## [1] "2018-01-11"
3) If you are willing to compromise and use a very simple regular expression -- here \D matches any non-digit and backslashes must be doubled within quotes. gsub removes any such character.
as.Date(gsub("\\D", "", x), "%Y%m%d")
## [1] "2018-01-11"
CodePudding user response:
Possible alternatives using base R, or stringr
and lubridate
as.Date(substr("X2018-01-11_poland", 2, 11), format = "%Y-%m-%d")
#> [1] "2018-01-11"
library(stringr)
library(lubridate)
ymd(str_sub("X2018-01-11_poland", 2, 11))
#> [1] "2018-01-11"
Created on 2021-12-22 by the reprex package (v2.0.1)