Home > Software design >  Extract numbers from a string then make it as a date
Extract numbers from a string then make it as a date

Time:04-24

Hello I am trying to extract Year, Month, and Date from the following string

"2020y 3m 1d 16h"

and desiring for an output like the following:

"2020-03-01" (or "2020-3-1" but a date type)

I've tried searching up Google but was only able to get [extraction with certain patterns- most of them had patterns in punctuation], [extract all the numbers - had a hard time deleting 16 etc].

Can somebody please help me out with this?

Thank you so much in advance!

CodePudding user response:

We can first remove the " __h" string from the input, then use the ymd() function from the lubridate package to turn it into date.

Regex:

  • \\s any white space (to match the space before "16h")
  • \\d{1,2} any digit that occurs 1 to 2 times (since hour should range from 00 to 23 or 24, which only has two digits at max)
library(lubridate)

ymd(gsub("\\s\\d{1,2}h", "", "2020y 3m 1d 16h"))
[1] "2020-03-01"

class(ymd(gsub("\\s\\d{1,2}h", "", "2020y 3m 1d 16h")))
[1] "Date"

CodePudding user response:

Convert the y,m,d characters to short dashes and then use as.POSIXct to convert to datetime class. The spaces would possibly not be present of the month or date were at or above 10.

as.POSIXct( gsub("[y|m|d]( ){0,1}", "-", test),format="%Y-%m-%d-%Hh")
#[1] "2020-03-01 16:00:00 CST"

This also succeeds with input like:

test <- "2020y12m 1d 16h"

....whereas the answer from benson23 fails. If you are intending to throw away the hours information, the format string could be:

..., format="%Y-%m-%d"

as.POSIXct( gsub("[y|m|d]( ){0,1}", "-", test),format="%Y-%m-%d")

You should generally offer a larger set of possible input to support testing of code.

  • Related