Home > Back-end >  Regex issue retrieving hour
Regex issue retrieving hour

Time:11-19

I have this REGEX to check a paragraph and get some data from there.

([0-9]{1,2}:{0,1}[0-9]{0,2}[a-z]{0,2})[\s\D\s] ([0-9]{1,2}:{0,1}[0-9]{0,2}[a-z]{0,2}),(. ),(\s\w{1,2} de [\wç]  de \d{4})?(\s\w  \d{1,2}, \d{4})?$

I need to get the hour, title and the date of this type of texts:

EXAMPLE 1 : This example the number 130 is causing the issue and I can't get the first hour

1:30pm to 4:30pm, Aniversário amigo matteo, Ana Montoya, Accepted, Location: Kids Buffet Infantil
Rua do Triunfo, 130, Brookling, Hello - SP, 04602-005, Brasil, November 23, 2022

EXAMPLE 2 : This is working correctly

8am to 9:30am, All Hearts meeting, Ana Montoya, Accepted, Location: https://us02web.zoom.us/j/1234?pwd=1234, November 21, 2022

Get the two hours, the text of the title and the final date

CodePudding user response:

Here is a modified regex with your sample input strings:

[
  '1:30pm to 4:30pm, Aniversário amigo matteo, Ana Montoya, Accepted, Location: Kids Buffet Infantil Rua do Triunfo, 130, Brookling, Hello - SP, 04602-005, Brasil, November 23, 2022',
  '8am to 9:30am, All Hearts meeting, Ana Montoya, Accepted, Location: https://us02web.zoom.us/j/1234?pwd=1234, November 21, 2022'
].forEach(str => {
  let m = str.match(/^(\d\d?(?::\d\d)?[ap]m) to (\d\d?(?::\d\d)?[ap]m), *([^,] ).* ([a-z]  \d , \d{4})/i);
  console.log(m);
});

Output:

[
  "1:30pm to 4:30pm, Aniversário amigo matteo, Ana Montoya, Accepted, Location: Kids Buffet Infantil Rua do Triunfo, 130, Brookling, Hello - SP, 04602-005, Brasil, November 23, 2022",
  "1:30pm",
  "4:30pm",
  "Aniversário amigo matteo",
  "November 23, 2022"
]
[
  "8am to 9:30am, All Hearts meeting, Ana Montoya, Accepted, Location: https://us02web.zoom.us/j/1234?pwd=1234, November 21, 2022",
  "8am",
  "9:30am",
  "All Hearts meeting",
  "November 21, 2022"
]

Explanation of regex:

  • ^ -- anchor at start of string
  • ( -- capture group 1 start
  • \d\d? -- 1 or 2 digits
  • (?::\d\d)? -- optional non-capture group for colon and 2 digits
  • [ap]m -- literal am or pm
  • ) -- capture group 1 end
  • to -- literal text
  • (\d\d?(?::\d\d)?[ap]m) -- capture group 2, same as above
  • , * -- comma and optional spaces
  • ([^,] ) -- title up to next comma
  • .* -- greedy scan to last space, followed by:
  • ([a-z] \d , \d{4}) -- date format Mmmmm dd, yyyy
  • ignore case flag i

CodePudding user response:

([0-9]{1,2}:{0,1}[0-9]{0,2}[a-z]{0,2})[\s\D\s] ([0-9]{1,2}:{0,1}[0-9]{0,2}[a-z]{0,2}),(. ),(\s\w{1,2} de [\wç] de \d{4})?(\s\w \d{1,2}, \d{4})?.*$

  • Related