Home > Software design >  Using Regex in Python to Extract Times from Email Text
Using Regex in Python to Extract Times from Email Text

Time:10-07

I have the email content. From this content I then want to extract any times that exist. The times are in 24 hour format, contain a colon separator (eg 13:00) and can appear anywhere in the text.

As an example:

"Some text some text some text 12:00 Some text some text some text"

When I use this line to extract the time, the result is blank:

tp_time = re.findall(r'(^[0-2][0-3]:[0-5][0-9]$)', tp_msg)
print(tp_time)

Can anyone see what I am doing wrong?

CodePudding user response:

Can anyone see what I am doing wrong?

You are looking for r'(^[0-2][0-3]:[0-5][0-9]$)'

^ denotes start of line or start of string (depending on mode)

$ denotes end of line or end of string (depending on mode)

You should use \b instead of ^ and \b instead of $, i.e.

import re
text = "Some text some text some text 12:00 Some text some text some text"
print(re.findall(r'(\b[0-2][0-3]:[0-5][0-9]\b)', text))

output

['12:00']

If you want to know more about \b read python re docs

CodePudding user response:

Using (0?[1-9]|1[0-2]):[0-5][0-9] instead of (^[0-2][0-3]:[0-5][0-9]$)

  • Related