I have the email content. From this content I then want to extract any times that exist. The times are in 24 hour format, contain a colon separator (eg 13:00) and can appear anywhere in the text.
As an example:
"Some text some text some text 12:00 Some text some text some text"
When I use this line to extract the time, the result is blank:
tp_time = re.findall(r'(^[0-2][0-3]:[0-5][0-9]$)', tp_msg)
print(tp_time)
Can anyone see what I am doing wrong?
CodePudding user response:
Can anyone see what I am doing wrong?
You are looking for r'(^[0-2][0-3]:[0-5][0-9]$)'
^
denotes start of line or start of string (depending on mode)
$
denotes end of line or end of string (depending on mode)
You should use \b
instead of ^
and \b
instead of $
, i.e.
import re
text = "Some text some text some text 12:00 Some text some text some text"
print(re.findall(r'(\b[0-2][0-3]:[0-5][0-9]\b)', text))
output
['12:00']
If you want to know more about \b
read python re
docs
CodePudding user response:
Using (0?[1-9]|1[0-2]):[0-5][0-9]
instead of (^[0-2][0-3]:[0-5][0-9]$)