I am not sure how to go about finding all of the substrings that match this pattern. I believe I can use regex but I am not sure. Currently, I have a string that has some other stuff and this date in it.
Example: "Expiration Date: Mon Aug 20 16:07:24 2029 word other gibberish word Expiration Date: Mon Aug 20 16:08:16 2029 word gibberish word"
I do not want to find exactly this string. I want to find all date-like values.
CodePudding user response:
If you want to find all date-like values in the same format as your example, here is one way to do so:
(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun) (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{2} \d{2}:\d{2}:\d{2} \d{4}
In python:
import re
data = "Expiration Date: Mon Aug 20 16:07:24 2029 word other gibberish word Expiration Date: Mon Aug 20 16:08:16 2029 word gibberish word"
print(re.findall(r'(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun) '
r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) '
r'\d{2} \d{2}:\d{2}:\d{2} \d{4}',
data))
# ['Mon Aug 20 16:07:24 2029', 'Mon Aug 20 16:08:16 2029']
CodePudding user response:
If you want to match every date-like string between "Expiration Date: "
and " word"
, you could use this regex:
(?<=Expiration Date: ).*?(?= word)
Of course, this assumes that " word"
is always right after the date you want to match. It uses a "Positive Lookbehind" and a "Positive Lookahead" so that matches are caught as separate matches instead of just a big match.
Resulting code:
>>> import re
>>> re.findall("(?<=Expiration Date: ).*?(?= word)", INPUT_STRING)
["Mon Aug 20 16:07:24 2029", "Mon Aug 20 16:08:16 2029"]