python date regex-CodePudding

This is the data given:

reviews = ["2017-09-02T07:00:09Z It's really smooth but the taste isn't so good. Terrible absolutely terrible. More on the cough syrup side than black cherry and vanilla. It was a waste of money. The green apple and blood orange are the best ones. Slightly disappointed in the taste. "]

reviews = pd.DataFrame(reviews)

I need to give an regex expression for the date and separately the time.

This is my attempt:

pattern=r'(\d{4}[-/]\d{2}[-/]\d{2})'

sol=re.findall(pattern,reviews)

print(sol)

CodePudding user response：

Easy way would be to slice the string and convert it to datetime. Next extract the time.

from datetime import datetime

pattern = pattern[:10]   ' '   pattern[11:19]
pattern = datetime.fromisoformat(pattern)
print(pattern.time())

CodePudding user response：

If you want to use regular expressions rather than simple string slicing then you could isolate the date and time from this string as follows:

import re

reviews = "2017-09-02T07:00:09Z It's really smooth but the taste isn't so good. Terrible absolutely terrible. More on the cough syrup side than black cherry and vanilla. It was a waste of money. The green apple and blood orange are the best ones. Slightly disappointed in the taste. "

rx = r'(?P<Date>\d{4}\-\d\d\-\d\dT)(?P<Time>\d\d:\d\d:\d\dZ)'

if (m := re.search(rx, reviews[0])):
    print(m.group('Date')[:-1])
    print(m.group('Time')[:-1])

This regular expression will not cope with multiple occurrences of this date/time pattern