Assume I have a string as follows:
2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00
Where a date
comes with time
several times. Is it possible that regular expression can find all time after each date
such as follows?
[('2021/12/23', '13:00','14:00'), ('2021/12/24', '13:00','14:00','15:00')]
I tried the following code in Python, but it returns only the first time:
re.findall(r'(\d /\d /\d )(\s\d \:\d ) ','2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00')
>>>[('2021/12/23', ' 14:00'), ('2021/12/24', ' 15:00')]
CodePudding user response:
You can use PyPi regex library to get the following to work:
import regex
pattern = regex.compile(r'(?P<date>\d /\d /\d )(?:\s (?P<time>\d :\d )) ')
for m in pattern.finditer('2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'):
print(m.capturesdict())
Output:
{'date': ['2021/12/23'], 'time': ['13:00', '14:00']}
{'date': ['2021/12/24'], 'time': ['13:00', '14:00', '15:00']}
See the Python demo.
Since PyPi regex library does not "forget" all captures inside a group, and provided the groups are named, the match.capturesdict()
returns the dictionary of all groups with their captures.
CodePudding user response:
Use re.findall
:
inp = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'
matches = re.findall(r'\d{4}/\d{2}/\d{2}(?: \d{1,2}:\d{2})*', inp)
print(matches)
This prints:
['2021/12/23 13:00 14:00', '2021/12/24 13:00 14:00 15:00']
Explanation of regex:
\d{4}/\d{2}/\d{2} match a date in YYYY/MM/DD format
(?: \d{1,2}:\d{2})* match a space followed by hh:mm time, 0 or more times
CodePudding user response:
You can use this findall split
solution:
import re
s = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'
for i in re.findall(r'\d /\d /\d (?:\s\d \:\d ) ', s): print (i.split())
Output:
['2021/12/23', '13:00', '14:00']
['2021/12/24', '13:00', '14:00', '15:00']
\d /\d /\d (?:\s\d \:\d )
matches a date string followed by 1 or more time strings.
You. could also use:
print ([i.split() for i in re.findall(r'\d /\d /\d (?:\s\d \:\d ) ', s)])
To get output:
[['2021/12/23', '13:00', '14:00'], ['2021/12/24', '13:00', '14:00', '15:00']]