Home > front end >  Python regular expression find and output a part of the pattern in multiple times
Python regular expression find and output a part of the pattern in multiple times

Time:01-13

Assume I have a string as follows:

2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00

Where a date comes with time several times. Is it possible that regular expression can find all time after each date such as follows?

[('2021/12/23', '13:00','14:00'), ('2021/12/24', '13:00','14:00','15:00')]

I tried the following code in Python, but it returns only the first time:

re.findall(r'(\d /\d /\d )(\s\d \:\d ) ','2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00')

>>>[('2021/12/23', ' 14:00'), ('2021/12/24', ' 15:00')]

CodePudding user response:

You can use PyPi regex library to get the following to work:

import regex
pattern = regex.compile(r'(?P<date>\d /\d /\d )(?:\s (?P<time>\d :\d )) ')
for m in pattern.finditer('2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'):
    print(m.capturesdict())

Output:

{'date': ['2021/12/23'], 'time': ['13:00', '14:00']}
{'date': ['2021/12/24'], 'time': ['13:00', '14:00', '15:00']}

See the Python demo.

Since PyPi regex library does not "forget" all captures inside a group, and provided the groups are named, the match.capturesdict() returns the dictionary of all groups with their captures.

CodePudding user response:

Use re.findall:

inp = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'
matches = re.findall(r'\d{4}/\d{2}/\d{2}(?: \d{1,2}:\d{2})*', inp)
print(matches)

This prints:

['2021/12/23 13:00 14:00', '2021/12/24 13:00 14:00 15:00']

Explanation of regex:

\d{4}/\d{2}/\d{2}    match a date in YYYY/MM/DD format
(?: \d{1,2}:\d{2})*  match a space followed by hh:mm time, 0 or more times

CodePudding user response:

You can use this findall split solution:

import re

s = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'

for i in re.findall(r'\d /\d /\d (?:\s\d \:\d ) ', s): print (i.split())

Output:

['2021/12/23', '13:00', '14:00']
['2021/12/24', '13:00', '14:00', '15:00']

Code Demo

\d /\d /\d (?:\s\d \:\d ) matches a date string followed by 1 or more time strings.

You. could also use:

print ([i.split() for i in re.findall(r'\d /\d /\d (?:\s\d \:\d ) ', s)])

To get output:

[['2021/12/23', '13:00', '14:00'], ['2021/12/24', '13:00', '14:00', '15:00']]
  •  Tags:  
  • Related