Home > Back-end >  Regex for time-zones, special case being ignored
Regex for time-zones, special case being ignored

Time:02-11

I am creating a regex to parse a string of time zones. The output must be reading input in the following form:

  • 0930

  • 0930 10930-1

  • <0930

  • (>0930) (the brackets are just to avoid stack reading this as '<>')

  • (<0920 1)

  • (>0920 1)

  • 0920-1240 1

  • 1200-1-1430

  • 1200-1-1400 1

  • 0920-1240 <<<<<<<<<<<<<<<<<<<<<<<<<ISSUE HERE

The regex cannot differentiate between hhmm-1, and hhmm-hhmm. It will read '0900-1200' as '0900-1'.

I have attempted many variateions of the regex, including:

r'([<>])?([0-9]{2})([0-9]{2})([ -]?)([0-1]?)|([0-9]{2})([0-9]{2})'

r'([<>])?([0-9]{2})([0-9]{2})([ -])?([0-1]?)(([0-1]?{4})()'

r'([<>])?([0-9]{2})([0-9]{2})([ -])?([0-1]?)(?([0-1]?)()'

Currently just considering using 2 different ones! One to test for case of hyphenated time string, the other for the rest,which work for me. I would like the output in a list of tuples, like

[('', '09', '30', '-', '','12','30', '-', '1'),
 ('', '09', '30', '-', '1','','', '', ''),
 ('>', '09', '30', '-', '1','','', '', '').....]

CodePudding user response:

You can use

([<>])?([0-9]{2})([0-9]{2})(?:([ -])([01])(?!\d{3}\b))?(?:([ -])([0-9]{2})([0-9]{2})(?:([ -])([01])(?!\d{3}\b))?)?

See the regex demo. Details:

  • ([<>])? - Group 1 (optional): < or >
  • ([0-9]{2}) - Group 2: two digits
  • ([0-9]{2}) - Group 3: two digits
  • (?:([ -])([01])(?!\d{3}\b))? - an optional group matching a sequence of:
    • ([ -]) - Group 4: or -
    • ([01])(?!\d{3}\b) - Group 5: 1 or 0 that are not followed with 3 more digits followed with a word boundary
  • (?: - start of a non-capturing group:
    • ([ -]) - Group 6: or -
    • ([0-9]{2}) - Group 7: two digits
    • ([0-9]{2}) - Group 8: two digits
    • (?:([ -])([01])(?!\d{3}\b))? - Optional sequence of or - captured in Group 9 and then 1 or 0 (captured in Group 10) that are not followed with 3 more digits followed with a word boundary
  • )? - end of non-capturing group, repeat 1 or 0 times.
  • Related