Home > Software design >  Wrap timecode into brackets using regular expression
Wrap timecode into brackets using regular expression


In a string, I need all timecodes to be formatted as following [HH:MM:SS] or [HH:MM:SS.ms]. Some of them are already in brackets. They can be everywhere, beginning, middle, or end of a phrase.

I'd like to put those not in brackets in brackets.

To select all of them I use:

[\[]?\d\d:\d\d:\d\d(.\d )?[\]]?

I tried

(?!\[. \])(.|^)(\d\d:\d\d:\d\d(.\d )?)(.|$)(?!\[. \])

Which is almost fine except that my selection $2 includes space characters in the case of string not beggining by ^ or finishing by $.

How can I get rid of this selection?

CodePudding user response:

You can use

re.sub(r'\[?\b(\d{2}:\d{2}:\d{2}(?:\.\d )?)\b]?', r'[\1]', text)

See the regex demo. Details:

  • \[? - an optional [ char
  • \b - a word boundary
  • (\d{2}:\d{2}:\d{2}(?:\.\d )?) - Group 1:
    • \d{2}:\d{2}:\d{2} - two digits, and then two occurrences of : and two digits
    • (?:\.\d )? - an optional sequence of . and one or more digits
  • \b - a word boundary
  • ]? - an optional ] char

To make sure you match 24-hour time format you can use a more precise pattern:

\[?\b((?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](?:\.[0-9] )?)\b]?

See this demo.

  • Related