Home > OS >  Python regex to parse the datetime.datetime object from string
Python regex to parse the datetime.datetime object from string

Time:02-01

I have the following string:

"{'foo': datetime.datetime(2022, 5, 23, 0, 0, tzinfo=tzlocal()), 'bar': 'some data', 'foobar': datetime.datetime(2022, 8, 3, 13, 57, 41, tzinfo=<UTC>), 'barlist': ['hello', 'world']}"

I want to be able to match all the datetime.datetime(...) strings within this string and replace it with the numbers in a list form only. So this is the expected result:

"{'foo': [2022, 5, 23, 0, 0], 'bar': 'some data', 'foobar': [2022, 8, 3, 13, 57, 41], 'barlist': ['hello', 'world']}"

I have something like this:

DATETIME_PATTERN = r"datetime.datetime\(((\d )(,\s*\d )*), tzinfo=.*\)"
modified_input_str = re.sub(DATETIME_PATTERN, r"[\1]", input_str)

but it replaces a big chunk of data inbetween the matches. How can I modify the regex to accomplish what I want?

Conclusion: I made a modification of the current best answer so it fits my particular usecase more:

DATETIME_PATTERN = r"datetime\.datetime\((\d (?:,\s*\d )*), tzinfo=(?:[^\s\d])*\)"

# The difference is that the string at the end of 'tzinfo=' can be anything but whitespace or numbers.

CodePudding user response:

You can use

datetime\.datetime\((\d (?:,\s*\d )*), tzinfo=(?:\(\)|[^()])*\)

Details:

  • datetime\.datetime\( - a datetime.datetime( string
  • (\d (?:,\s*\d )*) - Group 1: one or more digits and then zero or more repetitions of a comma zero or more whitespaces and then one or more digits
  • , tzinfo= - a literal string
  • (?:\(\)|[^()])* - zero or more repetitions of a () string or any char other than ( and )
  • \) - a ) char.

See the regex demo.

  • Related