I have the following string:
"{'foo': datetime.datetime(2022, 5, 23, 0, 0, tzinfo=tzlocal()), 'bar': 'some data', 'foobar': datetime.datetime(2022, 8, 3, 13, 57, 41, tzinfo=<UTC>), 'barlist': ['hello', 'world']}"
I want to be able to match all the datetime.datetime(...)
strings within this string and replace it with the numbers in a list form only. So this is the expected result:
"{'foo': [2022, 5, 23, 0, 0], 'bar': 'some data', 'foobar': [2022, 8, 3, 13, 57, 41], 'barlist': ['hello', 'world']}"
I have something like this:
DATETIME_PATTERN = r"datetime.datetime\(((\d )(,\s*\d )*), tzinfo=.*\)"
modified_input_str = re.sub(DATETIME_PATTERN, r"[\1]", input_str)
but it replaces a big chunk of data inbetween the matches. How can I modify the regex to accomplish what I want?
Conclusion: I made a modification of the current best answer so it fits my particular usecase more:
DATETIME_PATTERN = r"datetime\.datetime\((\d (?:,\s*\d )*), tzinfo=(?:[^\s\d])*\)"
# The difference is that the string at the end of 'tzinfo=' can be anything but whitespace or numbers.
CodePudding user response:
You can use
datetime\.datetime\((\d (?:,\s*\d )*), tzinfo=(?:\(\)|[^()])*\)
Details:
datetime\.datetime\(
- adatetime.datetime(
string(\d (?:,\s*\d )*)
- Group 1: one or more digits and then zero or more repetitions of a comma zero or more whitespaces and then one or more digits, tzinfo=
- a literal string(?:\(\)|[^()])*
- zero or more repetitions of a()
string or any char other than(
and)
\)
- a)
char.
See the regex demo.