I'm writing a python script that will help me replace the logging framework in my c application using the re.sub functionality.
The old syntax looks like this:
old_log_info("this is an integer: %i, this is a double: %d", 1, 2.0);
old_log_error("this is an integer: %i, this is a double: %d", 1, 2.0);
The new syntax:
new_log_inf("this is an integer: {}, this is a double: {}", 1, 2.0);
new_log_err("this is an integer: {}, this is a double: {}", 1, 2.0);
It has to work on multiline statements as well, that is:
old_log_info(
"this is an integer: %i, this is a double: %d",
1,
2.0);
Should turn into:
new_log_inf(
"this is an integer: {}, this is a double: {}",
1,
2.0);
replacing function names is trivial, but replacing format specifiers (%i
,%d
, etc.) only if appear in the logging expressions is not. the %i
in:
printf("this is an integer: %i", 1);
should be untouched.
I have tried playing with lookarounds to isolate the substrings between old_log_info(
and the nearest );
, but I can't figure out how to replace only the format specifiers in that match and not the whole match.
CodePudding user response:
You can use two layers of regex. One to find the functions in which to make the changes, and one to actually make the changes.
Below is an example of the logic. Note that it doesn't work if the text in the function contains );
, in which case it's better to replace the first level of regex by parsing (please provide examples of code if this is the case).
code = '''some code %i
old_log_info("this is an integer: %i, this is a double: %d", 1, 2.0);
printf("this is an integer: %i", 1);'''
import re
def repl(m):
inside = re.sub('%[id]', '{}', m.group(2))
return f'new_log_info({inside});'
new_code = re.sub('(old_log_info)\((.*?)\);', repl, code)
Output:
some code %i
new_log_info("this is an integer: {}, this is a double: {}", 1, 2.0);
printf("this is an integer: %i", 1);