This is with the reference to question: Replacing placeholders with dictionary keys/values
I have placeholders (the same as in the referenced question except the last one). There I need to replace placeholder $fil_TABLE_NAME1
, where $fil_
stays the same but table name differs (split with underscores, can contain numbers)
placeholders = {r'\$plc_hldr1': '1111',
r'\$plc_hldr2': 'abcd',
r'\$\d*date_placeholder': '20200101',
r'\$fil_\w ': '(select * from table)'
}
For replacement I'm using the adjusted code from the referenced question
def remove_escape_chars(reggie):
return re.sub(r'\\\$\\d\*|\$\d*|\\\$fil\\\_\\\w\\\ |\\', '', reggie) #modification
def multiple_replace(escape_dict, text):
# Create a second dictionary to lookup regex match replacement targets
unescaped_placeholders = { remove_escape_chars(k): placeholders[k] for k in placeholders }
# Create a regular expression from all of the dictionary keys
regex = re.compile("|".join(escape_dict.keys()))
return regex.sub(lambda match: unescaped_placeholders[remove_escape_chars(match.group(0))], text)
But when I execute it with
text = "sometext $fil_SAMPLE_TABLE_NAME some more text $plc_hldr2 some more more text
1234date_placeholder some text $5678date_placeholder"
result = multiple_replace(placeholders, text)
print(result)
I get sometext $fil_SAMPLE_TABLE_NAME some more text abcd some more more text 20200101 some text 20200101
- $fil_SAMPLE_TABLE_NAME
is not replaced.
I think I have some issue in regular expression, maybe something incorrectly escaped, but after several modifications, I was not able to find the issue.
Would anybody help me please?
CodePudding user response:
I would take a slightly different approach to this. Rather than trying to match the regex which matched part of the string, create a regex which has each individual regex in its own group, and then use the matching group number to look up the replacement value. For your sample data, the regex would look like this:
(\$plc_hldr1)|(\$plc_hldr2)|(\$\d*date_placeholder)|(\$fil_\w )
and the python code would then be:
placeholders = {r'\$plc_hldr1': '1111',
r'\$plc_hldr2': 'abcd',
r'\$\d*date_placeholder': '20200101',
r'\$fil_\w ': '(select * from table)'
}
replacements = list(placeholders.values())
text = "sometext $fil_SAMPLE_TABLE_NAME some more text $plc_hldr2 some more more text $1234date_placeholder some text $5678date_placeholder"
regex = re.compile('(' ')|('.join(placeholders.keys()) ')')
regex.sub(lambda m: replacements[m.lastindex-1], text)
Output:
sometext (select * from table) some more text abcd some more more text 20200101 some text 20200101
Note that this requires that any group in any of the placeholder regexes needs to be non-capturing i.e. (?:...)
rather than (...)
.