I am writing a python script in which I need to replace strings surrounded by quotation marks into underscores, and the number of underscores should be equal to the length of strings. Here is what I have tried:
>>> re.sub(r'"(. ?)"', (len(r"\1") * "_"), '"hellohello"')
'__'
Apparently, I have got something wrong there. The expected outcome for the test above would be 10 underscores but I got two. Any ideas where it went wrong?
[EDIT] I think r"\1" back references to the first match because
>>> re.sub(r'"(. ?)"', r"\1" "pp", '"hellohello"')
'hellohellopp'
CodePudding user response:
Using re.sub
with a callback function works nicely here:
inp = '"hellohello"'
output = re.sub(r'"(.*?)"', lambda m: '"' re.sub(r'.', '_', m.group(1)) '"', inp)
print(output) # "__________"
The trick here is to match "..."
, while capturing the quoted content in the first capture group. We then replace each character of content with a single underscore.