I have the following string:
string = r"string\032with\032backslash\032\092\032and\010new\035line"
What I want to do is change all escaped triples of digits (which are meant to be read decimally) into their utf-8 form using chr()
.
What I tried to do was
re.sub('(\\[0-9]{3})', chr('\1'), string)
as re.sub allows users to use matched groups in replacement. But this does not work. What would be the correct way to do this?
EDIT:
string="string" chr(32) 'with' chr(32) 'backslash' chr(32) chr(92) chr(32) 'and' chr(10) 'new' chr(35) 'line'
returns (correctly)
string with backslash\ and
new#line
CodePudding user response:
You made 2 mistakes:
- Your pattern needs to be a raw string as well (otherwise the
\\
will be a string containing a single\
, which as magic properties inside a regex.) - If you want to make any changes to the replacement (here: remove the
\
and convert the number into an integer and then into a character) you need to use a function.
>>> re.sub(r'(\\[0-9]{3})', lambda match: chr(int(match.group(0)[1:])), string)
'string with backslash \\ and\nnew#line'