Hi have the following string:
s = r'aaa (bbb (ccc)) ddd'
and I would like to find and replace the innermost nested parentheses with {}
. Wanted output:
s = r'aaa (bbb {ccc}) ddd'
Let's start with the nested (
. I use the following regex in order to find nested parentheses, which works pretty good:
match = re.search(r'\([^\)] (\()', s)
print(match.group(1))
(
Then I try to make the substitution:
re.sub(match.group(1), r'\{', s)
but I get the following error:
error: missing ), unterminated subpattern at position 0
I really don't understand what's wrong.
CodePudding user response:
You've gotten the argument order wrong:
sub(pattern, repl, string, count=0, flags=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the Match object and must return a replacement string to be used.
The pattern comes first, but because you've given it match.group(1)
, it's seeing '('
as the pattern, which contains unmatched and unescaped parentheses.
I think what you are after is something like:
re.sub(r'\([^\)] (\()', r'\1{', s)
'aaa ({ccc)) ddd'
CodePudding user response:
You can use
import re
s = r'aaa (bbb (ccc)) ddd'
print( re.sub(r'\(([^()]*)\)', r'{\1}', s) )
# => aaa (bbb {ccc}) ddd
See the Python demo.
Details:
\(
- a(
char([^()]*)
- Group 1 (\1
): any zero or more chars other than(
and)
\)
- a)
char.
The replacement is a Group 1 value wrapped with curly braces.
CodePudding user response:
With your shown samples and attempts, please try following code in Python, written and tested in Python3.x. Also here is the Online demo for used regex in code.
import re
var = r'aaa (bbb (ccc)) ddd'
print( re.sub(r'(^.*?\([^(]*)\(([^)]*)\)(.*)', r'\1{\2}\3', var) )
Output for shown samples, will be as follows:
aaa (bbb {ccc}) ddd
Explanation of Python code:
- Using python's
re
library here for regex. - Creating a variable named
var
which has valueaaa (bbb (ccc)) ddd
in it. - Then using
print
function of python3 to print value which we get fromre.sub
function which is performing substitution for us to get required output.
Explanation of re.sub
section: Basically we are using regex (^.*?\([^(]*)\(([^)]*)\)(.*)
(explained below) which creates 3 capturing groups(only to get required values), where 1st capturing group captures value just before (
which is present before ccc
and 2nd capturing group has ccc
in it and 3rd capturing group has rest of the value in it. While performing substitution we are simply substituting it with \1{\2}\3
and wrapping value ccc
within {..}
Explanation of regex:
(^.*?\([^(]*) ##Creating 1st capturing group which matches values from starting of value to till first occurrence of (
##with a Lazy match followed by a match which matches anything just before next occurrence of (
\( ##Matching literal ( here, NO capturing group here as we DO NOT want this in output.
([^)]*) ##Creating 2nd capturing group which has everything just before next occurrence of ) in it.
\) ##Matching literal ) here, NO capturing group here as we DO NOT want this in output.
(.*) ##Creating 3rd capturing group which has rest values in it.