I have a string containing parenthesis which looks like this:
(abc)(abc(dbc))
desired output:
abc abc(dbc)
I tried using regular expression:
cleanstring = re.sub('[^A-Za-z0-9._] ', ' ', string)
but it removing all parenthesis.
CodePudding user response:
A regex will probably not cut it (if/when there will be more levels of parens), but a state machine like this seems to do the desired trick for your input:
def strip_first_level_parens(s):
level = 0
out = ""
last_was_level0_paren = False
for c in s:
if c == "(":
level = 1
if level == 1: # don't emit first paren
if last_was_level0_paren: # add space between otherwise unemitted )(
out = " "
continue
elif c == ")":
if not level:
raise ValueError("Unopened parens")
level -= 1
if level == 0: # returned to level 1
last_was_level0_paren = True
continue
out = c
last_was_level0_paren = False
if level:
raise ValueError("Unclosed parens")
return out
print(strip_first_level_parens("(abc)(abc(dbc)) (blah blah) (blah (blahbleh))"))
outputs
abc abc(dbc) blah blah blah (blahbleh)
CodePudding user response:
Use this:
import re
s = '(abc)(abc(dbc))'
' '.join(re.findall('\((\(*(?:[^)(]*|\([^)]*\))*\)*)\)', s))
Output:
'abc abc(dbc)'
CodePudding user response:
Something easy peasy like this works. Less is more.
string = "(abc)(abc(dbc))"
cleanstring = string.replace('(', ' ', 2).replace(')', '', 2).lstrip()
print(cleanstring)
Output:
abc abc(dbc)
CodePudding user response:
Code:
import regex
' '.join([subtext[1:-1] for subtext in regex.findall(r'\((?:\w |(?R))*\)', '(abc)(abc(dbc))')])
Output:
'abc abc(dbc)'