Home > database >  Is there an easy way to remove unnecessary whitespaces inside of brackets that are in the middle of
Is there an easy way to remove unnecessary whitespaces inside of brackets that are in the middle of

Time:09-16

I've strings in the form of:

s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."

and I would like to get a cleaned string in the form of:

s = "Wow that is really nice, (2.1) shows that according to the drawings in (1.1) and a) there are errors."

I tried to fix it with regex:

import re

regex = r" (?=[^(]*\))"
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are some errors."
re.sub(regex, "", s)

But I get faulty results like this: Wow that is really nice, (2.1) shows that according to the drawings in (1.1)anda) there are some errors.

Does anyone know how to deal with this problem when you don't always have the same number of opening and closing brackets?

CodePudding user response:

I am not sure about that, but you can try to do the following:

s = s.replace('( ','(')
s = s.replace(' )',')')

Here replace(old, new) is standard function, that replace old string to the new one. I hope it will help.

CodePudding user response:

If the only whitespace you want to remove are the ones that occur directly after an opening bracket (or before a closing), then a simple string replace might work:

>>> s.replace("( ", "(").replace(" )", ")")
'Wow that is really nice, (2.1) shows that according to the drawings in (1. 1) and a) there are errors.'

CodePudding user response:

You can match all the inner-most parentheneses with simple regex, and then perform a substitution on the matches to remove all the whitespaces.

import re
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."
regex = r"\([^\(\)]*\)"
res = re.sub(regex, lambda s: s[0].replace(" ", ""), s)

print(res)

CodePudding user response:

try

 r" (?=[^()]*\))"

This excludes 'close parenthesis' from the things that can be inside a pair of parentheses.

Whether this works will depends whether you have nested brackets in your text.

Nested brackets is not something that can be solved with regex- you need a parser (it may need to count the brackets)

  • Related