I have this string:
(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)
I need to grab the commas inside the parentesis for further processing, and I want the commas spliting the groups to remain.
Let's say I want to replace the target commas by FOO
, the result should be:
(40.959953710949506 FOO -74.18210638344726),(40.95891663745299 FOO -74.10606039345703),(40.917472246121065 FOO -74.09582940498359),(40.921752754230255 FOO -74.16397897163398),(40.95248644043785 FOO -74.21067086616523)
I want the Regular Expression, I don't want some language specific functions for this.
CodePudding user response:
You can just use a lookaround to find all ,
that are not preceded by a )
like this:
(?<!\)),
I don't want some language specific functions for this
The format of the above regex is not language specific as can be seen in the following Code Snippet or this regex101 snippet:
const x = '(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)';
const rgx = /(?<!\)),/g;
console.log(x.replace(rgx, ' XXX'));
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
CodePudding user response:
For example:
import re
s = "(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)"
s = re.sub(r",(?=[^()] \))", " FOO", s)
print(s)
# (40.959953710949506 FOO -74.18210638344726),(40.95891663745299 FOO -74.10606039345703),(40.917472246121065 FOO -74.09582940498359),(40.921752754230255 FOO -74.16397897163398),(40.95248644043785 FOO -74.21067086616523)
We use a positive lookahead to only replace commas where )
comes before (
ahead in the string.
CodePudding user response:
Use re.sub
with a callback function:
inp = "(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)"
output = re.sub(r'\((-?\d (?:\.\d )?),\s*(-?\d (?:\.\d )?)\)', lambda m: r'(' m.group(1) r' FOO ' m.group(2) r')', inp)
print(output)
This prints:
(40.959953710949506 FOO -74.18210638344726),(40.95891663745299 FOO -74.10606039345703),(40.917472246121065 FOO -74.09582940498359),(40.921752754230255 FOO -74.16397897163398),(40.95248644043785 FOO -74.21067086616523)
The strategy here is to capture the two numbers in each tuple in separate groups. Then, we replace by connecting the two numbers with FOO
instead of the original comma.