For removing everything between parentheses, currently i use:
SELECT
REGEXP_REPLACE('(aaa) bbb (ccc (ddd) / eee)', "\\([^()]*\\)", "");
Which is incorrect, because it gives bbb (ccc / eee)
, as that removes inner parentheses only.
How to remove everynting between nested parentheses? so expected result from this example is bbb
CodePudding user response:
In case of Google BigQuery, this is only possible if you know your maximum number of nestings. Because it uses re2 library that doesn't support regex recursions.
let r = /\((?:(?:\((?:[^()])*\))|(?:[^()]))*\)/g
let s = "(aaa) bbb (ccc (ddd) / eee)"
console.log(s.replace(r, ""))
CodePudding user response:
If you can iterate on the regular expression operation until you reach a fixed point you can do it like this:
repeat {
old_string = string
string := remove_non_nested_parens_using_regex(string)
} until (string == old_string)
For instance if we have
((a(b)) (c))x)
on the first iteration we remove (b)
and (c)
: sequences which begin with (
, end with )
and do not contain parentheses, matched by \([^()]*\)
. We end up with:
((a) )x)
Then on the next iteration, (a)
is gone:
( )x)
and after one more iteration, ( )
is gone:
x)
when we try removing more parentheses, there is no more change, and so the algorithm terminates with x)
.