Home > Back-end >  Regex to find all strings that are not inside balanced parentheses
Regex to find all strings that are not inside balanced parentheses

Time:07-15

I want to find a regex that works in JavaScript and have the logic to find all strings that are not inside balanced parentheses, i.e. all strings that start and finish with the char " but are not surrounded by both char ( and char ).

I want that for the text:

1. ("ggg" "H"
2. "ggg") "H"
3. ("ggg") "H"
4. "gg()g" "H"
5. "gg)g" "H"
6. "gg(g" "H"
  1. the matches: "ggg", "H"
  2. the matches: "ggg", "H"
  3. the matches: "H"
  4. the matches: "gg()g", "H"
  5. the matches: "gg)g", "H"
  6. the matches: "gg(g", "H"

this is what I have for now: but it doesn't work for strings that have the char ) inside

(?<!\()"[^\)] ?"

CodePudding user response:

The Trick can make it easier: Match what you don't want, but capture what you need...
not this|(but that) and process the matches on JS-side. For your task, eg:

\("[^"] "\)|("[^"] ")

See this demo at Regex101 (in the multiline demo \n is for not skipping lines)

For extracting, use eg exec and check if group 1 is set. If it is set, the match derives from the right side of the alternation (not "inside balanced"). Here a JS-demo at tio.run using exec or replace.

  • Related