Home > front end >  RegEx that gives letters not enclosed by parentheses
RegEx that gives letters not enclosed by parentheses

Time:08-19

I want to write a regular expression that gives me letters that are not enclosed by parentheses. It should not give me any letters in any parentheses.

For example:

If the input is x y^(a b*c) z
it should give me x, y, and z, but not a, b, and c

I've tried this but it didn't work:

/[^\(][a-z][^\)]/g

Test strings:

"x y^(a b) z"         // should return ["x", "y", "z"]
"x y^(a b) z*(c-d)/w" // should return ["x", "y", "z", "w"]
"x y^(a b*(c)) z"     // should return ["x", "y", "z"]

Also, the indexes of all the letters in the original string are needed.

Please answer with an explanation

CodePudding user response:

Here are two solutions:

strings = [
  'x y^(a b) z',
  'x y^(a b c) q z',
  'x y^(a b) p q^(c d) z'
].forEach(str => {
  let match1 = str.match(/(?<!\([^\)]*)[a-z] /g);
  let match2 = str.replace(/\^\([^\)]*/g, '').match(/[a-z] /g);
  console.log(str   ' =>'
      '\n  match1: '   match1
      '\n  match2: '   match2);
});
Output:

x y^(a b) z =>
  match1: x,y,z
  match2: x,y,z
x y^(a b c) q z =>
  match1: x,y,q,z
  match2: x,y,q,z
x y^(a b) p q^(c d) z =>
  match1: x,y,p,q,z
  match2: x,y,p,q,z

Explanation for match1:

  • .match(/(?<!\([^\)]*)[a-z] /g) -- negative lookbehind for ^(... pattern
  • note that negative lookbehind does not work in all browsers, notably Safari

Explanation for match2:

  • .replace(/\^\([^\)]*/g, '') -- remove all ^(...) patterns
  • .match(/[a-z] /g) -- simple match for letters

Do you have nested parenthesis? This is possible too with pre-tagging parenthesis with nesting level. Let me know.

CodePudding user response:

For your specific case you could use negative lookbehind / lookahead:

(?<!\()[a-z](?!\))

https://regex101.com/r/C0L228/1

Example:

const rootSymbols = /(?<!\()[a-z](?!\))/g;
console.log("x y^(a b) z".match(rootSymbols)); // ["x", "y", "z"]

If you have nested brackets, I think the simplest is to get first rid of them:

const rootSymbols = (str) => str.replace(/\(.*\)/g, "").match(/[a-z]/g);
console.log(rootSymbols("x y^(a b-(c)) z")); // ["x", "y", "z"]

CodePudding user response:

Ugly JavaScript solution:

const symbols = (str) => str.replace(/\(([^\)] )/g, '').match(/[a-z] /g)?.map(a => ({letter: a, index: str.indexOf(a)}));

console.log(symbols("x y^(a b) z")); // ["x", "y", "z"] with indexes [0, 2, 10]
console.log(symbols("x y^(a b) z*(c-d)/w")); // ["x", "y", "z", "w"] with indexes [0, 2, 10, 18]
console.log(symbols("x y^(a b*(c)) z")); // ["x", "y", "z"] with indexes [0, 2, 14]

Inner parentheses symbols are removed first and then the remaining is matched, after that indexes are added, final objects looks like this:

// "x y^(a b) z*(c-d)/w"
[
    {
        "letter": "x",
        "index": 0
    },
    {
        "letter": "y",
        "index": 2
    },
    {
        "letter": "z",
        "index": 10
    },
    {
        "letter": "w",
        "index": 18
    }
]
  • Related