I want to write a regular expression that gives me letters that are not enclosed by parentheses. It should not give me any letters in any parentheses.
For example:
If the input is x y^(a b*c) z
it should give me x, y, and z, but not a, b, and c
I've tried this but it didn't work:
/[^\(][a-z][^\)]/g
Test strings:
"x y^(a b) z" // should return ["x", "y", "z"]
"x y^(a b) z*(c-d)/w" // should return ["x", "y", "z", "w"]
"x y^(a b*(c)) z" // should return ["x", "y", "z"]
Also, the indexes of all the letters in the original string are needed.
Please answer with an explanation
CodePudding user response:
Here are two solutions:
strings = [
'x y^(a b) z',
'x y^(a b c) q z',
'x y^(a b) p q^(c d) z'
].forEach(str => {
let match1 = str.match(/(?<!\([^\)]*)[a-z] /g);
let match2 = str.replace(/\^\([^\)]*/g, '').match(/[a-z] /g);
console.log(str ' =>'
'\n match1: ' match1
'\n match2: ' match2);
});
Output:
x y^(a b) z =>
match1: x,y,z
match2: x,y,z
x y^(a b c) q z =>
match1: x,y,q,z
match2: x,y,q,z
x y^(a b) p q^(c d) z =>
match1: x,y,p,q,z
match2: x,y,p,q,z
Explanation for match1:
.match(/(?<!\([^\)]*)[a-z] /g)
-- negative lookbehind for^(...
pattern- note that negative lookbehind does not work in all browsers, notably Safari
Explanation for match2:
.replace(/\^\([^\)]*/g, '')
-- remove all^(...)
patterns.match(/[a-z] /g)
-- simple match for letters
Do you have nested parenthesis? This is possible too with pre-tagging parenthesis with nesting level. Let me know.
CodePudding user response:
For your specific case you could use negative lookbehind / lookahead:
(?<!\()[a-z](?!\))
https://regex101.com/r/C0L228/1
Example:
const rootSymbols = /(?<!\()[a-z](?!\))/g;
console.log("x y^(a b) z".match(rootSymbols)); // ["x", "y", "z"]
If you have nested brackets, I think the simplest is to get first rid of them:
const rootSymbols = (str) => str.replace(/\(.*\)/g, "").match(/[a-z]/g);
console.log(rootSymbols("x y^(a b-(c)) z")); // ["x", "y", "z"]
CodePudding user response:
Ugly JavaScript solution:
const symbols = (str) => str.replace(/\(([^\)] )/g, '').match(/[a-z] /g)?.map(a => ({letter: a, index: str.indexOf(a)}));
console.log(symbols("x y^(a b) z")); // ["x", "y", "z"] with indexes [0, 2, 10]
console.log(symbols("x y^(a b) z*(c-d)/w")); // ["x", "y", "z", "w"] with indexes [0, 2, 10, 18]
console.log(symbols("x y^(a b*(c)) z")); // ["x", "y", "z"] with indexes [0, 2, 14]
Inner parentheses symbols are removed first and then the remaining is matched, after that indexes are added, final objects looks like this:
// "x y^(a b) z*(c-d)/w"
[
{
"letter": "x",
"index": 0
},
{
"letter": "y",
"index": 2
},
{
"letter": "z",
"index": 10
},
{
"letter": "w",
"index": 18
}
]