I have a program with hundreds of patterns similar to the following:
(^|\.|,|:|;|\"|-|‖|\[|\(|\{)m(z|y)aword(e|)(s|)($|\.|,|:|;|\"|-|‖|\]|\)|\})
That the first group and last group searches for a beginning and end respectively.
(^|\.|,|:|;|\"|-|‖|\[|\(|\{) << beginning
($|\.|,|:|;|\"|-|‖|\]|\)|\}) << end
is there a simpler alternative pattern semantically equivalent to the above, the problem with the above solution is since there is a lot of words that should meet this pattern, I have more than 500 of them so far, that makes the program very very slow. Any help and suggestion is truly appreciated Thanks. Bid
CodePudding user response:
An equivalent and less intense regex would be:
(?:^|[.,:;"‖\[({-])m[zy]aworde?s?(?:[.,:;"‖\])}-]|$)
Notes:
- When possible, it favors character classes
[]
instead of boolean alternation ORs|
- Favor non-capturing groups
(?:)
. Capture groups cost resources so shy away from them unless you need to explicitly reference something.
My regex performance sample - 11,000 steps
Your regex performance sample - 46,000 steps