Home > OS >  Solve Catastrophic Backtracking in my regex
Solve Catastrophic Backtracking in my regex

Time:02-25

I am trying to detect names (French names) that can be in the following formats

CHARLES

CHARLES-ANTOINE

D'URENBURRY

CHARLES--ANTOINE

CHARLES ANTOINE

And for that I have the following regex

/^([A-Z] (([-]{0,2}|[']|[ ])?[A-Z] )) $/

Which works, but GitHub's code scanner shows this error

This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'AA'.

I understand the error, however, I'm not sure how to solve it.

My question is: how to solve it.

Thank you

CodePudding user response:

You need to re-write the pattern like

^[A-Z] (?:(?:-{1,2}|[' ])[A-Z] )*$

See the regex demo. Now, each subsequent pattern will match different string, at different locations inside the string.

Details:

  • ^ - start of string
  • [A-Z] - one or more uppercase ASCII letters
  • (?:(?:-{1,2}|[' ])[A-Z] )* - zero or more repetions of:
    • (?:-{1,2}|[' ]) - a non-capturing group matching one or two hyphens, a space or apostrophe
    • [A-Z] - one or more uppercase ASCII letters
  • $ - end of string.
  • Related