I am trying to detect names (French names) that can be in the following formats
CHARLES
CHARLES-ANTOINE
D'URENBURRY
CHARLES--ANTOINE
CHARLES ANTOINE
And for that I have the following regex
/^([A-Z] (([-]{0,2}|[']|[ ])?[A-Z] )) $/
Which works, but GitHub's code scanner shows this error
This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'AA'.
I understand the error, however, I'm not sure how to solve it.
My question is: how to solve it.
Thank you
CodePudding user response:
You need to re-write the pattern like
^[A-Z] (?:(?:-{1,2}|[' ])[A-Z] )*$
See the regex demo. Now, each subsequent pattern will match different string, at different locations inside the string.
Details:
^
- start of string[A-Z]
- one or more uppercase ASCII letters(?:(?:-{1,2}|[' ])[A-Z] )*
- zero or more repetions of:(?:-{1,2}|[' ])
- a non-capturing group matching one or two hyphens, a space or apostrophe[A-Z]
- one or more uppercase ASCII letters
$
- end of string.