I am using the following regex in JavaScript to validate an input field:
<textarea
id="kpf-message-textarea"
class="message-area"
name="message"
maxlength="1000"
aria-describedby="kpf-message-extra-text"
aria-invalid="true"
tabindex={this.kpfTabindex}
value={this.message}
onInput={(event) => this._handleChange(event)}>
</textarea>
this.message.match(/^([A-Za-z]|[0-9]|
[ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]|[ \\.^°!"²§³$%&\/\\{\\}\\
(\\)=?´`@€ -\\*~'#<>|µ,;:_<CR><LF>]|[\n]) $/)
For the follwing pattern with 30 capital alphabets and one invalid character, it causes the browser to hang and only closing and opening the brower again helps:
ABCDEFGHIJKLMNOPAAAAAAAAAAAAAA¼
Whats wrong here?
CodePudding user response:
The pattern times out due to catastrophic backtracking.
It can not match the character ¼
at the end, and it will still try to explore (backtrack) all paths. With the outer repeating group, and the alternations inside that group, there are a lot of options.
Using [<CR><LF>]
in the character class is the same as [><CFLR]
If you want to match a carriage return or a linefeed you can use \r\n
As you use the alternation |
for all single character matches (character classes without a quantifier), you can merge all character classes to a single character class and repeat the character class 1 or more times.
^[A-Za-z0-9ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ \\.^°!"²§³$%&/{} ()=?´`@€ *~'#<>|µ,;:_\r\n-] $
A simplified example:
function handleChange() {
let elm = document.getElementById("kpf-message-textarea");
if (/^[A-Za-z0-9ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ \\.^°!"²§³$%&/{} ()=?´`@€ *~'#<>|µ,;:_\r\n-] $/.test(elm.value)) {
console.log("Match: " elm.value);
} else {
console.log("No match: " elm.value);
}
}
<textarea id="kpf-message-textarea" class="message-area" name="message" maxlength="1000" aria-invalid="true" onInput=handleChange()>
</textarea>