Home > Software design >  Regex for hashtags tree takes too long to execute
Regex for hashtags tree takes too long to execute

Time:03-21

In our app we have topics which user can create, and each topic must have it's own hashtag (or hashtags hierarchy). We have this kinda Regex for validation:

const REGEX_HASHTAG = /^(#[w]?((\/?)([a-z0-9] ) ) )(,\s{0,1}#[a-z0-9]?((\/?)([a-z0-9] ) ) )*$/g;

What i need is for user to be able to create hashtags which have structure like this:

  1. (#) symbol
  2. Text in lowercase
  3. Optional slash (/) followed by lowercase text to create hierarchy

And also users can put comma (and optional whitespace) followed by new hashtag or hashtag hierarchy. When i put too many letters and slash at the end Regex stops working, takes too long to execute. What am i doing wrong?

regexr.com/6hpqo

CodePudding user response:

There are quite a few nested quantifiers and optional parts, which can cause catastrophic backtracking when there is no match.

You could write the pattern as

^#[a-z0-9] (?:\/[a-z0-9] )*(?:,\s*#[a-z0-9] (?:\/[a-z0-9] )*)*$
  • ^ Start of string
  • #[a-z0-9] Match # and 1 repetitions of the listed characters in the character class
  • (?:\/[a-z0-9] )* Optionally repeat the / and the same character class
  • (?: Non capture group
    • ,\s* Match a comma and optional whitespace chars
    • #[a-z0-9] (?:\/[a-z0-9] )* The same pattern as in the first part
  • )* Close the non capture group and optionally repeat it
  • $ End of string

Regex demo

  • Related