Home > Software design >  Matching comparison operators indefinitely in a string
Matching comparison operators indefinitely in a string

Time:12-30

I'm trying to match the following pattern:

bla bla bla lorem ipsum bla bla [email protected] name!=Foo Bar

Here's my current approach:

(email|name) \s*(>=|<=|>|<|=|!=)\s*([^ !=<>] )

Since I'll always know the keys (email, name), the first part is easy, but I'm unable to match values with whitespace. What am I missing here?

The end of a match should be the beginning of a new key operator combination.

CodePudding user response:

You can use

(email|name)\s*([><]=?|!?=)\s*(.*?)(?=\s*(?:(?:email|name)\s*(?:[><]=?|!?=)|$))

Or, if the keys must be matched as whole words:

\b(email|name)\s*([><]=?|!?=)\s*(.*?)(?=\s*(?:\b(?:email|name)\s*(?:[><]=?|!?=)|$))

See the regex demo. Details:

  • \b(email|name) - a word boundary and Group 1 capturing either email or name
  • \s* - zero or more whitespaces
  • ([><]=?|!?=) - Group 2: < or > and then an optional =, or an optional ! and then a = char
  • \s* - zero or more whitespaces
  • (.*?) - Group 3: any zero or more more chars other than line break chars as few as possible
  • (?=\s*(?:\b(?:email|name)\s*(?:[><]=?|!?=)|$)) - a positive lookahead that requires (immediately to the right of the current position):
    • \s* - zero or more whitespaces
    • (?: - start of a non-capturing group:
      • \b(?:email|name)\s*(?:[><]=?|!?=) - either email or name as a whole word, then zero or more whitespaces and then a < or > and then an optional =, or an optional ! and then a = char
    • | - or
      • $ - end of string
    • ) - end of the group.
  • Related