Home > Blockchain >  Match repeating groups with regex with optional parameter in .NET
Match repeating groups with regex with optional parameter in .NET

Time:04-05

I need to validate a filtering input with a regex that will be used for a [RegularExpression] attribute on a Filter field in a class used a input model. The input has the following format:

[property]~[predicate]~[value]

For example:

lastname~eq~'John'

and also multiple filtering can be applied n times:

[property]~[predicate]~[value]~[logicaloperator]~[property]~[predicate]~[value] ...

For example:

lastname~eq~'Doe'~and~firstname~eq~'John'~or~firstname~eq~'Jane'

I have to make sure that if logical operators are used then they are followed by the same pattern. I tried using named groups and lookbehinds but I couldn't get it to work properly.

I've created the following regex :

((((\w )~(\blt\b|\blte\b|\beq\b|\bgt\b|\bgte\b|\bneq\b|\bcontains\b)~(.\w .))(~(\bor\b|\band\b)~)?((\w )~(\blt\b|\blte\b|\beq\b|\bgt\b|\bgte\b|\bneq\b|\bcontains\b)~(.\w .))?) )

I cannot get it to match only when the input is valid. The general pattern of the groups that I've tried to implement is:

(main group-
 (property group-any word)~(predicate group-list of operators)~(value -any value)
)
(~(logic operator)~)
(main group)

Targeted behavior:

Valid input:

lastname~eq~'Doe'                                                      -> should match
lastname~eq~'Doe'~and~firstname~eq~'John'                              -> should match
lastname~eq~'Doe'~and~firstname~eq~'John'~or~firstname~eq~'Jane'        -> should match

Invalid input:

lastname~eq~                                          ->should not match
lastname~eq~'Doe'~and~firstname~eq                    ->should not match
lastname~eq~'Doe'~and~firstname~eq~John~              ->should not match
lastname~eq~'Doe'~and~firstname~eq~John~or~           ->should not match

Any ideas how to make this work ?

CodePudding user response:

You can use

^\w ~(?:lte?|n?eq|gte?|contains)~['"][^'"] ['"](?:~(?:and|or)~\w ~(?:lte?|n?eq|gte?|contains)~['"][^'"] ['"])*$

Or,

^(?:\w ~(?:lte?|n?eq|gte?|contains)~['"][^'"] ['"](?:~(?:and|or)~(?!$)|$)) $

See the regex demo.

Note that the $ is preceded with \r? in the regex demo because the string is a multiline string with CRLF line endings, and RegexOptions.Multiline option is enabled.

The pattern matches

  • ^ - start of string
  • \w - one or more word chars
  • ~ - a ~ char
  • (?:lte?|n?eq|gte?|contains) - a predicate pattern (lt, lte, gt, gte, neq, eq, contains
  • ~ - a ~ char
  • ['"][^'"] ['"] - a ' or ", then one or more chars other than ' and " and then a " or '
  • (?: - start of a non-capturing group
    • ~(?:and|or)~ - ~, and or or, and a ~ char
    • \w ~(?:lte?|n?eq|gte?|contains)~['"][^'"] ['"] - described above
  • )* - zero or more repetitions
  • $ - end of string.
  • Related