Home > Enterprise >  I need a regex that finds substrings in a CQL string split by occurrences of AND and OR outside quot
I need a regex that finds substrings in a CQL string split by occurrences of AND and OR outside quot

Time:07-05

In this example source string:

index1 = "searchterm1" AND (index2 any "\"value2.1\" \"value2.2 AND sometext\" \"value2.3 OR sometext\"") OR index3 = "searchterm3"

The source needs to be splitted by the following bold text:

index1 = "searchterm1" AND (index2 any "\"value2.1\" \"value2.2 AND sometext\" \"value2.3 OR sometext\"") OR index3 = "searchterm3"

I expect this to be the result:

match 1 with group 1: index1 = "searchterm1"

match 2 with group 1: AND and group 2:(index2 any "\"value2.1\" \"value2.2 AND sometext\" \"value2.3

match 3 with group 1: OR and group 2: sometext\"") OR index3 = "searchterm3"

I tried i.e. this: \b(AND|OR)(?=([^\"]*\"[^\"]*\")*[^\"]*$) but those escaped quotes are giving me a hard time.

CodePudding user response:

You can use the following regex:

(AND|OR|^).*?(?:\1.*?)*(?=(AND|OR|$))

It will match:

  • (AND|OR|^): AND, OR or the start of string symbol
  • .*?: the least amount of characters that are followed by
  • (?:\1.*?)*: the same AND, OR sequence of characters and the any other characters - optionally
  • (?=(AND|OR|$)): AND, OR or the end of string symbol

Check the demo here.

  • Related