I am trying to build a custom string (JQL) validator based on my biz requirement for building the query (Just like JIRA - JQL) I got success in building the regex for positive scenarios. but there are some corner cases where i am finding hard to validate and pass the query string. Ex.
Product = 'Pepsi' AND Category = 'Beverages'
(Regex Passes and validated)Product = "L'Oreal Professionnel Anti-hair Loss Regime" AND Category = 'Shampoo'
(Regex failed and shows invalid string)
For a reference i have tested 2nd string here -> https://regexr.com/74tn6
I have built the below regex for building the string like JQL.
/^\(*(NOT )*(\(*((([A-Za-z]\s*)|('[^']*'\s*)) ((?:^|\W)in\s*(\('[^']*'\s*([,]\s*'[^']*'\s*)*\))|(?:^|\W)not\s*in\s*([A-Za-z] \s*)|(?:^|\W)in\s*([A-Za-z] \s*)|(?:^|\W)= '[^']*'|(?:^|\W)!= '[^']*'|(?:^|\W)> '[^']*'|(?:^|\W)< '[^']*'|(?:^|\W)<= '[^']*'|(?:^|\W)>= '[^']*'|(?:^|\W)is '[^']*'|(?:^|\W)not is '[^']*')\)*((?:^|\W)AND\W|(?:^|\W)OR\W) (NOT\W)*)*\(*((([A-Za-z])|('[^']*')) ((?:^|\W)in (\('[^']*' *([,] *'[^']*')*\))|(?:^|\W)in ([A-Za-z] )|(?:^|\W)not in (\('[^']*'([,]'[^']*')*\))|(?:^|\W)not in ([A-Za-z] )|(?:^|\W)= '[^']*'|(?:^|\W)!= '[^']*'|(?:^|\W)> '[^']*'|(?:^|\W)< '[^']*'|(?:^|\W)<= '[^']*'|(?:^|\W)>= '[^']*'|(?:^|\W)is '[^']*'|(?:^|\W)not is '[^']*'))*\)*)* *\(*(((ORDER BY '[^']*')*)|((ORDER BY [A-Za-z]*)*) *(ASC|DESC)*)*\)*\)*$/gi
I know, I have missed few corner use cases like aboveone where the Product is L'Oreal Professionnel Anti-hair Loss Regime
-> This string has "'" special chars.. Can someone help me to refine or fix above regex? so that it accept the different formats of strings too. Thanks in advance.
CodePudding user response:
You can change the regex to
^\(*(NOT )*(\(*((([A-Za-z]\s*)|(('[^']*'|"[^"]*")\s*)) ((?:^|\W)in\s*(\(('[^']*'|"[^"]*")\s*([,]\s*('[^']*'|"[^"]*")\s*)*\))|(?:^|\W)not\s*in\s*([A-Za-z] \s*)|(?:^|\W)in\s*([A-Za-z] \s*)|(?:^|\W)= ('[^']*'|"[^"]*")|(?:^|\W)!= ('[^']*'|"[^"]*")|(?:^|\W)> ('[^']*'|"[^"]*")|(?:^|\W)< ('[^']*'|"[^"]*")|(?:^|\W)<= ('[^']*'|"[^"]*")|(?:^|\W)>= ('[^']*'|"[^"]*")|(?:^|\W)is ('[^']*'|"[^"]*")|(?:^|\W)not is ('[^']*'|"[^"]*"))\)*((?:^|\W)AND\W|(?:^|\W)OR\W) (NOT\W)*)*\(*((([A-Za-z])|(('[^']*'|"[^"]*"))) ((?:^|\W)in (\(('[^']*'|"[^"]*") *([,] *('[^']*'|"[^"]*"))*\))|(?:^|\W)in ([A-Za-z] )|(?:^|\W)not in (\(('[^']*'|"[^"]*")([,]('[^']*'|"[^"]*"))*\))|(?:^|\W)not in ([A-Za-z] )|(?:^|\W)= ('[^']*'|"[^"]*")|(?:^|\W)!= ('[^']*'|"[^"]*")|(?:^|\W)> ('[^']*'|"[^"]*")|(?:^|\W)< ('[^']*'|"[^"]*")|(?:^|\W)<= ('[^']*'|"[^"]*")|(?:^|\W)>= ('[^']*'|"[^"]*")|(?:^|\W)is ('[^']*'|"[^"]*")|(?:^|\W)not is ('[^']*'|"[^"]*")))*\)*)* *\(*(((ORDER BY ('[^']*'|"[^"]*"))*)|((ORDER BY [A-Za-z]*)*) *(ASC|DESC)*)*\)*\)*$
The key change is that instead of handling single quoted string, it handles doubles quoted strings too. I've converted '[^']*'
to ('[^']*'|"[^"]*")