Appologies for not knowing exactly how to word this question. There is probably even a better title. I'm open to suggestions.
I have the following subjects:
(Field1 = 'Value1') and (Field2 = 'Value2')
and
(Field1 = 'Value1') and (Field2 = 'Value2') or (Field3 = 'Value3')
I want to match in such a way that I have each thing between the () in groups and each conjunction in a group. So, for the second one, some variation of
0: Field1 = 'Value1'
1: and
2: Field2 = 'Value2'
3: or
4: Field3 = 'Value3'
The good news is, I've got regex that works on the first:
\(([A-Za-z0-9\s\'=] )\) (and|or) \(([A-Za-z0-9\s\'=] )\)
https://regex101.com/r/hMXAXS/1
But (on the second subject) it doesn't match the third "and ()". I need to support arbitrary numbers of groups. I can modify it to just look for "and ()" but then it doesn't match the first group.
How can I tell regex to do this? I either need to "double count" some groups (which is fine) or have some other way of optionally looking for additional patterns and matching them.
Thanks for the help!
PS: I was able to get my application to work with the regex ((and|or) \(([A-Za-z0-9\s\'=] )\))
and then just accepting that the first group would never match and creating application logic to support this. Still, I'd bet there's a better way.
CodePudding user response:
You may use preg_match_all
here with the regex pattern (?<=\()(.*?)(?=\))|(?:and|or)
as follows:
$input = "(Field1 = 'Value1') and (Field2 = 'Value2') or (Field3 = 'Value3')";
preg_match_all("/(?<=\()(.*?)(?=\))|(?:and|or)/", $input, $matches);
print_r($matches[0]);
This prints:
Array
(
[0] => Field1 = 'Value1'
[1] => and
[2] => Field2 = 'Value2'
[3] => or
[4] => Field3 = 'Value3'
)
CodePudding user response:
If you are not worried about fringe cases where delimiting words or paretheses can exist within the parenthetical expressions, then preg_split()
generates the desired flat array.
Code: (Demo)
$input = "(Field1 = 'Val and ue1') and (Field2 = 'Valu or e2') or (Field3 = 'Value3')";
var_export(
preg_split(
"~^\(|\)$|\) (and|or) \(~",
$input,
0,
PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE
)
);
Output:
array (
0 => 'Field1 = \'Val and ue1\'',
1 => 'and',
2 => 'Field2 = \'Valu or e2\'',
3 => 'or',
4 => 'Field3 = \'Value3\'',
)
Or simplify the pattern by pre-trimming the outermost parentheses. (Demo)
var_export(preg_split("~\) (and|or) \(~", trim($input, '()'), 0, PREG_SPLIT_DELIM_CAPTURE));
You can also use the continue metacharacter \G
to continue matching from the end of the previous match: (Demo) This takes 88 steps versus Tim's pattern which takes 280 steps to parse the string.
$input = "(Field1 = 'Val and ue1') and (Field2 = 'Valu or e2') or (Field3 = 'Value3')";
preg_match_all('~(?:^\(|\G(?!^)(?:\) | \())\K(?:(?:and|or)|[^)] )~', $input, $m);
print_r($m[0]);