I split a string by comma, but not within parathesis, using preg_split
. I came up with
preg_split('#,(?![^\(]*[\)])#',$str);
which works perfectly unless there is a comma before a nested parenthesis.
Works for
$str = "first (1,2),second (child (nested), child2), third";
Array
(
[0] => first (1,2)
[1] => second (child (nested), child2)
[2] => third
)
but not for
$str = "first (1,2),second (child, (nested), child2), third";
Array
(
[0] => first (1,2)
[1] => second (child
[2] => (nested), child2)
[3] => third
)
CodePudding user response:
Looking at the requirement of ignoring ,
which are inside the brackets, this problem just boils down to making sure the brackets are balanced. If any ,
resides inside an unbalanced parenthesis, we ignore them, else that ,
is our delimiter now for the split.
To collect strings in-between these ,
, we maintain a start pointer $sub_start
to keep track of substrings' start index and update them after we come across a valid delimiter ,
.
Snippet:
<?php
function splitCommaBased($str){
$open_brac = 0;
$len = strlen($str);
$res = [];
$sub_start = 0;
for($i = 0; $i < $len; $i){
if($str[ $i ] == ',' && $open_brac == 0){
$res[] = substr($str, $sub_start, $i - $sub_start);
$sub_start = $i 1;
}else if($str[ $i ] == '('){
$open_brac ;
}else if($str[ $i ] == ')'){
$open_brac--;
}else if($i === $len - 1){
$res[] = substr($str, $sub_start);
}
}
return $res;
}
print_r(splitCommaBased('first (1,2),second (child, (nested), child2), third'));
CodePudding user response:
You can use recursion matching the balanced parenthesis. Then make use of SKIP FAIL and match the comma to split on.
(\((?:[^()] |(?1))*\))(*SKIP)(*F)|,
See a regex demo.
Example
$str = "first (1,2),second (child, (nested), child2), third";
$pattern = "/(\((?:[^()] |(?1))*\))(*SKIP)(*F)|,/";
print_r(preg_split($pattern, $str));
Output
Array
(
[0] => first (1,2)
[1] => second (child, (nested), child2)
[2] => third
)