I'm having trouble to find the exact formula to split the string on space if the word is longer than 2 character.
$str = 'test 1 vitamin d3 test 2';
$str_parts = preg_split('#(?:\s |(vitamin [cde]))#', $str, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
var_export($str_parts);
An ideal output would be:
['test 1', 'vitamin d3', 'test 2']
CodePudding user response:
You may use this regex in split
:
\h (?=\w{3})
Which matches 1 whitespaces if there is a word with at least 3 characters afterwards using a positive lookahead.
Code:
$str = 'test 1 vitamin d3 test 2';
$str_parts = preg_split('/\h (?=\w{3})/', $str);
print_r($str_parts);
Output:
Array
(
[0] => test 1
[1] => vitamin d3
[2] => test 2
)
CodePudding user response:
You can use a reverse logic and extract sequences of a 3 or more letter word followed with 1-2 letter words:
preg_match_all('~\b\w{3,}(?:\W \w{1,2})*\b~', $text, $matches)
See the regex demo. Details:
\b
- a word boundary\w{3,}
- three or more word chars(?:\W \w{1,2})*
- zero or more occurrences of\W
- one or more non-word chars\w{1,2}
- one to two word chars
\b
- a word boundary