I have a php function which selects the text from a string between 2 different character sequences.
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini = strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = ',""Word1, Word2""';
$parsed = get_string_between($fullstring, ',""', '""');
echo $parsed; //Result = Word1, Word2
However, I would like to extend this further to select all matches when there are multiple occurrences within the string (this is likely, since the string will be generated by a csv file with hundreds of lines and hundreds of matches.) For example:
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';
And within each substring I will need to remove certain characters. In this example, I need to remove commas.
The intended output would be:
//Result2 = ',""Word1 Word2"" and another thing ,""Word3 Word4""'
Can anybody suggest the most straightforward way of achieving this? Thanks.
CodePudding user response:
So it was a good thing I asked for the output, because initially I had something else. Many people would use regular expressions here, but I often find those difficult to work with, so I took a more basic approach:
function extractWantedStuff($input)
{
$output = [];
$sections = explode('""', $input);
$changeThisSection = false;
foreach ($sections as $section) {
if ($changeThisSection) {
$section = str_replace(',', '', $section);
}
$output[] = $section;
$changeThisSection = !$changeThisSection;
}
return implode('""', $output);
}
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';
echo extractWantedStuff($fullstring);
The output would be:
,""Word1 Word2"" and another thing ,""Word3 Word4""
See: Example code
Slightly more optimized, by removing the $changeThisSection
boolean:
function extractWantedStuff($input)
{
$output = [];
$sections = explode('""', $input);
foreach ($sections as $key => $section) {
if ($key % 2 != 0) { // is $key uneven?
$section = str_replace(',', '', $section);
}
$output[] = $section;
}
return implode('""', $output);
}
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';
echo extractWantedStuff($fullstring);
See: Example code
And further optimized, by removing the $output
array:
function extractWantedStuff($string)
{
$sections = explode('""', $string);
foreach ($sections as $key => $section) {
if ($key % 2 != 0) {
$sections[$key] = str_replace(',', '', $section);
}
}
return implode('""', $sections);
}
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';
echo extractWantedStuff($fullstring);
See: Example code
CodePudding user response:
You can actually perform a regex match matching all characters in between start
and end
substrings in a non greedy manner and use preg_match_all
to capture all of those in-between strings like below:
<?php
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4"",""Word5, Word6""';
$start = ',""';
$end = '""';
preg_match_all('/'. preg_quote($start) . '(. ?)' . preg_quote($end) . '/', $fullstring, $matches);
print_r($matches[1]);
Update:
If you wish to perform the whole word match, you can simply do a greedy match removing the ?
with preg_match
like below:
<?php
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4"",""Word5, Word6""';
$start = ',""';
$end = '""';
preg_match('/'. preg_quote($start) . '(.*)' . preg_quote($end) . '/', $fullstring, $matches);
print_r($matches[0] ?? []);
CodePudding user response:
You can simply use string replace function if you have specific strings to remove, I just passed those in array and replaced it with blanck space.
function removeExtraCharacters($fullstring, $characters = array()){
foreach($characters as $char){
$fullstring = str_replace($char,"", $fullstring);
}
return $fullstring;
}
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';
echo removeExtraCharacters($fullstring, array(',"', '"'));
//output: Word1, Word2 and another thing Word3, Word4