Home > OS >  Best way to replace/remove a specific character when it appears between 2 character sequences
Best way to replace/remove a specific character when it appears between 2 character sequences

Time:05-11

I have a php function which selects the text from a string between 2 different character sequences.

function get_string_between($string, $start, $end){
    $string = ' ' . $string;
    $ini = strpos($string, $start);
    if ($ini == 0) return '';
    $ini  = strlen($start);
    $len = strpos($string, $end, $ini) - $ini;
    return substr($string, $ini, $len);
}

$fullstring = ',""Word1, Word2""';
$parsed = get_string_between($fullstring, ',""', '""');

echo $parsed; //Result = Word1, Word2

However, I would like to extend this further to select all matches when there are multiple occurrences within the string (this is likely, since the string will be generated by a csv file with hundreds of lines and hundreds of matches.) For example:

$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';

And within each substring I will need to remove certain characters. In this example, I need to remove commas.

The intended output would be:

//Result2 = ',""Word1 Word2"" and another thing ,""Word3 Word4""'

Can anybody suggest the most straightforward way of achieving this? Thanks.

CodePudding user response:

So it was a good thing I asked for the output, because initially I had something else. Many people would use regular expressions here, but I often find those difficult to work with, so I took a more basic approach:

function extractWantedStuff($input)
{
    $output = [];
    $sections = explode('""', $input);
    $changeThisSection = false;
    foreach ($sections as $section) {
        if ($changeThisSection) {
            $section = str_replace(',', '', $section);
        }
        $output[] = $section;
        $changeThisSection = !$changeThisSection;
    }
    return implode('""', $output);
}

$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';

echo extractWantedStuff($fullstring);

The output would be:

,""Word1 Word2"" and another thing ,""Word3 Word4""

See: Example code

Slightly more optimized, by removing the $changeThisSection boolean:

function extractWantedStuff($input)
{
    $output = [];
    $sections = explode('""', $input);
    foreach ($sections as $key => $section) {
        if ($key % 2 != 0) { // is $key uneven?
            $section = str_replace(',', '', $section);
        }
        $output[] = $section;
    }
    return implode('""', $output);
}

$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';

echo extractWantedStuff($fullstring);

See: Example code

And further optimized, by removing the $output array:

function extractWantedStuff($string)
{
    $sections = explode('""', $string);
    foreach ($sections as $key => $section) {
        if ($key % 2 != 0) {
            $sections[$key] = str_replace(',', '', $section);
        }
    }
    return implode('""', $sections);
}

$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';

echo extractWantedStuff($fullstring);

See: Example code

CodePudding user response:

You can actually perform a regex match matching all characters in between start and end substrings in a non greedy manner and use preg_match_all to capture all of those in-between strings like below:

<?php

$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4"",""Word5, Word6""';
$start = ',""';
$end = '""';
preg_match_all('/'. preg_quote($start) . '(. ?)' . preg_quote($end) . '/', $fullstring, $matches);
print_r($matches[1]);

Online Demo

Update:

If you wish to perform the whole word match, you can simply do a greedy match removing the ? with preg_match like below:

<?php

$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4"",""Word5, Word6""';
$start = ',""';
$end = '""';
preg_match('/'. preg_quote($start) . '(.*)' . preg_quote($end) . '/', $fullstring, $matches);
print_r($matches[0] ?? []);

Online Demo

CodePudding user response:

You can simply use string replace function if you have specific strings to remove, I just passed those in array and replaced it with blanck space.

function removeExtraCharacters($fullstring, $characters = array()){
    foreach($characters as $char){
        $fullstring = str_replace($char,"", $fullstring);
    }
    return $fullstring;
}
$fullstring = ',""Word1, Word2"" and another thing ,""Word3, Word4""';
echo removeExtraCharacters($fullstring, array(',"', '"'));
//output: Word1, Word2 and another thing Word3, Word4
  • Related