I have a csv file that looks like this:

13:BOOT2700        X;27
13:BOOT2700        X;27
13:BOOT2700        X;27
13:BOOT2700        X;27
13:BOXER1136       X;11.36
13:BOXER1364       X;13.64
13:BOXER1591       X;15.91
13:BOXER909        X;9.09
...

I would like to remove the duplicates of data[0] and remove the spaces and the "X" at the end of the string. For the second part it works correctly but I can't delete the duplicates. I tried this code but they remain. It shows me each time the first values while they are identical.

In the end I would like this:

13:BOOT2700;27
13:BOXER1136;11.36
13:BOXER1364;13.64
13:BOXER1591;15.91
13:BOXER909;9.09

How can I do it? Thanks for your help

<?php

$file = "BI6_20211214_0905_15000.txt";
    
        if (($handle = fopen($file, "r")) !== false) {
                  

                
                while (($data = fgetcsv($handle, 9000000, ";")) !== false) {        

                    $uniqueStr = implode('X', array_unique(explode('X', $data[0]))); //doesn't work

                    $clean_name = str_replace(' ', '', $data[0]);
                    $clean_name2 = str_replace('X', '', $clean_name);
                    
                        echo $clean_name2; //13:BOOT2700
                        echo ("<br>"); 

                    } 
                }
                fclose($handle);              
     
 
echo "good !";


?>

CodePudding user response：

Here's the entire code simplified and with comments that can help OP and others understand how you can process that.

I have 2 files:

input.txt

13:BOOT2700        X;27
13:BOOT2700        X;28
13:BOOT2700        X;29
13:BOOT2700        X;29
13:BOXER1136       X;11.36
13:BOXER1364       X;13.64
13:BOXER1591       X;15.91
13:BOXER909        X;9.09

When you run the code below, its result will be

===> Processing input.txt
Result:
13:BOOT2700;27
13:BOXER1136;11.36
13:BOXER1364;13.64
13:BOXER1591;15.91
13:BOXER909;9.09

input2.txt

13:BOOT111         X;27
13:BOOT2700        X;29
13:BOOT2700        X;29
13:BOXER1136       X;11.36
13:BOXER1364       X;13.64
13:BOXER1591       X;15.91
13:BOXER909        X;9.09

Its output will be

===> Processing input2.txt
Result:
13:BOOT111;27
13:BOOT2700;29
13:BOXER1136;11.36
13:BOXER1364;13.64
13:BOXER1591;15.91
13:BOXER909;9.09

Code

<?php

# Remove byte order mark (BOM)
function remove_utf8_bom($text) {
    $bom = pack('H*','EFBBBF');
    $text = preg_replace("/^$bom/", '', $text);
    return $text;
}

# get list of all files
$dir   = 'your-path/';
$allFiles = scandir($dir);

# process each file
foreach($allFiles as $file) {

    if (in_array($file, array(".",".."))) {
        continue;
    }

    echo "===> Processing $file\n";
    $file = $dir.$file;
    $filename = basename( $file );

    # stores unique items like 13:BOOT2700, 13:BOXER1136 etc.
    $processedItems = array();

    # stores lines in the format we need
    $finalResult = array();

    $handle = fopen($file, 'r');
    if ($handle === false) {
        echo "Problem opening $file. Skipping.\n";
        continue;
    }

    # read each line
    while(!feof($handle)) {
        $line = fgets($handle);
        $line = remove_utf8_bom($line);

        # skip empty lines
        if (strlen(trim($line)) === 0) {
            continue;
        }

        # split by X;, trim the first part
        $lineSplit = explode('X;', $line);
        $lineSplit[0] = trim($lineSplit[0]);

        # check if the first part (such as 13:BOOT2700) is processed already
        # if so, don't do anything else
        if (in_array($lineSplit[0], $processedItems) === true) {
            continue;
        }
        else {
            # store the first part in processed items and create the newly
            # formatted line; store that in final result
            $processedItems[] = $lineSplit[0];
            $finalResult[] = $lineSplit[0] . ';' . $lineSplit[1];
        }
    }
    fclose($handle);

    # show the final result
    echo "Result:\n";
    foreach ($finalResult as $x) {
        echo $x;
    }
}
 
echo "Done";

?>

CodePudding user response：

The file is read into an array with file. With array_map and preg_replace the spaces and the X are removed from each line. array_unique then removes the duplicate entries.

$array = file('input.txt',FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$array = array_map(function($v){return preg_replace('/  X;/',';',$v);}, $array);
$array = array_unique($array);

The result is an array.

array (
  0 => "13:BOOT2700;27",
  4 => "13:BOXER1136;11.36",
  5 => "13:BOXER1364;13.64",
  6 => "13:BOXER1591;15.91",
  7 => "13:BOXER909;9.09",
)

If a file is required as a result, the array can be converted into a string with implode and written to a file with file_put_contents.

$str = implode("\r\n",$array);
file_put_contents('input.csv', $str);