I have a csv file that looks like this:
13:BOOT2700 X;27
13:BOOT2700 X;27
13:BOOT2700 X;27
13:BOOT2700 X;27
13:BOXER1136 X;11.36
13:BOXER1364 X;13.64
13:BOXER1591 X;15.91
13:BOXER909 X;9.09
...
I would like to remove the duplicates of data[0] and remove the spaces and the "X" at the end of the string. For the second part it works correctly but I can't delete the duplicates. I tried this code but they remain. It shows me each time the first values while they are identical.
In the end I would like this:
13:BOOT2700;27
13:BOXER1136;11.36
13:BOXER1364;13.64
13:BOXER1591;15.91
13:BOXER909;9.09
How can I do it? Thanks for your help
<?php
$file = "BI6_20211214_0905_15000.txt";
if (($handle = fopen($file, "r")) !== false) {
while (($data = fgetcsv($handle, 9000000, ";")) !== false) {
$uniqueStr = implode('X', array_unique(explode('X', $data[0]))); //doesn't work
$clean_name = str_replace(' ', '', $data[0]);
$clean_name2 = str_replace('X', '', $clean_name);
echo $clean_name2; //13:BOOT2700
echo ("<br>");
}
}
fclose($handle);
echo "good !";
?>
CodePudding user response:
Here's the entire code simplified and with comments that can help OP and others understand how you can process that.
I have 2 files:
input.txt
13:BOOT2700 X;27
13:BOOT2700 X;28
13:BOOT2700 X;29
13:BOOT2700 X;29
13:BOXER1136 X;11.36
13:BOXER1364 X;13.64
13:BOXER1591 X;15.91
13:BOXER909 X;9.09
When you run the code below, its result will be
===> Processing input.txt
Result:
13:BOOT2700;27
13:BOXER1136;11.36
13:BOXER1364;13.64
13:BOXER1591;15.91
13:BOXER909;9.09
input2.txt
13:BOOT111 X;27
13:BOOT2700 X;29
13:BOOT2700 X;29
13:BOXER1136 X;11.36
13:BOXER1364 X;13.64
13:BOXER1591 X;15.91
13:BOXER909 X;9.09
Its output will be
===> Processing input2.txt
Result:
13:BOOT111;27
13:BOOT2700;29
13:BOXER1136;11.36
13:BOXER1364;13.64
13:BOXER1591;15.91
13:BOXER909;9.09
Code
<?php
# Remove byte order mark (BOM)
function remove_utf8_bom($text) {
$bom = pack('H*','EFBBBF');
$text = preg_replace("/^$bom/", '', $text);
return $text;
}
# get list of all files
$dir = 'your-path/';
$allFiles = scandir($dir);
# process each file
foreach($allFiles as $file) {
if (in_array($file, array(".",".."))) {
continue;
}
echo "===> Processing $file\n";
$file = $dir.$file;
$filename = basename( $file );
# stores unique items like 13:BOOT2700, 13:BOXER1136 etc.
$processedItems = array();
# stores lines in the format we need
$finalResult = array();
$handle = fopen($file, 'r');
if ($handle === false) {
echo "Problem opening $file. Skipping.\n";
continue;
}
# read each line
while(!feof($handle)) {
$line = fgets($handle);
$line = remove_utf8_bom($line);
# skip empty lines
if (strlen(trim($line)) === 0) {
continue;
}
# split by X;, trim the first part
$lineSplit = explode('X;', $line);
$lineSplit[0] = trim($lineSplit[0]);
# check if the first part (such as 13:BOOT2700) is processed already
# if so, don't do anything else
if (in_array($lineSplit[0], $processedItems) === true) {
continue;
}
else {
# store the first part in processed items and create the newly
# formatted line; store that in final result
$processedItems[] = $lineSplit[0];
$finalResult[] = $lineSplit[0] . ';' . $lineSplit[1];
}
}
fclose($handle);
# show the final result
echo "Result:\n";
foreach ($finalResult as $x) {
echo $x;
}
}
echo "Done";
?>
CodePudding user response:
The file is read into an array with file. With array_map and preg_replace the spaces and the X are removed from each line. array_unique then removes the duplicate entries.
$array = file('input.txt',FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$array = array_map(function($v){return preg_replace('/ X;/',';',$v);}, $array);
$array = array_unique($array);
The result is an array.
array (
0 => "13:BOOT2700;27",
4 => "13:BOXER1136;11.36",
5 => "13:BOXER1364;13.64",
6 => "13:BOXER1591;15.91",
7 => "13:BOXER909;9.09",
)
If a file is required as a result, the array can be converted into a string with implode and written to a file with file_put_contents.
$str = implode("\r\n",$array);
file_put_contents('input.csv', $str);