I want to remove duplicates from txt file. Now, I use this to remove duplicates:
$lines = file('input.txt');
$lines = array_unique($lines);
file_put_contents('output.txt', implode($lines));
The problem is that code only remove duplicate for a case like beef bbq recipe
and beef bbq recipe
only. In my case, if the txt file contains keywords like :
beef bbq recipe
beef easy recipe
beef steak recipe
bbq recipe beef
beef bbq recipe
recipe bbq beef
Will return with this result :
beef bbq recipe
beef easy recipe
beef steak recipe
bbq recipe beef
recipe bbq beef
Instead, I want the result looks like this :
beef bbq recipe
beef easy recipe
beef steak recipe
So, I want cases like beef bbq recipe
, bbq recipe beef
and recipe bbq beef
to be considered as duplicates too. Is there a solution for this? Thank you
CodePudding user response:
You can use array_map
, explode
and sort
to bring the keywords into the same order for all your lines before removing duplicates:
$lines = file('input.txt');
// sort keywords in each line
$lines = array_map(function($line) {
$keywords = explode(" ", trim($line));
sort($keywords);
return implode(" ", $keywords);
}, $lines);
$lines = array_unique($lines);
file_put_contents('output.txt', implode("\n", $lines));
This will iterate your array and order the keywords for each line alphabetically. Afterwards, you can remove the duplicated lines using array_unique
.