I have a array which contains set of unique elements my_array= [aab, abc def, fgh,] I have a file which containing these elements(repeated also) I want to count each unique element has how many repetitions if no repetition then count is 1
example of file :
i want to have aab but no i dont want abc
i want to have aab but no i dont want def
output should be
aab - 2
abc - 1
def - 1
I tried to search first and print it its not woking
use strict;
use warnings;
my @my_array;
@my_array =("abc", "aab", "def");
open (my $file, '<', 'filename.txt') or die;
my $value;
foreach $value (@my_array) {
while(<$file>) {
if ($_ =~ /$value/){
print "found : $value\n";
}
}
}
Also tried 2nd method
use strict;
use warnings;
my @my_array;
@my_array =("abc", "aab", "def");
open (my $file, '<', 'filename.txt') or die;
while (<$file>) {
my $k=0;
if ($_ =~ /$my_array[$k]/) {
print "$my_array[$k]”;
}
}
CodePudding user response:
Sample input data does not specify if lookup
words repeat in the line or not.
Following demo code assumes that lookup
words do not repeat in the line.
If this statement above does not true then the line should be split into tokens and each token must be inspected to get correct count of lookup
words.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my(%count,@lookup);
@lookup =('abc', 'aab', 'def');
while( my $line = <DATA> ) {
for ( @lookup ) {
$count{$_} if $line =~ /\b$_\b/;
}
}
say Dumper(\%count);
exit 0;
__DATA__
i want to have aab but no i dont want abc
i want to have aab but no i dont want def
Output
$VAR1 = {
'aab' => 2,
'abc' => 1,
'def' => 1
};
CodePudding user response:
I'm a fan of the Algorithm::AhoCorasick::XS
module for performing efficient searches for multiple strings at once. An example:
#!/usr/bin/env perl
use warnings;
use strict;
use Algorithm::AhoCorasick::XS;
my @words = qw/abc aab def/;
my $aho = Algorithm::AhoCorasick::XS->new(\@words);
my %counts;
while (my $line = <DATA>) {
$counts{$_} for $aho->matches($line);
}
for my $word (@words) {
printf "%s - %d\n", $word, $counts{$word}//1;
}
__DATA__
i want to have aab but no i dont want abc
i want to have aab but no i dont want def
outputs
abc - 1
aab - 2
def - 1
The $counts{$word}//1
bit in the output will give you a 1 if that word doesn't exist in the hash because it wasn't encountered in the text.