Home > Software design >  search for string in file using an aray elements perl
search for string in file using an aray elements perl

Time:12-11

I have a array which contains set of unique elements my_array= [aab, abc def, fgh,] I have a file which containing these elements(repeated also) I want to count each unique element has how many repetitions if no repetition then count is 1

example of file :

i want to have aab but no i dont want abc
i want to have aab but no i dont want def

output should be

aab - 2
abc - 1
def - 1

I tried to search first and print it its not woking

use strict;
use warnings;
my @my_array;

@my_array =("abc", "aab", "def");
open (my $file, '<', 'filename.txt') or die;
my $value;
foreach $value (@my_array) {
    while(<$file>) {
        if ($_ =~ /$value/){
            print "found : $value\n"; 
        }
    }
}

Also tried 2nd method

use strict;
use warnings;
my @my_array;

@my_array =("abc", "aab", "def");

open (my $file, '<', 'filename.txt') or die;
while (<$file>) {
    my $k=0;
    if ($_ =~ /$my_array[$k]/) {
        print "$my_array[$k]”;
    }
}

CodePudding user response:

Sample input data does not specify if lookup words repeat in the line or not.

Following demo code assumes that lookup words do not repeat in the line.

If this statement above does not true then the line should be split into tokens and each token must be inspected to get correct count of lookup words.

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my(%count,@lookup);

@lookup =('abc', 'aab', 'def');

while( my $line = <DATA> ) {
    for ( @lookup ) {
        $count{$_}   if $line =~ /\b$_\b/;
    }
}

say Dumper(\%count);

exit 0;

__DATA__
i want to have aab but no i dont want abc
i want to have aab but no i dont want def

Output

$VAR1 = {
          'aab' => 2,
          'abc' => 1,
          'def' => 1
        };

CodePudding user response:

I'm a fan of the Algorithm::AhoCorasick::XS module for performing efficient searches for multiple strings at once. An example:

#!/usr/bin/env perl
use warnings;
use strict;
use Algorithm::AhoCorasick::XS;

my @words = qw/abc aab def/;
my $aho = Algorithm::AhoCorasick::XS->new(\@words);

my %counts;

while (my $line = <DATA>) {
    $counts{$_}   for $aho->matches($line);
}

for my $word (@words) {
    printf "%s - %d\n", $word, $counts{$word}//1;
}

__DATA__
i want to have aab but no i dont want abc
i want to have aab but no i dont want def

outputs

abc - 1
aab - 2
def - 1

The $counts{$word}//1 bit in the output will give you a 1 if that word doesn't exist in the hash because it wasn't encountered in the text.

  • Related