Home > Enterprise >  How to match 2 array?
How to match 2 array?

Time:02-04

I have 2 files I need to match.

File1.txt contains:

-----------------------------------------------
Words | Keyword | Sentence
-----------------------------------------------
Lunch  >WORDS>    when do you want to have lunch?.
Hate   >WORDS>    I hate you.
Other  >WORDS>    Other than that?

File2.txt contains:

I love you.
Other than that?. 
I like you.
when do you want to have lunch?.

File1 will do the word matching with File2, after this keyword >WORDS>. Meaning File1 and File2 just compare the word "Other than that?" and "when do you want to have lunch?". So the result will take the same word after the keywords >WORDS>. I use array to do.

The expected output will print:

Other  >WORDS>    Other than that?. 
Lunch  >WORDS>    when do you want to have lunch?.

CODE:

use strict;
use warnings;
use diagnostics;

use Data::Dumper;
use 5.010;

my $new= File1.txt;                #read File1
my $old= File2.txt;                #read File2
my $string1;
my $string2;
my @new_array;
my @old_array;
my $string11;
my @array1;

#---------------------------------------------------------------
#   Main
#---------------------------------------------------------------

open(NEW_FILE,"<", $new) || die "Cannot open file $new to Read! - $!"; 
open(OLD_FILE,"<", $old) || die "Cannot open file $old to Read! - $!";

while (<NEW_FILE>) {
    my $string1= $_;
    my $string11= $_;
    if ($string1=~ m/WORDS/){      #matching the Keyword >WORDS>
        $string1 = $';         #string1 will take after >WORDS>
        $string11 = $_;        #string11 will take the full.
        push (@new_array, ($string1));      #string1 = @new_array
        push (@array1, ($string11));    }}  #string11 = @array1

while (<OLD_FILE>) {
    my $string2= $_;
    if ($string2 =~ m/WORDS/){  #matching the Keyword >WORDS>
        $string2 = $';          #string2 will take after >WORDS>
        push (@old_array, ($string2));   #string2 = @old_array
        }}

#------Do comparison between new file and old file. (only after WORDS)
my @intersection =();
my @unintersection = ();
my %hash1 = map{$_ => 1} @old_array;

foreach (@new_array){
    if (defined $hash1{$_}){    
        push @intersection, $_; #this one will take the same array between new and old
    }
    else { 
        push @unintersection, $_;   #this one will take the new array only. So, will read this one.
    }}

Until this part, if I print the @unintersection, it will produce:

other than that?
when do you want to have lunch?.

Do comparison between@unintersection (result after WORDS) and (@array1).

my @same();
my @not_same= ();
my %hash2 = map{$_ => 1} @unintersection;

foreach (@array1) {
    if (@array1 = m/WORDS/){      
        @array1 = $';
        if (defined $hash2{$_}) {
            @array1 = $_;
            push @same, $_;             
        }
        else {
            push @not_same, $_;}}}

print @same;
print @not_same;

close(NEW_FILE);
close(OLD_FILE);
close(NEW_OUTPUT_FILE);

The result that I produce only 1. have lunch?"

Other  >WORDS>    Other than that?

Should be got 2 output. "Other >WORDS> Other than that?" and "Lunch >WORDS> when do you want to have lunch?"

CodePudding user response:

The problem can be solved with a lookup table (implemented as hashref) build on information provided in File1.txt (words_lookup.dat).

Once we have lookup table at our disposal read File2.txt (words_data.dat) and compare with lookup table. If the input line matches lookup table then output stored value ($lookup->{$1}{line}) to the console.

use strict;
use warnings;
use feature 'say';

my($fh, $lookup);

my $fname_lookup = 'words_lookup.dat';    # File1.txt
my $fname_data   = 'words_data.dat';      # File2.txt
my $re_lookup    = qr/(\S )\s >WORDS>\s (.*)/;

open $fh, '<', $fname_lookup
    or die "Couldn't open $fname_lookup";
    
while( <$fh> ) {
    chomp;
    next unless /$re_lookup/;
    $lookup->{$1}{sentence} = $2;
    $lookup->{$1}{line} = $_;
}

close $fh;

open $fh, '<', $fname_data
    or die "Couldn't open $fname_data";
    
while( my $line = <$fh> ) {
    $line =~ /$lookup->{$_}{sentence}/ && say $lookup->{$_}{line} for keys $lookup->%*;
}

close $fh;

exit 0;

Output

Other  >WORDS>    Other than that?
Lunch  >WORDS>    when do you want to have lunch?.
  • Related