I have 2 file need to matching.
File1.txt contains:
-----------------------------------------------
Words | Keyword | Sentence
-----------------------------------------------
Lunch >WORDS> when do you want to have lunch?.
Hate >WORDS> I hate you.
Other >WORDS> Other than that?
File2.txt contains:
I love you.
Other than that?.
I like you.
when do you want to have lunch?.
File1 will do the word matching with File2, after this keyword
>WORDS>
. Meaning File1 and File2 just compare the word "Other than that?" and "when do you want to have lunch?". So the result will take the same word after the keywords >WORDS>.
I use array to do.
The expected output will print:
Other >WORDS> Other than that?.
Lunch >WORDS> when do you want to have lunch?.
CODE:
use strict;
use warnings;
use diagnostics;
use Data::Dumper;
use 5.010;
my $new= File1.txt; #read File1
my $old= File2.txt; #read File2
my $string1;
my $string2;
my @new_array;
my @old_array;
my $string11;
my @array1;
#---------------------------------------------------------------
# Main
#---------------------------------------------------------------
open(NEW_FILE,"<", $new) || die "Cannot open file $new to Read! - $!";
open(OLD_FILE,"<", $old) || die "Cannot open file $old to Read! - $!";
while (<NEW_FILE>) {
my $string1= $_;
my $string11= $_;
if ($string1=~ m/WORDS/){ #matching the Keyword >WORDS>
$string1 = $'; #string1 will take after >WORDS>
$string11 = $_; #string11 will take the full.
push (@new_array, ($string1)); #string1 = @new_array
push (@array1, ($string11)); }} #string11 = @array1
while (<OLD_FILE>) {
my $string2= $_;
if ($string2 =~ m/WORDS/){ #matching the Keyword >WORDS>
$string2 = $'; #string2 will take after >WORDS>
push (@old_array, ($string2)); #string2 = @old_array
}}
#------Do comparison between new file and old file. (only after WORDS)
my @intersection =();
my @unintersection = ();
my %hash1 = map{$_ => 1} @old_array;
foreach (@new_array){
if (defined $hash1{$_}){
push @intersection, $_; #this one will take the same array between new and old
}
else {
push @unintersection, $_; #this one will take the new array only. So, will read this one.
}}
Until this part, if I print the @unintersection
, it will produce:
other than that?
when do you want to have lunch?.
Do comparison between@unintersection
(result after WORDS) and (@array1
).
my @same();
my @not_same= ();
my %hash2 = map{$_ => 1} @unintersection;
foreach (@array1) {
if (@array1 = m/WORDS/){
@array1 = $';
if (defined $hash2{$_}) {
@array1 = $_;
push @same, $_;
}
else {
push @not_same, $_;}}}
print @same;
print @not_same;
close(NEW_FILE);
close(OLD_FILE);
close(NEW_OUTPUT_FILE);
The result that I produce only 1. have lunch?"
Other >WORDS> Other than that?
Should be got 2 output. "Other >WORDS> Other than that?" and "Lunch >WORDS> when do you want to have lunch?"
CodePudding user response:
The problem can be solved with a lookup table (implemented as hashref) build on information provided in File1.txt (words_lookup.dat
).
Once we have lookup table at our disposal read File2.txt (words_data.dat
) and compare with lookup table. If the input line matches lookup table then output stored value ($lookup->{$1}{line}
) to the console.
use strict;
use warnings;
use feature 'say';
my($fh, $lookup);
my $fname_lookup = 'words_lookup.dat'; # File1.txt
my $fname_data = 'words_data.dat'; # File2.txt
my $re_lookup = qr/(\S )\s >WORDS>\s (.*)/;
open $fh, '<', $fname_lookup
or die "Couldn't open $fname_lookup";
while( <$fh> ) {
chomp;
next unless /$re_lookup/;
$lookup->{$1}{sentence} = $2;
$lookup->{$1}{line} = $_;
}
close $fh;
open $fh, '<', $fname_data
or die "Couldn't open $fname_data";
while( my $line = <$fh> ) {
$line =~ /$lookup->{$_}{sentence}/ && say $lookup->{$_}{line} for keys $lookup->%*;
}
close $fh;
exit 0;
Output
Other >WORDS> Other than that?
Lunch >WORDS> when do you want to have lunch?.