Home > database >  Match column elements of two files and replace it with the matched line using AWK/PERL
Match column elements of two files and replace it with the matched line using AWK/PERL

Time:03-20

I have two files each having 3 columns. I want to match element of column 3 of file1 with column3 of file2. If it matches, replace the entire line of file1 with the line corresponding to the match from file2, otherwise move to the next line.

Below is the example: In file2, the 3rd column element reg[2] and reg[9][9] are present in the column 3 of file1. So, the corresponding line of file1 is replaced with that of line from file2.

File1:

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

File2:

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

Desired output

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

Attempted code:

awk -F, 'NR==FNR{a[$3]=$0;next;}a[$3]{$0=a[$3]}1' file2 file1

I am still a novice in using oneliner awk commands. I am definitely doing something wrong in the above code. What I am trying to do is put the 3rd column in the form of key and the entire line as value. If the key exists in column3 of file1, replace fil1 current line with current value from file2. Otherwise skip and move to next line.

CodePudding user response:

I would use GNU AWK following way, let file1.txt content be

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]
and file2.txt content be
Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

then

awk 'FNR==NR{arr[$3]=$0;next}{print($3 in arr?arr[$3]:$0)}' file2.txt file1.txt

output

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

Explanation: start from processing file2.txt by storing each line in array arr under key being 3rd column ($3) value, do nothing else (thus next usage), then process file1.txt if 3rd value is present among arr keys ($3 in arr) do print corresponding value otherwise print current line ($0). In order to do so I employ so-called ternary operator condition?valueiftrue:valueiffalse

(tested in GNU Awk 5.0.1)

CodePudding user response:

Noticing the perl tag, here's a Perl solution:

perl -ane 'if ($eof) {
               if (exists $h{ $F[2] }) {
                   print $h{ $F[2] }
               } else { print }
           } else {
               $h{ $F[2] } = $_;
               $eof = 1 if eof;
           }' -- file2 file1
  • -n reads the input line by line, running the code for each line;
  • -a splits each line on whitespace into the @F array;
  • We set the variable $eof at the end of the first file, i.e. file2;
  • While reading the first file (file2), we store each line to a hash keyed by the third column;
  • While reading the second file (file1), we check whether the hash contains the line for the third column: if so, we print it, otherwise we print the current line.
  • Related