joining 2 file and taking first file as priority-CodePudding

I'm looking for help on joining (at the UNIX level) two files (file1 and file2), picking values from file1 as a priority over the values in file2. If a srcvalue exists in file1, that should be taken instead of file2's tmpValue. If there is no srcValue in file1, then pick up this value from file2's tmpValue.

Sample data:

file1:

id  name    srcValue
1   a   s123
2   b   s456
3   c

file2:

id  tmpValue    
1   Tva
3   TVb
4   Tvm

Desired output:

ID  Name    FinalValue
1   a   s123
2   b   s456
3   c   TVb

CodePudding user response：

I would approach this problem with an awk script; it is fairly powerful and flexible. The general approach here is to load the values from file2 first, then loop through file1 and substitute them as needed.

awk 'BEGIN { print "ID Name FinalValue" }
     FNR == NR && FNR > 1       { tmpValue[$1]=$2; }
     FNR != NR && FNR > 1       { if (NF == 2) {
                                        print $1, $2, tmpValue[$1]
                                } else {
                                        print $1, $2, $3
                                }
                }
    ' file2 file1

The BEGIN block is executed before any files are read; its only job is to output the new header.

The FNR == NR && FNR > 1 condition is true for the first filename ("file2" here) and also skips the first line of that file (FNR > 1), since it's a header line. The "action" block for that condition simply fills an associative array with the id and tmpValue from file2.

The FNR != NR && FNR > 1 corresponds to the second filename ("file1" here) and also skips the first (header) line. In this block of code, we check to see if there's a srcValue; if so, print those three values back out; if not, substitute in the saved value (assuming there is one; otherwise, it'll be blank).

I assume that the <br> bits in the question are attempts at formatting, and that column 3 in file1 would actually be empty if there was no value there.