I'm looking for help on joining (at the UNIX level) two files (file1 and file2), picking values from file1 as a priority over the values in file2. If a srcvalue exists in file1, that should be taken instead of file2's tmpValue. If there is no srcValue in file1, then pick up this value from file2's tmpValue.
Sample data:
file1:
id name srcValue
1 a s123
2 b s456
3 c
file2:
id tmpValue
1 Tva
3 TVb
4 Tvm
Desired output:
ID Name FinalValue
1 a s123
2 b s456
3 c TVb
CodePudding user response:
I would approach this problem with an awk
script; it is fairly powerful and flexible. The general approach here is to load the values from file2 first, then loop through file1 and substitute them as needed.
awk 'BEGIN { print "ID Name FinalValue" }
FNR == NR && FNR > 1 { tmpValue[$1]=$2; }
FNR != NR && FNR > 1 { if (NF == 2) {
print $1, $2, tmpValue[$1]
} else {
print $1, $2, $3
}
}
' file2 file1
The BEGIN block is executed before any files are read; its only job is to output the new header.
The FNR == NR && FNR > 1
condition is true for the first filename ("file2" here) and also skips the first line of that file (FNR > 1
), since it's a header line. The "action" block for that condition simply fills an associative array with the id and tmpValue from file2.
The FNR != NR && FNR > 1
corresponds to the second filename ("file1" here) and also skips the first (header) line. In this block of code, we check to see if there's a srcValue; if so, print those three values back out; if not, substitute in the saved value (assuming there is one; otherwise, it'll be blank).
I assume that the <br>
bits in the question are attempts at formatting, and that column 3 in file1 would actually be empty if there was no value there.