I need your assistance to find the list of unmatched in the Employee.txt from the following examples on AIX 6.x.
Employee.txt
1|Sam|Smith|Seatle
2|Barry|Jones|Seatle
3|Garry|Brown|Houston
4|George|Bla|LA
5|Celine|Wood|Atlanta
6|Jody|Ford|Chicago
Car.txt
100|red|1
110|green|9
120|yellow|2
130|yellow|6
140|red|8
150|white|0
bash-4.3$ awk -F"|" 'NR==FNR { empcar[$1]=$0; next } { if (empcar[$3]) print empcar[$3] "|" $1 "|" $2 > "match.txt"; else print $0 > "no_match.txt" }' Employee.txt Car.txt
110|green|9
140|red|8
150|white|0
match.txt
1|Sam|Smith|Seatle|100|red
2|Barry|Jones|Seatle|120|yellow
6|Jody|Ford|Chicago|130|yellow
no_match.txt
110|green|9
140|red|8
150|white|0
bash-4.3$ awk -F"|" 'NR==FNR { empcar[$1]=$0; next } !($3 in empcar)' employee.txt car.txt produced the same list as in the no_match.txt.
However, I want the no_match.txt to be as follows:
3|Garry|Brown|Houston
4|George|Bla|LA
5|Celine|Wood|Atlanta
In other words, print the row in Employee.txt when does not have employee no. in Car.txt. I couldn’t work out how to reference those unmatched records in the else statement.
I also encountered a lot of unexplained duplicates in the match.txt with my private confidential data that cannot be disclosed.
Many thanks, George
CodePudding user response:
print the row in
Employee.txt
when does not have employee no. inCar.txt
.
You may use this solution:
awk -F"|" '
NR == FNR {
empcar[$3]
next
}
{
print > ($1 in empcar ? "match.txt" : "no_match.txt")
}' Car.txt Employee.txt
cat match.txt
1|Sam|Smith|Seatle
2|Barry|Jones|Seatle
6|Jody|Ford|Chicago
cat no_match.txt
3|Garry|Brown|Houston
4|George|Bla|LA
5|Celine|Wood|Atlanta
Note that we are processing Car.txt
as first file and storing all IDs from 3rd field in array empcar
. Later while processing Employee.txt
we just redirect output to match or no match based on the condition if $1
from later file exists in associative array empcar
or not.