Home > Software design >  Merge two files using AWK with conditions
Merge two files using AWK with conditions

Time:10-07

I am new to bash scripting and need help with below Question. I parsed a log file to get below and now stuck on later part. I have a file1.csv with content as:

mac-test-1,10.32.9.12,15
mac-test-2,10.32.9.13,10
mac-test-3,10.32.9.14,11
mac-test-4,10.32.9.15,13

and second file2.csv has below content:

mac-test-3,10.32.9.14
mac-test-4,10.32.9.15

I want to do a file comparison and if the line in second file matches any line in first file then change the content of file 1 as below:

mac-test-1,10.32.9.12, 15, no match
mac-test-2,10.32.9.13, 10, no match
mac-test-3,10.32.9.14, 11, matched
mac-test-4,10.32.9.15, 13, matched

I tried this

awk -F "," 'NR==FNR{a[$1]; next} $1 in a {print $0",""matched"}' file2.csv file1.csv 

but it prints below and doesn't include the not matching records

mac-test-3,10.32.9.14,11,matched 
mac-test-4,10.32.9.15,13,matched

Also, in some cases the file2 can be empty so the result should be like this:

 mac-test-1,10.32.9.12,15, no match
 mac-test-2,10.32.9.13,10, no match
 mac-test-3,10.32.9.14,11, no match
 mac-test-4,10.32.9.15,13, no match

CodePudding user response:

With your shown samples please try following awk code. You need not to check condition first and then print the statement because when you are checking $1 in a then those items who doesn't exist will NEVER come inside this condition's block. So its better to print whole line of file1.csv and then print status of that particular line either its matched OR not-matched based on their existence inside array.

awk '
BEGIN  { FS=OFS="," }
FNR==NR{
  arr[$0]
  next
}
{
  print $0,(($1 OFS $2) in arr)?"Matched":"Not-matched"
}
' file2.csv file1.csv


EDIT: Adding a solution to handle empty file of file2.csv scenario here, same concept wise as above only thing it handles scenarios when file2.csv is an Empty file.

awk -v lines=$(wc -l < file2.csv) '
BEGIN  { FS=OFS=","}
(lines==0){
  print $0,"Not-Matched"
  next
}
FNR==NR{
  arr[$0]
  next
}
{
  print $0,(($1 OFS $2) in arr)?"Matched":"Not-matched"
}
' file2.csv file1.csv

CodePudding user response:

You are not printing the else case:

awk -F "," 'NR==FNR{a[$1]; next}
{
 if ($1 in a) {
  print $0 ",matched"
 } else {
  print $0 ",no match"
 }
}' file2.csv file1.csv

Output

mac-test-1,10.32.9.12,15,no match
mac-test-2,10.32.9.13,10,no match
mac-test-3,10.32.9.14,11,matched
mac-test-4,10.32.9.15,13,matched

Or in short, without manually printing the comma but using OFS:

awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1];next}{ print $0 OFS (($1 in a)?"":"no")"match"}' file2.csv file1.csv

Edit

I found a solution on this page handling FNR==NR on an empty file.

When file2.csv is empty, all output lines will be:

mac-test-1,10.32.9.12,15,no match

Example

awk -F "," '
ARGV[1] == FILENAME{a[$1];next}
{
 if ($1 in a) {
  print $0 ",matched"
 } else {
  print $0 ",no match"
 }
}' file2.csv file1.csv
  • Related