I managed to find half of the solution to my challenge, but I cannot find a way to add a conditional to deal with the other half. I am using awk
. The field separator is ;
and the values are inside double-quotes "
. The files have only 3x fields each.
I have two files (file1.txt file2.txt) and want to store the differences in a third file(results.txt).
file1.txt
"SWITCH1";"rack7";"Datacenter1"
"SWTICH46";"rack1";"rack1"
"ROUTER3";"";"rack1"
"SWITCH7";"rack1";"rack1"
"ROUTER9";"rack1";"rack1"
"ROUTER22";"rack1";"Datacenter4"
file2.txt
"SWITCH1";"rack7";"Datacenter1"
"ROUTER22";";"Datacenter4"
"SWITCH51";"rack7";"Datacenter2"
If I use:
awk -F';' 'FNR==NR {a[$0];next} !($0 in a)' file1.txt file2.txt
I get:
"ROUTER22";";"Datacenter4"
"SWITCH51";"rack7";"Datacenter2"
But I want to treat $2 in file2.txt "
and $2 in file1.txt rack1
not as a difference between files. Therefore whenever I find an entry on file2.txt that has "
in field $2
and rack1
in field $2
in file1.txt for the same $1, I do not want to treat as difference and discard it.
The file is generated dynamically nightly and when it happens; field $2==rack1
in file1.txt
while field $2=="
in file2.txt
. This is the match to exclude as well as the one I managed to exclude with the awk
command above. Below is the expected output:
Desired results.txt
"SWITCH51";"rack7";"Datacenter2"
I am struggling to find a conditional to handle this scenario.
CodePudding user response:
You could check if the value of field 2 is just "
and replace it with "rack1"
If after the replacement $0
is not in array a
then print the unmodified row which is the tmp
variable in the example.
awk '
BEGIN{FS=OFS=";"}
FNR==NR {a[$0];next}
{
tmp = $0
sub(/^"$/, "\"rack1\"", $2)
if (!($0 in a)) print tmp
}
' file1.txt file2.txt
Output
"SWITCH51";"rack7";"Datacenter2"
CodePudding user response:
Based on your shown samples, please try following awk
code. Simple explanation would be, in first Input_file's reading creating 2 arrays a
and b
with index of $0 and $1,$3 respectively. In next Input_file's reading checking 2 conditions if $1,$3 is NOT present in b AND $0 is not present in a then print that line from Input_file2.
awk -F';' '
FNR==NR{
a[$0]
b[$1,$3]
next
}
!(($1,$3) in b) && !($0 in a)
' file1.txt file2.txt