I have a CSV file that contains duplicate data in columns, for example :
Field1;Field2;Field3;Field4;Field5
alpha;15;16;delta;delta
alpha;15;15;delta;kappa
alpha;15;15;delta;delta
alpha;15;16;delta;kappa
I want to delete rows that have the same value in Field2;Field3
or Field4;Field5
or both.
Expected output :
Field1;Field2;Field3;Field4;Field5
alpha;15;16;delta;kappa
CodePudding user response:
Suggesting awk
script:
awk -F';' '$2==$3||$4==$5{next}1' input.csv
This will print input.csv
excluding the required lines.
awk -i inplace -F';' '$2==$3||$4==$5{next}1' input.csv
This will updateinput.csv
excluding the required lines.