Home > Software engineering >  Apache Nifi - Getting unique records from CSV files
Apache Nifi - Getting unique records from CSV files

Time:12-22

I have two csv files and both files have records. I want to delete duplicate records. I want to get unique records. How can I do it with Apache Nifi?

Thank you !

input1.csv ;

id,surname,name
1,ali,veli
2,mert,tolga

input2.csv ;

id,surname,name
1,ali,veli
3,ahmet,ozan

output.csv ;

id,surname,name
1,ali,veli
2,mert,ayşe
3,ahmet,ozan

CodePudding user response:

You can do this by doing Record based processing and combine the MergeRecord to merge the two csv files into one and then you can use QueryRecord processor for deduplication with query like:

SELECT * FROM FLOWFILE
INTERSECT
SELECT * FROM FLOWFILE

SELECT DISTINCT FROM FLOWFILE will not work. Here are Calcite docs enter image description here

The output:

enter image description here

  • Related