Home > front end >  Split a csv file using unix based on header
Split a csv file using unix based on header

Time:07-02

I have a single csv file that contains 2 headers inside.
Now i want to split them into 2 files based on their header(or any approach that is better).
I tried doing it in our ETL tool but there are limitations to it and can't achieve the desired result.
so wondering if it's possible using unix script

Below is the example of a single file with 2 headers inside:

Name,Age,Profession
John,30,Programmer
Mike,40,Accountant
,,,
Company_Name,Department,Location
Microsoft,IT,USA
Deloitte,Finance,Australia

CodePudding user response:

awk -vRS= -F, '{print > NR"-"$1}' test.csv

Creates:

$ ls
1-Name
2-Company_Name

Using NR in the file name is important to avoid printing to the same file twice.

Edit: this splits a file at blank lines (one or more).

CodePudding user response:

this works with true awk for the selected split break pattern:

awk 'BEGIN {f="file1"} {if ($0==",,,") {f="file2"} else { printf "%s\n",$0 >f".csv"}}' file.csv

CodePudding user response:

$ awk -F',' '!f{f=$1}!/,,,/{print $0>f;next}{f=""}' file
$ head Name Company_Name 
==> Name <==
Name,Age,Profession
John,30,Programmer
Mike,40,Accountant

==> Company_Name <==
Company_Name,Department,Location
Microsoft,IT,USA
Deloitte,Finance,Australia
  • Related