I have many CSVs with the same fields and same name ("data.csv"
). Each csv has a header and multiple lines and is inside a different folder.
E.g. folder1
has a csv called data.csv
:
NAME, COUNTRY
JOHN, USA
MARY, Panama
folder2
has a csv called data.csv
:
NAME, COUNTRY
James, UK
Jim, India
folder3
has a csv called data.csv
:
NAME, COUNTRY
James, UK
Jim, India
Now I want to combine all csv's into one, but without repeating the headers.
So far I am doing:
find . -name "data.csv" | xargs cat > mergedCSV
Which works fine, except for the repeated headers.
CodePudding user response:
You can use csvstack
from the handy csvkit package to concatenate multiple CSV files with the same layout:
find . -name data.csv | xargs csvstack > mergedCSV
CodePudding user response:
You can use miller very easily with the "cat" built-in command/verb
find . -name data.csv | xargs mlr --csv cat
and if you want pretty formatting with 3 files as input:
mlr --opprint --barred --icsv cat a/data.csv b/data.csv c/data.csv
------- ------------
| NAME | COUNTRY |
------- ------------
| JOHN | USA |
| MARY | Panama |
| James | UK |
| Jim | India |
| James | UK |
| Jim | India |
------- ------------