I have several CSV files I'd like to combine by matching column headers but still keep the unmatched columns, for example:
Input file1.csv:
col1,col2,col3,col5
a,b,c,d
d,e,b,g
c,a,d,h
Input file2.csv:
col1,col3,col4,col5
g,d,b,c
o,e,x,h
b,n,w,e
Desired output:
col1,col2,col3,col4,col5
a,b,c,,d
d,e,b,,g
c,a,d,,h
g,,d,b,c
o,,e,x,h
b,,n,w,e
CodePudding user response:
I would use Miller (available here for several OSs):
mlr --csv unsparsify file1.csv file2.csv
col1,col2,col3,col5,col4
a,b,c,d,
d,e,b,g,
c,a,d,h,
g,,d,c,b
o,,e,h,x
b,,n,e,w
remark: The columns are outputted in the order in which they first appear; if need be, you can specify a custom ordering, but you'll need to know the column names in advance.