How to Concat 2 Pandas dataframes with different column numbers-CodePudding

I have a .xlsx file I am creating two dataframes from and then trying to concat them so that the right columns append to the bottom of the left columns. How do I get the data in the 2nd dataframe to move directly below the 1st dataframe with each dataframe having a different number of columns?

import pandas as pd

data = pd.read_excel('test.xlsx')
header = data[['col_1', 'col_2', 'col_3']]
detail = data[['col_4', 'col_5', 'col_6', 'col_7']]
combined = pd.concat([header, detail], ignore_index=True)
combined.to_csv('output.csv')

test.xlsx

col_1	col_2	col_3	col_4	col_5	col_6	col_7
aaaaa	bbbbb	ccccc	ddddd	eeeee	fffff	ggggg

Actual: output.csv

col_1	col_2	col_3	col_4	col_5	col_6	col_7
aaaaa	bbbbb	ccccc
			ddddd	eeeee	fffff	ggggg

Expected: output.csv

col_1	col_2	col_3	col_4	col_5	col_6	col_7
aaaaa	bbbbb	ccccc
ddddd	eeeee	fffff	ggggg

CodePudding user response：

Can you try this:

header = data[['col_1', 'col_2', 'col_3']]
detail = data[['col_4', 'col_5', 'col_6', 'col_7']]

detail.columns=data.columns[:len(detail.columns)]
header.columns=data.columns[:len(header.columns)]
combined = pd.concat([header, detail])
for i in data.columns:
    if i in combined.columns:
        pass
    else:
        combined[i]=np.nan

CodePudding user response：

Here is a proposition with pandas.concat :

header = ['col_1', 'col_2', 'col_3']
detail = ['col_4', 'col_5', 'col_6', 'col_7']

func = lambda x: dict(zip(x, range(1, len(x) 1)))

out = (
        pd.concat([data.loc[:, header].rename(columns= func(header)),
                   data.loc[:, detail].rename(columns= func(detail))],
                  ignore_index=True, axis=0)
      )

out.loc[:, header] = np.NaN

out.columns = data.columns

# Output :

print(out)

   col_1  col_2  col_3  col_4  col_5  col_6  col_7
0  aaaaa  bbbbb  ccccc    NaN    NaN    NaN    NaN
1  ddddd  eeeee  fffff  ggggg    NaN    NaN    NaN