I have two very different pandas DataFrames with unique column names. And I want to stack them one under another.Nothing else like "union" or "intersection" needed. Like "copy paste" one under another. And I also don't want to lose any column names.
Df1:
Col1
1 A C
2 B D
3 X Y
DF2:
Col2 Col3 Col4 Col5
1 a c x t
2 b d y j
3 b e z k
I tried using pd.concat([df1,df2], axis=0, ignore_index = True) but this resulted in something like this:
Col1
1 A C Nan Nan Nan Nan
2 B D Nan Nan Nan Nan
3 X Y Nan Nan Nan Nan
4 Nan Nan Col2 Col3 Col4 Col5
5 Nan Nan a c x t
6 Nan Nan b b y j
7 Nan Nan b e z k
Expected Result:
Col1
1 A C Nan Nan
2 B D Nan Nan
3 X Y Nan Nan
4 Col2 Col3 Col4 Col5
5 a c x t
6 b b y j
7 b e z k
Original DFs
CodePudding user response:
Pandas does not allow for multiple unnamed columns (i.e. columns with the empty string as a name) since column names must be unique. That said, I think this gets close to your desired result.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Col1':[*'ABX'],'':[*'CDY']})
df2 = pd.DataFrame({'Col2':[*'abb'],'Col3':[*'cde'],'Col4':[*'xyz'],'Col5':[*'tjk']})
mat1 = np.vstack([df1.columns,df1.to_numpy()])
mat2 = np.vstack([df2.columns,df2.to_numpy()])
df = pd.concat([pd.DataFrame(mat1),pd.DataFrame(mat2)])
The resulting dataframe df
is
0 1 2 3
0 Col1 NaN NaN
1 A C NaN NaN
2 B D NaN NaN
3 X Y NaN NaN
0 Col2 Col3 Col4 Col5
1 a c x t
2 b d y j
3 b e z k
CodePudding user response:
This is quite a weird reshape. You can't directly move the header as row, but a trick is to reset_index
on the transpose.
For the concat
, set aside the column header and put it back:
cols = df1.columns
(pd.concat([df1.set_axis(range(df1.shape[1]), axis=1),
df2.T.reset_index().T], ignore_index=True)
.rename(columns=dict(enumerate(cols))))
output:
Col1 2 3
0 A C NaN NaN
1 B D NaN NaN
2 X Y NaN NaN
3 Col2 Col3 Col4 Col5
4 a c x t
5 b d y j
6 b e z k