I have a loop which generates dataframes with 2 columns in each. Now, when I try to append the dataframes vertically (stacking those vertically), the code adds the new dataframes horizontally when I use pd.concat within a loop. However, the results do not merge the columns (with same lenght properly). Instead, it adds 2 new columns for every loop iteration, creating a bunch on Nans. How to solve?
df_master=pd.DataFrame()
columns=list(df_master)
data=[]
for i in range(1,3):
--do something and return a df2 with 2 columns
data.append(df2)
df_master = pd.concat(data, axis=1)
df_master.head()
How do I compress the new 2 column for every iteration within one dataframe?
CodePudding user response:
If you don't need to keep the column labels of original dataframes, you can try renaming the column labels of each dataframe to the same (e.g. 0
and 1
) before concat, for example:
df_master = pd.concat([dfi.rename({old: new for new, old in enumerate(dfi.columns)}, axis=1) for dfi in data], ignore_index=True)
Demo
df1
57 59
0 1 2
1 3 4
df2
138 140
0 11 12
1 13 14
data = [df1, df2]
df_master = pd.concat([dfi.rename({old: new for new, old in enumerate(dfi.columns)}, axis=1) for dfi in data], ignore_index=True)
df_master
0 1
0 1 2
1 3 4
2 11 12
3 13 14
CodePudding user response:
I suppose the problem is your columns have different names in each iteration, so you could easily solve it by calling df2.rename()
and renaming it to the same names
CodePudding user response:
It works for me if I change axis to 0 inside the concat command.
df_master = pd.concat(data, axis=0)
CodePudding user response:
Pandas would fill empty cells with NaNs
in each scenario and like the example you see below.
df1 = pd.DataFrame({'col1':[11,12,13], 'col2': [21,22,23], 'col3':[31,32,33]})
df2 = pd.DataFrame({'col1':[111,112,113, 114], 'col2': [121,122,123,124]})
merge / join / concatenate data frames [df1, df2]
vertically - add rows
pd.concat([df1,df2], ignore_index=True)
# output
col1 col2 col3
0 11 21 31.0
1 12 22 32.0
2 13 23 33.0
3 111 121 NaN
4 112 122 NaN
5 113 123 NaN
6 114 124 NaN
merge / join / concatenate data frames horizontally (aligning by index)
pd.concat([df1,df2], axis=1)
# output
col1 col2 col3 col1 col2
0 11.0 21.0 31.0 111 121
1 12.0 22.0 32.0 112 122
2 13.0 23.0 33.0 113 123
3 NaN NaN NaN 114 124