Home > database >  How to combine two datasets vertically in pandas?
How to combine two datasets vertically in pandas?

Time:10-06

I have a loop which generates dataframes with 2 columns in each. Now, when I try to append the dataframes vertically (stacking those vertically), the code adds the new dataframes horizontally when I use pd.concat within a loop. However, the results do not merge the columns (with same lenght properly). Instead, it adds 2 new columns for every loop iteration, creating a bunch on Nans. How to solve?

df_master=pd.DataFrame()
columns=list(df_master)

data=[]
for i in range(1,3):
        --do something and return a df2 with 2 columns
        data.append(df2)
            
df_master = pd.concat(data, axis=1)
df_master.head()

How do I compress the new 2 column for every iteration within one dataframe?

enter image description here

CodePudding user response:

If you don't need to keep the column labels of original dataframes, you can try renaming the column labels of each dataframe to the same (e.g. 0 and 1) before concat, for example:

df_master = pd.concat([dfi.rename({old: new for new, old in enumerate(dfi.columns)}, axis=1) for dfi in data], ignore_index=True)

Demo

df1

   57  59
0   1   2
1   3   4


df2

   138  140
0   11   12
1   13   14


data = [df1, df2]

df_master = pd.concat([dfi.rename({old: new for new, old in enumerate(dfi.columns)}, axis=1) for dfi in data], ignore_index=True)


df_master

    0   1
0   1   2
1   3   4
2  11  12
3  13  14

CodePudding user response:

I suppose the problem is your columns have different names in each iteration, so you could easily solve it by calling df2.rename() and renaming it to the same names

CodePudding user response:

It works for me if I change axis to 0 inside the concat command.

df_master = pd.concat(data, axis=0)

CodePudding user response:

Pandas would fill empty cells with NaNs in each scenario and like the example you see below.

df1 = pd.DataFrame({'col1':[11,12,13], 'col2': [21,22,23], 'col3':[31,32,33]})
df2 = pd.DataFrame({'col1':[111,112,113, 114], 'col2': [121,122,123,124]})

merge / join / concatenate data frames [df1, df2] vertically - add rows

pd.concat([df1,df2], ignore_index=True)
# output 

   col1  col2  col3
0    11    21  31.0
1    12    22  32.0
2    13    23  33.0
3   111   121   NaN
4   112   122   NaN
5   113   123   NaN
6   114   124   NaN

merge / join / concatenate data frames horizontally (aligning by index)

pd.concat([df1,df2], axis=1)
# output
   col1  col2  col3  col1  col2
0  11.0  21.0  31.0   111   121
1  12.0  22.0  32.0   112   122
2  13.0  23.0  33.0   113   123
3   NaN   NaN   NaN   114   124
  • Related