Home > Back-end >  How Merge multi dataframes Pandas in Python
How Merge multi dataframes Pandas in Python

Time:07-13

FinalDf = pd.merge(HRTrainingData,HRDataSet1,HRDataSet2,HRDataSet3, on='EmployeeNumber', how='outer')

TypeError: merge() got multiple values for argument 'on'

I have only entered one 'on' argument, so I'm not sure what is going on here, but I am unable to merge these data frames. Any advice?

CodePudding user response:

You can write like below:

pd.merge(df3, pd.merge(df1,df2, on='EmployeeNumber', how='outer'), on='EmployeeNumber',how='outer')

Or with functools.reduce:

import functools

functools.reduce(lambda x,y : pd.merge(x,y, 
                                       on='EmployeeNumber', 
                                       how='outer'), 
                 [df1, df2, df3, df4, ..., df15])
# above code inference like below
# merge(merge(merge(merge(df1, df2), df3), df4), ..., df15)

CodePudding user response:

Or you can use join,

df_list=[d.set_index('EmployeeNumber', inplace=True) for d in [df1,df2,d3,d4]]

df_list[0].join(df_list[1:])

One of the advantages pd.DataFrame.join over pd.DataFrame.merge, join accepts a list of dataframes.

  • Related