FinalDf = pd.merge(HRTrainingData,HRDataSet1,HRDataSet2,HRDataSet3, on='EmployeeNumber', how='outer')
TypeError: merge() got multiple values for argument 'on'
I have only entered one 'on' argument, so I'm not sure what is going on here, but I am unable to merge these data frames. Any advice?
CodePudding user response:
You can write like below:
pd.merge(df3, pd.merge(df1,df2, on='EmployeeNumber', how='outer'), on='EmployeeNumber',how='outer')
Or with functools.reduce
:
import functools
functools.reduce(lambda x,y : pd.merge(x,y,
on='EmployeeNumber',
how='outer'),
[df1, df2, df3, df4, ..., df15])
# above code inference like below
# merge(merge(merge(merge(df1, df2), df3), df4), ..., df15)
CodePudding user response:
Or you can use join,
df_list=[d.set_index('EmployeeNumber', inplace=True) for d in [df1,df2,d3,d4]]
df_list[0].join(df_list[1:])
One of the advantages pd.DataFrame.join over pd.DataFrame.merge, join accepts a list of dataframes.