I have three DataFrames (df_1, df_2, df_3) with 3 of 4 columns ('Date', 'Plant', 'Product') in common and I need to merge them so I get only one DataFrame with those three columns in common plus other three different columns ('Column1', 'Column2', 'Column3'), one from each DataFrame. The name of the columns would look something like this: Date Plant Product Column1 Column2 Column3
I tried making a merge function but I can't make it consider more than one column in common and more than two DataFrames
df = pd.merge(
left=df_1,
right=df_2,
how='outer',
left_on='Date',
right_on='Product',
)
I got:
ValueError: You are trying to merge on datetime64[ns] and object columns. If you wish to proceed you should use pd.concat
But I don't want to concat because the rows of the three DataFrames don't match.
CodePudding user response:
You need to make sure that the data type of columns on which you match are identical. The error message tells you that the merge column in df_1
has a datetime type while in df_2
it has an object type.
The reason for this is that you are matching on Date
in df_1
and Product
in df_2
, that's not want you want.
Based on your explanation, you want records from both data frames to be combined that share the same values for columns Date
, Plant
, and Product
. In that case, try:
df = df_1.merge(
df_2,
on=['Date','Plant','Product'],
how='outer'
)