Home > front end >  How to merge 3 DataFrames based on 3 columns in common
How to merge 3 DataFrames based on 3 columns in common

Time:01-20

I have three DataFrames (df_1, df_2, df_3) with 3 of 4 columns ('Date', 'Plant', 'Product') in common and I need to merge them so I get only one DataFrame with those three columns in common plus other three different columns ('Column1', 'Column2', 'Column3'), one from each DataFrame. The name of the columns would look something like this: Date Plant Product Column1 Column2 Column3

I tried making a merge function but I can't make it consider more than one column in common and more than two DataFrames

df = pd.merge(
    left=df_1,
    right=df_2,
    how='outer',
    left_on='Date',
    right_on='Product',
)

I got:

ValueError: You are trying to merge on datetime64[ns] and object columns. If you wish to proceed you should use pd.concat

But I don't want to concat because the rows of the three DataFrames don't match.

CodePudding user response:

You need to make sure that the data type of columns on which you match are identical. The error message tells you that the merge column in df_1 has a datetime type while in df_2 it has an object type.

The reason for this is that you are matching on Date in df_1 and Product in df_2, that's not want you want.

Based on your explanation, you want records from both data frames to be combined that share the same values for columns Date, Plant, and Product. In that case, try:

df = df_1.merge(
    df_2,
    on=['Date','Plant','Product'],
    how='outer'
)
  • Related