Home > Enterprise >  keep only one column during merge when column names are similar
keep only one column during merge when column names are similar

Time:05-25

I perform a join like this:

df = df.merge(df2, left_on=["code"], right_on=["countryCode"],
                    how='left')

Both of my dataframes have a column with the same name money. After the join, the new df has two columns

money_x  and   money_y

one from each table joined. How can I alter this such that for any duplicate columns, only the second one stays with the original one. For example, in this case, only money_y from the second df should stay and the original name money should be kept instead of money_y.

CodePudding user response:

This will make it where the 2nd df in the merge will essentially be the same while the first df in the merge will now have a '_y' included

df = df.merge(df2, left_on=["code"], right_on=["countryCode"],
                    how='left', suffixes=('_y', ''))

CodePudding user response:

Select only the unique columns from the first DataFrame before join:

df1 = df[list(set(df.columns).difference(set(df2.columns)))]
output = df1.join(df2, left_on="code", right_on="countryCode"])
  • Related