I perform a join like this:
df = df.merge(df2, left_on=["code"], right_on=["countryCode"],
how='left')
Both of my dataframes have a column with the same name money
. After the join, the new df has two columns
money_x and money_y
one from each table joined. How can I alter this such that for any duplicate columns, only the second one stays with the original one. For example, in this case, only money_y
from the second df should stay and the original name money
should be kept instead of money_y
.
CodePudding user response:
This will make it where the 2nd df in the merge will essentially be the same while the first df in the merge will now have a '_y' included
df = df.merge(df2, left_on=["code"], right_on=["countryCode"],
how='left', suffixes=('_y', ''))
CodePudding user response:
Select only the unique columns from the first DataFrame before join
:
df1 = df[list(set(df.columns).difference(set(df2.columns)))]
output = df1.join(df2, left_on="code", right_on="countryCode"])