I have a bilateral data set which means that I have two columns of country-name.
df:
Country | Partner Country | year | var1 |
---|---|---|---|
Turkey | Spain | 2019 | 320 |
df2:
Country | year | index |
---|---|---|
Turkey | 2019 | 0 |
Spain | 2019 | 1 |
Desired result:
Country | index_country | Partner Country | index_partner | Year | var1 |
---|---|---|---|---|---|
Turkey | 0 | Spain | 1 | 2019 | 320 |
I would like to have the desired result without merging twice.
CodePudding user response:
It is possible by unpivot (melt), left join and pivot
:
df = (df1.rename_axis('idx').reset_index()
.melt(['idx','var1','year'], value_name='Country').merge(df2, how='left')
.pivot(index=['idx','var1','year'], columns='variable', values=['index','Country'])
.sort_index(axis=1, level=0, sort_remaining=False)
)
df.columns = [f'{a}_{b}' if a == 'index' else b for a, b in df.columns]
df = df.reset_index().drop('idx', axis=1)
print (df)
var1 year Country Partner Country index_Country index_Partner Country
0 320 2019 Turkey Spain 0 1
Solution with double merge
:
df = (df1.merge(df2.rename(columns={'index':'index_country'}),
how='left',
on=['Country','year'])
.merge(df2.rename(columns={'index':'index_partner'}),
how='left', left_on=['Partner Country','year'],
right_on=['Country','year'],
suffixes=('','_')).drop('Country_', axis=1)
)
print (df)
Country Partner Country year var1 index_country index_partner
0 Turkey Spain 2019 320 0 1