Home > front end >  How can I use one index from a data set for two different columns of a different data set without me
How can I use one index from a data set for two different columns of a different data set without me

Time:10-26

I have a bilateral data set which means that I have two columns of country-name.

df:

Country Partner Country year var1
Turkey Spain 2019 320

df2:

Country year index
Turkey 2019 0
Spain 2019 1

Desired result:

Country index_country Partner Country index_partner Year var1
Turkey 0 Spain 1 2019 320

I would like to have the desired result without merging twice.

CodePudding user response:

It is possible by unpivot (melt), left join and pivot:

df = (df1.rename_axis('idx').reset_index()
         .melt(['idx','var1','year'], value_name='Country').merge(df2, how='left')
         .pivot(index=['idx','var1','year'], columns='variable', values=['index','Country'])
           .sort_index(axis=1, level=0, sort_remaining=False)
         )
df.columns = [f'{a}_{b}' if a == 'index' else b for a, b in df.columns]
df = df.reset_index().drop('idx', axis=1)
print (df)
   var1  year Country Partner Country index_Country index_Partner Country
0   320  2019  Turkey           Spain             0                     1

Solution with double merge:

df = (df1.merge(df2.rename(columns={'index':'index_country'}), 
                how='left', 
                on=['Country','year'])
            .merge(df2.rename(columns={'index':'index_partner'}),
                      how='left', left_on=['Partner Country','year'],
                                 right_on=['Country','year'],
                                 suffixes=('','_')).drop('Country_', axis=1)
           )
print (df)
  Country Partner Country  year  var1  index_country  index_partner
0  Turkey           Spain  2019   320              0              1
  • Related