Home > Mobile >  More efficient way to do the same merge on multiple columns in a dataframe?
More efficient way to do the same merge on multiple columns in a dataframe?

Time:04-26

Input:

df1

    OFF_P1  OFF_P2  OFF_P3  OFF_P4  OFF_P5  GAME_ID
0   1629675 1627736 1630162 201976  1629020 22101224
1   201599  1630178 202699  1629680 201980  22101228
2   1630191 1630180 1630587 1630240 1628402 22101228
3   1627759 201143  1628464 1628369 203935  22101223
4   1630573 1630271 1630238 1628436 1630346 22101223

df2

    PLAYER_ID GAME_ID   PTS
0   201980    21900002  28
1   201586    21900001  13
2   1628366   21900001  8
3   200755    21900001  16
4   202324    21900001  6

Desired Output:

    OFF_P1  OFF_P2  OFF_P3  OFF_P4  OFF_P5  GAME_ID  OFF_P1_PTS  OFF_P2_PTS  etc...
0   1629675 1627736 1630162 201976  1629020 22101224 28          13          ...
1   201599  1630178 202699  1629680 201980  22101228 12
2   1630191 1630180 1630587 1630240 1628402 22101228 14
3   1627759 201143  1628464 1628369 203935  22101223 8
4   1630573 1630271 1630238 1628436 1630346 22101223 19

I would like to merge the PTS column from df2 to df1 but for each column of OFF_P1, OFF_P2, etc...

Is there a more efficient way to do this other than something like the below?

df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P1'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P2'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P3'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P4'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P5'], right_on=['GAME_ID', 'PLAYER_ID'])

CodePudding user response:

I would prefer the MultiIndex.map approach:

d = df2.set_index(['GAME_ID', 'PLAYER_ID'])['PTS']

for c in df1.filter(like='OFF_P'):
    df1[f'{c}_PTS'] = df1.set_index(['GAME_ID', c]).index.map(d)

print(df1)

    OFF_P1   OFF_P2   OFF_P3   OFF_P4   OFF_P5   GAME_ID  OFF_P1_PTS  OFF_P2_PTS  OFF_P3_PTS  OFF_P4_PTS  OFF_P5_PTS  OFF_P1_PTS_PTS  OFF_P2_PTS_PTS  OFF_P3_PTS_PTS  OFF_P4_PTS_PTS  OFF_P5_PTS_PTS
0  1629675  1627736  1630162   201976  1629020  22101224        28.0         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
1   201599  1630178   202699  1629680   201980  22101228         NaN        13.0         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
2  1630191  1630180  1630587  1630240  1628402  22101228         NaN         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
3  1627759   201143  1628464  1628369   203935  22101223        16.0         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
4  1630573  1630271  1630238  1628436  1630346  22101223         NaN         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
  • Related