Home > Back-end >  How do i create code for a vlookup in python?
How do i create code for a vlookup in python?

Time:11-29

df

Season Date Team Team_Season_Code TS L Opponent Opponent_Season_Code OS
2019 20181109 Abilene_Chr 1_2019 94 Home Arkansas_St 15_2019 73
2019 20181115 Abilene_Chr 1_2019 67 Away Denver 82_2019 61
2019 20181122 Abilene_Chr 1_2019 72 N Elon 70_2019 56
2019 20181123 Abilene_Chr 1_2019 73 Away Pacific 224_2019 71
2019 20181124 Abilene_Chr 1_2019 60 N UC_Riverside 306_2019 48

Overall_Season_Avg

Team_Season_Code Team TS OS MOV
15_2019 Arkansas_St 70.909091 65.242424 5.666667
70_2019 Elon 73.636364 71.818182 1.818182
82_2019 Denver 74.03125 72.15625 1.875
224_2019 Pacific 78.333333 76.466667 1.866667
306_2019 UC_Riverside 79.545455 78.060606 1.484848

I have these two dataframes and I want to be able to look up the Opponent_Season_Code from df in Overall_Season_Avg - "Team_Season_Code" and bring back "TS" and "OS" to create a new column in df called "OOS" and "OTS" So a new column for row 1 in df should have Column name OOS with data - 65.24... and Column name OTS with data 70.90...

In excel its a simple vlookup but i haven't been able to use the solutions that i have found to the vlookup question on overflow so i decided to post my own question. I will also say that the Overall_Season_Avg dataframe was created through by Overall_Season_Avg = df.groupby(['Team_Season_Code', 'Team']).agg({'TS': np.mean, 'OS': np.mean, 'MOV': np.mean})

CodePudding user response:

You can use a merge, after reworking a bit Overall_Season_Avg :

df.merge(Overall_Season_Avg
         .set_index(['Team_Season_Code', 'Team'])
         [['OS', 'TS']].add_prefix('O'),
         left_on=['Opponent_Season_Code', 'Opponent'],
         right_index=True, how='left'
        )

Output:

   Season      Date         Team Team_Season_Code  TS     L      Opponent Opponent_Season_Code  OS        OOS        OTS
0    2019  20181109  Abilene_Chr           1_2019  94  Home   Arkansas_St              15_2019  73  65.242424  70.909091
1    2019  20181115  Abilene_Chr           1_2019  67  Away        Denver              82_2019  61  72.156250  74.031250
2    2019  20181122  Abilene_Chr           1_2019  72     N          Elon              70_2019  56  71.818182  73.636364
3    2019  20181123  Abilene_Chr           1_2019  73  Away       Pacific             224_2019  71  76.466667  78.333333
4    2019  20181124  Abilene_Chr           1_2019  60     N  UC_Riverside             306_2019  48  78.060606  79.545455
merging only on Opponent_Season_Code/Team_Season_Code:
df.merge(Overall_Season_Avg
         .set_index('Team_Season_Code')
         [['OS', 'TS']].add_prefix('O'),
         left_on=['Opponent_Season_Code'],
         right_index=True, how='left'
        )

Output:

   Season      Date         Team Team_Season_Code  TS     L      Opponent Opponent_Season_Code  OS        OOS        OTS
0    2019  20181109  Abilene_Chr           1_2019  94  Home   Arkansas_St              15_2019  73  65.242424  70.909091
1    2019  20181115  Abilene_Chr           1_2019  67  Away        Denver              82_2019  61  72.156250  74.031250
2    2019  20181122  Abilene_Chr           1_2019  72     N          Elon              70_2019  56  71.818182  73.636364
3    2019  20181123  Abilene_Chr           1_2019  73  Away       Pacific             224_2019  71  76.466667  78.333333
4    2019  20181124  Abilene_Chr           1_2019  60     N  UC_Riverside             306_2019  48  78.060606  79.545455

CodePudding user response:

df.merge(Overall_Season_Avg, on=['Team_Season_Code', 'Team'], how='left')

and rename column's names



or use transform instead agg when make Overall_Season_Avg.

but i don remain transform code becuz you don provide reproducible example

make simple and reproducible example plz

https://stackoverflow.com/help/minimal-reproducible-example

  • Related