Home > Software design >  Python Pandas Column Calculation
Python Pandas Column Calculation

Time:12-10

Below is an example of a pandas database:

     player  games  result
855    Zoom    156       0
856    Zoom    213       1
857  Zothve      7       0
858  Zothve      2       1
859      zs     55       0
860      zs     20       1
861   Zvene     34       0
862   Zvene     25       1
863  Zz1tai     54       0
864  Zz1tai     60       1

I have another dataframe with players and their win rates

       player  winPercentage
0        Zoom        71.6049
1          zs        71.4286
2       Zvene        71.2025
3      zz1tai        70.3704
4  DREAMPYLLA        69.8113
5        Khan        69.2857
6      Looper        68.4932
7      Buggax        67.2269
8       Mouse        67.1053
9        Apii        66.6667

As you can see, each player has 2 columns associated with their name based on the result, where a 1 is a win and a 0 is a loss. I am trying to take the win percentage for the player from table B and put it in a new column on table A. I've tried a couple permutations including what is below and I haven't found a solution that works.

for name in a['player']:
    wp = b['winPercentage'].loc[b['player']==name].to_string(index=False)
    try:
        if zip(a['player'], a['result']) == (name, 1):
            a['WinPercentage'] = float(wp)
        else:
            a['WinPercentage'] = 1-float(wp)
    except Exception as e:
        print(e)

I know if it was all numeric I could use numpy and I know I'm not supposed to iterate down the pandas rows. I know that a lambda function is an option but I haven't been able to get one to work. I'm looking for advice in not only a way to make this work but also the most pythonic way to do it.

CodePudding user response:

You're just trying to join the two dataframes based on the player column. So :

df = df0.merge(df1, on="player")

Where df0 and df1 are the first and second dataframes you described, respectively.

CodePudding user response:

Merge the two dataframes, then search for where result = 0, and change to np.nan.

df = df1.merge(df2, on="player", how='left')
df['winPercentage'] = np.where(df['result']==0, np.nan, df['winPercentage'])

   player  games  result  winPercentage
0    Zoom    156       0            NaN
1    Zoom    213       1         71.605
2  Zothve      7       0            NaN
3  Zothve      2       1            NaN
4      zs     55       0            NaN
5      zs     20       1         71.429
6   Zvene     34       0            NaN
7   Zvene     25       1         71.203
8  Zz1tai     54       0            NaN
9  Zz1tai     60       1            NaN
  • Related