I have a pandas dataframe of stats from 4 NBA seasons where seasons starts from 2017-18 and has been converted into dummy variables as seen below.
Salary VORP ... Season_2019-20 Season_2020-21
Player ...
Nikola Jokić 29542010.0 0.931373 ... 0 1
James Harden 28299399.0 0.843137 ... 0 0
James Harden 30570000.0 1.000000 ... 0 0
Giannis Antetokounmpo 24157304.0 0.813725 ... 0 0
Rudy Gobert 23491573.0 0.558824 ... 0 0
I want to divide the salary column by the year by that year's salary cap using the function below.
def pct_cap(row):
if row['Season_2017-18'] == 1:
return final_data['Salary'] / 99093000
if row['Season_2018-19'] == 1:
return final_data['Salary'] / 101869000
if row['Season_2019-20'] == 1:
return final_data['Salary'] / 109140000
if row['Season_2020-21'] == 1:
return final_data['Salary'] / 109140000
return 1
However, when I apply the function using the code below, it completely changes the shape of the dataframe as it appears to be applying the function to every column instead of just the Salary column.
What is the logic that is occurring with this function and what would be the best way to divide the salary by the salary cap? I'm a beginner and any help would be greatly appreciated.
x = final_data.apply(lambda row: pct_cap(row), axis=1)
Player Nikola Jokić James Harden ... Alec Burks Vince Carter
Player ...
Nikola Jokić 0.270680 0.259294 ... 0.099372 0.021934
James Harden 0.298124 0.285584 ... 0.109448 0.024158
James Harden 0.290000 0.277802 ... 0.106465 0.023500
Giannis Antetokounmpo 0.290000 0.277802 ... 0.106465 0.023500
Rudy Gobert 0.290000 0.277802 ... 0.106465 0.023500
CodePudding user response:
Your pct_cap
function is weird. The problem is, for every row, it's returning a series instead of a number. It should return the salary of a player, not salaries of everyone.
Try it like this:
def pct_cap(row):
if row['Season_2017-18'] == 1:
return row['Salary'] / 99093000
if row['Season_2018-19'] == 1:
return row['Salary'] / 101869000
if row['Season_2019-20'] == 1:
return row['Salary'] / 109140000
if row['Season_2020-21'] == 1:
return row['Salary'] / 109140000
return 1
x = final_data.apply(pct_cap, axis=1)