Home > Net >  Function two assign values to a category based on column value does not work, pd dataframe
Function two assign values to a category based on column value does not work, pd dataframe

Time:12-10

I want to assign each row to a category based on the value in a specific column of my dataframe. Here is the function:

def assign_SOP (df):
  if df['Strikeouts per Pitches'] <= 0.053571:  
    return 'Below average'
  elif df['Strikeouts per Pitches'] >= 0.053571 and df['Strikes per Pitches'] < 0.059794:
    return 'Average'
  elif df['Strikeouts per Pitches'] >= 0.059794 and df['Strikes per Pitches'] < 0.068870:
    return 'Above Average'
  elif df['Strikeouts per Pitches'] >= 0.068870:
    return 'Elite'
#Creating Columsn for each category
df_MLB['SP Category'] = df_MLB.apply(assign_SP, axis=1)
df_MLB['SOP Category'] = df_MLB.apply(assign_SOP, axis=1)

Somehow it only works for 'Below average' and 'Elite'

enter image description here

I used almost the same function for another column and it worked:

def assign_SP (df):
  if df['Strikes per Pitches'] <= 0.645129:
    return 'Below average'
  elif df['Strikes per Pitches'] >= 0.645129 and df['Strikes per Pitches'] < 0.656995:
    return 'Average'
  elif df['Strikes per Pitches'] >= 0.656995 and df['Strikes per Pitches'] < 0.672696:
    return 'Above Average'
  elif df['Strikes per Pitches'] >= 0.672696:
    return 'Elite'

Here is the output: enter image description here

Can someone help me out here?

CodePudding user response:

I would use pandas.cut to save time, energy and memory :

import numpy as np

categories = ['Below average', 'Average', 'Above Average', 'Elite']
​
values = [0, 0.053571, 0.059794, 0.068870, np.inf]
​
df["SOP Category"] = pd.cut(df["Strikeouts per Pitches"], bins=values, labels=categories, include_lowest=True)
​

# Output :

print(df)
    Strikeouts per Pitches   SOP Category
0                 0.064281  Above Average
1                 0.054225        Average
2                 0.064516  Above Average
3                 0.063732  Above Average
4                 0.060326  Above Average
5                 0.056730        Average
6                 0.078766          Elite
7                 0.068870  Above Average
8                 0.058195        Average
9                 0.052836  Below average
10                0.050294  Below average
11                0.057866        Average
12                0.074221          Elite
13                0.059794        Average
14                0.052574  Below average
15                0.045643  Below average
16                0.048541  Below average
17                0.065417  Above Average
18                0.064903  Above Average
19                0.077328          Elite

NB : You have to make a cut for each column separatly.

  • Related