Home > database >  Python: Lambda Function to Return Values Based on Multiple Conditions
Python: Lambda Function to Return Values Based on Multiple Conditions

Time:03-30

I am trying to apply a lambda function to a DataFrame where I check multiple conditions.

The relevant columns in the DataFrame:

shooter_id shot_made player_id_1 total_points free_throw_id free_throw_made time
NaN NaN 42 NaN NaN NaN 1
42 True 42 2 NaN NaN 2
30 True 42 2 NaN NaN 3
NaN NaN 42 NaN 42 True 3
NaN NaN 42 NaN 42 False 4
42 True 42 5 NaN NaN 5

I want to add a column to the DataFrame that has the most recent total_points values, while also adding 1 to the rows where the free_throw_made = True, as the total_points does not reflect these...

shooter_id shot_made player_id_1 total_points free_throw_id free_throw_made time player_id_1_total_points
NaN NaN 42 NaN NaN NaN 1 0
42 True 42 2 NaN NaN 2 2
30 True 42 2 NaN NaN 3 2
NaN NaN 42 NaN 42 True 3 3
NaN NaN 42 NaN 42 False 4 3
42 True 42 5 NaN NaN 5 5

I've tried a few different bits of code, but I can't work out the proper logic or syntax.

For example, I ran this:

def points(x):
        if x == df['shooter_id'] & df['shot_made']:
            return df['total_points']
        elif x == df['free_throw_id'] & df['free_throw_made']:
            df['total_points']  = 1 
        else:
            return df['total_points']

df['player_id_1_total_points'] = df['player_id_1'].apply(lambda x: points(x))

Not only did this return an error (unsupported operand types float and bool), as I'm writing this I'm also realizing it would not return the most recent total_points value...

Any guidance would be extremely appreciated!

CodePudding user response:

First group the dataframe by player_id_1 then for each group forward fill the values in total_points then add with free_throw_made, finally take the cumulative max to calculate most recent total points:

def func(g):
    return (
        g['total_points'].ffill()
        .add(g['free_throw_made'], fill_value=0)
        .fillna(0).cummax().to_frame()
    )


df['player_id_1_total_points'] = df.groupby('player_id_1').apply(func)

Result

   shooter_id shot_made  player_id_1  total_points  free_throw_id free_throw_made  time  player_id_1_total_points
0         NaN       NaN           42           NaN            NaN             NaN     1                       0.0
1        42.0      True           42           2.0            NaN             NaN     2                       2.0
2        30.0      True           42           2.0            NaN             NaN     3                       2.0
3         NaN       NaN           42           NaN           42.0            True     3                       3.0
4         NaN       NaN           42           NaN           42.0           False     4                       3.0
5        42.0      True           42           5.0            NaN             NaN     5                       5.0

CodePudding user response:

You don't need a function, you can directly use vectorial code:

g = df.assign(free_throw_made=df['free_throw_made'].eq(True)).groupby('player_id_1')
df['player_id_1_total_points'] = (g['total_points'].ffill().fillna(0, downcast='infer')
                                   g['free_throw_made'].cumsum()
                                 )

output:

   shooter_id shot_made  player_id_1  total_points  free_throw_id  \
0         NaN       NaN           42           NaN            NaN   
1        42.0      True           42           2.0            NaN   
2        30.0      True           42           2.0            NaN   
3         NaN       NaN           42           NaN           42.0   
4         NaN       NaN           42           NaN           42.0   
5        42.0      True           42           5.0            NaN   

  free_throw_made  time  player_id_1_total_points  
0             NaN     1                         0  
1             NaN     2                         2  
2             NaN     3                         2  
3            True     3                         3  
4           False     4                         3  
5             NaN     5                         6  
  • Related