Home > OS >  Multiply values in certain columns by fixed metric if multiple conditions exist
Multiply values in certain columns by fixed metric if multiple conditions exist

Time:09-26

I have a dataset of hockey statistics and I want to apply a weight multiplier to certain statistics based on certain conditions.

A snippet of my dataset:

    Player            Pos    GP      G     GF/GP    S     Shots/GP    S%    TOI/GP
2   Andrew Cogliano   1.0   79.2    11.0    0.1   126.8      1.6     8.3    14.44
12  Artturi Lehkonen  2.0   73.0    14.6    0.2   158.6      2.2     9.3    15.29
28  Cale Makar        4.0   59.3    16.0    0.3   155.0      2.6     9.8    23.67
31  Darren Helm       1.0   66.6    10.5    0.2   125.0      1.9     8.6    14.37
61  Gabriel Landeskog 2.0   72.0    24.3    0.3   196.1      2.7     12.8   19.46
103 Nathan MacKinnon  1.0   73.8    27.8    0.4   274.4      3.7     9.9    19.69

What I am trying to do is create a function that multiplies 'G', 'GF/GP', 'S', and 'Shots/GP' by a specific weight - 1.1 for example. But I want to only do that for players based on two categories:

  • Defence ('Pos' = 4.0) with 50 or more games ('GP') and 20 min or more time on ice per game ('TOI/GP')
  • Offense ('Pos' != 4.0) with 50 or more games ('GP') and 14 min or more time on ice per game ('TOI/GP')

I can identify these groups by:

def_cond = df.loc[(df["Pos"]==4.0) & (df["GP"]>=50) & (df["TOI/GP"] >=20.00)]
off_cond = df.loc[(df["Pos"]!=4.0) & (df["GP"]>=50) & (df["TOI/GP"] >=14.00)]

Output for def_cond:

    Player        Pos   GP     G    GF/GP   S     Shots/GP   S%   TOI/GP
28  Cale Makar    4.0   59.3  16.0   0.3   155.0    2.6     9.8   23.67
41  Devon Toews   4.0   58.8  8.2    0.1   120.5    2.1     6.7   22.14
45  Erik Johnson  4.0   67.4  7.3    0.1   140.9    2.1     5.1   22.22
112 Samuel Girard 4.0   68.0  4.4    0.1   90.8     1.3     5.0   20.75

Issue:

What I want to do is take this output and multiply 'G', 'GF/GP', 'S', and 'Shots/GP' by a weight value - again 1.1 for example.

I have tried various things such as:

if def_cond == True:
        df[["G", "GF/GP", S", "Shots/GP"]].multiply(1.1, axis="index")

Or simply

if def_cond == True:
        df["G"] = (df["G"]*1.1)

Pretty much everything I try results in the following error:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I am new to this so any and all advice is welcome!

CodePudding user response:

I would try this:

def f(df,weight):
    for i in df.index:
        if (
            (df.loc[i,'Pos']==4.0 and df.loc[i,'GP']>=50 
                                  and df.loc[i,'TOI/GP']>=20)
            or
            (df.loc[i,'Pos']!=4.0 and df.loc[i,'GP']>=50 
                                  and df.loc[i,'TOI/GP']>=14)
            ):
            df.loc[i,['G','GF/GP','S','Shots/GP']]*=weight

Though, i'm pretty sure, it is not the best solution...

  • Related