Home > Software engineering >  Function that creates new column based on filtering input selection
Function that creates new column based on filtering input selection

Time:11-14

I want to create a new column that multiplies the column values of pt_nm with predefined values, if the name is selected in the variable:

df["pt_nm"] looks like this

0    0.0
1    1.0
2    1.0
3    2.0
4    1.0
dtype: float64

my variables that are available to select are these:

types = ["E", "S", "EK"]
r_type = "E"

pt_s= 25
pt_e = 60
pt_ek = 45

I tried the following which doesn't work:

def race (r_type, pt_nm):
    if r_type == "E":
        pt_nm* pt_e
    elif r_type == "S":
        pt_nm* pt_s
    else:
        pt_nm* pt_ek

df["pt_new"] = df["pt_nm"].apply(race, axis = 1)

I assume the problem is probably in the arguments? An explanation on how the function would work is appreciated! :)

CodePudding user response:

Use Series.pipe with pass complete Series to function, also add return like:

types = ["E", "S", "EK"]
r_type = "E"

pt_s= 25
pt_e = 60
pt_ek = 45

#swapped arguments
def race (pt_nm, r_type):
    if r_type == "E":
        return  pt_nm* pt_e
    elif r_type == "S":
        return pt_nm* pt_s
    else:
        return pt_nm* pt_ek

df["pt_new"] = df["pt_nm"].pipe(race, r_type)
#alternatuive
#df["pt_new"] = race(df["pt_nm"], r_type)
print (df)
   pt_nm  pt_new
0    0.0     0.0
1    1.0    60.0
2    1.0    60.0
3    2.0   120.0
4    1.0    60.0

CodePudding user response:

can you try this:

def race (r_type, pt_nm):
    if r_type == "E":
        return pt_nm* pt_e
    elif r_type == "S":
        return pt_nm* pt_s
    else:
        return pt_nm* pt_ek

df["pt_new"] = df["pt_nm"].apply(lambda x: race(x,r_type=r_type))

CodePudding user response:

You can use a dictionary to look up the scalar for the provided type and use that scalar in the apply function. This gives you your desired output:

import pandas as pd

df = pd.DataFrame([0.0, 1.0, 1.0, 2.0, 1.0], columns = ["pt_nm"])

r_type = "E"

types = {"E": 60, "S": 25, "EK": 45}
scalar = types[r_type]
df["pt_new"] = df["pt_nm"].apply(lambda x: x*scalar)

print(df)

Out:

   pt_nm  pt_new
0    0.0     0.0
1    1.0    60.0
2    1.0    60.0
3    2.0   120.0
4    1.0    60.0
  • Related