Home > Net >  change column value with arthimatic sequences using df.loc in pandas
change column value with arthimatic sequences using df.loc in pandas

Time:01-31

suppose I have following dataframe :

data = {"age":[2,3,2,5,9,12,20,43,55,60],'alpha' : [0,0,0,0,0,0,0,0,0,0]}
df = pd.DataFrame(data)

I want to change value of column alpha based on column age using df.loc and an arithmetic sequences but I got syntax error:

df.loc[((df.age <=4)) , "alpha"] = ".4"          
df.loc[((df.age >= 5)) & ((df.age <= 20)), "alpha"] = 0.4   (1 - 0.4)*((df$age - 4)/(20 - 4))
df.loc[((df.age > 20)) , "alpha"] = "1"

thank you in davance.

CodePudding user response:

Reference the age column using a . not a $

df.loc[((df.age >= 5)) & ((df.age <= 20)), "alpha"] = 0.4   (1 - 0.4)*((df.age - 4)/(20 - 4))

CodePudding user response:

Instead of multiple .loc assignments you can combine all conditions at once using chained np.where clauses:

df['alpha'] = np.where(df.age <= 4, ".4", np.where((df.age >= 5) & (df.age <= 20),
                                         0.4   (1 - 0.4) *((df.age - 4)/(20 - 4)),
                                         np.where(df.age > 20, "1", df.alpha)))
print(df) 

   age   alpha
0    2      .4
1    3      .4
2    2      .4
3    5  0.4375
4    9  0.5875
5   12     0.7
6   20     1.0
7   43       1
8   55       1
9   60       1

CodePudding user response:

Besides the synthax error (due to $), to reduce visible noise, I would go for numpy.select :

import numpy as np
​
conditions = [df["age"].le(4),
              df["age"].gt(4) & df["age"].le(20),
              df["age"].gt(20)]
​
values = [".4", 0.4   (1 - 0.4) * ((df["age"] - 4) / (20 - 4)), 1]
​
df["alpha"] = np.select(condlist= conditions, choicelist= values)

​ Output :

print(df)

   age   alpha
0    2      .4
1    3      .4
2    2      .4
3    5  0.4375
4    9  0.5875
5   12     0.7
6   20     1.0
7   43       1
8   55       1
9   60       1
  • Related