I am trying to bulid an if else condition for a data frame, but it seems giving me invalid syntax, The data is below:
df = pd.DataFrame(np.random.randint(0,30,size=10),
columns=["Random"],
index=pd.date_range("20180101", periods=10))
df=df.reset_index()
df['Recommandation']=['No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'No', 'No', 'Yes', 'No']
df['diff']=[3,2,4,1,6,1,2,2,3,1]
df
I am trying to create another column in 'new' by using the following condition:
If the 'index' is in the first three date, then, 'new'='random',
elif the 'Recommendation' is yes, than 'new'= 'Value of the previous row of the random column' 'diff'
else: 'new'= 'Value of the previous row of the random column'
My code is below:
def my_fun(df, Recommendation, random, index, diff):
print (x)
if df[(df['index']=='2018-01-01')|(df['index']=='2018-01-02')|(df['index']=='2018-01-03')] :
x = df['random']
elif (df[df['recommendation']=='Yes']):
x = df['random'].shift(1) df['diff']
else:
x = df['random'].shift(1)
return x
#The expected output:
df['new'] = [22, 20, 10, 31, 26, 6, 27, 5, 10, 13]
df
CodePudding user response:
Following your conditions, the code should be:
import numpy as np
df['new'] = np.select([df['index'].isin(df['index'].iloc[:3]), df['Recommandation'].eq('Yes')],
[df['Random'], df['diff'] df['Random'].shift(1)],
df['Random'].shift(1)
)
output:
index Random Recommandation diff new
0 2018-01-01 22 No 3 22.0
1 2018-01-02 21 Yes 2 21.0
2 2018-01-03 29 No 4 29.0
3 2018-01-04 19 Yes 1 30.0
4 2018-01-05 1 Yes 6 25.0
5 2018-01-06 8 Yes 1 2.0
6 2018-01-07 0 No 2 8.0
7 2018-01-08 4 No 2 0.0
8 2018-01-09 27 Yes 3 7.0
9 2018-01-10 27 No 1 27.0
CodePudding user response:
In your else clause, x[i] is wrong because of not definition for i
If you want to add a column to your dataframe you must use a vode such as below:
if ....:
df['newcol']=...
elif ...:
df['newcol']=...
.
.
.