I have a query in my code, where am trying to use if else case to derive a value in a data frame,
import pandas as pd
import numpy as np
c=15
s={'yr':[2014,2014,2014,2014],'value':[10,20,20,50]}
p=pd.DataFrame(data=s)
if (p['value'])>= c:
p['qty']=c-p['value']
else:
p['value']
I am getting the error in the above code-
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Basically this should be my expected output-
yr value qty
0 2014 10 10
1 2014 20 5
2 2014 20 5
3 2014 50 35
How should I solve this error?
CodePudding user response:
You can select certain rows with the loc statement:
# initialize the qty column
df['qty'] = df['value']
# adjust qty where qty is larger than c
df.loc[df['qty'] > c, 'qty'] -= c
CodePudding user response:
if
expects a boolean value (True/False) but (p['value'])>= c
is a Series, so you're getting that error. One way to get the desired output is to use mask
:
p['qty'] = p['value'].mask(lambda x: x>=c, p['value']-c)
Another option is to use numpy.where
:
import numpy as np
p['qty'] = np.where(p['value']>=c, p['value']-c, p['value'])
Output:
yr value qty
0 2014 10 10
1 2014 20 5
2 2014 20 5
3 2014 50 35
CodePudding user response:
You're trying to operate element-wise on a column, that usually requires iteration or applying a function
p['qty'] = p['value'].apply(lambda x: c - x if x >= c else x)
CodePudding user response:
Solution using np.where
, assuming your intended calculation is p['value']-c
rather than c-p['value']
:
p['qty'] = np.where(p['value'] >= c, p['value']-c, p['value'])
Result:
yr value qty
0 2014 10 10
1 2014 20 5
2 2014 20 5
3 2014 50 35