I have a data frame as follows;
return | Upper | lower |
---|---|---|
50 | 70 | 20 |
10 | 15 | 3 |
I'm trying to count how many times the return is in-between the upper and lower. I have tried to create another bool type column if the condition is true.
for val in data['return']:
if data['return'] < data['upper'] or data['return']> data['lower']:
data['Predicted'] = 1
else:
data['Predicted'] = 0
where data[predicted]
should be the new column.
However I get the error
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I have tried changing the operator to |
, but it didn't work. I'm new to python and are unsure what way to best solve this.
For context my goal is to calculated how many times it has predicted it right. I am not sure if this method is the best way.
CodePudding user response:
how many times the return is in-between the upper and lower
It seems you rather need AND operator. You could use between
here instead of iterating over the rows:
data['predicted'] = data['return'].between(data['lower'], data['Upper']).astype(int)
Output:
return Upper lower predicted
0 50 70 20 1
1 10 15 3 1
The error happens because data['return']
, data['upper']
etc. are Series objects, so the comparisons yield boolean Series, which you can't use in an if-statement because it's expecting a True/False value.
CodePudding user response:
Other options.
import numpy as np
import pandas as pd
x = pd.DataFrame({'upper': [3, 4, 4], 'lower': [1, 1, 2], 'return': [2, 3, 5]})
x['pred'] = 0
x.loc[np.logical_and(x['return'] < x['upper'], x['return'] > x['lower']), 'pred'] = 1
I like this solution because it can be used for other problems.