I have a dataFrame column price having a price feature having 10000 values ranging from 0$ to 399.99$.
I'm trying to segregate value as per certain price band but getting incorrect values.
Price(values) given are:
array([ 0. , 4.99, 3.99, 6.99, 1.49, 2.99, 7.99, 5.99,
3.49, 1.99, 9.99, 7.49, 0.99, 9. , 5.49, 10. ,
24.99, 11.99, 79.99, 16.99, 14.99, 1. , 29.99, 12.99,
109.99, 154.99, 3.08, 2.59, 4.8 , 1.96, 19.4 , 3.9 ,
4.59, 15.46, 3.04, 4.29, 2.6 , 3.28, 4.6 , 28.99,
2.95, 2.9 , 1.97, 200. , 89.99, 2.56, 30.99, 3.61,
394.99, 1.26, 1.2 , 1.04], dtype=float32)
Tried below code, but getting wrong Output despite having value > 28,
(data['Price'].any() > 28:
print('Max')
# is returning False
def Priceband(): if data['Price'].any() < 7: print('Cheap') if data['Price'].any() >= 7 & data['Price'].any() < 14: print('Normal') if data['Price'].any() >= 14 & data['Price'].any() < 21: print('Slight Expensive') if data['Price'].any() >= 21 & data['Price'].any() < 28: print('Expensive') if data['Price'].any() > 28: print('Max')
Getting 'False' even for 'True' Conditions
CodePudding user response:
The use of any
is not correct. Additionally, with the use of elif
you don't do unneccessary if
-checks:
if (data['Price'] < 7).any():
print('Cheap')
elif (data['Price'] >= 7 and data['Price'] < 14).any():
print('Normal')
elif (data['Price'] >= 14 and data['Price'] < 21).any():
print('Slight Expensive')
elif (data['Price'] >= 21 and data['Price'] < 28).any():
print('Expensive')
elif (data['Price'] > 28).any():
print('Max')
However, consider using pandas.cut
. In case one of the borders changes, you only have to adjust one hard-coded number instead of two. Plus readability increases.
bins = [0, 7, 14, 21, 28, 400]
labels = ['Cheap', 'Normal', 'Slight Expensive', 'Expensive', 'Max']
data['Price Band'] = pd.cut(data['Price'], bins=bins, labels=labels)
CodePudding user response:
any checks whether any value is True
. Thus, you need to do the check before and then call any
:
if (data['Price'] < 7).any():
print('Cheap')
Any numeric value that is above 0 will return true for a boolean check:
pd.DataFrame({'Price': [0., 0.1, 0.5, 1.0, 2.0]}).astype('bool')
>>> Price
0 False
1 True
2 True
3 True
4 True