I want to return a value (1,2,3,4 or 5) based on the range a number falls in. I want to define a function and apply the function to a column in a DataFrame using .apply()
.
In the code below, amount
is a hypothetical column in a DataFrame. However, I get the error SyntaxError: invalid syntax
on line elif >= 40 amount < 60:
(I believe it will raise the same error on all other lines).
amount = pd.Series([20, 25, 65, 80])
def miles(amount):
if 20 >= amount < 40:
return 1
elif >= 40 amount < 60:
return 2
elif >= 60 amount < 80:
return 3
elif >= 80 amount < 100:
return 4
elif >= 100 amount < 120:
return 5
else:
pass
Any help is appreciated. Thank you!
CodePudding user response:
For this particular case, you are mapping discrete fixed-width integer ranges to a number. This can be solved using a linear transform. The offset in this case is 0.
amount = pd.Series([20, 25, 65, 80])
out = amount.divide(20).astype(int)
out
# returns:
0 1
1 1
2 3
3 4
dtype: int32
For a more general case where the binning is not fixed-width, you can use pd.cut
.
pd.cut(ammount, [20, 40, 60, 80, 100, 120], right=False, labels=[1,2,3,4,5]).astype(int)
# returns:
0 1
1 1
2 3
3 4
dtype: int32
CodePudding user response:
You can use:
pd.cut(amount, range(20,121,20), labels = range(1,6), right = False)
#Output:
#0 1
#1 1
#2 3
#3 4
#dtype: category
#Categories (5, int64): [1 < 2 < 3 < 4 < 5]
The first argument is the pandas.Series
you want to cut, the next one are the bins, labels
associates every bin with a label, and right
includes the rightmost edge of the bin when it's True
.
For more details check the documentation: pandas.cut
.
CodePudding user response:
You can use pandas.cut to do this.
It will separate your array elements into different sections.
Here is a link to the documentation: https://pandas.pydata.org/docs/reference/api/pandas.cut.html
Hopefully this helps :)