I have a pandas dataframe that contain score such as
score |
---|
0.1 |
0.15 |
0.2 |
0.3 |
0.35 |
0.4 |
0.5 |
etc
I want to group these value into the gorups of 0.2 so if score is between 0.1 or 0.2 the value for this row in sore will be 0.2 if score is between 0.2 and 0.4 then the value for score will be 0.4
so for example if max score is 1, I will have 5 buckets of score, 0.2 0.4 0.6 0.8 1
desired output:
score |
---|
0.2 |
0.2 |
0.2 |
0.4 |
0.4 |
0.4 |
0.6 |
CodePudding user response:
You can first define a function that does the rounding for you:
import numpy as np
def custom_round(x, base):
return base * np.ceil(x / base)
Then use .apply()
to apply the function to your column:
df.score.apply(lambda x: custom_round(x, base=.2))
Output:
0 0.2
1 0.2
2 0.2
3 0.4
4 0.4
5 0.4
6 0.6
Name: score, dtype: float64
CodePudding user response:
Try np.ceil
:
import pandas as pd
import numpy as np
data = {'score': {0: 0.1, 1: 0.15, 2: 0.2, 3: 0.3, 4: 0.35, 5: 0.4, 6: 0.5}}
df = pd.DataFrame(data)
base = 0.2
df['score'] = base * np.ceil(df.score/base)
print(df)
score
0 0.2
1 0.2
2 0.2
3 0.4
4 0.4
5 0.4
6 0.6