Home > Mobile >  grouping values in pandas column
grouping values in pandas column

Time:09-26

I have a pandas dataframe that contain score such as

score
0.1
0.15
0.2
0.3
0.35
0.4
0.5

etc

I want to group these value into the gorups of 0.2 so if score is between 0.1 or 0.2 the value for this row in sore will be 0.2 if score is between 0.2 and 0.4 then the value for score will be 0.4

so for example if max score is 1, I will have 5 buckets of score, 0.2 0.4 0.6 0.8 1

desired output:

score
0.2
0.2
0.2
0.4
0.4
0.4
0.6

CodePudding user response:

You can first define a function that does the rounding for you:

import numpy as np
def custom_round(x, base):
    return base * np.ceil(x / base)

Then use .apply() to apply the function to your column:

df.score.apply(lambda x: custom_round(x, base=.2))

Output:

0    0.2
1    0.2
2    0.2
3    0.4
4    0.4
5    0.4
6    0.6
Name: score, dtype: float64

CodePudding user response:

Try np.ceil:

import pandas as pd
import numpy as np

data = {'score': {0: 0.1, 1: 0.15, 2: 0.2, 3: 0.3, 4: 0.35, 5: 0.4, 6: 0.5}}
df = pd.DataFrame(data)

base = 0.2
df['score'] = base * np.ceil(df.score/base)

print(df)

   score
0    0.2
1    0.2
2    0.2
3    0.4
4    0.4
5    0.4
6    0.6
  • Related