how to get quartiles and classify a value according to this quartile range-CodePudding

I have this df:

d = pd.DataFrame({'Name':['Andres','Lars','Paul','Mike'],
                  'target':['A','A','B','C'],
                  'number':[10,12.3,11,6]})

And I want classify each number in a quartile. I am doing this:

(d.groupby(['Name','target','number'])['number']
 .quantile([0.25,0.5,0.75,1]).unstack()
 .reset_index()
 .rename(columns={0.25:"1Q",0.5:"2Q",0.75:"3Q",1:"4Q"})
)

But as you can see, the 4 quartiles are all equal because the code above is calculating per row so if there's one 1 number per row all quartiles are equal.

If a run instead:

d['number'].quantile([0.25,0.5,0.75,1])

Then I have the 4 quartiles I am looking for:

0.25     9.000
0.50    10.500
0.75    11.325
1.00    12.300

What I need as output(showing only first 2 rows)

   Name    target   number     1Q      2Q      3Q        4Q     Rank
0   Andres  A       10.0       9.0    10.5    11.325    12.30     1
1   Lars    A       12.3       9.0    10.5    11.325    12.30     4

you can see all quartiles has the the values considering tall values in the number column. Besides that, now we have a column names Rank that classify the number according to it's quartile. ex. In the first row 10 is within the 1st quartile.

CodePudding user response：

Here's one way that build on the quantiles you've created by making it a DataFrame and joining it to d. Also assigns "Rank" column using rank method:

out = (d.join(d['number'].quantile([0.25,0.5,0.75,1])
              .set_axis([f'{i}Q' for i in range(1,5)], axis=0)
              .to_frame().T
              .pipe(lambda x: x.loc[x.index.repeat(len(d))])
              .reset_index(drop=True))
       .assign(Rank=d['number'].rank(method='dense')))

Output:

     Name target  number   1Q    2Q      3Q    4Q  Rank
0  Andres      A    10.0  9.0  10.5  11.325  12.3   2.0
1    Lars      A    12.3  9.0  10.5  11.325  12.3   4.0
2    Paul      B    11.0  9.0  10.5  11.325  12.3   3.0
3    Mike      C     6.0  9.0  10.5  11.325  12.3   1.0