Home > Net >  Using groupby() and cut() in pandas
Using groupby() and cut() in pandas

Time:08-01

I have a dataframe and for each group value I want to label values. If value is less that group mean then label is 1 and if group value is more than group mean then label is 2.

input data frame is

         groups  num1 
0        a     2    
1        a     5     
2        a     7    
3        b    10    
4        b     4     
5        b     0     
6        b     7     
7        c     2    
8        c     4     
9        c     1     

Here mean values for group a, b ,c are 4.66, 5.25 and 2.33 respectively and output data frame is .

       groups  num1 
0        a     1    
1        a     2     
2        a     2    
3        b     2    
4        b     1     
5        b     1     
6        b     2     
7        c     1    
8        c     2     
9        c     1     

I want to use panads.cut and may be pandas.groupby and pandas.apply also.

Thanks in advance

CodePudding user response:

cut is not really pertinent here. Use groupby.transform('mean') and numpy.where:

df['out'] = np.where(df['num1'].lt(df.groupby('groups')['num1']
                                     .transform('mean')),
                     1, 2)

Output (as new column "out" for clarity):

  groups  num1  out
0      a     2    1
1      a     5    2
2      a     7    2
3      b    10    2
4      b     4    1
5      b     0    1
6      b     7    2
7      c     2    1
8      c     4    2
9      c     1    1
I really want cut

OK, but it's not really nice and performant:

(df.groupby('groups')['num1']
   .transform(lambda g: pd.cut(g, [-np.inf, g.mean(), np.inf], labels=[1, 2]))
)
  • Related