Home > OS >  Python exclude zero while ranking
Python exclude zero while ranking

Time:06-30

Please help to create ranking that excludes values = 0, null, NaN for the below df,

Input:

df = pd.DataFrame(data={'Group1': ['A', 'A', 'A',
                                   'B', 'C','D'],
                        'Group2': ['A1', 'A2', 'A3',
                                   'B1', 'B2','D1'],
                        'Number': [3, 2, 4, 0, np.nan,'']

Expected result:

Group1  Group2  Number  Rank
 A       A1      3       2
 A       A2      2       1
 A       A3      4       3
 B       B1      0  
 C       B2    NaN  
 D       D1     

Similar post but does not show on excluding zero, null, nan Ranking order per group in Pandas

CodePudding user response:

Use df.rank, Series.isin with Groupby.transform:

In [1704]: df['Rank'] = df[~df.Number.isin([0, '', np.nan])].groupby('Group1')['Number'].transform('rank')

In [1705]: df
Out[1705]: 
  Group1 Group2 Number  Rank
0      A     A1      3   2.0
1      A     A2      2   1.0
2      A     A3      4   3.0
3      B     B1      0   NaN
4      C     B2    NaN   NaN
5      D     D1          NaN

CodePudding user response:

To do what you are planing, I would write like this: 'Rank':[2,1,3,'', ''], so you have the result that you are expecting at the output without visual information.

df = pd.DataFrame(data={'Animal': ['cat', 'penguin', 'dog',
                               'spider', 'snake'],
                    'Number_legs': [4, 2, 4, 0, np.nan],
                    'Rank':[2,1,3,'', '']})

This would be the full code.

I recomend you to have there the null, so at least you'll have there the information about the empty object.

  • Related