Please help to create ranking that excludes values = 0, null, NaN for the below df,
Input:
df = pd.DataFrame(data={'Group1': ['A', 'A', 'A',
'B', 'C','D'],
'Group2': ['A1', 'A2', 'A3',
'B1', 'B2','D1'],
'Number': [3, 2, 4, 0, np.nan,'']
Expected result:
Group1 Group2 Number Rank
A A1 3 2
A A2 2 1
A A3 4 3
B B1 0
C B2 NaN
D D1
Similar post but does not show on excluding zero, null, nan Ranking order per group in Pandas
CodePudding user response:
Use df.rank
, Series.isin
with Groupby.transform
:
In [1704]: df['Rank'] = df[~df.Number.isin([0, '', np.nan])].groupby('Group1')['Number'].transform('rank')
In [1705]: df
Out[1705]:
Group1 Group2 Number Rank
0 A A1 3 2.0
1 A A2 2 1.0
2 A A3 4 3.0
3 B B1 0 NaN
4 C B2 NaN NaN
5 D D1 NaN
CodePudding user response:
To do what you are planing, I would write like this: 'Rank':[2,1,3,'', '']
, so you have the result that you are expecting at the output without visual information.
df = pd.DataFrame(data={'Animal': ['cat', 'penguin', 'dog',
'spider', 'snake'],
'Number_legs': [4, 2, 4, 0, np.nan],
'Rank':[2,1,3,'', '']})
This would be the full code.
I recomend you to have there the null, so at least you'll have there the information about the empty object.