I have a basic set of data like:
ID Value
A 0.1
B 0.2
C -0.1
D -0.01
E 0.15
If we use data.rank() we get the result:
ID Value
A 3
B 5
C 1
D 2
E 4
But i want to have so that the negative values result in an negative rank instead such as:
ID Value
A 2
B 3
C -2
D -1
E 1
Basically rank give the negative values an negative rank while the postive values get a positive rank but instead of 1 to 5 we get 1 to 3 and -1 to -2. Any help is greatly apreciated.
CodePudding user response:
Rank your positive values and negative values separately and then concat
them:
>>> pd.concat([df[df["Value"].gt(0)].rank(),df[df["Value"].lt(0)].mul(-1).rank().mul(-1)]).sort_index()
ID Value
0 1.0 1.0
1 2.0 3.0
2 -1.5 -2.0
3 -1.5 -1.0
4 3.0 2.0
CodePudding user response:
Another method similar to the concat answer, but not as compact:
import pandas as pd
A = ['A', 'B', 'C', 'D']
B = [-1, 1, 3, -2]
df = pd.DataFrame({'ID': A, 'value': B})
pos = df.where(df['value'] >= 0)['value'].rank()
neg = df.where(df['value'] < 0)['value'].rank()
pos.update(-neg)
df['rank'] = pos
print(df)
CodePudding user response:
Think outside the box & sort your values.
# create your dummy data
data = pd.DataFrame({'ID':list('ABCDE'), 'Value':[0.1,0.2,-0.1,-0.01,0.15]})
# sort the data so that we can use of the cumsum (and vectorize the operation)
data = data.sort_values('Value')
data['RANK'] = (data['Value']>=0).cumsum()
data['RANK'] -= (data['Value']<0)[::-1].cumsum()
# Sort the values back to the original state ...
data.sort_values('ID')
ID Value RANK
0 A 0.10 1
1 B 0.20 3
2 C -0.10 -2
3 D -0.01 -1
4 E 0.15 2