I have the following example
import pandas as pd
names = ['a', 'b', 'c', 'd', 'e']
points = [10, 15, 15, 12, 20]
scores = pd.DataFrame({'name': names,
'points': points})
I want to create a new column called position
that specifies the relative position of a player. The player with the most points is #1.
I sort the df
using
scores = scores.sort_values(by='points', ascending=False)
If there is a tie (same number of points) I want position
to be the T
and the corresponding position.
In my example the position of b
and c
is T2
.
Desired output:
name points position
e 20 1
b 15 T2
c 15 T2
d 12 3
a 10 4
Thank you
CodePudding user response:
I would use pandas.Series.rank
:
# is there a tie ?
m = scores["points"].duplicated(keep=False)
# calculate rankings
s = scores["points"].rank(method="dense", ascending=False)
scores["position"] = (
s.where(~m, s.astype(str).radd("T"))
.astype(str)
.replace(".0$", "", regex=True)
)
out = scores.sort_values(by="points", ascending=False)
# Output :
print(out)
name points position
4 e 20 1
1 b 15 T2
2 c 15 T2
3 d 12 3
0 a 10 4