Suppose I have a pandas Dataframe,
Name thick prio
Aabc1 20 1
Babc2 21 1
Cabc3 22 1
Aabc4 23 1
Axyz1 20 2
Bxyz2 21 2
Axyz3 22 2
I need to create a dataframe column such a way that expected output will be
Name thick prio newPrio
Aabc1 20 1 1
Babc2 21 1 3
Cabc3 22 1 4
Aabc4 23 1 2
Axyz1 20 2 5
Bxyz2 21 2 7
Axyz3 22 2 6
The logic behind this is:
1st group names based on thickness (ascending order) and priority. Then check the prio
column, for example, 1, there are multiple names, if a name starts with A give them 1st priority, if B then 2nd and if C then third. Then go to the prio
2 and do the same thing. In this way, I would like to create a newPrio
column.
I have tried it and it is working partially
x['newPrio'] = x.sort_values(['Name', 'thick', 'prio'])['thick'].index 1
CodePudding user response:
You can use sort_values
by prio
then Name
and thick
:
rank = df.sort_values(['prio', 'Name', 'thick']).index
df['newPrio'] = pd.Series(range(1, len(df) 1), index=rank)
print(df)
# Output
Name thick prio newPrio
0 Aabc1 20 1 1
1 Babc2 21 1 3
2 Cabc3 22 1 4
3 Aabc4 23 1 2
4 Axyz1 20 2 5
5 Bxyz2 21 2 7
6 Axyz3 22 2 6
CodePudding user response:
Use DataFrame.sort_values
with Index.argsort
for positions of indices:
df['newPrio'] = df.sort_values(['prio', 'Name', 'thick']).index.argsort() 1
print (df)
Name thick prio newPrio
0 Aabc1 20 1 1
1 Babc2 21 1 3
2 Cabc3 22 1 4
3 Aabc4 23 1 2
4 Axyz1 20 2 5
5 Bxyz2 21 2 7
6 Axyz3 22 2 6