Create a Dataframe priority/rank column based on conditions over multiple columns-CodePudding

Suppose I have a pandas Dataframe,

  Name    thick  prio
  Aabc1     20     1
  Babc2     21     1
  Cabc3     22     1
  Aabc4     23     1
  Axyz1     20     2
  Bxyz2     21     2
  Axyz3     22     2

I need to create a dataframe column such a way that expected output will be

  Name    thick  prio  newPrio
  Aabc1     20     1     1
  Babc2     21     1     3
  Cabc3     22     1     4
  Aabc4     23     1     2
  Axyz1     20     2     5
  Bxyz2     21     2     7
  Axyz3     22     2     6

The logic behind this is: 1st group names based on thickness (ascending order) and priority. Then check the prio column, for example, 1, there are multiple names, if a name starts with A give them 1st priority, if B then 2nd and if C then third. Then go to the prio 2 and do the same thing. In this way, I would like to create a newPrio column.

I have tried it and it is working partially x['newPrio'] = x.sort_values(['Name', 'thick', 'prio'])['thick'].index 1

CodePudding user response：

You can use sort_values by prio then Name and thick:

rank = df.sort_values(['prio', 'Name', 'thick']).index
df['newPrio'] = pd.Series(range(1, len(df) 1), index=rank)
print(df)

# Output
    Name  thick  prio  newPrio
0  Aabc1     20     1        1
1  Babc2     21     1        3
2  Cabc3     22     1        4
3  Aabc4     23     1        2
4  Axyz1     20     2        5
5  Bxyz2     21     2        7
6  Axyz3     22     2        6

CodePudding user response：

Use DataFrame.sort_values with Index.argsort for positions of indices:

df['newPrio'] = df.sort_values(['prio', 'Name', 'thick']).index.argsort()   1
print (df)
    Name  thick  prio  newPrio
0  Aabc1     20     1        1
1  Babc2     21     1        3
2  Cabc3     22     1        4
3  Aabc4     23     1        2
4  Axyz1     20     2        5
5  Bxyz2     21     2        7
6  Axyz3     22     2        6