Home > Mobile >  Set row index in pandas based on whether another row's value has already been indexed or not
Set row index in pandas based on whether another row's value has already been indexed or not

Time:12-10

What I am trying to accomplish through Pandas is:

  1. Let's say we have a Pandas DataFrame like this:
     transaction_code
1    4373-36
2    3626-68
3    3626-68
4    3281-23
5    4721-44
...
101  6273-56
102  2836-78
103  1657-28
104  3281-23
105  5323-64

I wanna create a new column called 'transaction_code_new_index' that will contain indexes just like the current existing one, buuuut whenever a transaction_code is duplicated (i.e. the code 6273-75 might exist 3 times in it), I want the index of those codes be the same for them (i.e. for every single transaction_code that matches 6273-75, their index must be the same)

Example:

     transaction_code transaction_code_new_index
1    4373-36          1
2    3626-68          2
3    3626-68          2 (because 3626-68 has already been indexed before)
4    3281-23          3
5    4721-44          4
...
101  6273-56          100
102  2836-78          101
103  1657-28          102
104  3281-23          3 (because 3281-23 has already been indexed before)
105  5323-64          103

Thanks.

CodePudding user response:

You can take the min index of every group. Using transform will assign the results back to the respective rows.

df['new_index'] = df.groupby('transaction_code')['transaction_code'].transform(lambda x: x.index.min())

Output

  transaction_code  new_index
1          4373-36          1
2          3626-68          2
3          3626-68          2
4          3281-23          4
5          4721-44          5
  • Related