What I am trying to accomplish through Pandas is:
- Let's say we have a Pandas DataFrame like this:
transaction_code
1 4373-36
2 3626-68
3 3626-68
4 3281-23
5 4721-44
...
101 6273-56
102 2836-78
103 1657-28
104 3281-23
105 5323-64
I wanna create a new column called 'transaction_code_new_index' that will contain indexes just like the current existing one, buuuut whenever a transaction_code is duplicated (i.e. the code 6273-75 might exist 3 times in it), I want the index of those codes be the same for them (i.e. for every single transaction_code that matches 6273-75, their index must be the same)
Example:
transaction_code transaction_code_new_index
1 4373-36 1
2 3626-68 2
3 3626-68 2 (because 3626-68 has already been indexed before)
4 3281-23 3
5 4721-44 4
...
101 6273-56 100
102 2836-78 101
103 1657-28 102
104 3281-23 3 (because 3281-23 has already been indexed before)
105 5323-64 103
Thanks.
CodePudding user response:
You can take the min index of every group. Using transform will assign the results back to the respective rows.
df['new_index'] = df.groupby('transaction_code')['transaction_code'].transform(lambda x: x.index.min())
Output
transaction_code new_index
1 4373-36 1
2 3626-68 2
3 3626-68 2
4 3281-23 4
5 4721-44 5