I am attempting to iterate through a column and add a number based on the count of instances the duplicate value takes place. This will give me a unique value in my dataset. The data I have is below:
FY group item concat
0 2015 GROUP_A 1 2015-GROUP_A-1
1 2015 GROUP_A 1 2015-GROUP_A-1
2 2015 GROUP_A 1 2015-GROUP_A-1
3 2015 GROUP_A 10 2015-GROUP_A-10
4 2015 GROUP_A 7 2015-GROUP_A-7
5 2015 GROUP_B 23 2015-GROUP_B-23
6 2015 GROUP_B 23 2015-GROUP_B-23
7 2015 GROUP_B 23 2015-GROUP_B-23
8 2015 GROUP_B 23 2015-GROUP_B-23
9 2015 GROUP_B 21 2015-GROUP_B-2
I am looking to return something like this:
FY group item concat uid
2015 GROUP_A 1 2015-GROUP_A-1 2015-GROUP_A-1-1
2015 GROUP_A 1 2015-GROUP_A-1 2015-GROUP_A-1-2
2015 GROUP_A 1 2015-GROUP_A-1 2015-GROUP_A-1-3
2015 GROUP_A 10 2015-GROUP_A-10 2015-GROUP_A-10-1
2015 GROUP_A 7 2015-GROUP_A-7 2015-GROUP_A-7-1
2015 GROUP_B 23 2015-GROUP_B-23 2015-GROUP_B-23-1
2015 GROUP_B 23 2015-GROUP_B-23 2015-GROUP_B-23-2
2015 GROUP_B 23 2015-GROUP_B-23 2015-GROUP_B-23-3
2015 GROUP_B 23 2015-GROUP_B-23 2015-GROUP_B-23-4
2015 GROUP_B 21 2015-GROUP_B-21 2015-GROUP_B-21-1
I greatly appreciate any help!
CodePudding user response:
You could count the number of consecutive values in each concat
group, adding 1 to each to offset the zero indexing
df['uid'] = df['concat'] '-' (df.groupby('concat').cumcount() 1).astype(str)