Incremental counter if the value is the same before the point-CodePudding

I have the following STRING column on a pandas DataFrame.

HOURCENTSEG(string-column)
070026.16169
070026.16169
070026.16169
070026.16169
070052.85555
070052.85555
070109.43620
070202.56430
070202.56431
070202.56434
070202.56434

As you can see we have many elements where the time overlaps before the point, in all the fields to avoid date overlaps I must add an incremental counter as I show you in the following output example.

HOURCENTSEG (string-column)
070026.00001
070026.00002
070026.00003
070026.00004
070052.00001
070052.00002
070109.00001  (if there is only one value it's just 00001)
070202.00001
070202.00002
070202.00003
070202.00004

It is a poorly designed application in the past and I have no other option to solve this.

Summary: Add an incremental counter after point. With a maximum size of 5, and padded with 0 from the left, When the number to the left of the point is equal.

CodePudding user response：

Use GroupBy.cumcount with splitted values by . and selected first sublist, last add zeros by Series.str.zfill:

s = df['HOURCENTSEG'].str.split('.').str[0]
df['HOURCENTSEG'] = s   '.'   s.groupby(s).cumcount().add(1).astype(str).str.zfill(5)
print (df)
     HOURCENTSEG
0   070026.00001
1   070026.00002
2   070026.00003
3   070026.00004
4   070052.00001
5   070052.00002
6   070109.00001
7   070202.00001
8   070202.00002
9   070202.00003
10  070202.00004