I have a dataset, df, where I would like to create a count for a series of values grouped by specific column in Python
Data
id date type
aa q1 23 hi
aa q1 23 hi
aa q1 23 bye
aa q1 23 bye
aa q2 23 hi
aa q2 23 bye
bb q1 23 hi
resets for every unique date and id
Desired
id date type count
aa q1 23 hi hi01
aa q1 23 hi hi02
aa q1 23 bye bye01
aa q1 23 bye bye02
aa q2 23 hi hi01
aa q2 23 bye bye02
bb q1 23 hi hi01
Doing
I am adding leading zeros - keep getting type error
df['count'] = df[0].str.upper() df[1].str.zfill(2)
Any suggestion is appreciated.
CodePudding user response:
You can use:
df['count'] = df['type'] df.groupby([*df]).cumcount().add(1).astype(str).str.zfill(2)
Output:
id date type count
0 aa q1 23 hi hi01
1 aa q1 23 hi hi02
2 aa q1 23 bye bye01
3 aa q1 23 bye bye02
4 aa q2 23 hi hi01
5 aa q2 23 bye bye01
6 bb q1 23 hi hi01