I have two columns, one with an ID and the orhter with transaction dates.
ID | Date |
---|---|
A | 2019-04-30 |
A | 2020-12-15 |
B | 2019-08-24 |
B | 2019-09-29 |
B | 2019-10-14 |
And so on...
I want to create a column that orders the date by ID.
ID | Date | Type |
---|---|---|
A | 2019-04-30 | Order_1 |
A | 2020-12-15 | Order_2 |
B | 2019-08-24 | Order_1 |
B | 2019-09-29 | Order_2 |
B | 2019-10-14 | Order_3 |
Thanks!
CodePudding user response:
Use GroupBy.cumcount
for counter with add 1
with cast to strings, so possible add Order_
:
df['Type'] = 'Order_' df.groupby('ID').cumcount().add(1).astype(str)
df['diff'] = df.groupby('ID')['Date'].diff().dt.days
I got a problem, when I pivot the table, we orders the date alphabetically (Order_1, Order_10, Order_100,..) and not Order_1, Order_2. Because it is a string. Is there a way around this?
Convert values to numeric and after pivoting use DataFrame.add_prefix
:
df['Type'] = df.groupby('ID').cumcount().add(1)
df1 = df.pivot('ID','Type','diff').add_prefix('Order_')