How can I group by below table from Customer ID and Product Code and get them to one row as below using Python?
Customer ID | Product Code | Days since the last transaction |
---|---|---|
A | 1 | 10 |
A | 1 | 23 |
A | 1 | 7 |
A | 2 | 8 |
A | 2 | 9 |
A | 3 | 6 |
B | 1 | 18 |
B | 2 | 4 |
B | 3 | 4 |
B | 3 | 12 |
C | 2 | 27 |
C | 2 | 15 |
Need to get below table by grouping them by Customer ID and Product Code.
Customer ID | Product Code | D1 | D2 | D3 |
---|---|---|---|---|
A | 1 | 10 | 23 | 7 |
A | 2 | 8 | 9 | N/A |
A | 3 | 6 | N/A | N/A |
B | 1 | 18 | N/A | N/A |
B | 2 | 4 | N/A | N/A |
B | 3 | 4 | 12 | N/A |
C | 2 | 27 | 15 | N/A |
CodePudding user response:
df[''] = df.groupby(['Customer ID', 'Product Code']).cumcount()
df = df.pivot(index=['Customer ID', 'Product Code'], columns='')
print(df)
Output:
Days since the last transaction
0 1 2
Customer ID Product Code
A 1 10.0 23.0 7.0
2 8.0 9.0 NaN
3 6.0 NaN NaN
B 1 18.0 NaN NaN
2 4.0 NaN NaN
3 4.0 12.0 NaN
C 2 27.0 15.0 NaN
CodePudding user response:
Below python code also worked for me.
#keep only the needed data
grouped = df.groupby(['Customer_ID','Product Code'], as_index=False).agg({"Days since the last transaction": lambda x: x.tolist()[:3] [x.iat[-1]]}).explode("Days since the last transaction")
#get the count for the age columns
grouped["idx"] = grouped.groupby(['Customer_ID','Product Code']).cumcount().add(1)
#pivot to get the required structure
output = grouped.pivot(["Customer_ID","Product Code"],"idx","Days since the last transaction").add_prefix("Days since the last transaction").reset_index().rename_axis(None, axis=1)
output.head()