Dataframe
d = {'logdate': ['2022-06-20' , '2022-06-21' , '2022-06-22' , '2022-06-23' , '2022-06-24' , '2022-06-25' , '2022-06-26' , '2022-06-27' , '2022-06-28' , '2022-06-29' , '2022-06-30' , '2022-07-01' , '2022-06-20' , '2022-06-21' , '2022-06-22' , '2022-06-23' , '2022-06-24' , '2022-06-25' , '2022-06-26' , '2022-06-27' , '2022-06-28' , '2022-06-29' , '2022-06-30' , '2022-07-01'], 'Username': [ 'UserA' , 'UserA' , 'UserB' , 'UserF' , 'UserA' , 'UserA' , 'UserC' , 'UserA' , 'UserA' , 'UserE' , 'UserA' , 'UserD' , 'UserA' , 'UserE' , 'UserB' , 'UserA' , 'UserF' , 'UserE' , 'UserA' , 'UserA' , 'UserB' , 'UserB' , 'UserA' , 'UserD']}
df = pd.DataFrame(data=d)
df
Output:
| logdate | Username |
|------------|----------|
| 2022-06-20 | UserA |
| 2022-06-21 | UserA |
| 2022-06-22 | UserB |
| 2022-06-23 | UserF |
| 2022-06-24 | UserA |
| 2022-06-25 | UserA |
I want to achieve the following, group by date and count of each users as individual columns
Expected:
| logdate | UserA | UserB | UserC | UserD | UserE | UserF |
|------------|-------|-------|-------|-------|-------|-------|
| 2022-06-20 | 2 | 0 | 0 | 0 | 0 | 0 |
| 2022-06-21 | 1 | 0 | 0 | 0 | 1 | 0 |
| 2022-06-22 | 0 | 2 | 0 | 0 | 0 | 0 |
| 2022-06-23 | 1 | 0 | 0 | 0 | 0 | 1 |
All I could try was to group by logdate and count Username
df.groupby('logdate')[['Username']].count()
Any suggestions would be helpful, Thanks.
CodePudding user response:
Try with crosstab
out = pd.crosstab(df['logdate'],df['Username']).reset_index()