Home > Enterprise >  Group by Date and count of unique users as column name
Group by Date and count of unique users as column name

Time:07-02

Dataframe

d = {'logdate': ['2022-06-20' , '2022-06-21' , '2022-06-22' , '2022-06-23' , '2022-06-24' , '2022-06-25' , '2022-06-26' , '2022-06-27' , '2022-06-28' , '2022-06-29' , '2022-06-30' , '2022-07-01' , '2022-06-20' , '2022-06-21' , '2022-06-22' , '2022-06-23' , '2022-06-24' , '2022-06-25' , '2022-06-26' , '2022-06-27' , '2022-06-28' , '2022-06-29' , '2022-06-30' , '2022-07-01'], 'Username': [ 'UserA' , 'UserA' , 'UserB' , 'UserF' , 'UserA' , 'UserA' , 'UserC' , 'UserA' , 'UserA' , 'UserE' , 'UserA' , 'UserD' , 'UserA' , 'UserE' , 'UserB' , 'UserA' , 'UserF' , 'UserE' , 'UserA' , 'UserA' , 'UserB' , 'UserB' , 'UserA' , 'UserD']}
df = pd.DataFrame(data=d)
df

Output:

| logdate    | Username |
|------------|----------|
| 2022-06-20 | UserA    |
| 2022-06-21 | UserA    |
| 2022-06-22 | UserB    |
| 2022-06-23 | UserF    |
| 2022-06-24 | UserA    |
| 2022-06-25 | UserA    |

I want to achieve the following, group by date and count of each users as individual columns

Expected:

| logdate    | UserA | UserB | UserC | UserD | UserE | UserF |
|------------|-------|-------|-------|-------|-------|-------|
| 2022-06-20 | 2     | 0     | 0     | 0     | 0     | 0     |
| 2022-06-21 | 1     | 0     | 0     | 0     | 1     | 0     |
| 2022-06-22 | 0     | 2     | 0     | 0     | 0     | 0     |
| 2022-06-23 | 1     | 0     | 0     | 0     | 0     | 1     |

All I could try was to group by logdate and count Username

df.groupby('logdate')[['Username']].count()

Any suggestions would be helpful, Thanks.

CodePudding user response:

Try with crosstab

out = pd.crosstab(df['logdate'],df['Username']).reset_index()
  • Related