df = pd.DataFrame({'id': ['id1', 'id1','id1', 'id2','id1','id1','id1'],
'activity':['swimming','running','jogging','walking','walking','walking','walking'],
'month':[2,3,4,3,4,4,3]})
pd.crosstab(df['id'], df['activity'])
I'd like to add another column for month in the output to get counts per user within each month for the respective activity.
df.set_index(['id','month'])['activity'].unstack().reset_index()
I get error.
edit: Expected output in the image. I do not know how to create a table.
CodePudding user response:
You can pass a list of columns to pd.crosstab
:
x = pd.crosstab([df["id"], df["month"]], df["activity"]).reset_index()
x.columns.name = None
print(x)
Prints:
id month jogging running swimming walking
0 id1 2 0 0 1 0
1 id1 3 0 1 0 1
2 id1 4 1 0 0 2
3 id2 3 0 0 0 1