I'm sure there is a super simple answer to this...
I have a dataframe like below:
The simulation is run 3 times for 4 time steps t. At each time step, a task (0,1,2) is chosen.
I want to find out the % of times that task 1 is chosen averaged over the 3 simulations at each time step t. I'm sure its some sort of simple groupby().mean() but i can't seem to get it. Any help would be appreciated!
t | simulation | chosen_task |
---|---|---|
0 | 0 | 1 |
1 | 0 | 2 |
2 | 0 | 0 |
3 | 0 | 1 |
0 | 1 | 0 |
1 | 1 | 1 |
2 | 1 | 1 |
3 | 1 | 1 |
0 | 2 | 0 |
1 | 2 | 1 |
2 | 2 | 2 |
3 | 2 | 0 |
CodePudding user response:
You can use crosstab
to calculate the normalized counts of chosen_task
for each time step
pd.crosstab(df['t'], df['chosen_task'], normalize='index')[1]
t
0 0.333333
1 0.666667
2 0.333333
3 0.666667
Name: 1, dtype: float64