I have the following dataframe
I am trying to divide the 'centroid velocity' per group, where each group is specified by a given ['Genotype', 'Fly'] pair, by the average 'centroid velocity' in the ['Genotype', 'Fly', 'Surface'='smooth'] subgroup.
So this is what I want to divide
fly_avg_vel = max_vel_centroid.groupby(['Genotype','Fly'])
and this is the divisor
fly_avg_vel = max_vel_centroid.groupby(['Genotype','Fly','Surface']).mean()
fly_avg_vel.loc[(slice(None),slice(None),'smooth'),:]
Could you please help?
CodePudding user response:
IIUC:
mean = (
df.loc[df['Surface'] == 'smooth']
.groupby(['Genotype', 'Fly'])['centroid_velocity']
.mean().reindex(df[['Genotype', 'Fly']])
)
df['mean'] = df['centroid_velocity'] / mean.values
print(df)
# Output
Genotype Fly Surface centroid_velocity mean
0 SN16_TNT Fly2 smooth 27.493325 1.018077
1 SN1_TNT Fly3 smooth 26.220000 0.949446
2 SN16_TNT Fly1 rough 32.987707 0.960055
3 SN1_TNT Fly1 smooth 36.936724 1.000000
4 SN1_TNT Fly2 smooth 32.023062 1.000000
5 SN16_TNT Fly2 rough 34.133866 1.263976
6 SN16_TNT Fly2 rough 27.382251 1.013964
7 SN1_TNT Fly1 rough 30.802885 0.833937
8 SN1_TNT Fly1 rough 40.623492 1.099813
9 SN16_TNT Fly1 smooth 32.401770 0.943002
10 SN16_TNT Fly1 smooth 36.318703 1.056998
11 SN16_TNT Fly3 smooth 29.397323 0.944683
12 SN16_TNT Fly2 smooth 26.516987 0.981923
13 SN1_TNT Fly3 smooth 24.421761 0.884330
14 SN16_TNT Fly3 rough 40.352241 1.296719
15 SN1_TNT Fly3 smooth 27.658973 1.001552
16 SN1_TNT Fly1 rough 30.818247 0.834352
17 SN1_TNT Fly3 smooth 32.163673 1.164671
18 SN16_TNT Fly3 rough 37.384199 1.201341
19 SN16_TNT Fly3 smooth 32.840134 1.055317
Setup a MRE
import pandas as pd
import numpy as np
rng = np.random.default_rng(2022)
N = 20
data = {'Genotype': rng.choice(['SN1_TNT', 'SN16_TNT'], N),
'Fly': rng.choice(['Fly1', 'Fly2', 'Fly3'], N),
'Surface': rng.choice(['smooth', 'rough'], N),
'centroid_velocity': rng.normal(2, 4, N) 30}
df = pd.DataFrame(data)