I have the following simple dataframe.
data = [['Cntrl', 0.844961, '201030'],
['Cntrl', 0.815331, '201030'],
['Exp', 1.530297, '201030'],
['Exp', 2.36416, '201030'],
['Cntrl', 1.867359, '201024'],
['Cntrl', 1.274295, '201024'],
['Exp', 0.7058, '201024'],
['Exp', 2.657393, '201024'],]
df = pd.DataFrame(data, columns=['Group', 'Value', 'Date'])
df
Group Value Date
0 Cntrl 0.844961 201030
1 Cntrl 0.815331 201030
2 Exp 1.530297 201030
3 Exp 2.364160 201030
4 Cntrl 1.867359 201024
5 Cntrl 0.705800 201024
6 Exp 1.274295 201024
7 Exp 2.657393 201024
I would like to divide the values for each date by the minimum Cntrl
value for that date, producing:
Group Value Date
0 Cntrl 1.0363 201030
1 Cntrl 1.0000 201030
2 Exp 1.8769 201030
3 Exp 2.8996 201030
4 Cntrl 1.4654 201024
5 Cntrl 1.0000 201024
6 Exp 0.5538 201024
7 Exp 2.0853 201024
I gather that I can divide all values from each date by the minimum of all all values for that date using:
df['Value'] = df.groupby('Date')['Value'].transform(lambda x: x / x.min())
However, this lumps both group values together instead of specifically using the Cntrl
group to calculate the minimum. Is there a way to nest an additional groupby call into the lambda function to, or set up the transform differently, to accomplish this?
CodePudding user response:
Doing groupby
with reindex
df.Value /= df[df['Group'].eq('Cntrl')].groupby('Date')['Value'].min().reindex(df['Date']).values
df
Out[172]:
Group Value Date
0 Cntrl 1.036341 201030
1 Cntrl 1.000000 201030
2 Exp 1.876903 201030
3 Exp 2.899632 201030
4 Cntrl 1.465406 201024
5 Cntrl 1.000000 201024
6 Exp 0.553875 201024
7 Exp 2.085383 201024