Home > Blockchain >  pandas: get the percent of a baseline value while baseline and drug are in the same column
pandas: get the percent of a baseline value while baseline and drug are in the same column

Time:12-13

I have a dataframe of this kind:

import pandas as pd
df = pd.DataFrame({'condition': ['BL', 'washon', 'washoff', 'BL', 'washon', 'washoff'], 
                   'value': [406, 219, 'NaN', 54, 16, 52]},
                  index = ['cell1', 'cell1', 'cell1', 'cell2', 'cell2', 'cell2']) 

And I need to loop over the value column and get the percent of BL for the washon and washoff values for each cell1 and cell2 into a new column, aka

df['value_percent_BL'] = [100, 53, NaN, 100, 29, 96]

Any ideas are appreciated!

CodePudding user response:

First, use real NaN, not strings:

df = df.replace('NaN', float('nan'))

Then get the value of BL per group and divide the value:

df['value_percent_BL'] = df['value']/df['value'].where(df['condition'].eq('BL')).groupby(df.index).transform('max')*100

output:

      condition  value  value_percent_BL
cell1        BL  406.0        100.000000
cell1    washon  219.0         53.940887
cell1   washoff    NaN               NaN
cell2        BL   54.0        100.000000
cell2    washon   16.0         29.629630
cell2   washoff   52.0         96.296296
rounded values:
group = df['value'].where(df['condition'].eq('BL')).groupby(df.index).transform('max')
df['value_percent_BL'] = (df['value']/group*100).round()

output:

      condition  value  value_percent_BL
cell1        BL  406.0             100.0
cell1    washon  219.0              54.0
cell1   washoff    NaN               NaN
cell2        BL   54.0             100.0
cell2    washon   16.0              30.0
cell2   washoff   52.0              96.0

NB. if you really want floored values: use numpy.floor: np.floor(df['value']/group*100)

  • Related