Given the example table
df = pd.DataFrame({'A':[8,4,8,4,9],'Ap':[0.001,0.06,0.001,0.1,0.002],'B':[7,3,9,3,6],
'Bp':[0.005,0.006,0.01,0.007,0.06],'C':[4,1,4,8,9],
'Cp':[0.004,0.008,0.2,0.006,0.00001]}, index=['x','y','z','zz','yz'])
That looks like this:
A Ap B Bp C Cp
x 8 0.001 7 0.005 4 0.00400
y 4 0.060 3 0.006 1 0.00800
z 8 0.001 9 0.010 4 0.20000
zz 4 0.100 3 0.007 8 0.00600
yz 9 0.002 6 0.060 9 0.00001
I'd like the keep/record the row value for the column with the lowest value from (A,B,C)
new = pd.DataFrame()
new['Minimum'] = df[[df.columns[0],df.columns[2],df.columns[4]]].min(axis=1)
This result will look like this
Minimum
x 4
y 1
z 4
zz 3
yz 6
But I'd also like to record the pval associated with the minimum value kept (Ap, Bp, Cp) and I'm unsure how to accomplish that.
So for example the final result should look like this
Minimum pVal
x 4 0.004
y 1 0.008
z 4 0.200
zz 3 0.007
yz 6 0.060
CodePudding user response:
Lets use idxmin
to get the column names corresponding to min values then use advance indexing with numpy to get the corresponding min values
c = ['A', 'B', 'C']
x, y = range(len(df)), df[c].idxmin(1)
df['min'] = df.values[x, df.columns.get_indexer_for(y)]
df['pVal'] = df.values[x, df.columns.get_indexer_for(y 'p')]
Result
A Ap B Bp C Cp min pVal
x 8 0.001 7 0.005 4 0.00400 4.0 0.004
y 4 0.060 3 0.006 1 0.00800 1.0 0.008
z 8 0.001 9 0.010 4 0.20000 4.0 0.200
zz 4 0.100 3 0.007 8 0.00600 3.0 0.007
yz 9 0.002 6 0.060 9 0.00001 6.0 0.060
Some details
idxmin(1)
: returns the name of column corresponding to min value for each rowdf.columns.get_indexer_for
returns the numerical indices(zero based) which can then be used to access the corresponding columns