i cant find the min value(which is>0) in each row in selected columns df[df[col]>0]-CodePudding

this is my data and i want to find the min value of selected columns(a,b,c,d) in each row then calculate the difference between that and dd. I need to ignore 0 in rows, I mean in the first row i need to find 8

CodePudding user response：

You can use pandas.apply with axis=1 and all column ['a','b','c','d'] convert to Series then replace 0 with inf and find min. At the end compute diff min with colmun 'dd'.

import numpy  as np
df['min_dd'] = df.apply(lambda row: min(pd.Series(row[['a','b','c','d']]).replace(0,np.inf)) - row['d'], axis=1)
print(df)

   a   b  c  d  dd  min_dd
0  0  15  0  8   6     2.0   # min_without_zero : 8 , dd : 6 -> 8-6=2 
1  2   0  5  3   2     0.0   # min_without_zero : 2 , dd : 2 -> 2-2=0
2  5   3  3  0   2     1.0   # 3 - 2
3  0   2  3  4   2     0.0   # 2 - 2

CodePudding user response：

need to ignore 0 in rows

Then just replace it with nan, consider following simple example

import numpy as np
import pandas as pd
df = pd.DataFrame({"A":[1,2,0],"B":[3,5,7],"C":[7,0,7]})
df.replace(0,np.nan).apply(min)
df["minvalue"] = df.replace(0,np.nan).apply("min",axis=1)
print(df)

gives output

   A  B  C  minvalue
0  1  3  7       1.0
1  2  5  0       2.0
2  0  7  7       7.0

CodePudding user response：

You can try

cols = ['a','b','c','d']

df['res'] = df[cols][df[cols].ne(0)].min(axis=1) - df['dd']

print(df)

   a   b  c  d  dd  res
0  0  15  0  8   6  2.0
1  2   0  5  3   2  0.0
2  5   3  3  0   2  1.0
3  2   3  4  4   2  0.0