I have a data frame of three columns. I want to check them if they follow a logical sequence.
code:
df = pd.DataFrame({'low':[10,15,np.nan]','medium:[12,18,29],'high':[16,19,np.nan]})
df =
low medium high
0 10.0 12 16.0
1 15.0 18 19.0
2 NaN 29 NaN
# check if low<medium<high
df['check'] = (df['low']<df['medium'])&(df['medium']<df['high'])
print("Condition failed: %s"%(df['check'].all()))
Present output:
df['check']=
True #correct
True # correct
False # wrong output here, it should not consider this
Basically I want to avoid comparison with the NaN values and producing false output. I want to avoid them. How can I do it?
CodePudding user response:
You can mask
it. Also, instead of chained condition, you can use between
:
df['check'] = df['medium'].between(df['low'], df['high'], inclusive='neither').mask(df[['low','high']].isna().any(axis=1))
Output:
low medium high check
0 10.0 12 16.0 True
1 15.0 18 19.0 True
2 NaN 29 NaN NaN