How to check for each row in dataframe if all its values are in specified range?
import pandas as pd
new = pd.DataFrame({'a': [1,2,3], 'b': [-5,-8,-3], 'c': [20,0,0]})
For instance range <-5, 5>:
>> a b c
>> 0 1 -5 20 # abs(20) > 5, hence no
>> 1 2 -8 0 # abs(-8) > 5, hence no
>> 2 3 -3 0 # abs(-3) <= 5, hence yes
Solution with iteration
print(['no' if any(abs(i) > 5 for i in a) else 'yes' for _, a in new.iterrows()])
>> ['no', 'no', 'yes']
CodePudding user response:
Doing:
out = (df.gt(-5) & df.lt(5)).all(axis=1)
# Or if you just want to supply a single value:
# df.abs().lt(5).all(axis=1)
print(out)
Output:
0 False
1 False
2 True
dtype: bool
You could add this as a new column, and change things to no
/yes
if desired (which imo is a terrible idea):
df['valid'] = np.where(df.abs().lt(5).all(1), 'yes', 'no')
print(df)
# Output:
a b c valid
0 1 -5 20 no
1 2 -8 0 no
2 3 -3 0 yes
CodePudding user response:
For operations with DataFrames of numbers you should use numpy.
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': [1, 2, 3], 'b': [-5, -8, -3], 'c': [20, 0, 0]})
df_ndarray = df.values
bin_mask = np.where((df_ndarray > 5) | (df_ndarray < -5), 1, 0)
res = np.equal(bin_mask.sum(axis=0), np.arange(len(df.columns)))