How to Compare if the Values between 2 Columns have around the Same Number ~ Pandas-CodePudding

I have a df, where I'm trying to compare 2 columns, and if they have around the same value in the same row, I want it to be dropped from the df. i.e.:

   A       B  
1  3.21   3.15
2  6.98   2.07
3  5.41   8.95
4  0.32   0.30

I would want only rows 2/3 to remain in the df, because in rows 1/4 A and B are similar to each other.

I've tried to do something like if i in column A is within a range ( /- 15% of the value of row B) remove that row, but it didn't work. Didn't know if there was some sort of built in function that pandas had for that.

CodePudding user response：

You could do this by passing rtol parameter to numpy.isclose:

result = df[~np.isclose(df.A, df.B, atol=0, rtol=0.15)]
#       A     B
# 2  6.98  2.07
# 3  5.41  8.95

CodePudding user response：

You could define your lower and upper bounds on permissable values

lower = df["A"]*0.85
upper = df["A"]*1.15

and then filter using pandas.Series.between

df[~df["B"].between(lower, upper)]