Home > Net >  Calculating of tolerance
Calculating of tolerance

Time:01-19

I am working with one data set. Data contains values with different decimal places. Data and code you can see below :

data = {
         'value':[9.1,10.5,11.8,
                  20.1,21.2,22.8,
                  9.5,10.3,11.9,
                  ]
        }

df = pd.DataFrame(data, columns = ['value'])

Which gives the following dataframe:

   value
0    9.1
1   10.5
2   11.8
3   20.1
4   21.2
5   22.8
6    9.5
7   10.3
8   11.9

Now I want to add a new column with the title adjusted.This column I want to calculate with numpy.isclose function with a tolerance of 2 (plus or minus 1). At the end I expect to have results as result shown in the next table

   value  adjusted
0    9.1        10
1   10.5        10
2   11.8        10
3   20.1        21
4   21.2        21
5   22.8        21
6    9.5        10
7   10.3        10
8   11.9        10

I tried with this line but I get only results such true and false and also this is only for one value (10) not for all values.

np.isclose(df1['value'],10,atol=2)

So can anybody help me how to solve this problem and calculate tolerance for values 10 and 21 with one line ?

CodePudding user response:

For only two distinct values, one possible solution is to use np.where:

df['adjusted'] = np.where((df['value'] >= 8) & (df['value'] <= 12), 10, 21)

CodePudding user response:

The exact logic and how this would generalize are not fully clear. Below are two options.

Assuming you want to test your values against a list of defined references, you can use the underlying numpy array and broadcasting:

vals = np.array([10, 21])

a = df['value'].to_numpy()

m = np.isclose(a[:, None], vals, atol=2)

df['adjusted'] = np.where(m.any(1), vals[m.argmax(1)], np.nan)

Assuming you want to group successive values, you can get the diff and start a new group when the difference is above threshold. Then round and get the median per group with groupby.transform:

group = df['value'].diff().abs().gt(2).cumsum()

df['adjusted'] = df['value'].round().groupby(group).transform('median')

Output:

   value  adjusted
0    9.1      10.0
1   10.5      10.0
2   11.8      10.0
3   20.1      21.0
4   21.2      21.0
5   22.8      21.0
6    9.5      10.0
7   10.3      10.0
8   11.9      10.0
  • Related