Home > Mobile >  Calculate daily average excluding values
Calculate daily average excluding values

Time:01-19

In Python, I have a dataframe with daily information of different temperature sensors and satellite value, like this:

Date DeviceID Value
01/01/2022 Sensor1 35
01/01/2022 Sensor2 34.33
01/01/2022 Sensor3 22.33
01/01/2022 Satellite 36.45

For each day, I would like to calculate a definitive value based on the sensor mean and the satellite value:

  • Calculte a new average value of the 3 sensors but excluding the sensors that its values are 10% or more under the original average value. For example, in this day the original value average is (35 34.33 22.33)/3 = 30,55 so the new average will exclude Sensor3 and the calculated average will be (35 34.33)/2 = 34.665
  • Compare for each day the calculated average with the Satellite value, if the calculated average value is less than the satellite value, our definitive_value = calculated average and if not, definitive_value = Satellital

The result dataframe would be like this

Date DeviceID Value
01/01/2022 Sensor1 35
01/01/2022 Sensor2 34.33
01/01/2022 Sensor3 22.33
01/01/2022 Satellite 36.45
01/01/2022 Definitive 34.665

Hope you can help me with this

Thanks in advance

CodePudding user response:

Extract your dataframe column as numpy array and do the following:

df_col = np.array([35, 34.33, 22.33])

bool_ = np.ones(df_col.shape, bool)
bool_[np.where(df_col < abs(0.9 * np.mean(df_col)))] = False

print(np.mean(df_col, where=bool_))

CodePudding user response:

Assuming you want to perform everything by Date, You can use filtering and groups:

m = df['DeviceID'].str.startswith('Sensor')
ref = df[m].groupby('Date')['Value'].mean().mul(0.9)

corr_mean = (df.loc[df['Value'].ge(df.loc[m, 'Date'].map(ref)), 'Value']
               .groupby(df['Date']).mean()
            )

out = (pd.concat([df, corr_mean.reset_index(name='Value').assign(DeviceID='Definitive')])
         .sort_values(by='Date', kind='stable', ignore_index=True)
      )

Output (using a more complex input):

         Date    DeviceID   Value
0  01/01/2022     Sensor1  35.000
1  01/01/2022     Sensor2  34.330
2  01/01/2022     Sensor3  22.330
3  01/01/2022   Satellite  36.450
4  01/01/2022  Definitive  34.665
5  02/01/2022     Sensor1  35.000
6  02/01/2022     Sensor2  34.330
7  02/01/2022     Sensor3  22.330
8  02/01/2022   Satellite  36.450
9  02/01/2022  Definitive  34.665

Used input:

         Date   DeviceID  Value
0  01/01/2022    Sensor1  35.00
1  01/01/2022    Sensor2  34.33
2  01/01/2022    Sensor3  22.33
3  01/01/2022  Satellite  36.45
4  02/01/2022    Sensor1  35.00
5  02/01/2022    Sensor2  34.33
6  02/01/2022    Sensor3  22.33
7  02/01/2022  Satellite  36.45
  • Related