In Python, I have a dataframe with daily information of different temperature sensors and satellite value, like this:
Date | DeviceID | Value |
---|---|---|
01/01/2022 | Sensor1 | 35 |
01/01/2022 | Sensor2 | 34.33 |
01/01/2022 | Sensor3 | 22.33 |
01/01/2022 | Satellite | 36.45 |
For each day, I would like to calculate a definitive value based on the sensor mean and the satellite value:
- Calculte a new average value of the 3 sensors but excluding the sensors that its values are 10% or more under the original average value. For example, in this day the original value average is (35 34.33 22.33)/3 = 30,55 so the new average will exclude Sensor3 and the calculated average will be (35 34.33)/2 = 34.665
- Compare for each day the calculated average with the Satellite value, if the calculated average value is less than the satellite value, our definitive_value = calculated average and if not, definitive_value = Satellital
The result dataframe would be like this
Date | DeviceID | Value |
---|---|---|
01/01/2022 | Sensor1 | 35 |
01/01/2022 | Sensor2 | 34.33 |
01/01/2022 | Sensor3 | 22.33 |
01/01/2022 | Satellite | 36.45 |
01/01/2022 | Definitive | 34.665 |
Hope you can help me with this
Thanks in advance
CodePudding user response:
Extract your dataframe column as numpy array and do the following:
df_col = np.array([35, 34.33, 22.33])
bool_ = np.ones(df_col.shape, bool)
bool_[np.where(df_col < abs(0.9 * np.mean(df_col)))] = False
print(np.mean(df_col, where=bool_))
CodePudding user response:
Assuming you want to perform everything by Date, You can use filtering and groups:
m = df['DeviceID'].str.startswith('Sensor')
ref = df[m].groupby('Date')['Value'].mean().mul(0.9)
corr_mean = (df.loc[df['Value'].ge(df.loc[m, 'Date'].map(ref)), 'Value']
.groupby(df['Date']).mean()
)
out = (pd.concat([df, corr_mean.reset_index(name='Value').assign(DeviceID='Definitive')])
.sort_values(by='Date', kind='stable', ignore_index=True)
)
Output (using a more complex input):
Date DeviceID Value
0 01/01/2022 Sensor1 35.000
1 01/01/2022 Sensor2 34.330
2 01/01/2022 Sensor3 22.330
3 01/01/2022 Satellite 36.450
4 01/01/2022 Definitive 34.665
5 02/01/2022 Sensor1 35.000
6 02/01/2022 Sensor2 34.330
7 02/01/2022 Sensor3 22.330
8 02/01/2022 Satellite 36.450
9 02/01/2022 Definitive 34.665
Used input:
Date DeviceID Value
0 01/01/2022 Sensor1 35.00
1 01/01/2022 Sensor2 34.33
2 01/01/2022 Sensor3 22.33
3 01/01/2022 Satellite 36.45
4 02/01/2022 Sensor1 35.00
5 02/01/2022 Sensor2 34.33
6 02/01/2022 Sensor3 22.33
7 02/01/2022 Satellite 36.45