Home > Software engineering >  Finding when a value in a pandas Series crosses multiple threshold values from another Series
Finding when a value in a pandas Series crosses multiple threshold values from another Series

Time:11-30

I have two Pandas Series, say sensor_values and thresholds. thresholds are defined in increasing order. How do I identify that in the sensor_value series, at what instances do the values cross any of the thresholds defined in the other Series thresholds?

What I'm trying to do is basically what has been done in this answer, but instead the threshold (called line in that answer) would itself be a Series of multiple values. So I want to check against not one single threshold value but multiple of them, since the data is time-varying, and so for my use-case, different thresholds may apply depending upon the range any of the values in sensor_values is in.

I tried to come up with this, but it's not detecting all of the values correctly, only some are detected.

threshold_cross = (((sensor_values.rta >= thresholds.any()) & (sensor_values.next_rta < thresholds.any())) | ((sensor_values.next_rta > thresholds.any()) & (sensor_values.rta <= thresholds.any())) | (sensor_values.rta == thresholds.any()))

The sensor_values Series may be increasing or decreasing in values. Those thresholds for now only apply to the increasing ones. For decreasing values, a different set of thresholds would apply, but anyways that's a thing I should be able to do it myself once I figure out just for one set of threshold values. So for now, one can assume that both sensor_values and thresholds are monotonically increasing.

Edit: sensor_values is a DataFrame, containing two fields rta and next_rta, with later being the next instance of the rta (time shifted by 1 position, as done in the linked answer)

CodePudding user response:

I'm thinking you could pd.cut the sensors values into intervals between thresholds and then compare those thresholds:

import numpy as np
import pandas as pd

sensors = pd.Series([0.1, 1.3, 2.1, 1.7, 2.6, 3.8, 4.1, 5.1, 4.4, 3.2, 1.6, 7.2])
thresholds = pd.Series([2, 5, 7])

bins = pd.concat([pd.Series(-np.inf), thresholds, pd.Series(np.inf)])
binned = pd.cut(sensors, bins)

crossings = binned != binned.shift()
crossings[0] = False # Don't count the first element

Output:

>>> crossings.tolist()
[False, False, True, True, True, False, False, True, True, False, True, True]

Important Note:

The above becomes problematic if your sensors take the exact value of one of the thresholds (i.e., given that the bins are treated right-closed, depending on the previous interval it may get counted as crossing or not). In that case, you'll have to write additional code to handle that.

  • Related