Home > OS >  Pandas get max delta in a timeseries for a specified period
Pandas get max delta in a timeseries for a specified period

Time:03-17

Given a dataframe with a non-regular time series as an index, I'd like to find the max delta between the values for a period of 10 secs. Here is some code that does the same thing:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(0)
xs = np.cumsum(np.random.rand(200))
# This function is to create a general situation where the max is not aways at the end or beginning
ys = xs**1.2   10 * np.sin(xs)

plt.plot(xs, ys, ' -')

threshold = 10
xs_thresh_ind = np.zeros_like(xs, dtype=int)
deltas = np.zeros_like(ys)

for i, x in enumerate(xs):
  # Find indices that lie within the time threshold
  period_end_ind = np.argmax(xs > x   threshold)

  # Only operate when the window is wide enough (this can be treated differently)
  if period_end_ind > 0:
    xs_thresh_ind[i] = period_end_ind

    # Find extrema in the period
    period_min = np.min(ys[i:period_end_ind   1])
    period_max = np.max(ys[i:period_end_ind   1])
    deltas[i] = period_max - period_min

max_ind_low = np.argmax(deltas)
max_ind_high = xs_thresh_ind[max_ind_low]
max_delta = deltas[max_ind_low]

print(
    'Max delta {:.2f} is in period x[{}]={:.2f},{:.2f} and x[{}]={:.2f},{:.2f}'
    .format(max_delta, max_ind_low, xs[max_ind_low], ys[max_ind_low],
            max_ind_high, xs[max_ind_high], ys[max_ind_high]))

df = pd.DataFrame(ys, index=xs)

OUTPUT:
Max delta 48.76 is in period x[167]=86.10,200.32 and x[189]=96.14,249.09

Is there an efficient pandaic way to achieve something similar?

CodePudding user response:

Create a Series from ys values, indexed by xs - but convert xs to be actual timedelta elements, rather than the float equivalent.

ts = pd.Series(ys, index=pd.to_timedelta(xs, unit="s"))

We want to apply a leading, 10 second window in which we calculate the difference between max and min. Because we want it to be leading, we'll sort the Series in descending order and apply a trailing window.

deltas = ts.sort_index(ascending=False).rolling("10s").agg(lambda s: s.max() - s.min())

Find the maximum delta with deltas[deltas == deltas.max()], which gives

0 days 00:01:26.104797298    48.354851

meaning a delta of 48.35 was found in the interval [86.1, 96.1)

  • Related