Home > database >  Vectorized calculation of new timeseries in pandas dataframe
Vectorized calculation of new timeseries in pandas dataframe

Time:02-04

I have a pandas dataframe and I am trying to estimate a new timeseries V(t) based on the values of an existing timeseries B(t). I have written a minimal reproducible example to generate a sample dataframe as follows:

import pandas as pd
import numpy as np

lenb = 5000
lenv = 200
l    = 5

B = pd.DataFrame({'a': np.arange(0, lenb, 1), 'b': np.arange(0, lenb, 1)},
                 index=pd.date_range('2022-01-01', periods=lenb, freq='2s'))

I want to calculate V(t) for all times 't' in the timeseries B as:

V(t) = (B(t-2*l)   4*B(t-l)  6*B(t)  4*B(t l)  1*B(t 2*l))/16

How can I perform this calculation in a vectorized manner in pandas? Lets say that l=5

Would that be the correct way to do it:

def V_t(B, l):
    V = (B.shift(-2*l)   4*B.shift(-l)   6*B   4*B.shift(l)   B.shift(2*l)) / 16
    return V

CodePudding user response:

I would have done it as you suggested in your latest edit. So here is an alternative to avoid having to type all the shift commands for an arbitrary long list of factors/multipliers:

import numpy as np

def V_t(B, l):
    X = [1, 4, 6, 4, 4]
    Y = [-2*l, -l, 0, l, 2*l]
    return pd.DataFrame(np.add.reduce([x*B.shift(y) for x, y in zip(X, Y)])/16,
                        index=B.index, columns=B.columns)
  • Related