How to effeciently compute the first, second, and third derivatives of live updating data?-CodePudding

I have a running/decaying sum that updates over time with live data. I would like to efficiently compute the first, second, and third derivatives.

The simplest way I can think of doing this is to calculate deltas over some time difference in the running/decaying sum. e.g.

t_0 sum_0
t_1 sum_1
first_derivative = (sum_1 - sum_0) / (t_1 - t0)

I can continue this process further with the second and third derivatives, which I think should work, but I'm not sure if this is the best way.

This running/decaying sum is not a defined function and relies on live updating data, so I can't just do a normal derivative.

CodePudding user response：

I don't know what your real use case is, but it sounds like you're going about this the wrong way. For most cases I can imagine, what you really want to do is:

First determine the continuous signal that your time series represents; and then
You can exactly calculate the derivatives of this signal at any point.

Since you have already decided that your time series represents exponential decay with discontinuous jumps, you have decided that all your derivatives are simply proportional to the current value and provide no extra information.

This probably isn't what you really want.

You would probably be better off applying a more sophisticated low-pass filter to your samples. In situations like yours, where you receive intermittent updates, it can be convenient to design the

CodePudding user response：

To be clear, you are looking to smooth out data AND to estimate rate of change. But rate of change inherently amplifies noise. Any solution is going to have to make some tradeoffs.

Here is a simple hack based on your existing technique.

First, let's look at a general version of a basic decaying sum. Let's keep the following variables:

average_value
average_time
average_weight

And you have a decay rate decay.

To update with a new observation (value, time) you simply:

average_weight *= (1 - decay)**(time - average_time)
average_value = (average_value * average_weight   value) / (1   average_weight)
average_time = (average_time * average_weight   time) / (1   average_weight)
average_weight  = 1

Therefore this moving average represents where your weight was some time ago. The slower the decay, the farther back it goes and the more smoothed out it is. Given that we want rate of change, the when is going to matter.

Now let's look at a first derivative. You have correctly put out a formula for estimating a first derivative. But at what time is that estimated derivative at? The answer turns out to be at time (t_0 t_1) / 2. Any other time you pick, it will be systematically off based on the third derivative.

So you can play around with it, but you can estimate a derivative based on any source of values and timestamps. You can do it from your first derivative, or do it from a weighted average. You can even combine them. You can also do a running weighted average of the first derivative! But whatever you do, you need to keep track of WHEN it is a derivative FOR. (This is why I went through and discussed how far back a weighted average is, you need to think clearly about timestamping every piece of data you have, averaged or not.)

And now we have your second derivative. You have all the same choices for the second derivative that you do for the first. Except your measurements don't give a first derivative.

The third derivative follows the same pattern of choices.

However you do it, keep in mind the following.

Each derivative will be delayed.
The more up to date you keep them, the more noise will be a problem.
Make sure to think clearly about both what the measurement is, and when it is as of.

It may require experimentation to find what works best for your application.