I have the following problem. I have a function f
defined in python using numpy functions. The function is smooth and integrable on positive reals. I want to construct the double antiderivative of the function (assuming that both the value and the slope of the antiderivative at 0 are 0) so that I can evaluate it on any positive real smaller than 100.
Definition of antiderivative of f
at x
:
integrate f(s) with s from 0 to x
Definition of double antiderivative of f
at x
:
integrate (integrate f(t) with t from 0 to s) with s from 0 to x
The actual form of f
is not important, so I will use a simple one for convenience. But please note that even though my example has a known closed form, my actual function does not.
import numpy as np
f = lambda x: np.exp(-x)*x
My solution is to construct the antiderivative as an array using naive numerical integration:
N = 10000
delta = 100/N
xs = np.linspace(0,100,N 1)
vs = f(xs)
avs = np.cumsum(vs)*delta
aavs = np.cumsum(avs)*delta
This of course works but it gives me arrays instead of functions. But this is not a big problem as I can interpolate aavs
using a spline to get a function and get rid of the arrays.
from scipy.interpolate import UnivariateSpline
aaf = UnivariateSpline(xs, aavs)
The function aaf
is approximately the double antiderivative of f
.
The problem is that even though it works, there is quite a bit of overhead before I can get my function and precision is expensive.
My other idea was to interpolate f
by a spline and take the antiderivative of that, however this introduces numerical errors that are too big for what I want to use the function.
Is there any better way to do that? By better I mean faster without sacrificing accuracy.
Edit: What I hope is possible is to use some kind of Fourier transform to avoid integrating twice. I hope that there is some convenient transform of vs
that allows to multiply the values component-wise with xs
and transform back to get the double antiderivative. I played with this a bit, but I got lost.
Edit: I figured out that by using the trapezoidal rule instead of a naive sum, increases the accuracy quite a bit. Using Simpson's rule should increase the accuracy further, but it's somewhat fiddly to do with numpy arrays.
Edit: As @user202729 rightfully complains, this seems off. The reason it seems off is because I have skipped some details. I explain here why what I say makes sense, but it does not affect my question.
My actual goal is not to find the double antiderivative of f
, but to find a transformation of this. I have skipped that because I think it only confuses the matter.
The function f
decays exponentially as x approaches 0 or infinity. I am minimizing the numerical error in the integration by starting the sum from 0 and going up to approximately the peak of f
. This ensure that the relative error is approximately constant. Then I start from the opposite direction from some very big x and go back to the peak. Then I do the same for the antiderivative values.
Then I transform the aavs
by another function which is sensitive to numerical errors. Then I find the region where the errors are big (the values oscillate violently) and drop these values. Finally I approximate what I believe are good values by a spline.
Now if I use spline to approximate f
, it introduces an absolute error which is the dominant term in a rather large interval. This gets "integrated" twice and it ends up being a rather large relative error in aavs
. Then once I transform aavs
, I find that the 'good region' has shrunk considerably.
CodePudding user response:
You can compute the two np.cumsum
(and the divisions) at once more efficiently using Numba. This is significantly faster since there is no need for several temporary arrays to be allocated, filled, read again and freed. Here is a naive implementation:
import numba as nb
@nb.njit('float64[::1](float64[::1], float64)') # Assume vs is contiguous
def doubleAntiderivative_naive(vs, delta):
res = np.empty(vs.size, dtype=np.float64)
sum1, sum2 = 0.0, 0.0
for i in range(vs.size):
sum1 = vs[i] * delta
sum2 = sum1 * delta
res[i] = sum2
return res
However, the sum is not very good in term of numerical stability. A Kahan summation is needed to improve the accuracy (or possibly the alternative Kahan–Babuška-Klein algorithm if you are paranoid about the accuracy and performance do not matter so much). Note that Numpy use a pair-wise algorithm which is quite good but far from being prefect in term of accuracy (this is a good compromise for both performance and accuracy).
Moreover, delta
can be factorized during in the summation (ie. the result just need to be premultiplied by delta**2
).
Here is an implementation using the more accurate Kahan summation:
@nb.njit('float64[::1](float64[::1], float64)')
def doubleAntiderivative_accurate(vs, delta):
res = np.empty(vs.size, dtype=np.float64)
delta2 = delta * delta
sum1, sum2 = 0.0, 0.0
c1, c2 = 0.0, 0.0
for i in range(vs.size):
# Kahan summation of the antiderivative of vs
y1 = vs[i] - c1
t1 = sum1 y1
c1 = (t1 - sum1) - y1
sum1 = t1
# Kahan summation of the double antiderivative of vs
y2 = sum1 - c2
t2 = sum2 y2
c2 = (t2 - sum2) - y2
sum2 = t2
res[i] = sum2 * delta2
return res
Here is the performance of the approaches on my machine (with an i5-9600KF processor):
Numpy cumsum: 51.3 us
Naive Numba: 11.6 us
Accutate Numba: 37.2 us
Here is the relative error of the approaches (based on the provided input function):
Numpy cumsum: 1e-13
Naive Numba: 5e-14
Accutate Numba: 2e-16
Perfect precision: 1e-16 (assuming 64-bit numbers are used)
If f
can be easily computed using Numba (this is the case here), then vs[i]
can be replaced by calls to f
(inlined by Numba). This helps to reduce the memory consumption of the computation (N
can be huge without saturating your RAM).
As for the interpolation, the splines often gives good numerical result but they are quite expensive to compute and AFAIK they require the whole array to be computed (each item of the array impact all the spline although some items may have a negligible impact alone). Regarding your needs, you could consider using Lagrange polynomials. You should be careful when using Lagrange polynomials on the edges. In your case, you can easily solve the numerical divergence issue on the edges by extending the array size with the border values (since you know the derivative on each edges of vs
is 0). You can apply the interpolation on the fly with this method which can be good for both performance (typically if the computation is parallelized) and memory usage.
CodePudding user response:
First, I created a version of the code I found more intuitive. Here I multiply cumulative sum values by bin widths. I believe there is a small error in the original version of the code related to the bin width issue.
import numpy as np
f = lambda x: np.exp(-x)*x
N = 1000
xs = np.linspace(0,100,N 1)
domainwidth = ( np.max(xs) - np.min(xs) )
binwidth = domainwidth / N
vs = f(xs)
avs = np.cumsum(vs)*binwidth
aavs = np.cumsum(avs)*binwidth
Next, for visualization here is some very simple plotting code:
import matplotlib
import matplotlib.pyplot as plt
plt.figure()
plt.scatter( xs, vs )
plt.figure()
plt.scatter( xs, avs )
plt.figure()
plt.scatter( xs, aavs )
plt.show()
The first integral matches the known result of the example expression and can be seen on wolfram
Below is a simple function that extracts an element from the second derivative. Note that int
is a bad rounding function. I assume this is what you have implemented already.
def extract_double_antideriv_value(x):
return aavs[int(x/binwidth)]
singleresult = extract_double_antideriv_value(50.24)
print('singleresult', singleresult)
Whatever full computation steps are required, we need to know them before we can start optimizing. Do you have a million different functions to integrate? If you only need to query a single double anti-derivative many times, your original solution should be fairly ideal.
Symbolic Approximation:
Have you considered approximations to the original function f
, which can have closed form integration solutions? You have a limited domain on which the function lives. Perhaps approximate f
with a Taylor series (which can be constructed with known maximum error) then integrate exactly? (consider Pade, Taylor, Fourier, Cheby, Lagrange(as suggested by another answer), etc...)
Log Tricks:
Another alternative to dealing with spiky errors, would be to take the log of your original function. Is f
always positive? Is the integration error caused because the neighborhood around the max is very small? If so, you can study ln(f)
or even ln(ln(f))
instead. It would really help to understand what f
looks like more.
Approximation Integration Tricks
There exist countless integration tricks in general, which can make approximate closed form solutions to undo-able integrals. A very common one when exponetnial functions are involved (I think yours is expoential?) is to use Laplace's Method. But which trick to pull out of the bag is highly dependent upon the conditions which f
satisfies.