Home > Back-end >  Fitting a curve to some datapoints
Fitting a curve to some datapoints

Time:03-11

the fitted curve doesn't fit the datapoints (xH_data, nH_data) as expected. Does someone know what might be the issue here?

from scipy.optimize import curve_fit
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

xH_data = np.array([1., 1.03, 1.06, 1.1, 1.2, 1.3, 1.5, 1.7, 2., 2.6, 3., 4., 5., 6.])
nH_data = np.array([403., 316., 235., 160., 70.8, 37.6, 14.8, 7.11, 2.81, 0.665, 0.313, 0.090, 0.044, 0.029])*1.0e6

plt.plot(xH_data, nH_data)
plt.yscale("log")
plt.xscale("log")

def eTemp(x, A, a, B):
    n =  B*(A x)**a 
    return n

parameters, covariance = curve_fit(eTemp, xH_data, nH_data, maxfev=200000)
fit_A = parameters[0]
fit_a = parameters[1]
fit_B = parameters[2]



print(fit_A)
print(fit_a)
print(fit_B)


r = np.logspace(0, 0.7, 1000)
ne =   fit_B *(fit_A   r)**(fit_a)
plt.plot(r, ne)

plt.yscale("log")
plt.xscale("log")

Thanks in advance for the help.

CodePudding user response:

I know of two things that might help you

  1. Provide the p0 input parameter to curve_fit with a set of appropriate starting parameters to the function. That can keep the algorithm from running wild.
  2. Change the function you are fitting so that it returns np.log(n) and then make the fit to np.log(nH_data). As it is now, there is a far larger penalty for not fitting the first data points than for not fitting the last data points, as the values are about 10^2 larger for the first ones. Thus, the first data points become "more important" to fit for the algorithm. Taking the logarithm puts them more on the same scale, so that points are weighed equally.

Go ahead and play around with it. I managed a pretty fine fit with these parameters

[-7.21450545e-01 -3.36131028e 00  5.97293632e 06]

CodePudding user response:

I think you're nearly there, just need to fit on a log scale and throw in a decent guess. To make the guess you just need to throw in a plot like

plt.figure()
plt.plot(np.log(xH_data), np.log(nH_data))

and you'll see it's nearly linear. So your B will be the exponentiated intercept (i.e. exp(20ish)) and the a is the approximate slope (-5ish). A is weird one, does it have some physical meaning or you just threw it in there? If there's no physical meaning, I'd say get rid of it.

from scipy.optimize import curve_fit
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

xH_data = np.array([1., 1.03, 1.06, 1.1, 1.2, 1.3, 1.5, 1.7, 2., 2.6, 3., 4., 5., 6.])
nH_data = np.array([403., 316., 235., 160., 70.8, 37.6, 14.8, 7.11, 2.81, 0.665, 0.313, 0.090, 0.044, 0.029])*1.0e6

def eTemp(x, A, a, B):
    logn =  np.log(B*(x   A)**a)
    return logn

parameters, covariance = curve_fit(eTemp, xH_data, np.log(nH_data), p0=[np.exp(0.1), -5, np.exp(20)], maxfev=200000)
fit_A = parameters[0]
fit_a = parameters[1]
fit_B = parameters[2]

print(fit_A)
print(fit_a)
print(fit_B)

r = np.logspace(0, 0.7, 1000)
ne = np.exp(eTemp(r, fit_A, fit_a, fit_B))
plt.plot(xH_data, nH_data)
plt.plot(r, ne)
plt.yscale("log")
plt.xscale("log")

CodePudding user response:

There is a problem with your fit equation. If A is less than -1 and your a parameter is negative then you get an imaginary value for your function within your fit range. For this reason you need to add constraints and an initial set of parameters to your curve_fit function for example:

parameters, covariance = curve_fit(eTemp, xH_data, nH_data, method='dogbox', p0 = [100, -3.3, 10E8], bounds=((-0.9, -10, 0), (200, -1, 10e9)), maxfev=200000)

You need to change the method to 'dogbox' in order to perform this fit with the constraints.

  • Related