How do I force the line and equation through 0 on this dataframe? I only know how to make a line that is the best fit and equation to match.Below is the code of the best fit line without force intercept through 0. Any help getting forced intercept is appreciated!
import seaborn as sns
import scipy as sp
r, p = sp.stats.pearsonr(x=df['mass(mg)'], y=df['Abs'])
sns.regplot(x=df['mass(mg)'],y=df['Abs']).set(Title='biuret standard plot')
def equation(x,y):
x_mean = x.mean()
y_mean = y.mean()
B1_num = ((x - x_mean) * (y - y_mean)).sum()
B1_den = ((x - x_mean)**2).sum()
B1 = B1_num / B1_den
B0 = y_mean - (B1*x_mean)
reg_line = 'y = {} {}β'.format(B0, round(B1, 3))
return (B0, B1, reg_line)
equation(df['mass(mg)'],df['Abs'])
CodePudding user response:
Seaborn's linear regression won't let you do that.
So if you are trying to get the line to go through the origin i.e. (0,0) then your equation is y=mx. There is no B0 if you're doing this. So we use numpy.linalf.lstq to do least squares and force it through origin. And we also add 0 coordinates so it draws on. Taking in your values for mass and Abs, see below:
import numpy as np
import matplotlib.pyplot as plt
# I'm adding a 0 in both x and y axis to force a line drawn.
x = np.array([0, 5, 4, 3, 2, 1, 0.5])
y = np.array([0, 0.596, 0.464, 0.333, 0.201, 0.121, 0.062])
# We only need a*x. so to figure out a we use lstsq from numpy
# Our x matrix is one dimensional, it needs to be two dimensional to use lstsq so:
x = x[:,np.newaxis]
a, _, _, _ = np.linalg.lstsq(x, y)
plt.plot(x, y, 'bo')
plt.plot(x, a*x, 'r-')
plt.xlim([0, 6])
plt.ylim([0, 0.7])
plt.show()
print(f"y = {a} x 0")
Which outputs:
Feel free to add labelling etc.
Hope this helps, please mark this as the answer if it does!
CodePudding user response:
sns.regplot
has a parameter truncate=
which default to True
and limits the line to the given data. With truncate=False
the line is extended until the border. You can set the xlim
before calling sns.regplot
to a suitable range.
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset('tips')
plt.xlim(-10, 80)
sns.regplot(data=tips, x='total_bill', y='tip', truncate=False, color='turquoise')
plt.axvline(0, color='gray', ls='--')
plt.show()
If you want to extend till x=0
and still limit the right to the data limit, you can set the xlim
both before and after the regplot
:
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset('tips')
plt.xlim(0, tips['total_bill'].max())
sns.regplot(data=tips, x='total_bill', y='tip', truncate=False, color='coral')
plt.axvline(0, color='gray', ls='--')
plt.xlim(-5, tips['total_bill'].max() 5)
plt.show()