Home > Blockchain >  Fit the data to multivariable linear regression in Python
Fit the data to multivariable linear regression in Python

Time:09-19

I have the following data:

x1=[100, 100, 110, 110, 120, 120, 120, 130, 130, 130]
x2=[1, 2, 1, 2, 1, 2, 3, 1, 2, 3]
y=[113, 118, 127, 132, 136, 144, 138, 146, 156, 149]

And I want to fit a function having the form y=a0 a1*x1 a2*x2.

I managed to do it by defining matrix X=[1 x1[0] x2[0], ..., 1 x1[9] x2[9]], and then computing [a0 a1 a2]'=inv(X'X)X'y, which gives a0=3.7148, a1=1.1013, a2=1.8517. However, I want to use the linear_model.LinearRegression() as well.

I have a file called multi_regress that contains the following data:

x0,x1,x2,y
1.0,100.0,1.0,113.0
1.0,100.0,2.0,118.0
1.0,110.0,1.0,127.0
1.0,110.0,2.0,132.0
1.0,120.0,1.0,136.0
1.0,120.0,2.0,144.0
1.0,120.0,3.0,138.0
1.0,130.0,1.0,146.0
1.0,130.0,2.0,156.0
1.0,130.0,3.0,149.0

And I wrote the following code:

import pandas
from sklearn import linear_model

df = pandas.read_csv("multi_regress.csv")
X = df[['x0', 'x1', 'x2']]
y = df['y']
regr = linear_model.LinearRegression()
regr.fit(X,y)
print(regr.coef_)

I got the following output:

[0.         1.10129032 1.8516129 ]

The first coefficient a0 is not correct, where is my mistake?

CodePudding user response:

You're looking for the intercept.

Try regr.intercept_ to get the value you want.

Alternatively, when you define your model, set fit_intercept=False, seeing as you've already added and intercept into your data.

  • Related