I have two columns where I would like to artificially increase column A value to 1000 to see what happens to values in column B.
Data
A B
500 20
200 10
100 5
Desired
A B
500 20
200 10
100 5
1000 ?
I wish to artificially increase column A value to 1000 to see what happens to values in column B.
Doing
Using python I will test for correlation. Treat this as linear regression problem.
pyplot.scatter(x = ‘A’, y = ‘B’, s= 100)
pyplot.show()
Then I am thinking I can use linear regression to determine what the value of B will be if I increase the dependent value of A. Just not sure on how to input the what if A values.
import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([500,200,100]).reshape((-1, 1))
y = np.array([20,10,5])
Any suggestion is appreciated
CodePudding user response:
You first need to create and fit a model before you can use it to make predictions.
import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([500,200,100])
y = np.array([20,10,5])
reg = LinearRegression().fit(x, y)
reg.predict(np.array([1000]))
A graph might help. In this case, there is no strict linear relationship, but we are making a best guess. It's sort of like a computer drawing a line of best fit.
Here, the equation of the line of best fit would be Y = 0.03654*X 1.923
. Making a prediction just means plugging another X value into this formula to find the corresponding Y coordinate.