Home > Blockchain >  Estimate future values following sklearn linear regression of accumulate data over time
Estimate future values following sklearn linear regression of accumulate data over time

Time:08-04

I am trying to solve a problem where I have 10 days worth of data for the number of burpees completed, and based on this information want to extrapolate to estimate the total number of burpees will be completed after 20 days.

data={'Day':[1,2,3,4,5,6,7,8,9,10],'burpees':[12,20,28,32,52,59,71,85,94,112]}
df=pd.DataFrame(data)

I have run sklearn LinearRegression on the data and extracted the coefficient:

from sklearn.linear_model import LinearRegression
reg = LinearRegression()
mdl = reg.fit(df[['Day']], df[['burpees']])
mdl.coef_

How do I get an estimation of the number of burpees on day 20?

CodePudding user response:

As per sklearn documentation, sklearn docs, input for .fit() method should be in np.array with (n_samples, n_features) shape.

Below should work:

# your data
data = {
    "Day": [1,2,3,4,5,6,7,8,9,10], 
    "burpees": [12, 20, 28, 32, 52, 59, 71, 85, 94, 112],
}
df = pd.DataFrame(data)

from sklearn.linear_model import LinearRegression
reg = LinearRegression()

X = df["Day"].values.reshape(-1, 1)
y = df["burpees"].values.reshape(-1, 1)

mdl = reg.fit(X, y)
print("Intercept, coef:", mdl.intercept_, mdl.coef_)

prediction_data = np.array(20).reshape(-1,1)

print("Prediction:", mdl.predict(prediction_data)[0][0])

Output:

Intercept, coef: [-4.4] [[11.07272727]]
Prediction: 217.0545454545455
  • Related