Paul need a laptop that is fast enough. One of the main parameter of computers which he must focus on is CPU. In this project we need to forecast performance of CPU which is characterized in terms of cycle time and memory capacity and so on.
It is Linear Regression problem and you should predict the Estimated Relative Performance column.
I am new in Python. Could anybody help me with the code for this task?
Maybe your model needs more work, for now, it only uses a variable and maybe you will get a better prediction with another model that uses more variables.
y = b0 b1 * x1
According to the text "... CPU which is characterized in terms of cycle time and memory capacity and so on" is the problem.
A proposal will be to extend your models using statsmodels
API to write a formula. In your case I like to remove all spaces in columns names before.
# Rename columns without spaces
old_columns = data.columns
new_columns = [col.replace(' ', '_') for col in old_columns]
data = data.rename(columns={old:new for old, new in zip(old_columns, new_columns)})
# Fit a model using more variables
import statsmodels.formula.api as sm2
formula = ('Estimated_Relative_Performance ~ ',
'Machine_Cycle_Time_in_nanoseconds ',
'Maximum_Main_Memory_in_kilobytes ',
'Cache_Memory_in_Kilobytes ',
'Maximum_Channels_in_Units')
formula = ' '.join(formula)
print(formula)
results2 = sm2.ols(formula, data).fit()
results2.summary()
data['predicted2'] = results2.predict()