Below are codes that differ only in indentation. They give different outputs. The first one outputs
array([[7.68717818e 11, 6.97681947e 11, 7.63307666e 11, 6.47349007e 11,
5.99065727e 11],
[7.66709788e 11, 6.99939453e 11, 7.64013608e 11, 6.45261906e 11,
5.99177884e 11],
[2.94202392e 11, 4.16042886e 12, 3.69055160e 11, 3.08724683e 11,
3.71083543e 11],
...,
[2.32972294e 11, 2.25518151e 11, 2.16985500e 11, 1.52392619e 11,
1.87686750e 11],
[1.31495500e 11, 4.03441481e 11, 1.66796570e 11, 7.37775506e 10,
1.55474795e 11],
[1.26216951e 11, 5.60385882e 11, 1.49446146e 11, 7.23769941e 10,
1.45692856e 11]])
The second one gives
array([[7.66709788e 11, 6.99939453e 11, 7.64013608e 11, 6.45261906e 11,
5.99177884e 11],
[2.94202392e 11, 4.16042886e 12, 3.69055160e 11, 3.08724683e 11,
3.71083543e 11],
[2.92378752e 11, 2.93673677e 12, 3.58775694e 11, 3.06625155e 11,
3.56276475e 11],
...,
[1.31495500e 11, 4.03441481e 11, 1.66796570e 11, 7.37775506e 10,
1.55474795e 11],
[1.26216951e 11, 5.60385882e 11, 1.49446146e 11, 7.23769941e 10,
1.45692856e 11],
[0.00000000e 00, 0.00000000e 00, 0.00000000e 00, 0.00000000e 00,
0.00000000e 00]])
Why do they give different results? Is it some general fact about how python works? I thought indentation is proper in both cases, but apparently there are several types of "proper indentation" that can give different results?
mses = np.zeros((len(all_models), 5))
kfold = KFold(5,
random_state = 614,
shuffle = True)
j = 0
for train_index, test_index in kfold.split(cars_train):
cars_tt = cars_train.iloc[train_index]
cars_ho = cars_train.iloc[test_index]
i = 0
for model in all_models:
if model == "baseline":
pred = np.power(10,cars_tt.log_sell.mean()*np.ones(len(cars_ho)))
mses[i, j] = mean_squared_error(cars_ho.selling_price, pred)
else:
if len(model) == 1:
reg = LinearRegression(copy_X = True)
reg.fit(cars_tt[model].values.reshape(-1,1), cars_tt.log_sell.values)
pred = np.power(10,reg.predict(cars_ho[model].values))
mses[i,j] = mean_squared_error(cars_ho.selling_price, pred)
else:
reg = LinearRegression(copy_X = True)
reg.fit(cars_tt[model].values, cars_tt.log_sell.values)
pred = np.power(10,reg.predict(cars_ho[model].values))
mses[i,j] = mean_squared_error(cars_ho.selling_price, pred)
i = i 1
j = j 1
mses
mses = np.zeros((len(all_models), 5))
kfold = KFold(5,
random_state = 614,
shuffle = True)
j = 0
for train_index, test_index in kfold.split(cars_train):
cars_tt = cars_train.iloc[train_index]
cars_ho = cars_train.iloc[test_index]
i = 0
for model in all_models:
if model == "baseline":
pred = np.power(10,cars_tt.log_sell.mean()*np.ones(len(cars_ho)))
mses[i, j] = mean_squared_error(cars_ho.selling_price, pred)
else:
if len(model) == 1:
reg = LinearRegression(copy_X = True)
reg.fit(cars_tt[model].values.reshape(-1,1), cars_tt.log_sell.values)
pred = np.power(10,reg.predict(cars_ho[model].values))
mses[i,j] = mean_squared_error(cars_ho.selling_price, pred)
else:
reg = LinearRegression(copy_X = True)
reg.fit(cars_tt[model].values, cars_tt.log_sell.values)
pred = np.power(10,reg.predict(cars_ho[model].values))
mses[i,j] = mean_squared_error(cars_ho.selling_price, pred)
i = i 1
j = j 1
mses
CodePudding user response:
In the first snippet i = i 1
is indented to line up with the outer else:
block, so it comes after it.
if model == "baseline":
...
else:
if len(model) == 1:
...
else:
...
i = i 1
In the second snippet it's indented such that it's inside the outer else:
block.
if model == "baseline":
...
else:
if len(model) == 1:
...
else:
...
i = i 1
This difference affects when and how many times i = i 1
is executed.
If Python had curly braces it would be the difference between:
if model == "baseline" {
...
}
else {
if len(model) == 1 {
...
}
else {
...
}
}
i = i 1
and:
if model == "baseline" {
...
}
else {
if len(model) == 1 {
...
}
else {
...
}
i = i 1
}