I have a Gradient Boosting Regressor model for which I would like to save the results to a csv.
from sklearn import ensemble
clf = ensemble.GradientBoostingRegressor(n_estimators = 400, max_depth = 5, min_samples_split = 2,
learning_rate = 0.1, loss = 'squared_error')
clf.fit(x_train, y_train)
I would like to see the actuals and predicted values.
CodePudding user response:
After fitting your model, you need to actually predict
the predictions, with clf.predict
, lets say that we want to save y_pred_train
& y_pred_test
:
y_pred_train = clf.predict(y_train)
y_pred_test = clf.predict(y_test)
Now that we have both predicted values and the actual ones for both train
& test
sets, we can save a .csv
file with np.savetxt
:
np.savetxt("train_preds.csv", np.column_stack((y_train, y_pred_train)), delimiter=",", fmt='%.5f')
And
np.savetxt("test_preds.csv", np.column_stack((y_test, y_pred_test)), delimiter=",", fmt='%.5f')
We are using np.column_stack
just to stack them as columns rather as rows since they are 1D numpy arrays in their nature.
Please Note that I formatted the floating point precision just for visibility purposes, if you are going to calculate the loss afterwards it's better not to format at all.