I am wondering is it possible to do voting for classification tasks. I have seen plenty of blogs explaining how to use voting for regression purposes.As given below.
# initializing all the model objects with default parameters
model_1 = LinearRegression()
model_2 = xgb.XGBRegressor()
model_3 = RandomForestRegressor()
# training all the model on the training dataset
model_1.fit(X_train, y_target)
model_2.fit(X_train, y_target)
model_3.fit(X_train, y_target)
# predicting the output on the validation dataset
pred_1 = model_1.predict(X_test)
pred_2 = model_2.predict(X_test)
pred_3 = model_3.predict(X_test)
# final prediction after averaging on the prediction of all 3 models
pred_final = (pred_1 pred_2 pred_3)/3.0
# printing the mean squared error between real value and predicted value
print(mean_squared_error(y_test, pred_final))
CodePudding user response:
Of course you can use the same for classes, only your voting will use a different function. This is, how Random Forests arrive at their prediction (the single decision trees in the forest "vote" for a common prediction). You can for example employ a majority vote over all classifiers. Or you can use the single predictions to formulate a probability for your prediction. For example, each class could get the fraction of votes it got assigned as the output.
CodePudding user response:
That can be done.
# initializing all the model objects with default parameters
model_1= svm.SVC(kernel='rbf')
model_2 = XGBClassifier()
model_3 = RandomForestClassifier()
# Making the final model using voting classifier
final_model = VotingClassifier(estimators=[('svc', model_1), ('xgb', model_2), ('rf', model_3)], voting='hard')
# applying 10 fold cross validation
scores = cross_val_score(final_model, X_all, y, cv=10, scoring='accuracy')
print(scores)
print('Model accuracy score : {0:0.4f}'.format(scores.mean()))
You can add more machine learning models than three if necessary Here note that I have applied cross validation and got the accuracy