I am new to Machine learning, and I have this error to calculate and return the accuracy on a test data
def NBAccuracy(features_train, labels_train, features_test, labels_test):
""" compute the accuracy of your Naive Bayes classifier """
### import the sklearn module for GaussianNB
from sklearn.naive_bayes import GaussianNB
### create classifier
clf = GaussianNB() #TODO
### fit the classifier on the training features and labels
clf.fit(features_train, labels_train, features_test, labels_test) #TODO
### use the trained classifier to predict labels for the test features
pred = clf.predict(features_test, labels_test) #TODO
### calculate and return the accuracy on the test data
### this is slightly different than the example,
### where we just print the accuracy
### you might need to import an sklearn module
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(features_test, labels_test, normalize=False) #TODO
return accuracy
return NBAccuracy
I got this error:
TypeError: fit() takes at most 4 arguments (5 given)
CodePudding user response:
You have train data and test data but you cannot use test data also in training and predict output on the same data.
clf.fit(features_train, labels_train, batch_size=32, epochs=10)
Batch size and epochs can be varied based on your training dataset
And accuracy should be calculated between your prediction output and original output
CodePudding user response:
These areas need to be corrected:
(1) need 2 parameters to fit, and they are training data to train the model, which are features_train
and labels_train
(2) prediction is done on test data, ie, features_test
(3) accuracy is done by comparing the truth labels_test
and the prediction pred
(4) only 1 return statement is needed, and do return a variable return accuracy
(not the function return NBAccuracy
)
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
def NBAccuracy(features_train, labels_train, features_test, labels_test):
""" compute the accuracy of your Naive Bayes classifier """
clf = GaussianNB()
clf.fit(features_train, labels_train)
pred = clf.predict(features_test)
accuracy = accuracy_score(labels_test, pred, normalize=False)
return accuracy