I'm following the book Hands-on Machichine Learning by Aurelien Geron, more specifically, where it begins to go into classifiers. I'm following the code from the book, but the error that I'm getting is:
ValueError: The number of classes has to be greater than one; got 1 class
My Code:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = .20, random_state = 42)
y_train_5 = (y_train == 5)
y_test_5 = (y_test == 5)
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(X_train, y_train_5)
When I looked up the error online, a potential fix was to use np.unique(y_train_5)
, but this did not work either.
CodePudding user response:
The problem is that you passed y_train_5 such that every value is the same, if you do
print(set(y_train_5))
you will see just one value. Consider doing stratified train test split, which makes sure that each class ends up in both train and test. Alternatively your y_train did not contain "5"s at all, and all values both in y_train_5 and y_test_5 are "False".