I am keep getting this error but I solve the issue by reshape the array: data = data.reshape(-1, 1)
My output:
Traceback (most recent call last):
File "C:\Users\USER\Desktop\python\machine-learning\bot4.py", line 93, in <module>
predictions = model.predict(data)
File "C:\Users\USER\Desktop\python\machine-learning\machine-learningVenv\lib\site-packages\sklearn\naive_bayes.py", line 105, in predict
X = self._check_X(X)
File "C:\Users\USER\Desktop\python\machine-learning\machine-leaningVenv\lib\site-packages\sklearn\naive_bayes.py", line 579, in _check_X
return self._validate_data(X, accept_sparse="csr", reset=False)
File "C:\Users\USER\Desktop\python\machine-learning\machine-learningVenv\lib\site-packages\sklearn\base.py", line 546, in _validate_data
X = check_array(X, input_name="X", **check_params)
File "C:\Users\USER\Desktop\python\machine-learning\machine-learningVenv\lib\site-packages\sklearn\utils\validation.py", line 902, in check_array
raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=['The cat is sleeping in the sun.' 'The dog is barking at the moon.'].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.```
I am expecting the ouput:
[{"Cat": "Sleeping", "Dog": "barking"}]
CodePudding user response:
Scikit expects vector like inputs in two dimensions so a dimension of Nx1 for N samples, or 1xF for a single sample with F features.
A list ["a", "b"]
like yours has not a 2D shape, which causes the error.
As Dr. Snoopy`s comment said you can in general not pass strings, you need to preprocess it for example with the LabelEncoder and/or OneHotEncoder