Home > database >  I'm getting an error when trying to fit a sklearn model. TypeError: Only size-1 arrays can be c
I'm getting an error when trying to fit a sklearn model. TypeError: Only size-1 arrays can be c

Time:11-17

from PIL import Image
import glob
import numpy as np
import matplotlib as plt
import pandas as pd
from sklearn.metrics import accuracy_score
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split

X = []
y = []

classes = [r"Anthracnose", r"Leaf Crinkcle", r"Powdery Mildew", r"Yellow Mosaic", r"Healthy"]

for i in range(5):
    for filename in glob.glob(r"C:\-\-\-\-\-\-\\"   classes[i]   r"/*.jpg"):
        image = Image.open(filename)
        matrix_temp = np.array(image)
        X.append(matrix_temp)
        y.append(i)

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2)


X_train = np.array(X_train, dtype=object).reshape(-1,1)
y_train = np.array(y_train).reshape(-1,1)
X_test = np.array(X_test, dtype=object).reshape(-1,1)
y_test = np.array(y_test).reshape(-1,1)


model = MLPClassifier()
model.fit(X_train, y_train)

Error:

TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\path_to_file\plants.py", line 32, in <module>
    model.fit(X_train, y_train)
  File "C:\path_to_file\_multilayer_perceptron.py", line 762, in fit
    return self._fit(X, y, incremental=False)
  File "C:\path_to_file\_multilayer_perceptron.py", line 394, in _fit
    X, y = self._validate_input(X, y, incremental, reset=first_pass)
  File "C:\path_to_file\_multilayer_perceptron.py", line 1109, in _validate_input
    X, y = self._validate_data(
  File "C:\path_to_file\base.py", line 596, in _validate_data
    X, y = check_X_y(X, y, **check_params)
  File "C:\path_to_file\validation.py", line 1074, in check_X_y
    X = check_array(
  File "C:\path_to_file\validation.py", line 856, in check_array
    array = np.asarray(array, order=order, dtype=dtype)
ValueError: setting an array element with a sequence.

I'm confused on what this error can mean and don't know what to try. Would appreciate any help!

CodePudding user response:

It seems that the problem is the shape of X_train. X before transforming it is of size: (number_of_samples, height, width). The MLPClassifier's fit function expects the following shape: (number_of_samples, number_of_features). Thus, you need to reshape your 2D images into 1D vectors (features) by, for example, concatenating the rows, such that the input X is: (number_of_samples, height x width).

However, in your example you transform X_train using reshape(-1, 1) which leads to the following shape: (number_of_samples x height x width)

Illustrative example: Let's say you have 2 images of size 3x3 each. The goal is to transform them such that each 'image' is of size 1x9 instead of size 3x3. For example, you can use numpy's reshape method:

import numpy as np
a = np.zeros((2, 3, 3))  # Shape is (2,3,3). It's like 2 images with size 3x3 each
a_reshaped = a.reshape((2, 9)). # Reshape the two images
print(a_reshaped)
a_reshaped_2 = a.reshape((2, -1)). # Same result as above
print(a_reshaped_2)
a_reshaped_wrong = a.reshape(-1, 1). # What you do
print(a_reshaped_wrong.shape())  # Check the size and see what's different

Finally, note that each image must have the same size. Working with different heights and widths does also lead to an error.

I hope that helps.

  • Related