I keep getting a value error when trying to fit my data into my sklearn ANN model. The error states "ValueError: Unknown label type: (array([0.836, 0.741, 0.789, ..., 0.74 , 0.812, 0.748]),)"
x = df[['danceability', 'energy', 'loudness', 'tempo']].values
y = df['valence'].values
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.25)
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
mlp = MLPClassifier(hidden_layer_sizes=(10, 10, 10), max_iter=1000)
mlp.fit(X_train, y_train)
All 5 of the columns passed contain float values, with 4 of them including valence containing exclusively values between 0 and 1. I've tried using np.array(__).reshape(-1,1) and all that, but I'm not sure what it means by unknown label type.
CodePudding user response:
According to scikit-learn documentation
"When doing classification in scikit-learn, y is a vector of integers or strings."
Instead of the line:
y = df['valence'].values
try this instead:
y = np.asarray(df['valence'], dtype="|S6")