X_train = df_train["Base_Reviews"].values
X_test = df_test["Base_Reviews"].values
y_train = df_train['category'].values
y_test = df_test['category'].values
num_words = 20000 #Max. workds to use per toxic comment
max_features = 15000 #Max. number of unique words in embeddinbg vector
max_len = 200 #Max. number of words per toxic comment to be use
embedding_dims = 128 #embedding vector output dimension
num_epochs = 5 # (before 5)number of epochs (number of times that the model is exposed to the training dataset)
val_split = 0.2
batch_size2 = 256
tokenizer = tokenizer = Tokenizer(num_words = num_words, lower = False)
tokenizer.fit_on_texts(list(X_train))
X_train = tokenizer.texts_to_sequences(X_train)
X_test = tokenizer.texts_to_sequences(X_test)
X_train = sequence.pad_sequences(X_train, max_len)
X_test = sequence.pad_sequences(X_test, max_len)
print('X_train shape:', X_train.shape)
print('X_test shape: ', X_test.shape)
and this is the shape of our dataset: X_train shape: (11419, 200), X_test shape: (893, 200)
X_tra, X_val, y_tra, y_val = train_test_split(X_train, y_train, train_size =0.8, random_state=233)
early = EarlyStopping(monitor="val_loss", mode="min", patience=4)
nn_model = Sequential([
Embedding(input_dim=max_features, input_length=max_len, output_dim=embedding_dims),
GlobalMaxPool1D(),
Dense(50, activation = 'relu'),
Dropout(0.2),
Dense(5, activation = 'softmax')
])
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
nn_model.compile(loss="categorical_crossentropy", optimizer=Adam(0.01), metrics=['accuracy', mean_pred, fmeasure, precision, auroc, recall])
When I run the below code I get the above error.
nn_model.compile(loss="categorical_crossentropy", optimizer=Adam(0.01), metrics=['accuracy', mean_pred, fmeasure, precision, auroc, recall])
When I feed the data to NN Model the above error I get. How can I resolve the error? This is the error:
ValueError
Traceback (most recent call last)
<ipython-input-51-a3721a91aa0b> in <module>
----> 1 nn_model_fit = nn_model.fit(X_tra, y_tra, batch_size=batch_size2, epochs=num_epochs, validation_data=(X_val, y_val), callbacks=[early])
~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
~\anaconda3\lib\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
1145 except Exception as e: # pylint:disable=broad-except
1146 if hasattr(e, "ag_error_metadata"):
-> 1147 raise e.ag_error_metadata.to_exception(e)
1148 else:
1149 raise
ValueError: in user code:
**ValueError: Shapes (None, 1) and (None, 5) are incompatible**
CodePudding user response:
You have to map your labels to integer values:
import numpy as np
labels_index = dict(zip(["issue", "supporting", "decision", "neutral", "attacking"], np.arange(5)))
y_train = [labels_index[y] for y in y_train]