am very new to deep learning in general, and I tried to make a program that takes a number from 1-12, and returns an alphabet from A, B, C, D or E based on which value it is. The generator randomly assigns each number to an alphabet upon generation, and the model is supposed to guess what is assigned to what. I used HashingVectorization to convert the aplhabets into an array of 5 values, and my code is as below:
from keras.models import Sequential
from keras.layers import Dense
import pandas as pd
from sklearn.feature_extraction.text import HashingVectorizer
# load the dataset
dataset = pd.read_csv('testbase.csv')
vectorizer = HashingVectorizer(n_features=5)
# split into input (X) and output (y) variables
input = dataset['col2']
result = vectorizer.transform(list(dataset['col1'])).toarray()
# define the keras model
model = Sequential()
model.add(Dense(5, input_dim=1, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(5, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(input, result, epochs=150, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(input, result)
print('Accuracy: %.2f' % (accuracy*100))
can someone figure out why this is the case?
CodePudding user response:
binary_crossentropy
is not a good loss function for this, because you're not doing binary classification (since you have five categories to classify into). I would give it a try with categorical crossentropy first, and if that still yields bad results, experiment by tweaking around with learning rate, optimizers etc.