during the classes at the university we were creating the multiclass clasification model, I wanted to repeat this task but my model does not work. I don't know where the problem is. The original dataframe has 20,000 rows, for the presentation I prepared a df with 20 rows. My model gives acc at the level of 0.03 (for df 20k rows) and does not change between epochs. I would be grateful for the suggestions where I made a mistake. My code:
import keras
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.optimizers import Adam
import pandas as pd
df = pd.DataFrame({'lettr': ['T','I','D','N','G','S','B','A','J','M','X','O','G','M','R','F','O','C','T', 'J'],
'x-box': [2, 5, 4, 7, 2, 4, 4, 1, 2, 11, 3, 6, 4, 6, 5, 6, 3, 7, 6, 2],
'y-box': [8, 12, 11, 11, 1, 11, 2, 1, 2, 15, 9, 13, 9, 9, 9, 9, 4, 10, 11, 2],
'width': [3, 3, 6, 6, 3, 5, 5, 3, 4, 13, 5, 4, 6, 8, 5, 5, 4, 5, 6, 3],
'high': [5, 7, 8, 6, 1, 8, 4, 2, 4, 9, 7, 7, 7, 6, 7, 4, 3, 5, 8, 3],
'onpix':[1, 2, 6, 3, 1, 3, 4, 1, 2, 7, 4, 4, 6, 9, 6, 3, 2, 2, 5, 1],
'x-bar':[8, 10, 10, 5, 8, 8, 8, 8, 10, 13, 8, 6, 7, 7, 6, 10, 8, 6, 6, 10],
'y-bar':[13, 5, 6, 9, 6, 8, 7, 2, 6, 2, 7, 7, 8, 8, 11, 6, 7, 8, 11, 6],
'x2bar':[0, 5, 2, 4, 6, 6, 6, 2, 2, 6, 3, 6, 6, 6, 7, 3, 7, 6, 5, 3],
'y2bar':[6, 4, 6, 6, 6, 9, 6, 2, 6, 2, 8, 3, 2, 5, 3, 5, 5, 8, 6, 6],
'xybar':[6, 13, 10, 4, 6, 5, 7, 8, 12, 12, 5, 10, 6, 7, 7, 10, 7, 11, 11, 12],
'x2ybr':[10, 3, 3, 4, 5, 6, 6, 2, 4, 1, 6, 7, 5, 5, 3, 5, 6, 7, 9, 4],
'xy2br':[8, 9, 7, 10, 9, 6, 6, 8, 8, 9, 8, 9, 11, 8, 9, 7, 8, 11, 4, 9],
'x-ege':[0, 2, 3, 6, 1, 0, 2, 1, 1, 8, 2, 5, 4, 8, 2, 3, 2, 2, 3, 0],
'xegvy':[8, 8, 7, 10, 7, 8, 8, 6, 6, 1, 8, 9, 8, 9, 7, 9, 8, 8, 12, 7],
'y-ege':[0, 4, 3, 2, 5, 9, 7, 2, 1, 1, 6, 5, 7, 8, 5, 6, 3, 5, 2, 1],
'yegvx':[8, 10, 9, 8, 10, 7, 10, 7, 7, 8, 7, 8, 8, 6, 11, 9, 8, 9, 4, 7],
})
def naiveEncode(col):
values = list(col.unique())
return col.apply(lambda x: values.index(x))
df["lettr"] = naiveEncode(df["lettr"])
X = df.iloc[:,1:].values
y = df.iloc[:, 0].values
from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
y1 = encoder.fit_transform(y)
Y = pd.get_dummies(y1).values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(128, activation="relu", input_shape=(16,)),
tf.keras.layers.Dense(64, activation="relu",),
tf.keras.layers.Dense(32, activation="relu",),
tf.keras.layers.Dense(1, activation="softmax",)
])
model.compile(Adam(lr=0.04),'categorical_crossentropy',metrics=['accuracy'])
model.summary()
model.fit(X_train,y_train,epochs=10)
CodePudding user response:
The problem comes from the fact that your y variable does not contain your desired target column.
y = df.iloc[:, 0].values
takes your first columns as target while you want df["lettr"] as the target. You should replace it by :
y = df["lettr"].values
Then you need to adapt your inputs X this way :
X = df.loc[:, df.columns != "lettr"].values
CodePudding user response:
Sorry for my first answer, I completely missed the fact that you are using a categorical_crossentropy loss in the wrong case.
In your case, you should use sparse_categorical_crossentropy instead of categorical_crossentropy. You should check : https://stats.stackexchange.com/questions/326065/cross-entropy-vs-sparse-cross-entropy-when-to-use-one-over-the-other
Also, you should update your output layer to a Dense layer that fit the number of class you have (26 if you have all the alphabet) with a softmax activation function.
By the way, I suggest you to add your test data as a validation metrics for your model training.
The code with the modification :
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(128, activation="relu", input_shape=(16,)),
tf.keras.layers.Dense(64, activation="relu",),
tf.keras.layers.Dense(32, activation="relu",),
tf.keras.layers.Dense(26, activation="softmax",)
])
model.compile(Adam(lr=0.04),'sparse_categorical_crossentropy',metrics=['accuracy'])
model.summary()
model.fit(X_train,y_train,
validation_data= (X_test, y_test),
epochs=10)