Why my neural network isn't able to classify correctly?-CodePudding

I made a simple neural network to classify food into only two classes egg or meat, however every time i train the model, it gives me a constant result despite the image change, like if i train for first time it recognize every image as a meat and for the second time it recognize all the images as egg, i don't know if it's a mistake in my code.

Here where i read the data :

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    directory,
    labels="inferred",
    label_mode="int",
    class_names= None,
    color_mode="rgb",
    batch_size=32,
    image_size=(256, 256),
    seed=None,
    validation_split=None,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
    crop_to_aspect_ratio=False
    )

Here is where i predict using softmax activate function after flattening the data :

def forward(x):
    return tf.matmul(x,W)   b

def model(x):
    x = flatten(x)
    return activate(x)

def activate(x):
    return tf.nn.softmax(forward(x))

calculating the error using cross_entropy

def cross_entropy(y_label, y_pred):
    return (-tf.reduce_sum(y_label * tf.math.log(y_pred   1.e-10)))

modifying values using descent gradient :

optimizer = tf.keras.optimizers.SGD(learning_rate=0.25)

def train_step(x, y ):
        with tf.GradientTape() as tape:
            #compute loss function
            current_loss = cross_entropy( y, model(x))
            # compute gradient of loss 
            #(This is automatic! Even with specialized funcctions!)
            grads = tape.gradient( current_loss , [W,b] )
            # Apply SGD step to our Variables W and b
            optimizer.apply_gradients( zip( grads , [W,b] ) )     
        return current_loss.numpy()

and finally, training the model :

W = tf.Variable(tf.zeros([196608, 2],tf.float32))
# Bias tensor
b = tf.Variable(tf.zeros([2],tf.float32))

loss_values=[]
accuracies = []
epochs = 100

for i in range(epochs):
    j=0
    # each batch has 50 examples
    for x_train_batch, y_train_batch in train_ds:
        
        j =1
        current_loss = train_step(x_train_batch/255.0, tf.one_hot(y_train_batch,2))
        if j%500 == 0: #reporting intermittent batch statistics
            print("epoch ", str(i), "batch", str(j), "loss:", str(current_loss) ) 
 

   

Update:
I have discovered that the problem is in the gradients, they are always zero except for the first time

CodePudding user response：

Just to be sure, are you actually going over the training data and using your training step, in the code you provided there is no training loop that goes over your train_ds, seems like your behavior is more down to initialization cos you don't actually train the model.

CodePudding user response：

After days of debugging, i discovered that the grads are zero, i then found out why, this was because the softmax function, as the values of x_train were very large so it gives out zeros and ones, which makes the change tends to zero, to walk around i just divided the forward(x) arguments over a large number