Feeding a 2D image to a TensorFlow CNN for image classification-CodePudding

Thanks for helping with this. I'm trying to train a convolutional neural network to predict a binary aspect of my system using TensorFlow. I have roughly 1000 ndarrays, of size 400x400x3, holding floats. This is similar (I think) to an RGB image; the three aspects of my image aren't colour but the outputs of three different functions, but that's not material here. Each of these ndarrays is associated with a binary label-- either 0 or 1. I'm new to TensorFlow though reasonably solid on principles of machine learning, so would really appreciate guidance on decoding errors. I think the way I've arranged layers is weird, but I've tried to follow the tutorial pretty closely so I don't know why.

My code is as follows:

no_of_samples = 1035
train_batches = 30
BATCH_SIZE = 23
SHUFFLE_BUFFER_SIZE = 50

data, labels = np.load("total_image_data.npy", allow_pickle=True), np.load("labels.npy", allow_pickle=True)

#divide into test and train sets

train_indices = np.random.choice([i for i in range(no_of_samples)], size=train_batches*BATCH_SIZE)
test_indices = [i for i in range(no_of_samples) if i not in train_indices]

data_train, labels_train = [data[i] for i in train_indices], [labels[i] for i in train_indices]
data_test, labels_test = [data[i] for i in test_indices], [labels[i] for i in test_indices]

train_dataset = tf.data.Dataset.from_tensor_slices((data_train, labels_train))
test_dataset = tf.data.Dataset.from_tensor_slices((data_test, labels_test))

train_dataset = train_dataset.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_dataset = test_dataset.batch(BATCH_SIZE)



model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(200, 5, strides=3, activation='relu', input_shape=(BATCH_SIZE, 400, 400, 3)),
    tf.keras.layers.Conv2D(100, 5, strides=2, activation="relu"),
    tf.keras.layers.Conv2D(50, 5, activation="relu"),
    tf.keras.layers.Conv2D(25, 3, activation="relu"),
    tf.keras.layers.MaxPooling2D(3),
    tf.keras.layers.Conv2D(50, 3, activation="relu"),
    tf.keras.layers.Conv2D(25, 3, activation="relu"),
    tf.keras.layers.MaxPooling2D(3),
    tf.keras.layers.Conv2D(50, 2, activation="relu"),
    tf.keras.layers.Conv2D(25, 2, activation="relu"),
    
    tf.keras.layers.GlobalMaxPooling2D(),

    # Finally, we add a classification layer.
    tf.keras.layers.Dense(2)
])

model.compile(optimizer=tf.keras.optimizers.RMSprop(),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['sparse_categorical_accuracy'])

model.fit(labels_train, data_train, epochs=10, batch_size=BATCH_SIZE)

model.evaluate(labels_test, data_test)

And the error I get when I run it is as follows:


Traceback (most recent call last):
  File "training.py", line 36, in <module>
    model = tf.keras.Sequential([
  File "/local/**/ac3/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 530, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/local/**/ac3/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/local/**/ac3/lib/python3.8/site-packages/keras/engine/input_spec.py", line 213, in assert_input_compatibility
    raise ValueError(f'Input {input_index} of layer "{layer_name}" '
ValueError: Input 0 of layer "max_pooling2d" is incompatible with the layer: expected ndim=4, found ndim=5. Full shape received: (None, 23, 58, 58, 25)

Any help decoding this would be greatly appreciated.

UPDATE: new and exciting errors! After some help from some generous members of the community my code now looks like this:


import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

no_of_samples = 1035
BATCH_SIZE = 16
SHUFFLE_BUFFER_SIZE = 50

data, labels = np.load("total_image_data.npy", allow_pickle=True), np.load("labels.npy", allow_pickle=True)

print(np.shape(data)) ##this outputs (1035, 400, 400, 3)
print(np.shape(labels)) ##this outputs (1035, )



#divide into test and train sets


dataset = tf.data.Dataset.from_tensor_slices((data, labels)).shuffle(SHUFFLE_BUFFER_SIZE)
test_dataset = dataset.take(100).batch(BATCH_SIZE)
train_dataset = dataset.skip(100).batch(BATCH_SIZE)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(200, 5, strides=3, activation='relu', input_shape=(400, 400, 3)),
    tf.keras.layers.Conv2D(100, 5, strides=2, activation="relu"),
    tf.keras.layers.Conv2D(50, 5, activation="relu"),
    tf.keras.layers.Conv2D(25, 3, activation="relu"),
    tf.keras.layers.MaxPooling2D(3),
    tf.keras.layers.Conv2D(50, 3, activation="relu"),
    tf.keras.layers.Conv2D(25, 3, activation="relu"),
    tf.keras.layers.MaxPooling2D(3),
    tf.keras.layers.Conv2D(50, 2, activation="relu"),
    tf.keras.layers.Conv2D(25, 2, activation="relu"),
    
    tf.keras.layers.GlobalMaxPooling2D(),

    # Finally, we add a classification layer.
    tf.keras.layers.Dense(1)
])

print('Labels shape -->',labels.shape)
print('Labels -->', labels)

model.compile(optimizer=tf.keras.optimizers.RMSprop(),
              loss=tf.keras.losses.BinaryCrossentropy(),
              metrics=['accuracy'])

model.fit(train_dataset, epochs=10)
model.evaluate(test_dataset)

This compiles and runs fine, but reports a loss of nan and an unchanging accuracy across epochs of 58% (the baseline prevalence of the thing I'm looking for is 38%). Again, I throw myself upon your mercy.

CodePudding user response：

I think you need to change a few things: the input shape to your model does not need the batch_size, it will be inferred during training. Change it to (400, 400, 3). Second, if you are working with binary labels, you need to change your loss function to tf.keras.losses.BinaryCrossentropy and your metric to tf.keras.metrics.BinaryAccuracy or simply accuracy. Furthermore, your output layer should have one output node instead of two: tf.keras.layers.Dense(1)

Here is a running example based on your code:

import numpy as np
import tensorflow as tf

no_of_samples = 250
BATCH_SIZE = 16
SHUFFLE_BUFFER_SIZE = 50

data, labels = np.random.random((no_of_samples, 400, 400, 3)), np.random.randint(2, size=no_of_samples)

dataset = tf.data.Dataset.from_tensor_slices((data, labels)).shuffle(SHUFFLE_BUFFER_SIZE)
test_dataset = dataset.take(50).batch(BATCH_SIZE)
train_dataset = dataset.skip(50).batch(BATCH_SIZE)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(200, 5, strides=3, activation='relu', input_shape=(400, 400, 3)),
    tf.keras.layers.Conv2D(100, 5, strides=2, activation="relu"),
    tf.keras.layers.Conv2D(50, 5, activation="relu"),
    tf.keras.layers.Conv2D(25, 3, activation="relu"),
    tf.keras.layers.MaxPooling2D(3),
    tf.keras.layers.Conv2D(50, 3, activation="relu"),
    tf.keras.layers.Conv2D(25, 3, activation="relu"),
    tf.keras.layers.MaxPooling2D(3),
    tf.keras.layers.Conv2D(50, 2, activation="relu"),
    tf.keras.layers.Conv2D(25, 2, activation="relu"),
    
    tf.keras.layers.GlobalMaxPooling2D(),

    # Finally, we add a classification layer.
    tf.keras.layers.Dense(1)
])

model.compile(optimizer=tf.keras.optimizers.RMSprop(),
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(train_dataset, epochs=10, validation_data=test_dataset)

print('Labels shape -->',labels.shape)
print('Labels -->', labels)
Labels shape --> (250,)
Labels --> [1 0 0 0 0 1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 1 0 0 1 0 1 1 0 0 1 1 1 0 1 1 0 0
 0 1 0 1 1 1 0 1 1 1 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 1 1 1 0 0
 1 0 1 1 1 1 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 1 1 0 1
 1 1 1 1 0 0 0 1 0 0 0 1 1 1 0 1 1 1 0 0 0 1 1 1 0 1 0 1 0 1 1 0 1 0 0 1 0
 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 1
 0 0 0 0 1 1 0 1 1 1 0 0 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1
 0 0 1 1 1 1 1 0 0 0 1 0 0 1 0 1 1 1 1 1 0 1 1 1 0 1 1 0]

CodePudding user response：

When the loss goes to nan the easiest explanation is that you are having either exploding or vanishing gradients. Try using gradient clipping (either by value or by norm) on your optimizer, that basically does not allow the gradients to go above or beyond a certain value.

This also can affect the performance of the network but is always better than a loss of nan.

With your current code, the implementation would look like this:

model.compile(optimizer=tf.keras.optimizers.RMSprop(clipvalue=0.5),
              loss=tf.keras.losses.BinaryCrossentropy(),
              metrics=['accuracy'])

The value is just an example, if it works, try tweaking it a bit.