Tensorflow: dataset or model is in wrong shape-CodePudding

I want to do some image processing. To do so, I put in an image and get back another one (a mask) For testing, my dataset consists of only one image and its mask:

train_data = tf.data.Dataset.from_tensors((img, mask))

both are of shape (720, 1280, 3).

I tried using a simple model:

model = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.Rescaling(1./255, input_shape=(720, 1280, 3)),
    tf.keras.layers.Conv2D(128, 5, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(128, 5, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2DTranspose(2, [720, 1280])
])

model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])

But recieve this error:

ValueError: Input 0 of layer "sequential_24" is incompatible with the layer: expected shape=(None, 720, 1280, 3), found shape=(720, 1280, 3)

I'm pretty sure that is very easy to fix, however I couldn't manage to do so. Basically the "array size" is missing. I tried playing around with [] and () or tf.data.Dataset.from_tensor_slices, but the best I could achieve was of shape (2, 720, 1280, 3) where the label column was missing then...

Any idea on how to correctly set up the dataset or adjust the model?

CodePudding user response：

You need batch dimension: from_tensors(([img], [mask]))
You need to modify your model because it produces the output with shape that does not match a mask shape

Complete example:

import tensorflow as tf
import numpy as np

img = np.zeros((720, 1280, 3), dtype=np.uint8)
mask = np.zeros((720, 1280, 1), dtype=np.uint8)
train_data = tf.data.Dataset.from_tensors(([img], [mask]))


model = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.Rescaling(1./255, input_shape=(720, 1280, 3)),
    tf.keras.layers.Conv2D(128, 5, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(128, 5, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2DTranspose(128, kernel_size=3, strides=2, padding='same', activation='relu'),
    tf.keras.layers.Conv2DTranspose(2, kernel_size=3, strides=2, padding='same', activation='relu'),
])

model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])
model.summary()

model.fit(train_data)
model.evaluate(train_data)

CodePudding user response：

I think it is because you have your images with shape (720, 1280, 3). But, even if you are providing only one image, you should add another dimension, that indicates the number of samples in your dataset. In this case, since you have only one element, the correct shapes are:

import tensorflow as tf
import numpy as np

# sample images
img = np.ones((1, 720, 1280, 3))
mask = np.ones((1, 720, 1280, 3))
train_data = tf.data.Dataset.from_tensors((img, mask))

To convert an existing image in this format:

img = np.ones((720, 1280, 3))
img = np.expand_dims(img, axis=0)  # new shape: (1, 720, 1280, 3)