I aim to train my convolutional neural network to identify png images. I first convert images into tensor.
file_path = f"./STFT_spectra/STFT_spectra0.png"
image = io.read_file(file_path)
image = io.decode_png(image)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize(image, [128,128])
print("----Here-----")
print(type(image))
print(image.shape)
Output:
----Here-----
<class 'tensorflow.python.framework.ops.EagerTensor'>
(128, 128, 4)
Then, I convert all my generated images, and save the numpy array as "image_list.p" file on the hard disk.
# total = 100
# image_list = np.empty(shape=(total, 128, 128, 4))
# for i in tqdm(range(total)):
# file_path = f"./STFT_spectra/STFT_spectra{i}.png"
# image = io.read_file(file_path)
# image = io.decode_png(image)
# image = tf.image.convert_image_dtype(image, tf.float32)
# image = tf.image.resize(image, [128, 128])
# image_list[i] = image
# pickle.dump(image_list, open("image_list.p", "wb"))
As for ground truth, each label is a 10 float combination, like [0.2, 0.3, 0.5, 0.6, 0.9, 0.5, 0.4, 0.6, 0.7, 0.1].
Then, I assemble the dataset:
labels = pickle.load(open(".././labels.p", "rb"))
fetched_image_list = pickle.load(open("../image_list.p", "rb"))
fetched_image_list = fetched_image_list.reshape(fetched_image_list.shape[0],
fetched_image_list.shape[1],
fetched_image_list.shape[2],
fetched_image_list.shape[3],
1)
dataset = tf.data.Dataset.from_tensor_slices((fetched_image_list, labels))
The CNN model goes like this:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), strides=(2,2), dilation_rate=(1,1), input_shape=(128,128,4,1), activation='relu'),
tf.keras.layers.Conv2D(71, (3, 3), strides=(2,2), dilation_rate=(1,1), activation='relu'),
tf.keras.layers.Conv2D(128, (3, 4), strides=(2,3), dilation_rate=(1,1),activation='relu'),
tf.keras.layers.Conv2D(128, (3, 3), strides=(2,2), dilation_rate=(1,1),activation='relu'),
tf.keras.layers.Conv2D(128, (3, 4), strides=(2, 3), dilation_rate=(1, 1), activation='relu'),
tf.keras.layers.Conv2D(128, (3, 3), strides=(2, 2), dilation_rate=(1, 1), activation='relu'),
tf.keras.layers.Dropout(0.20),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(10)
])
Everything looks fine, but problem comes,
ValueError: Exception encountered when calling layer "conv2d_1" (type Conv2D).
Negative dimension size caused by subtracting 3 from 1 for '{{node conv2d_1/Conv2D/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true](conv2d_1/Conv2D/Reshape, conv2d_1/Conv2D/Conv2D/ReadVariableOp)' with input shapes: [?,63,1,32], [3,3,32,71].
Call arguments received by layer "conv2d_1" (type Conv2D):
• inputs=tf.Tensor(shape=(None, 128, 63, 1, 32), dtype=float32)
How can I fix this issue? Is there any problem with the definition of CNN?
CodePudding user response:
You need to use padding with your convolutional layers. You're running out of dimension space.
Negative dimension size caused by subtracting 3 from 1 for '{{node conv2d_1/Conv2D/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true](conv2d_1/Conv2D/Reshape, conv2d_1/Conv2D/Conv2D/ReadVariableOp)' with input shapes: [?,63,1,32], [3,3,32,71].
So add, at least to the second convolutional layer, padding="same"
, by default, it's "valid"
, which means no padding. You can't have negative dimension space.
tf.keras.layers.Conv2D(71, (3, 3), strides=(2,2), padding="same", dilation_rate=(1,1), activation='relu'),