Why my code with generator with any batch_sizes will reset and my Ram is going to fill
import some important libraries
import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
load and some spliting data
cifar10_data = tf.keras.datasets.cifar10
(train_images, train_labels), (test_images, test_labels) = cifar10_data.load_data()
CLASS_NAMES= ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
validation_images, validation_labels = train_images[:5000], train_labels[:5000]
train_images, train_labels = train_images[5000:], train_labels[5000:]
using tf.data form and build some pairs of data
train_ds = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
test_ds = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
validation_ds = tf.data.Dataset.from_tensor_slices((validation_images, validation_labels))
define a preprocessing
def process_images(image, label, size=227):
# Normalize images to have a mean of 0 and standard deviation of 1
image = tf.image.per_image_standardization(image)
# Resize images from 32x32 to 277x277
image = tf.image.resize(image, (227,227))
return image, label
using tf.data for understanding size of data
train_ds_size = tf.data.experimental.cardinality(train_ds).numpy()
test_ds_size = tf.data.experimental.cardinality(test_ds).numpy()
validation_ds_size = tf.data.experimental.cardinality(validation_ds).numpy()
print("Training data size:", train_ds_size)
print("Test data size:", test_ds_size)
print("Validation data size:", validation_ds_size)
using tf.data methods for generating data in batch size = 64
train_ds = (train_ds
.map(process_images)
.shuffle(buffer_size=train_ds_size)
.batch(batch_size=64, drop_remainder=True))
test_ds = (test_ds
.map(process_images)
.shuffle(buffer_size=train_ds_size)
.batch(batch_size=64, drop_remainder=True))
validation_ds = (validation_ds
.map(process_images)
.shuffle(buffer_size=train_ds_size)
.batch(batch_size=64, drop_remainder=True))
define the model
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters=96, kernel_size=(11,11), strides=(4,4), activation='relu', input_shape=(227,227,3)),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPool2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Conv2D(filters=256, kernel_size=(5,5), strides=(1,1), activation='relu', padding="same"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPool2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), activation='relu', padding="same"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), activation='relu', padding="same"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), activation='relu', padding="same"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPool2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(4096, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(4096, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(10, activation='softmax')
])
compile the model
model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.optimizers.SGD(lr=0.001), metrics=['accuracy'])
# model.summary()
fit the model on dataset
history = model.fit(train_ds,
epochs=1,
validation_data=validation_ds, verbose=1,
validation_freq=1)
How can I use generator like this code without problem actually I need to use a generator in my code to solve memory problem but I don't know how to use this type of generator
CodePudding user response:
you must reduce shuffle buffer size.
CodePudding user response:
Its just cause of stack of dense layers with so many units (neurons) that will lead into overflow and OOM and as estimated for this model, the dense layers will contain 37752832 and 16781312 trainable parameters which is really enormous model.
So try out again with less units for dense layers, notice the most important thing in convolution models is that the dense layers are just for classifying the extracted feature maps, so its not needed to define dense layers with so many units, so emphasis on defining best model based on convolution base.