RAM Overflow Colab, when running model.fit() in Image Classifier of AutoKeras for many images-CodePudding

I'm trying to create an Image Classifier on a dataset with 40'000 images, in order to let Autokeras train the most appropriate model for me afterwards. Now the problem is, that every time I load all the images and get their labels but when I run the normalization Google Colab, there is a RAM overflow (although having a Pro account). Subsequently my code:

# Import TensorFlow
%tensorflow_version 2.x
import tensorflow as tf

import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import normalize, to_categorical
!pip install autokeras
import autokeras as ak

import glob
import numpy as np

images = glob.glob(path   '/*.png')

import random

data = []
labels = []

for i in images:
    image=tf.keras.preprocessing.image.load_img(i, color_mode='rgb')
    image=np.array(image, dtype ='float32')
    image=cv2.resize(image, (180, 180))
    image/=255.0
    data.append(image)
    label=os.path.basename(str(i.replace('.png', '')))
    label=label.split()[0]
    labels.append(label)

data = np.array(data)
labels = np.array(labels)
print(labels)

Up until here everything works like a charm, but then I face the problem which creates the overflow:

# normalize feature and encode label
X = data 
y = np.zeros(labels.shape)
indices = np.unique(labels)
for i in range(labels.shape[0]):
  y[i] = np.where(labels[i] == indices)[0]
y = to_categorical(y)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                random_state=42) # this line seems to be the problem where Colab has a RAM overflow

clf= ak.ImageClassifier(overwrite=True, max_trials=20)
clf.fit(X_train, y_train, epochs=10) # and also this line seems to be the problem where Colab has a RAM overflow.

Does anyone know what the problem might be? A hint in any direction would be highly appreciated, as this makes me going crazy!

CodePudding user response：

I highly recommend you tf.data.Dataset for creating the dataset:

Do all processes (like resize and normalize) that you want on all images with dataset.map.
Instead of using train_test_split, Use dataset.take, datast.skip for splitting dataset.

Code for generating random images and label:

# !pip install autokeras
import tensorflow as tf
import autokeras as ak
import numpy as np

data = np.random.randint(0, 255, (45_000,32,32,3))
label = np.random.randint(0, 10, 45_000)
label = tf.keras.utils.to_categorical(label)

Convert data & label to tf.data.Dataset and process on them: (only 55 ms for 45_000, benchmark on colab)

dataset = tf.data.Dataset.from_tensor_slices((data, label))
def resize_normalize_preprocess(image, label):
    image = tf.image.resize(image, (16, 16))
    image = image / 255.0
    return image, label

# %%timeit 
dataset = dataset.map(resize_normalize_preprocess, num_parallel_calls=tf.data.AUTOTUNE)
# 1 loop, best of 5: 54.9 ms per loop

Split dataset to 80% for train and 20% for test
Train and evaluate AutoKeras.ImageClassifier

dataet_size = len(dataset)
train_size = int(0.8 * dataet_size)
test_size = int(0.2 * len(dataset))

dataset = dataset.shuffle(32)
train_dataset = dataset.take(train_size)
test_dataset = dataset.skip(train_size)

print(f'Size dataset : {len(dataset)}')
print(f'Size train_dataset : {len(train_dataset)}')
print(f'Size test_dataset : {len(test_dataset)}')

clf = ak.ImageClassifier(overwrite=True, max_trials=1)
clf.fit(train_dataset, epochs=1)
print(clf.evaluate(test_dataset))

Output:

Size dataset : 45000
Size train_dataset : 36000
Size test_dataset : 9000

Search: Running Trial #1

Value             |Best Value So Far |Hyperparameter
vanilla           |?                 |image_block_1/block_type
True              |?                 |image_block_1/normalize
False             |?                 |image_block_1/augment
3                 |?                 |image_block_1/conv_block_1/kernel_size
1                 |?                 |image_block_1/conv_block_1/num_blocks
2                 |?                 |image_block_1/conv_block_1/num_layers
True              |?                 |image_block_1/conv_block_1/max_pooling
False             |?                 |image_block_1/conv_block_1/separable
0.25              |?                 |image_block_1/conv_block_1/dropout
32                |?                 |image_block_1/conv_block_1/filters_0_0
64                |?                 |image_block_1/conv_block_1/filters_0_1
flatten           |?                 |classification_head_1/spatial_reduction_1/reduction_type
0.5               |?                 |classification_head_1/dropout
adam              |?                 |optimizer
0.001             |?                 |learning_rate

Result of searching and finding the best parameter and training:

Trial 1 Complete [00h 01m 16s]
val_loss: 2.3030436038970947

Best val_loss So Far: 2.3030436038970947
Total elapsed time: 00h 01m 16s
INFO:tensorflow:Oracle triggered exit
1125/1125 [==============================] - 68s 60ms/step - loss: 2.3072 - accuracy: 0.0979
INFO:tensorflow:Assets written to: ./image_classifier/best_model/assets
282/282 [==============================] - 26s 57ms/step - loss: 2.3025 - accuracy: 0.0970
[2.302501916885376, 0.09700000286102295]