Image normalization by tf.image.convert_image

According to documentation tf.image.convert_image_dtype "Images that are represented using floating point values are expected to have values in the range [0,1)."

But in the keras tutorial(https://keras.io/examples/vision/cutmix/) i have seen the following preprocessing function:

def preprocess_image(image, label):
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    image = tf.image.convert_image_dtype(image, tf.float32) / 255.0
    return image, label

My question is: why did they divide by 255, when tf.image.convert_image_dtype already did that job?

CodePudding user response：

When using convert_image_dtype(image, tf.float32) only type of number in image convert to float32 and don't place [0,1) but when you divide by 255.0 you move number to [0,1) and we do this for Convolutional Layers.

import tensorflow_datasets as tfds
import tensorflow as tf

dataset = tfds.load('cifar10', as_supervised=True, split='train').batch(1)

for image, label in dataset.take(1):
    print(image[0])
    
IMG_SIZE = 64
def preprocess_image(image, label):
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    image = tf.image.convert_image_dtype(image, tf.float32) / 255.0
    # or
    # image = tf.cast(image, tf.float32) / 255.0
    return image, label

dataset = dataset.map(preprocess_image)
for image, label in dataset.take(1):
    print(image[0])

Output:

tf.Tensor(
[[[143  96  70]
  [141  96  72]
  [135  93  72]
  ...
  [212 177 147]
  [219 185 155]
  [221 187 157]]], shape=(32, 32, 3), dtype=uint8)


tf.Tensor(
[[[0.56078434 0.3764706  0.27450982]
  [0.5588235  0.3764706  0.2764706 ]
  [0.55490196 0.3764706  0.28039217]
  ...
  [0.8607843  0.72745097 0.6098039 ]
  [0.86470586 0.73137254 0.6137255 ]
  [0.8666667  0.73333335 0.6156863 ]]], shape=(64, 64, 3), dtype=float32)