Why tf.keras.Model training flag significantly alters the prediction result?-CodePudding

I was recently going through the tensorflow pix2pix tutorial and after playing with it a bit I unexpectedly realized that there is a major difference between the predictions of a tf.keras.Model (In this case the Generator() from the tutorial) where one of the prediction use the training flag to true and the other to false.

Here is the code to demonstrate the issue:

# ...Tutorial steps where I load the model instead of creating a new one...
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))


for example_input, example_target in test_dataset.take(1):
  train_res1 = generate_images(generator, example_input, example_target)       # Function definition as per tutorial expect that I return the 'Predicted Image'
  train_res2 = generate_images(generator, example_input, example_target)       # Now considering training is true (will alter the model), a small RGB difference is expected
  notrain_res2 = generate_images2(generator, example_input, example_target)    # Identical to 'generate_images' except that 'training=false', should be identical or similar to last one.
  
  r_avg = np.average(train_res1[:,:, 0])
  g_avg = np.average(train_res1[:,:, 1])
  b_avg = np.average(train_res1[:,:, 2])
  print(f"Training flag true iteration#1 = R average: {r_avg}, G average: {g_avg}, B average: {b_avg}")
  r_avg = np.average(train_res2[:,:, 0])
  g_avg = np.average(train_res2[:,:, 1])
  b_avg = np.average(train_res2[:,:, 2])
  print(f"Training flag true iteration#2 = R average: {r_avg}, G average: {g_avg}, B average: {b_avg}")
  r_avg = np.average(notrain_res2[:,:, 0])
  g_avg = np.average(notrain_res2[:,:, 1])
  b_avg = np.average(notrain_res2[:,:, 2])
  print(f"Training flag false            = R average: {r_avg}, G average: {g_avg}, B average: {b_avg}")

Just to avoid any confusion, here is the code of generate_images2 which is identical to generate_images from tutorial except that 'training=False' and I return the prediction:

def generate_images2(model, test_input, tar):
  prediction = model(test_input, training=False)
  plt.figure(figsize=(15, 15))

  display_list = [test_input[0], tar[0], prediction[0]]
  title = ['Input Image', 'Ground Truth', 'Predicted Image']

  for i in range(3):
    plt.subplot(1, 3, i 1)
    plt.title(title[i], color = "w")
    # Getting the pixel values in the [0, 1] range to plot.
    plt.imshow(display_list[i] * 0.5   0.5)
    plt.axis('off')
  plt.show()
  return display_list[2]

Here you can vizualize my concerns with the training flag.

As expected there is are minor differences between the RGB values of iteration#1 and iteration#2 with training flag = True. This is expected due to training model alterations.
However when training flag = False, I would expect the RGB values to be similar or identical to the iteration#2 with training flag = True but if you look at the door in the yellow and red circle s the RGB values are clearly different.
The result is pretty much always better with training=True

Question: Why tf.keras.Model training flag significantly alters the prediction result?

CodePudding user response：

Here's an answer from the Tensorflow repo:

https://github.com/tensorflow/tensorflow/issues/36936

There are some things that only happen during training, for example dropout is used. If training=False then dropout layers are ignored (see https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout)