preprocess_input changes array inplace, but doesn't change tensor-CodePudding

I've noticed some strange behaviour in preprocess_input, a function used to preprocess images to normalise values correctly for the specific pre-trained network you are using. After several hours of debugging, it appears that when a tensor is used as an input, the input tensor is unmodified, and it returns the processed input as a new tensor:

tensor = tf.ones(3)*100
print(tensor)
tensor2 = tf.keras.applications.mobilenet_v2.preprocess_input (tensor)
print(tensor)
print(tensor2)

returns

tf.Tensor([100. 100. 100.], shape=(3,), dtype=float32)
tf.Tensor([100. 100. 100.], shape=(3,), dtype=float32)
tf.Tensor([-0.21568626 -0.21568626 -0.21568626], shape=(3,), dtype=float32)

However when doing the exact same thing but with a numpy array as input, apart from returning the processed version as a new array, the original array is changed to be the same as the new array:

array = np.ones(3)*100
print(array)
array2 = tf.keras.applications.mobilenet_v2.preprocess_input (array)
print(array)
print(array2)
array =1
print(array)
print(array2)

returns

[100. 100. 100.]
[-0.21568627 -0.21568627 -0.21568627]       # <== input has changed!!!
[-0.21568627 -0.21568627 -0.21568627]
[0.78431373 0.78431373 0.78431373]
[0.78431373 0.78431373 0.78431373]          # <== further changes to input change output

Three questions:

Why is behaviour not uniform?
Why is it considered beneficial for the original array to be changed?
Why does preprocess_input both return the new values and also modify in-place - isn't it usually one or the other, doing both is confusing...

CodePudding user response：

To be fair the docs do mention this behaviour:

The preprocessed data are written over the input data if the data types are compatible. To avoid this behaviour, numpy.copy(x) can be used.

So that kind of answers Q1 - tensors are immutable, so can't be overwritten, as opposed to np arrays which are mutable so can be changed in-place.

Note: this isn't a great answer - if this function will be used a lot on tensors, surely its behaviour should be fixed such that even on arrays it will behave the same. I.e. it shouldn't ever change the input, so it behaves uniformly and people will know what to expect.