I have a 2D tensor with dynamic first dimension and fixed second dimension. I update this tensor using segment_sum
where the size of the first dimension may change, hence, I want to zero-pad the modified tensor to the same shape as input.
The answers provided in this question did not help me, hence asking this question with the specifics of my use-case.
class MyLayer(layers.Layer):
def call(self, inputs):
x, segment_ids = inputs
x_ = tf.math.segment_sum(x, segment_ids)
# the first dimension of x and x_
# may not be equal at this point, hence zero-padding.
x_ = tf.pad(
x_,
[(0, 0), (tf.shape(x)[0] - tf.shape(x_)[0], 0)])
return x_
If I'm reading how to pad unknown to fix size and getting the shape of a dynamic size tensor correctly, the above should work!? However, I get an error at a layer downstream MyLayer
complaining:
unsupported operand type(s) for *=: 'int' and 'NoneType'
I get this error before even the first epoch starts, regardless of the input I provide; however, if I remove the padding, I'll get incompatible shape only for some of my input. Therefore, I am guessing the error is related to the padding, and maybe the tf.shape
given NoneType
in the error message?
Example
This is a minimal example, where I can get the expected output using a simple padding as I show at the end of the example. However, in my actual use-case where the model is defined on dynamic input size, i.e., size is None
, a similar approach fails with the above-explained error.
x = [[1, 10], [2, 20], [3, 30], [4, 40]]
i = [0, 1, 1, 1]
o = tf.math.segment_sum(x, i)
o = tf.get_static_value(o)
print(f"shape: {o.shape}")
print(f"\ntensor:\n{o}")
Output:
shape: (2, 2)
tensor:
[[ 1 10]
[ 9 90]]
Expected output
o = tf.pad(o, [(0, len(x) - o.shape[0]), (0, 0)])
shape: (4, 2)
tensor:
[[ 1 10]
[ 9 90]
[ 0 0]
[ 0 0]]
CodePudding user response:
You could try using tf.concat
for padding in your case:
import tensorflow as tf
class MyLayer(tf.keras.layers.Layer):
def call(self, inputs):
x, segment_ids = inputs
x_ = tf.math.segment_sum(x, segment_ids)
x_ = tf.concat([x_, tf.zeros((int(tf.shape(x)[0] - tf.shape(x_)[0]), tf.shape(x)[1]))], axis=0)
return x_
inputs = tf.keras.layers.Input((2,))
l = MyLayer()
x = l([inputs, [0, 1, 1, 1]])
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
model.compile(optimizer='adam', loss='mse')
x = tf.constant([[1, 10], [2, 20], [3, 30], [4, 40]], dtype=tf.float32)
print(l([x, [0, 1, 1, 1]]))
y = tf.constant([1, 2, 1, 4])
model.fit(x, y, epochs=2)
tf.Tensor(
[[ 1. 10.]
[ 9. 90.]
[ 0. 0.]
[ 0. 0.]], shape=(4, 2), dtype=float32)
Epoch 1/2
1/1 [==============================] - 1s 683ms/step - loss: 2624.1050
Epoch 2/2
1/1 [==============================] - 0s 12ms/step - loss: 2361.9163
<keras.callbacks.History at 0x7f7fdd4cf810>