I have a similar question, this one TensorFlow - Pad unknown size tensor to a specific size?. My question is more difficult though, and I didn't find any solutions can solve my question. My question is that what if the given unknown tensor have different sizes in the last dimension and I want to pad them to the same fix length, how can I do that? For example, suppose the given tensor is
[[1],
[1, 2],
[1, 2, 3]]
I want to pad them such that I can get
[[1, 0, 0, 0],
[1, 2, 0, 0],
[1, 2, 3, 0]]
The solutions in the original post all assume the last dimension have the same length. Any ideas on how to solve this problem? I am not even sure if tf.pad()
is the right function to achieve this...
CodePudding user response:
Have a look at pad_sequences
It works as follow:
sequence = [
[1],
[1, 2],
[1, 2, 3]
]
tf.keras.preprocessing.sequence.pad_sequences(sequence, padding='post')
Should give you:
array([
[1, 0, 0]
[1, 2, 0]
[1, 2, 3]
])
CodePudding user response:
Try combining tf.slice
、tf.pad
and tf.map_fn
.
- For TF1
"""
[
[1],
[1, 2],
[1, 2, 3]
]
"""
a = tf.sparse.SparseTensor(
indices=[[0,0], [1,0], [1,1], [2,0], [2,1], [2,2]],
values=[1, 1, 2, 1, 2, 3],
dense_shape=[3, 3],
)
def cut_or_pad_1d(lst, max_len):
origin_len = tf.shape(lst)[0]
# cut
lst = tf.cond(origin_len > max_len,
true_fn=lambda: lst[:max_len],
false_fn=lambda: lst)
# pad
lst = tf.cond(origin_len < max_len,
true_fn=lambda: tf.pad(lst, [[0, max_len-origin_len]]),
false_fn=lambda: lst)
return lst
sess = tf.Session()
a_dense = tf.sparse.to_dense(a)
import functools
for MAX_LEN in (2, 5):
a_regularized = tf.map_fn(functools.partial(cut_or_pad_1d, max_len=MAX_LEN), a_dense)
a_regularized_val = sess.run(a_regularized)
print(f'max_len={MAX_LEN}, a_regularized_val=')
print(a_regularized_val)
- For TF2
"""
[
[1],
[1, 2],
[1, 2, 3]
]
"""
a = tf.sparse.SparseTensor(
indices=[[0,0], [1,0], [1,1], [2,0], [2,1], [2,2]],
values=[1, 1, 2, 1, 2, 3],
dense_shape=[3, 3],
)
def cut_or_pad_1d(lst, max_len):
origin_len = tf.shape(lst)[0]
if origin_len > max_len:
# cut
lst = lst[:max_len]
elif origin_len < max_len:
# pad
lst = tf.pad(lst, [[0, max_len-origin_len]])
return lst
a_dense = tf.sparse.to_dense(a)
import functools
for MAX_LEN in (2, 5):
a_regularized = tf.map_fn(functools.partial(cut_or_pad_1d, max_len=MAX_LEN), a_dense)
print(f'max_len={MAX_LEN}, a_regularized_val=')
print(a_regularized.numpy())