When utilizing tf.data. Dataset.zip
for zipping two datasets. It combines each index value of the first dataset with the corresponding index value of the second datasets.
a = tf.data.Dataset.range(1, 4) # ==> [ 1, 2, 3 ]
b = tf.data.Dataset.range(4, 7) # ==> [ 4, 5, 6 ]
ds = tf.data.Dataset.zip((a, b))
list(ds.as_numpy_iterator()) # (1, 4), (2, 5), (3, 6)]
You can observe a single combination of two datasets, such as 1,4 followed by 2, 5 and then 3,6. How can multiple all possible combinations be generated, such as (1, 4), (1,5), (1, 6), (2,4), (2,5), (2, 6), (3, 4), (3, 5), (3, 6)?
CodePudding user response:
You could use a list comprehension -
a = tf.data.Dataset.range(1, 4) # ==> [ 1, 2, 3 ]
b = tf.data.Dataset.range(4, 7) # ==> [ 4, 5, 6 ]
d = tf.data.Dataset.from_tensor_slices([(x, y) for x in a for y in b])
for el in d:
print(el)
Output
tf.Tensor([1 4], shape=(2,), dtype=int64)
tf.Tensor([1 5], shape=(2,), dtype=int64)
tf.Tensor([1 6], shape=(2,), dtype=int64)
tf.Tensor([2 4], shape=(2,), dtype=int64)
tf.Tensor([2 5], shape=(2,), dtype=int64)
tf.Tensor([2 6], shape=(2,), dtype=int64)
tf.Tensor([3 4], shape=(2,), dtype=int64)
tf.Tensor([3 5], shape=(2,), dtype=int64)
tf.Tensor([3 6], shape=(2,), dtype=int64)
CodePudding user response:
A pure tensorflow approach without loops could look like this:
import tensorflow as tf
a = tf.data.Dataset.range(1, 4)
b = tf.data.Dataset.range(4, 7)
repeats = 3
b = b.repeat(repeats).window(repeats, shift=repeats).flat_map(lambda x: x.batch(repeats))
ds = tf.data.Dataset.zip((a, b)).map(lambda x, y: tf.data.Dataset.from_tensor_slices(tf.stack([tf.broadcast_to(x, (repeats,)), y], axis=1)))
ds = ds.flat_map(lambda x: x.batch(1).map(lambda y: (y[0][0], y[0][1])))
list(ds.as_numpy_iterator())
[(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]