Home > Back-end >  tf.data.Dataset.zip: Can we have some alternative method of tf.data.Dataset.zip?
tf.data.Dataset.zip: Can we have some alternative method of tf.data.Dataset.zip?

Time:07-03

When utilizing tf.data. Dataset.zip for zipping two datasets. It combines each index value of the first dataset with the corresponding index value of the second datasets.

a = tf.data.Dataset.range(1, 4)  # ==> [ 1, 2, 3 ]
b = tf.data.Dataset.range(4, 7)  # ==> [ 4, 5, 6 ]
ds = tf.data.Dataset.zip((a, b))
list(ds.as_numpy_iterator()) # (1, 4), (2, 5), (3, 6)]

You can observe a single combination of two datasets, such as 1,4 followed by 2, 5 and then 3,6. How can multiple all possible combinations be generated, such as (1, 4), (1,5), (1, 6), (2,4), (2,5), (2, 6), (3, 4), (3, 5), (3, 6)?

CodePudding user response:

You could use a list comprehension -

a = tf.data.Dataset.range(1, 4)  # ==> [ 1, 2, 3 ]
b = tf.data.Dataset.range(4, 7)  # ==> [ 4, 5, 6 ]
d = tf.data.Dataset.from_tensor_slices([(x, y) for x in a for y in b])
for el in d:
  print(el)

Output

tf.Tensor([1 4], shape=(2,), dtype=int64)
tf.Tensor([1 5], shape=(2,), dtype=int64)
tf.Tensor([1 6], shape=(2,), dtype=int64)
tf.Tensor([2 4], shape=(2,), dtype=int64)
tf.Tensor([2 5], shape=(2,), dtype=int64)
tf.Tensor([2 6], shape=(2,), dtype=int64)
tf.Tensor([3 4], shape=(2,), dtype=int64)
tf.Tensor([3 5], shape=(2,), dtype=int64)
tf.Tensor([3 6], shape=(2,), dtype=int64)

CodePudding user response:

A pure tensorflow approach without loops could look like this:

import tensorflow as tf

a = tf.data.Dataset.range(1, 4)
b = tf.data.Dataset.range(4, 7)
repeats = 3
b = b.repeat(repeats).window(repeats, shift=repeats).flat_map(lambda x: x.batch(repeats))
ds = tf.data.Dataset.zip((a, b)).map(lambda x, y: tf.data.Dataset.from_tensor_slices(tf.stack([tf.broadcast_to(x, (repeats,)), y], axis=1)))
ds = ds.flat_map(lambda x: x.batch(1).map(lambda y: (y[0][0], y[0][1])))

list(ds.as_numpy_iterator())
[(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]
  • Related