Home > Back-end >  Tensorflow dataset with variable number of elements
Tensorflow dataset with variable number of elements

Time:12-01

I need a dataset structured to handle a variable number of input images (a set of images) to regress against an integer target variable.

The code I am using to source the images is like this:

import tensorflow as tf
from tensorflow import convert_to_tensor


def read_image_tf(path: str) -> tf.Tensor:
    image = tf.keras.utils.load_img(path)
    return tf.keras.utils.img_to_array(image)

def read_image_list(x, y):
    return tf.map_fn(read_image_tf, x), y


paths_list = [['image_1', 'image_2', 'image_3'], ['image_6'], ['image_4', 'image_5', 'image_8', 'image_19']]

x = tf.ragged.constant(paths_list)
y = tf.constant([1,2,3])

dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.map(lambda x,y: read_image_list(x,y))

This code breaks with TypeError (TypeError: path should be path-like or io.BytesIO, not <class 'tensorflow.python.framework.ops.Tensor'>), as it seems that the map operation is not extracting the paths correctly from the original RaggedTensor. I have also tried to use a generator with similar results. Any help would be much appreciated

CodePudding user response:

Maybe something like this:

import tensorflow as tf

def read_image_tf(path: str) -> tf.Tensor:
    img = tf.io.read_file(path)
    return tf.io.decode_png(img, channels=3) # more generic: tf.io.decode_image

def read_image_list(x, y):
    return tf.map_fn(read_image_tf, x, dtype=tf.uint8), y

paths_list = [['/content/image1.png', '/content/image1.png', '/content/image1.png'], ['/content/image1.png'], ['/content/image1.png', '/content/image1.png', '/content/image1.png', '/content/image1.png']]

x = tf.ragged.constant(paths_list)
y = tf.constant([1,2,3])

dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.map(lambda x, y: read_image_list(x, y))

for x, y in dataset:
  print(x.shape, y)
(3, 100, 100, 3) tf.Tensor(1, shape=(), dtype=int32)
(1, 100, 100, 3) tf.Tensor(2, shape=(), dtype=int32)
(4, 100, 100, 3) tf.Tensor(3, shape=(), dtype=int32)

You can also convert x back to a ragged tensor if you want.

  • Related