Home > Blockchain >  Loading a numpy array into Tensorflow input pipeline
Loading a numpy array into Tensorflow input pipeline

Time:04-25

So I am following a tutorial for making a dataloader for images (https://github.com/codebasics/deep-learning-keras-tf-tutorial/blob/master/44_tf_data_pipeline/tf_data_pipeline.ipynb).

The full code is something like this:

images_ds = tf.data.Dataset.list_files("path/class/*")

def get_label(file_path):
    import os
    parts = tf.strings.split(file_path, os.path.sep)
    return parts[-2]

## How the tutorial does it
def process_image(file_path):
    label = get_label(file_path)

    img = tf.io.read_file(file_path)
    img = tf.image.decode_jpeg(img)

    return img, label

## How I want to do it
def process_image(file_path):
    label = get_label(file_path)


    img = np.load(file_path)
    img = tf.convert_to_tensor(img) 

    return img, label

train_ds = images_ds.map(process_image)

In the tutorial, the data is a .jpeg. However, my data is a .npy.

Therefore, loading the data with the following code does not work:

img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img)

I want to work around this problem, but my solution does not work.

img = np.load(file_path)
img = tf.convert_to_tensor(img) 

It does work when I feed the process_image function 1 instance. However, when I use the .map function, I get an error.

Error:
TypeError: expected str, bytes or os.PathLike object, not Tensor

Is there an equivalent function to tf.image.decode_image() for decoding a numpy array and/or can someone help me with my current error?

CodePudding user response:

The comment of @André put me in the right direction. The code below works.


def process_image(file_path):
    label = get_label(file_path)
    label = np.uint8(label)

    img = np.load(file_path)
    img = tf.convert_to_tensor(img/255, dtype=tf.float32) 

    return img , label 

train_ds = images_ds.map(lambda item: tf.numpy_function(
          process_image, [item], (tf.float32, tf.uint8))) 

  • Related