Home > OS >  Create tensorflow with tf.data.Dataset.from_tensor_slices and one property with dynamic size
Create tensorflow with tf.data.Dataset.from_tensor_slices and one property with dynamic size

Time:03-03

I have a pandas dataframe and of its columns is "bbox" with value i.e. [[94.0, 58.0, 469.0, 362.0]]. I want to convert this dataframe to a custom dataset with tf.data.Dataset.from_tensor_slices. I want the bbox element to have a shape of (None,4) but it is created with shape (1,4) tf.Tensor([[ 94. 58. 469. 362.]], shape=(1, 4), dtype=float32) and I don't know what I am doing wrong.

My dataset is created with this code:

myimages = pd.DataFrame.from_dict(train_data).to_dict("list")
myimages = tf.data.Dataset.from_tensor_slices(myimages)

Thanks everyone in advance for your time

CodePudding user response:

The shape of the tensor is (1, 4) since the bbox column in your dataframe contains elements that are a list of lists, not a single list.

To get a shape of (4, ) for each label, you can modify the elements in the bbox column in your dataframe by indexing each bbox element to obtain the first bbox, and inserting it into the dataframe like so:

myimages["bbox"] = [bbox_element[0] for bbox_element in myimages["bbox"].values]

CodePudding user response:

You can just use tf.data.Dataset.map and tf.squeeze to get rid of the extra dimension:

import tensorflow as tf
import pandas as pd

train_data = {'names': ['some_image.jpg', 'other_image.jpg'],
              'bbox': [[[94.0, 58.0, 469.0, 362.0]], [[94.0, 58.0, 469.0, 362.0]]]}
df = pd.DataFrame(train_data)
myimages = tf.data.Dataset.from_tensor_slices((df['names'].to_numpy(), df['bbox'].to_list()))
myimages = myimages.map(lambda x, y: (x, tf.squeeze(y, axis=0)))

for x, y in myimages:
  print(x, y)
tf.Tensor(b'some_image.jpg', shape=(), dtype=string) tf.Tensor([ 94.  58. 469. 362.], shape=(4,), dtype=float32)
tf.Tensor(b'other_image.jpg', shape=(), dtype=string) tf.Tensor([ 94.  58. 469. 362.], shape=(4,), dtype=float32)
  • Related