I have a pandas dataframe and of its columns is "bbox" with value i.e. [[94.0, 58.0, 469.0, 362.0]]
. I want to convert this dataframe to a custom dataset with tf.data.Dataset.from_tensor_slices.
I want the bbox element to have a shape of (None,4) but it is created with shape (1,4) tf.Tensor([[ 94. 58. 469. 362.]], shape=(1, 4), dtype=float32)
and I don't know what I am doing wrong.
My dataset is created with this code:
myimages = pd.DataFrame.from_dict(train_data).to_dict("list")
myimages = tf.data.Dataset.from_tensor_slices(myimages)
Thanks everyone in advance for your time
CodePudding user response:
The shape of the tensor is (1, 4)
since the bbox
column in your dataframe contains elements that are a list of lists, not a single list.
To get a shape of (4, )
for each label, you can modify the elements in the bbox
column in your dataframe by indexing each bbox element to obtain the first bbox, and inserting it into the dataframe like so:
myimages["bbox"] = [bbox_element[0] for bbox_element in myimages["bbox"].values]
CodePudding user response:
You can just use tf.data.Dataset.map
and tf.squeeze
to get rid of the extra dimension:
import tensorflow as tf
import pandas as pd
train_data = {'names': ['some_image.jpg', 'other_image.jpg'],
'bbox': [[[94.0, 58.0, 469.0, 362.0]], [[94.0, 58.0, 469.0, 362.0]]]}
df = pd.DataFrame(train_data)
myimages = tf.data.Dataset.from_tensor_slices((df['names'].to_numpy(), df['bbox'].to_list()))
myimages = myimages.map(lambda x, y: (x, tf.squeeze(y, axis=0)))
for x, y in myimages:
print(x, y)
tf.Tensor(b'some_image.jpg', shape=(), dtype=string) tf.Tensor([ 94. 58. 469. 362.], shape=(4,), dtype=float32)
tf.Tensor(b'other_image.jpg', shape=(), dtype=string) tf.Tensor([ 94. 58. 469. 362.], shape=(4,), dtype=float32)