Can't convert Python list to Tensorflow Dataset (InvalidArgumentError: Shapes of all inputs mus-CodePudding

I'm trying to make a neural network (using YT guide, but I had to change data input code) and I need the batched dataset for the train function to work properly (idk why, not event sure on it). But when I try to convert a train data list to Dataset using tensorflow.data.Dataset.from_tensor_slices(train_data)) I receive a error message:

InvalidArgumentError
{{function_node __wrapped__Pack_N_3_device_/job:localhost/replica:0/task:0/device:GPU:0}} Shapes of all inputs must match: values[0].shape = [105,105,3] != values[2].shape = [1] [Op:Pack] name: 0

The train_data list consists of 560 lists, each with 3 elements inside:

<tf.Tensor: shape=(105, 105, 3), dtype=float32, numpy = array([[["105x105 3-dimensional image with my face"]]]. dtype=float32)>
<tf.Tensor: shape=(105, 105, 3), dtype=float32, numpy = array([[["different image with the same properties"]]] dtype=float32)>
<tf.Tensor: shape=(1,), dtype=float32, numpy=array(["1. or 0. (float), a label, showing if these pictures are actually the pictures of the same person"], dtype=float32)>

I am pretty sure that all of the shapes in the train_data list are exactly as described.

Some data about shapes using .shape method

train_data.shape #"AttributeError: 'list' object has no attribute 'shape'" - main list
train_data[0].shape #"AttributeError: 'list' object has no attribute 'shape'" - sublist, with 3 elements
train_data[0][0].shape #"TensorShape([105, 105, 3])" - first image
train_data[0][0][0].shape #"TensorShape([105, 3])" - first row of image pixels, ig
train_data[0][0][0][0].shape #"TensorShape([3])" - pixel in the left upper corner

That's what I tried to do: The label of the image pairs (1. or 0.) was previosly just an integer. Then, I received an error saying that everything here should be the same type of float32. Then, I tried to convert it to tensor, but it changed nothing except the last part of the curren error message, it used to say "values[2].shape = []" before. I really have no idea what could lead to the error. I don't have any Tensorflow usage experience. sorry if my engrish is bad

Edit: here is the code that takes the images out of certain directory. May cause eye bleeding

for i in os.listdir("t"):
    for ii in os.listdir(os.path.join("t", i)):
        td.append([
                   [
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii)   "\\"   os.listdir(os.path.join("t", i, ii))[0])) / 255, 0), 
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii)   "\\2.jpeg")) / 255, 0)],
                    tensorflow.convert_to_tensor(
                     float(
                      os.listdir(os.path.join("t", i, ii))[0][0]
                     )
                    )
                  ])

I added some spaces in order to make it a bit more readable. td = train_data. Yea, I could've messed something up there.

CodePudding user response：

Replicating the problem:

x1 = tf.random.normal((105,105,3))
x2 = tf.random.normal((105,105,3))
y = tf.random.normal((1,))

array_list = [[x1, x2, y]] * 560
tf.data.Dataset.from_tensor_slices(array_list)
#InvalidArgumentError ... values[0].shape = [105,105,3] != values[2].shape = [1]

Fix:

#flatten to a single list
flatten_list = sum(array_list, [])

#Separate features and labels 
X = tf.squeeze(tf.stack(flatten_list[::3]))
y = tf.squeeze(tf.stack(flatten_list[2::3]))

#construct dataset iterator
ds = tf.data.Dataset.from_tensor_slices((X, y))
for data in ds.take(1):
    print(data)

CodePudding user response：

Your data is in this shape right now...

x1 = tf.random.normal((105,105,3))
x2 = tf.random.normal((105,105,3))
y = tf.random.normal((1,))

When you are doing this

tf.data.Dataset.from_tensor_slices(((x1 , x2) , y))

You are getting an error above, It is because the first shape is always reserved for batch size, and your batch size, is not given, first give it the batch size by doing this...

x1 = tf.expand_dims(x1, axis=0)
x2 = tf.expand_dims(x2, axis=0)

Now do this

tf.data.Dataset.from_tensor_slices(((x1 , x2) , y))

<TensorSliceDataset element_spec=((TensorSpec(shape=(105, 105, 3), dtype=tf.float32, name=None), TensorSpec(shape=(105, 105, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(), dtype=tf.float32, name=None))>