How to correctly train and predict in Tensorflow 2?-CodePudding

I am starting out with Tensorflow and trying to create a model for a simple classification problem, but I can't get the input and output shapes matching correctly, and when I do, I can't call "predict" on the Keras model.

Let's say I want to recognise people based on fictive sizes/measurements.

The problem seems to be that I can't get the input and output shapes matching correctly, and when I do, I can't call "predict" on the Keras model. Here's what I have so far:

#!python
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.6.0

This is the limited dataset I want to train the model with (I know it needs to be way bigger but first lets get this to compile)"

# The names of people I want to identify
names = sorted({"Jack", "John", "Peter", "X"})

# The fictive sizes for each:
sizes = [
            [0.0, 0.5, 0.6, 0.7, 0.8, 0.3], # Jack's sizes
            [0.2, 0.6, 0.7, 0.8, 0.5, 0.2], # John's sizes
            [0.0, 0.1, 0.1, 0.4, 0.8, 0.9], # Peter's sizes
            [0.3, 0.9, 0.2, 0.1, 0.0, 0.8]  # X's sizes
]

Then I create a Keras Sequential model with 6 inputs, one for each size, and 4 outputs, one the probability for each name. A middle layer might be better but I am trying to take out everything that could go wrong here:

model = tf.keras.models.Sequential([
  tf.keras.layers.Dense(name="inputs",  units=len(sizes[0]), activation='relu'),
  tf.keras.layers.Dense(name="outputs", units=len(names), activation='relu')
])

As fas as I understand this is called "one-hot", where each of the outputs generate a probability for that classification (name), and one will be "hot", meaining closer to 1.0. So this is what I use to compile the model:

model.compile(
        optimizer='adam',
        loss=tf.keras.losses.CategoricalCrossentropy(), # one-hot
        metrics=['accuracy']
)

Next up is training the model. As far as I understand, model.fit() can work with two arrays, X and Y, where X contains the input Tensors and Y contains the expected output Tensors (so X[0] should yield Y[0], etc etc). This seemed to be the simplest setup, so I did:

x = [tf.constant(s) for s in sizes] # Stack size Tensors for all names
y = [tf.constant(i) for i in [ # Stack categorization probabilities for each x -> y
       [1.0, 0.0, 0.0, 0.0], # Jack's index
       [0.0, 1.0, 0.0, 0.0], # John's index
       [0.0, 0.0, 1.0, 0.0], # Peter's index
       [0.0, 0.0, 0.0, 1.0]  # X's index
   ]]

When I call

model.fit(x, y, epochs=5)

I get the following error:

ValueError: Data cardinality is ambiguous:
  x sizes: 6, 6, 6, 6
  y sizes: 4, 4, 4, 4
Make sure all arrays contain the same number of samples.

This strange, because my X and Y dataset are exactly the same size, and each Tensor in X contains 6 values for the 6 inputs of the model, and each Tensor in Y contains 4 values for the outputs in the model.

Please help me understand what I am doing wrong here. I'd like to be able to train the model, and then call model.predict([0.0, 0.5, 0.6, 0.7, 0.8, 0.3]) on it and then get probabilities back indicating that this is likely Jack's measurements.

CodePudding user response：

The problem lies in the way you provide the inputs to model.fit. If x or y are lists of tensors/arrays, each element of the list is interpreted as a "separate input", i.e. observations of a different variable. In your case, the model gets 4 different variables, each with 6 observations, for x, and 4 different variables, each with 4 observations, for y, causing a mismatch.

The solution is to simply provide x and y as a tensor or array instead of a list of tensors:

x = np.array(sizes) (or tf.constant would be fine as well).

Same thing for y. Then, Keras will interpret the input as 4 observations of 6 variables for x, and 4 observations of 4 variables for y.