How to use tf.keras.layers.Normalization for more than one feature inside the model-CodePudding

This reproducible example creates a basic regression model predicting MPG given Horsepower (hope I am OK to just provide link). As far as I understand, this bakes the transformation of the feature Horsepower into the model's training - also refered to as "inside the model". This is appealing as the model does the necessary transformation of raw data during scoring/inference e.g. after deployment (please correct me if I misunderstood). I am wondering, how this could be implemented when one has more than on independent variable. This is taken from the reproducible code quoted above:

horsepower_normalizer = tf.keras.layers.Normalization(input_shape=[1, ], axis=None)
horsepower_normalizer.adapt(horsepower)

horsepower_normalizer = tf.keras.layers.Normalization(input_shape=[1, ], axis=None)
horsepower_normalizer.adapt(horsepower)

horsepower_model = Sequential([
    horsepower_normalizer,
    layers.Dense(units=1)
])

So let us say we have a list of numeric features X, Y, Z could the model definition code be produced based on this (e.g. via the functional API)? Any pointers would be very much welcome. Thanks!

PS:

I am currently trying to learn Keras TF and ideally I want the normalisation make part of the mode/training. I use very rudemnetary code (to be improved!) along those lines:

train_data = pd.read_csv('train.csv')
val_data = pd.read_csv('val.csv')

target_name = 'ze_target'

y_train = train_data[target_name]
X_train = train_data.drop(target_name, axis=1)

y_val = train_data[target_name]
X_val = train_data.drop(target_name, axis=1)

def create_model():
    
    model = Sequential()
    model.add(Dense(20, input_dim=X.shape[1], activation='relu'))
    model.add(Dense(20, input_dim=X.shape[1], activation='relu'))
    model.add(Dense(20, input_dim=X.shape[1], activation='relu'))
    model.add(Dense(1))
    # Compile model
    model.compile(optimizer=Adam(learning_rate=0.0001), loss = 'mse')
    return model

model = create_model()
model.summary()

model.fit(X_train, y_train, validation_data=(X_val,y_val), batch_size=128, epochs=30)

CodePudding user response：

You can use tf.concat and concatenate three features on axis=1 then use tf.keras.layers.Normalization for three feature like below, because we want to normalize on three features, make sure to set input_shape=(3,) and axis=-1.

import tensorflow as tf

x = tf.random.uniform((100, 1))
y = tf.random.uniform((100, 1))
z = tf.random.uniform((100, 1))

xyz = tf.concat([x, y, z], 1)

horsepower_normalizer = tf.keras.layers.Normalization(input_shape=(3,), axis=-1)
horsepower_normalizer.adapt(xyz)

horsepower_model = tf.keras.models.Sequential([
    horsepower_normalizer,
    tf.keras.layers.Dense(units=1)
])

horsepower_model(xyz)

Output:

<tf.Tensor: shape=(100, 1), dtype=float32, numpy=
array([[-0.17135675],
       [-0.48248804],
       [-2.2847023 ],
       [-0.05702276],
       [ 2.9332483 ],
       [ 0.64826846],
       [-2.1490448 ],
       [-1.1697797 ],
       [-0.01030668],
            ...
       [-1.880199  ],
       [ 1.2854142 ],
       [-0.5471661 ]], dtype=float32)>