I want to try to train a NN to do nonlinear fitting for me.
I generate a set of model parameters p
, calculate the model output (the y
values) on a given set of x
values. The NN input is the y
and output should be the predicted parameters p'
.
Instead of using a loss function like the MSE between p
and p'
, I think it could work better to first generate the predicted y'
and then use the MSE between y
and y'
. (I was having issues where small differences in some of the parameters could cause a massive "drift" in the fitted curve when x >> 0.)
I found that Keras uses symbolic evaluation of the loss function (I think?). Thus, I implemented a version of my model using Keras backend calls as follows:
x = np.linspace(0, 10, num=1000)
# regular model
def model(x, a):
return np.cos(a * x)
def Kmodel(x, params):
a = params[:, 0]
# other parameters omitted, params[:, n]
# simplified model for illustration
return K.cos(a * x)
# THIS CAUSES ISSUES
keras_x = K.constant(x)
def regression_loss(p_true, p_pred):
y_true = Kmodel(keras_x, p_true)
y_pred = Kmodel(keras_x, p_pred)
mse = K.mean(K.square(y_true - y_pred))
return mse
However, I can't figure out how to multiply everything by x
. It involves increasing the dimensionality of the vectors, as we have 1000 values for x
instead of however many parameters I am using. Keras does not like it:
W tensorflow/core/framework/op_kernel.cc:1733] INVALID_ARGUMENT: required broadcastable shapes
Graph execution error:
<snip>
File "/tmp/ipykernel_4109/793156309.py", line 9, in Kmodel
return a * x
Node: 'regression_loss/mul'
required broadcastable shapes
[[{{node regression_loss/mul}}]] [Op:__inference_train_function_18838]
I found similar problems online, but none seem to deal with this broadcasting issue. For example, Tensor indexing in custom loss function, Custom loss function in Keras, Is it possible to call/use instance attributes or global variables from a custom loss function using Keras?, InvalidArgumentError: required broadcastable shapes at loc(unknown).
CodePudding user response:
Hmmm, you just have to make sure that a
has the same shape as x
. For example, you can use tf.repeat
:
import tensorflow as tf
import numpy as np
x = np.linspace(0, 10, num=1000)
# regular model
def model(x, a):
return np.cos(a * x)
def Kmodel(x, params):
a = tf.repeat(params[:, 0], repeats=tf.shape(x)[0] // tf.shape(params)[0])
# other parameters omitted, params[:, n]
# simplified model for illustration
return tf.keras.backend.cos(a * x)
# THIS CAUSES ISSUES
keras_x = tf.keras.backend.constant(x)
def regression_loss(p_true, p_pred):
y_true = Kmodel(keras_x, p_true)
y_pred = Kmodel(keras_x, p_pred)
mse = tf.keras.backend.mean(tf.keras.backend.square(y_true - y_pred))
return mse
y_true = tf.constant([[0., 1.], [0., 0.]])
y_pred = tf.constant([[1., 1.], [1., 0.]])
regression_loss(y_true, y_pred)
<tf.Tensor: shape=(), dtype=float32, numpy=1.6316855>
Or with tf.reshape
:
def Kmodel(x, params):
y = tf.repeat(tf.expand_dims(x, axis=-1), repeats=tf.shape(params)[0])
y = tf.reshape(y, (tf.shape(params)[0], tf.shape(x)[0]))
return tf.keras.backend.cos(tf.expand_dims(params[:, 0], axis=-1)* x)