I am developing a regression model to predict cryptocurrency prices, and I have created a simple loss function. The idea is simple the Y target is the price change from a certain lookup window, so either the values are positive or negative. The idea is to first apply an mae loss function and then penalize where y_pred is positive and y_true is negative and vice versa. And reduce the loss where y_pred is positive and y_true is also positive and vice versa. Yet when I train with my loss function the precision does not get higher then 0.50 where it gets to around 0.535 with the regular mae loss function. Any idea what could cause this?
The loss function:
def loss_fn(
# the loss mode [mae, rmse, mape, huber].
mode="mae",
# the threshold.
threshold=0.0,
# penalize incorrect predictions (when predicted positive and is negative and reversed) (should be >= 1).
penalizer=1.0,
# reduce correct predictions (when predicted positive and is positive and reversed) (should be <= 1).
reducer=1.0,
):
def loss_function(y_true, y_pred):
if mode == "mae":
loss = keras.losses.MAE(y_true, y_pred)
elif mode == "rmse":
loss = K.sqrt(K.mean(K.square(y_pred - y_true)))
elif mode == "mape":
loss = keras.losses.mean_absolute_percentage_error(y_true, y_pred)
elif mode == "huber":
loss = keras.losses.Huber()(y_true, y_pred)
if penalizer != 1.0 or reducer != 1.0:
# apply penalizer.
mask = tf.where(
tf.logical_or(
tf.logical_and(K.less_equal(y_pred, -1 * threshold), K.greater(y_true, 0.0)),
tf.logical_and(K.greater_equal(y_pred, threshold), K.less(y_true, 0.0)),
),
penalizer,
1.0,
)[:, 0]
loss = tf.multiply(loss, mask)
# apply reducer.
mask = tf.where(
tf.logical_or(
tf.logical_and(K.less_equal(y_pred, -1 * threshold), K.less(y_true, 0.0)),
tf.logical_and(K.greater_equal(y_pred, threshold), K.greater(y_true, 0.0)),
),
reducer,
1.0,
)[:, 0]
loss = tf.multiply(loss, mask)
loss = tf.math.reduce_mean(loss)
return loss
return loss_function
loss = loss_fn(mode="mae", threshold=0.0, penalizer=3.0, reducer=1.0/3.0)
Does anyone see any errors or mistakes that could cause this?
Training logs from "mae" as loss:
Epoch 1/250
2829/2829 [==============================] - 44s 12ms/step - loss: 0.8713 - precision: 0.5311 - val_loss: 0.9731 - val_precision: 0.5343
Epoch 2/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8705 - precision: 0.5341 - val_loss: 0.9732 - val_precision: 0.5323
Epoch 3/250
2829/2829 [==============================] - 31s 11ms/step - loss: 0.8702 - precision: 0.5343 - val_loss: 0.9727 - val_precision: 0.5372
Epoch 4/250
2829/2829 [==============================] - 31s 11ms/step - loss: 0.8701 - precision: 0.5345 - val_loss: 0.9730 - val_precision: 0.5336
Epoch 5/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8700 - precision: 0.5344 - val_loss: 0.9732 - val_precision: 0.5316
Epoch 6/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8699 - precision: 0.5347 - val_loss: 0.9726 - val_precision: 0.5334
Epoch 7/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8697 - precision: 0.5346 - val_loss: 0.9731 - val_precision: 0.5331
Epoch 8/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8695 - precision: 0.5343 - val_loss: 0.9722 - val_precision: 0.5382
Epoch 9/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8693 - precision: 0.5346 - val_loss: 0.9724 - val_precision: 0.5330
Epoch 10/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8693 - precision: 0.5345 - val_loss: 0.9732 - val_precision: 0.5331
Epoch 11/250
2829/2829 [==============================] - 32s 11ms/step - loss: 0.8692 - precision: 0.5342 - val_loss: 0.9738 - val_precision: 0.5339
Epoch 12/250
2829/2829 [==============================] - 31s 11ms/step - loss: 0.8690 - precision: 0.5345 - val_loss: 0.9729 - val_precision: 0.5356
Epoch 13/250
2829/2829 [==============================] - 31s 11ms/step - loss: 0.8687 - precision: 0.5350 - val_loss: 0.9728 - val_precision: 0.5342
Training logs from the custom loss function (EarlyStopping enabled):
Epoch 1/250
2829/2829 [==============================] - 42s 12ms/step - loss: 1.4488 - precision: 0.5039 - val_loss: 1.5693 - val_precision: 0.5021
Epoch 2/250
2829/2829 [==============================] - 33s 12ms/step - loss: 1.4520 - precision: 0.5022 - val_loss: 1.6135 - val_precision: 0.5132
Epoch 3/250
2829/2829 [==============================] - 33s 12ms/step - loss: 1.4517 - precision: 0.5019 - val_loss: 1.6874 - val_precision: 0.4983
Epoch 4/250
2829/2829 [==============================] - 33s 12ms/step - loss: 1.4536 - precision: 0.5017 - val_loss: 1.6885 - val_precision: 0.4982
Epoch 5/250
2829/2829 [==============================] - 33s 12ms/step - loss: 1.4513 - precision: 0.5028 - val_loss: 1.6654 - val_precision: 0.5004
Epoch 6/250
2829/2829 [==============================] - 34s 12ms/step - loss: 1.4578 - precision: 0.4997 - val_loss: 1.5716 - val_precision: 0.5019
CodePudding user response:
Any idea what could cause this?
Assuming you set the seed for reproducibility, otherwise it could simply be initialization, when you change the loss function you change the domain over which the gradient is going to iterate to optimize your network.
And since you don't have any guarantee that your model is going to reach the global minima but, most likely, will stop at a local minima it coud just mean that, given the same initialization (set seed) the optimization process stopped at a different local minima.