This Code for a paper I read had a loss function written using Pytorch, I tried to convert it as best as I could but am getting all Zero's as model predictions, so would like to ask the following:
- Are the methods I used the correct equivalent in Tensorflow?
- Why is the model predicting only Zero's?
Here is the function:
#Pytorch
class AdjMSELoss1(nn.Module):
def __init__(self):
super(AdjMSELoss1, self).__init__()
def forward(self, outputs, labels):
outputs = torch.squeeze(outputs)
alpha = 2
loss = (outputs - labels)**2
adj = torch.mul(outputs, labels)
adj[adj>0] = 1 / alpha
adj[adj<0] = alpha
loss = loss * adj
return torch.mean(loss)
#Tensorflow
def custom_loss_function(outputs,labels):
outputs = tf.squeeze(outputs)
alpha = 2.0
loss = (outputs - labels) ** 2.0
adj = tf.math.multiply(outputs,labels)
adj = tf.where(tf.greater(adj, 0.0), tf.constant(1/alpha), adj)
adj = tf.where(tf.less(adj, 0.0), tf.constant(alpha), adj)
loss = loss * adj
return tf.reduce_mean(loss)
The function compiles correctly and is being used in the loss and metric parameters, it is outputing results in metrics logs that appear to be correct (Similar to val_loss) but the output of the model after running is just predicting all 0's
model.compile(
loss= custom_loss_function,
optimizer=optimization,
metrics = [custom_loss_function]
)
MODEL
#Simplified for readability
model = Sequential()
model.add(LSTM(32,input_shape=(SEQ_LEN,feature_number),return_sequences=True,))
model.add(Dropout(0.3))
model.add(LSTM(96, return_sequences = False))
model.add(Dropout(0.3))
model.add(Dense(1))
return model
Inputs/Features are pct_change Price for the previous SEQ_LEN days. (Given SEQ_LEN days tries to predict next day: Target)
Outputs/Targets are the next day's price pct_change * 100 (Ex: 5 for 5%). (1 value per row)
Note: The model predicts normally when RMSE() is set as the loss function, as mentioned when using the custom_loss_function above it's just predicting Zero's
CodePudding user response:
Try this custom_loss
:
def custom_loss(y_pred, y_true):
alpha = 2.0
loss = (y_pred - y_true) ** 2.0
adj = tf.math.multiply(y_pred,y_true)
adj = tf.where(tf.greater(adj, 0.0), tf.constant(1/alpha), adj)
adj = tf.where(tf.less(adj, 0.0), tf.constant(alpha), adj)
loss = loss * adj
return tf.reduce_mean(loss)
I check with the below code and work correctly (Code for creating a model for learning and predicting the sum of two variables with the custom_loss
):
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf
import numpy as np
x = np.random.rand(1000,2)
y = x.sum(axis=1)
y = y.reshape(-1,1)
def custom_loss(y_pred, y_true):
alpha = 2.0
loss = (y_pred - y_true) ** 2.0
adj = tf.math.multiply(y_pred,y_true)
adj = tf.where(tf.greater(adj, 0.0), tf.constant(1/alpha), adj)
adj = tf.where(tf.less(adj, 0.0), tf.constant(alpha), adj)
loss = loss * adj
return tf.reduce_mean(loss)
model = Sequential()
model.add(Dense(128, activation='relu', input_dim=2))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1,))
model.compile(optimizer='adam', loss=custom_loss)
model.fit(x, y, epochs=200, batch_size=16)
for _ in range(10):
rnd_num = np.random.randint(50, size=2)[None, :]
pred_add = model.predict(rnd_num)
print(f'predict sum of {rnd_num[0]} -> {pred_add}')
Output:
Epoch 1/200
63/63 [==============================] - 1s 2ms/step - loss: 0.2903
Epoch 2/200
63/63 [==============================] - 0s 2ms/step - loss: 0.0084
Epoch 3/200
63/63 [==============================] - 0s 2ms/step - loss: 0.0016
...
Epoch 198/200
63/63 [==============================] - 0s 2ms/step - loss: 3.3231e-07
Epoch 199/200
63/63 [==============================] - 0s 2ms/step - loss: 5.1004e-07
Epoch 200/200
63/63 [==============================] - 0s 2ms/step - loss: 9.8688e-08
predict sum of [43 44] -> [[82.81973]]
predict sum of [39 13] -> [[48.97299]]
predict sum of [36 46] -> [[78.05187]]
predict sum of [46 7] -> [[49.445843]]
predict sum of [35 11] -> [[43.311478]]
predict sum of [33 1] -> [[31.695848]]
predict sum of [6 8] -> [[13.433815]]
predict sum of [14 38] -> [[49.54941]]
predict sum of [ 1 40] -> [[39.709686]]
predict sum of [10 2] -> [[11.325197]]