How to implement the Residual Standard Error (RSE) as a metric in Keras?-CodePudding

How can I compute the Residual Standard Error (RSE) as a custom metric in Keras? The RSE is given by: sqrt[RSS / (n-2)] Where the RSS is: sum((y_true -y_pred)**2) This question refers to a post on stackoverflow. In this post, a user by the name of Swain Subrat Kumar shows the implementation of the Residual Standard Error (RSE). He even provides a minimum working example (MWE) which I believe to be correct. I repost a shortened version here:

def RSE(y_true, y_predicted):
'''
y_true, y_pred: np.array()
'''
    RSS = np.sum(np.square(y_true - y_predicted))
    return math.sqrt(RSS / (len(y_true) - 2))

I am trying to translate this code into keras/tensorflow so that I can use it as a metric. So far, I have this:

def rse(y_true, y_pred):
    '''
    y_true, y_pred: tensor
    '''
    tmp=tf.cast(len(y_true), tf.float32) - tf.constant(2.0)
    RSS = K.sum(K.square(y_true - y_pred)) # residual sum of squares
    return K.sqrt(tf.math.divide(RSS, tmp))

However, this is not correct. The RSS is ok. Where it all goes wrong is in dividing the RSS by (len(y_true)-2). How can I fix this? Many thanks in advance.

P.S.: I am having similar problems when trying to create my own variance metric.

CodePudding user response：

If you are using the rse function as a metric or a loss, it's being applied to batches of data i.e; tensors which are of size (B, n) where B is the designated batch size and n being the number of elements in each vector (assuming each is 1-D). When you apply the division using len(y_true) - 2, the len function is going to return the number of samples in the batch B (the first dimension), where it should be using the value of the second dimension n. If you change the rse function to use the value of the second dimension in the tensor (y_true.shape[1]), the results are correct:

def rse(y_true, y_pred):
    '''
    y_true, y_pred: tensor
    '''
    tmp = tf.cast(y_true.shape[1], tf.float32) - tf.constant(2.0)
    RSS = K.sum(K.square(y_true - y_pred)) # residual sum of squares
    return K.sqrt(tf.math.divide(RSS, tmp))

In a fully reproducible dummy example:

import tensorflow as tf
import tensorflow.keras.backend as K
import numpy as np


def rse(y_true, y_pred):
    '''
    y_true, y_pred: tensor
    '''
    tmp = tf.cast(y_true.shape[1], tf.float32) - tf.constant(2.0)
    RSS = K.sum(K.square(y_true - y_pred)) # residual sum of squares
    return K.sqrt(tf.math.divide(RSS, tmp))


if __name__ == "__main__":
    # NOTE: call `expand_dims` to simulate the idea of a batch (i.e a 2D tensor with shape (1, 5))
    # so B = 1, n = 5
    y_true = np.expand_dims(np.array([1, 2, 3, 4, 6], dtype=np.float32), axis=0)
    y_pred = np.expand_dims(np.array([1, 2, 3, 4, 5], dtype=np.float32), axis=0)
    print(rse(y_true, y_pred))

Output is:

tf.Tensor(0.57735026, shape=(), dtype=float32)

Which is correct (simply the square root of 1/3, since we only have 1 error in the example data).