This question is a follow-up to the following question that has already been answered, which I would like to formally ask here as a new question. The original question is located here:
As mentioned, I am currently training TensorFlow models to predict parameters of different distributions. For this purpose, I create appropriate layers and modify the loss functions.
Unfortunately, when I use a multivariate t-distribution (tfp.distributions.MultivariateStudentTLinearOperator), the following error results:
InvalidArgumentError: Input matrix is not invertible.
[[node negative_t_loss_2/negative_t_loss_2_MultivariateStudentTLinearOperator/log_prob/LinearOperatorLowerTriangular/solve/triangular_solve/MatrixTriangularSolve (defined at d:\20_programming\python\virtualenvs\tensorflow-gpu-2\lib\site-packages\tensorflow_probability\python\distributions\multivariate_student_t.py:265) ]] [Op:__inference_train_function_1471]
Function call stack:
train_function
This time, the procedure for defining the loss function is as follows:
def negative_t_loss_2(y_true, y_pred):
# Separate the parameters
n, mu1, mu2, sigma11, sigma12, sigma22 = tf.unstack(y_pred, num=6, axis=-1)
mu = tf.transpose([mu1, mu2], perm=[1, 0])
sigma = tf.linalg.LinearOperatorLowerTriangular(tf.transpose([[sigma11, sigma12], [sigma12, sigma22]], perm=[2, 0, 1]))
dist = tfp.distributions.MultivariateStudentTLinearOperator(df=n, loc=mu, scale=sigma)
nll = tf.reduce_mean(-dist.log_prob(y_true))
return nll
I have copied the complete (somewhat more extensive) code and the required data to
https://drive.google.com/drive/folders/1IIAtKDB8paWV0aFVFALDUAiZTCqa5fAN?usp=sharing
(notebook "normdist_2D_not_working_t.ipynb").
The operating system I use is Windows 10, the Python version is 3.6. All libraries listed in the sample code are the latest, including tensorflow-gpu.
I would be very grateful if the problem could be solved. The topic is particularly relevant for the financial sector, since such distributions play a major role here, especially in risk management.
CodePudding user response:
The scale matrix needs to be lower triangular when calling LinearOperatorLowerTriangular, to convert the tensor to a linear operator, just replace
sigma = tf.linalg.LinearOperatorLowerTriangular(tf.transpose([[sigma11, sigma12], [sigma12, sigma22]], perm=[2, 0, 1]))
by:
sigma = tf.linalg.LinearOperatorLowerTriangular(tf.transpose([[sigma11, tf.zeros_like(sigma12)], [sigma12, sigma22]], perm=[2, 0, 1]))
Also the parameter n of the Student-t is positive, so you should add n = tf.keras.activations.softplus(n)
in negative_t_layer_2
function
Then, it should work.