gat_key_t = np.random.normal(size = (8, 16, 64, 20)).astype(np.float32)
gat_query_t = np.random.normal(size = (8, 16, 30, 64)).astype(np.float32)
tf_key = tf.convert_to_tensor(gat_key_t)
tf_query = tf.convert_to_tensor(gat_query_t)
pt_key = torch.from_numpy(gat_key_t)
pt_query = torch.from_numpy(gat_query_t)
tf_output = tf.matmul(tf_query, tf_key)
pt_output = torch.matmul(pt_query, pt_key)
# False
np.allclose(tf_output.numpy(), pt_output.numpy(), rtol = 1e-5, atol = 1e-5, equal_nan = False)
# True
np.allclose(tf_output.numpy(), pt_output.numpy(), rtol = 1e-4, atol = 1e-4, equal_nan = False)
When I multiply two tensors, the outputs of torch and tensorflow are different when tolerance is smaller than 1e-5.
As above, two values are the same until 1e-4, but they become different as tolerance becomes smaller.
How can I make two output be the same in the tolerance of 1e-5?
CodePudding user response:
Had encountered this issue recently when trying to port a transformer model from pytorch
to TF
. Only their CPU version of TF seems to be closer to both pytorch matmul
and numpy's matmul
. Casting the params to tf.float64
also improves the precision. Their GPU implementation of matmul
(which uses cublas
) seems to suffer from precision issues.
The closer i got to improve the precision is by native way of implementing matmul
:
tf_output = tf.reduce_sum(tf.expand_dims(tf.transpose(tf_query, (0, 1, 3, 2)), -1)*tf.expand_dims(tf_key,3), 2)