I am working on a problem in which a matrix has to be mean-var normalized row-wise. It is also required that the normalization is applied after splitting each row into tiny batches. The code seem to work for Numpy, but fails with Pytorch (which is required for training). It seems Pytorch and Numpy results differ. Any help will be greatly appreciated.
Example code:
import numpy as np
import torch
def normalize(x, bsize, eps=1e-6):
nc = x.shape[1]
if nc % bsize != 0:
raise Exception(f'Number of columns must be a multiple of bsize')
x = x.reshape(-1, bsize)
m = x.mean(1).reshape(-1, 1)
s = x.std(1).reshape(-1, 1)
n = (x - m) / (eps s)
n = n.reshape(-1, nc)
return n
# numpy
a = np.float32(np.random.randn(8, 8))
n1 = normalize(a, 4)
# torch
b = torch.tensor(a)
n2 = normalize(b, 4)
n2 = n2.numpy()
print(abs(n1-n2).max())
CodePudding user response:
In the first example you are calling normalize
with a
, a numpy.ndarray
, while in the second you call normalize
with b
, a torch.Tensor
.
According to the documentation page of torch.std
, Bessel’s correction is used by default to measure the standard deviation. As such the default behavior between numpy.ndarray.std
and torch.Tensor.std
is different.
If
unbiased
isTrue
, Bessel’s correction will be used. Otherwise, the sample deviation is calculated, without any correction.
torch.std(input, dim, unbiased, keepdim=False, *, out=None) → Tensor
Parameters
input
(Tensor) – the input tensor.unbiased
(bool) – whether to use Bessel’s correction (δN = 1
).
You can try yourself:
>>> a.std(), b.std(unbiased=True), b.std(unbiased=False)
(0.8364538, tensor(0.8942), tensor(0.8365))