Why does numpy and pytorch give different results after mean and variance normalization?-CodePudding

I am working on a problem in which a matrix has to be mean-var normalized row-wise. It is also required that the normalization is applied after splitting each row into tiny batches. The code seem to work for Numpy, but fails with Pytorch (which is required for training). It seems Pytorch and Numpy results differ. Any help will be greatly appreciated.

Example code:

import numpy as np
import torch


def normalize(x, bsize, eps=1e-6):
    nc = x.shape[1]
    if nc % bsize != 0:
        raise Exception(f'Number of columns must be a multiple of bsize')
    x = x.reshape(-1, bsize)
    m = x.mean(1).reshape(-1, 1)
    s = x.std(1).reshape(-1, 1)
    n = (x - m) / (eps   s)
    n = n.reshape(-1, nc)
    return n

# numpy
a = np.float32(np.random.randn(8, 8))
n1 = normalize(a, 4)
# torch
b = torch.tensor(a)
n2 = normalize(b, 4)
n2 = n2.numpy()

print(abs(n1-n2).max())

CodePudding user response：

In the first example you are calling normalize with a, a numpy.ndarray, while in the second you call normalize with b, a torch.Tensor.

According to the documentation page of torch.std, Bessel’s correction is used by default to measure the standard deviation. As such the default behavior between numpy.ndarray.std and torch.Tensor.std is different.

If unbiased is True, Bessel’s correction will be used. Otherwise, the sample deviation is calculated, without any correction.

torch.std(input, dim, unbiased, keepdim=False, *, out=None) → Tensor
Parameters

input (Tensor) – the input tensor.

unbiased (bool) – whether to use Bessel’s correction (δN = 1).

You can try yourself:

>>> a.std(), b.std(unbiased=True), b.std(unbiased=False)
(0.8364538, tensor(0.8942), tensor(0.8365))