I have a tensor of shape [batch_size, channel, depth, height, width]
:
torch.Size([1, 1, 32, 64, 64])
with data:
tensor([[[[[-1.8540, -2.8068, -2.7348, ..., -1.9074, -1.8227, -1.4540],
[-2.7012, -4.2785, -3.7421, ..., -3.1961, -2.7786, -1.8042],
[-2.1924, -4.2202, -4.4361, ..., -3.1203, -2.9282, -2.3800],
...,
[-2.7429, -4.3133, -4.4029, ..., -4.4971, -5.3288, -2.8659],
[-3.0169, -4.0198, -3.6886, ..., -3.7542, -4.5010, -2.4040],
[-1.6174, -2.5340, -2.3974, ..., -1.9249, -2.4107, -1.2664]],
[[-2.7840, -3.2442, -3.6118, ..., -3.1365, -2.8342, -1.9516],
[-3.5764, -4.9253, -5.9196, ..., -4.8373, -4.2233, -3.3809],
[-3.1701, -5.0826, -5.6424, ..., -5.2955, -4.6438, -3.4820],
...,
[-4.0111, -6.1946, -5.6582, ..., -6.7947, -6.5305, -4.2866],
[-4.2103, -6.6177, -6.0420, ..., -5.8076, -6.2128, -3.2093],
[-2.3174, -4.1081, -3.7369, ..., -3.5552, -3.1871, -1.9736]],
[[-2.8441, -4.1575, -3.8233, ..., -3.5065, -3.4313, -2.3030],
[-4.0076, -5.4939, -6.2451, ..., -4.6663, -4.9835, -3.1530],
[-3.4737, -5.6347, -6.0232, ..., -5.6191, -5.2626, -3.6109],
...,
[-3.8026, -5.3676, -6.1460, ..., -7.6695, -6.7640, -4.1681],
[-4.4012, -6.1293, -6.1859, ..., -6.0011, -6.1012, -3.5307],
[-2.7917, -4.2264, -4.1388, ..., -4.2080, -3.5555, -1.6384]],
...,
[[-2.2204, -3.5705, -4.3114, ..., -4.2249, -3.9628, -2.9190],
[-3.6343, -5.3445, -6.1638, ..., -6.3998, -6.7561, -4.8491],
[-3.4870, -5.5835, -5.6436, ..., -6.8527, -7.2536, -4.8143],
...,
[-2.4492, -3.7896, -5.4344, ..., -6.2853, -6.0766, -3.7538],
[-2.4723, -3.8393, -4.8480, ..., -5.6503, -5.0375, -3.5580],
[-1.6161, -2.9843, -3.2865, ..., -3.2627, -3.2887, -2.5750]],
[[-2.1509, -3.8303, -4.2807, ..., -3.7945, -3.7561, -3.0863],
[-3.1012, -5.1321, -6.1387, ..., -6.5191, -6.3268, -4.4283],
[-2.8346, -5.0640, -5.4868, ..., -6.6515, -6.5529, -4.3672],
...,
[-2.7278, -4.2538, -4.9776, ..., -6.4153, -6.0100, -3.9929],
[-2.8002, -4.0473, -4.7455, ..., -5.4203, -4.7286, -3.4111],
[-1.7964, -3.2307, -3.6329, ..., -3.2750, -2.3952, -1.9714]],
[[-1.4447, -2.1572, -2.4487, ..., -2.3859, -2.9540, -1.8451],
[-1.8075, -2.8380, -3.5621, ..., -3.8641, -3.5828, -2.7304],
[-1.7862, -2.9849, -3.8364, ..., -4.3380, -4.4745, -2.8476],
...,
[-1.8043, -2.5662, -2.7296, ..., -4.2772, -3.9882, -2.8654],
[-1.2364, -2.5228, -2.7190, ..., -4.1142, -3.6160, -2.2325],
[-1.0395, -1.7621, -2.5738, ..., -2.0349, -1.5140, -1.1625]]]]]
Now to get the prediction from this I use
torch.argmax(data, 1)
which should give me the location of maximum values in the channel dimension, but instead I get a tensor containing only zeros. Even max(torch.argmax())
produces 0
.
How can this be, the tensor is only a single dimension and a single batch, how can it return a 0?
To get rid of the negative values I applied torch.nn.Sigmoid()
on it, but still argmax
failed to find a maximum value. Which I dont understand, how can there not be a maximum value?
numpy.argmax(output.detach().numpy(), 1)
gives the same output, all 0
.
Am I not using argmax correctly?
CodePudding user response:
On this page everything is confusing for argmax
.
The example they selected is 4x4 so you cannot spot the difference
a = torch.randn(5, 3)
print(a)
print(torch.argmax(a, dim=0))
print(torch.argmax(a, dim=1))
Out
tensor([[-1.0329, 0.2042, 2.5499],
[ 0.9893, 0.3913, 0.5096],
[ 0.4951, 0.2260, -0.3810],
[-1.8953, -0.6823, 0.8349],
[-0.6217, 0.4068, -1.0846]])
tensor([1, 4, 0])
tensor([2, 0, 0, 2, 1])
See how in dim=0
we have 3 values. This is the dimension of columns.
So it is telling you the element with index 1 in the first column is the max in that column.
The other dim=1
is the dimension of rows, so we have 5 values.
For your example you can calculate the shape of result for argmax
:
for i in range(data.dim()):
print("dim", i)
r =torch.argmax(data,i)
print(r.shape)
dim 0
torch.Size([1, 32, 64, 64])
dim 1
torch.Size([1, 32, 64, 64])
dim 2
torch.Size([1, 1, 64, 64])
dim 3
torch.Size([1, 1, 32, 64])
dim 4
torch.Size([1, 1, 32, 64])
And for dim=0
and dim=0
you should have all 0 sine the dim is 1 (index = 0).
I tried that, but then how do I extract the maximum values from argmax and which dimension should it look against?
data = torch.randn(32, 64, 64)
values, indices = data.max(0)
print(values, indices)
values, indices = values.max(0)
print(values, indices)
values, indices = values.max(0)
print(values, indices
)
tensor([[1.9918, 1.6041, 2.6535, ..., 1.5768, 1.7320, 1.8234],
[1.6700, 2.4574, 1.8548, ..., 1.8770, 1.7674, 1.6194],
[1.8361, 1.6800, 1.8982, ..., 1.7983, 2.7109, 2.2166],
...,
[2.7439, 1.6215, 2.9740, ..., 1.7031, 1.4445, 1.6681],
[1.9437, 1.4507, 1.8551, ..., 2.5853, 1.9753, 2.4046],
[1.4198, 2.5250, 1.8949, ..., 3.2618, 2.8547, 2.0487]]) tensor([[ 4, 7, 21, ..., 27, 28, 17],
[16, 27, 18, ..., 29, 30, 19],
[ 6, 16, 14, ..., 22, 24, 29],
...,
[16, 16, 8, ..., 21, 27, 22],
[15, 0, 0, ..., 9, 12, 3],
[30, 14, 9, ..., 23, 20, 14]])
tensor([3.2089, 4.1386, 3.2650, 3.3497, 4.4210, 3.0439, 3.5144, 3.2356, 3.3058,
3.2702, 2.9981, 3.6997, 3.1719, 3.4962, 3.0889, 3.6220, 3.9256, 4.1314,
3.0804, 3.3636, 3.5517, 3.2052, 3.6548, 3.7064, 3.6531, 4.5144, 3.1287,
4.1465, 3.1906, 3.1493, 3.1996, 3.6754, 3.7610, 3.5968, 3.2109, 3.6037,
3.2799, 3.0069, 3.0386, 3.0240, 3.5372, 3.6539, 3.5571, 3.2047, 3.1218,
4.2479, 3.1230, 3.0372, 3.0258, 3.8679, 3.6409, 3.0938, 3.1246, 2.9426,
4.0824, 3.8124, 3.4226, 3.3459, 4.1600, 3.6566, 3.0351, 3.3969, 3.5842,
3.0997]) tensor([17, 21, 30, 62, 62, 63, 43, 31, 45, 63, 20, 4, 58, 23, 22, 43, 54, 30,
15, 28, 13, 4, 4, 28, 6, 52, 53, 19, 33, 20, 3, 1, 14, 40, 0, 0,
46, 62, 58, 45, 28, 50, 4, 55, 25, 5, 21, 16, 27, 32, 10, 19, 38, 30,
48, 27, 20, 9, 2, 39, 55, 58, 32, 6])
tensor(4.5144) tensor(25)
This was by dimension, or simple
m = values.max()
will give you the max value.
a = torch.argmax(values)
idx = np.unravel_index(a, values.shape)
will give you the index.
CodePudding user response:
I am not sure what you mean by prediction here (usually, prediction are done for vectors or a tensor of size batch X N), so I won't be able to tell you what you should do but I will try to explain why it is zeros everywhere.
As you have mentioned the channel dimension has only 1 row, therefore all the argmax are 0 as there is only one value at the 0th position to check. So all the argmax being 0 makes sense.
Sigmoid is a monotonic function and hence won't change the result.