When training my CNN image classifier using PyTorch I noticed a ~20 % difference in accuracy when using a batch size of 4 vs 32. What might be causing such drastic differences?
batch_size 4
100%|██████████| 10/10 [02:50<00:00, 17.04s/it, TestAcc=71%, TrainAcc=74%, loss=0.328]
batch_size 32
100%|██████████| 10/10 [02:38<00:00, 15.85s/it, TestAcc=53%, TrainAcc=57%, loss=0.208]
Model:
class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, num_classes)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = torch.flatten(x, 1) # flatten all dimensions except batch
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
CodePudding user response:
You can try to adjust your learning rate too.
With a larger batch size you should also use a larger learning rate.
This article has additional explanations for the relation of learning rate and batch size.