Note: I have already seen similar questions: the same error, tell torch not to use GPU, but the answers do not work for me.
I have installed PyTorch version 1.13.0 cu117
(the latest), and the code structure is as follows (an image classification task):
# os.environ["CUDA_VISIBLE_DEVICES"]="" # required?
device = torch.device("cpu") # use CPU
...
train_set = DataLoader(
torchvision.datasets.ImageFolder(path, transform), **kwargs
)
...
model = myCNN().to(device)
optimizer = SGD(args)
loss = CrossEntropyLoss()
train()
I want to train on CPU.
For dataloader, in accordance to this, I've set pin_memory=True
and non_blocking=pin_memory
. The error persists even on setting pin_memory=False
.
The training loop has the following structure:
for epoch in n_epochs:
model.train()
inputs, labels = inputs.to(device, non_blocking=non_blocking), labels.to(device, non_blocking=non_blocking)
Compute loss, back-propagate
The error traceback (on calling train()
):
Traceback (most recent call last):
File "code.py", line 233, in <module>
train()
File "code.py", line 122, in train
outputs = model(inputs)
File "...\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "code.py", line 87, in forward
output = self.network(input)
File "...\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "...\torch\nn\modules\container.py", line 204, in forward
input = module(input)
File "...\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "...\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "...\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
Edit: There was a comment regarding possible issues due to the model. The model is roughly:
class myCNN(nn.Module):
def __init__(self, ...other args...):
super().__init__()
self.network = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding),
nn.ReLU(),
nn.MaxPool2d(kernel_size),
... similar convolutional layers ...
nn.Flatten(),
nn.Linear(in_features, out_features)
)
def forward(self, input):
output = self.network(input)
return output
Since I have transferred both model and data to the same device, what could be the reason of this error? How to correct it?
CodePudding user response:
The issue was due to incorrect usage of summary
from torchinfo
. It does a forward pass (if input size is provided), and the device is (by default) selected on basis of torch.cuda.is_available()
.
If device
(as specified in the question) argument is given to summary
, the training happens just fine.