Home > other >  Wrong Pytorch GPU run times.
Wrong Pytorch GPU run times.

Time:11-14

Pytorch single GPU run without error, when using the following code much GPU (2) is running, an error,
 
The model=get_instance_segmentation_model (num_classes)

If the torch. Cuda. Device_count () & gt; 1:
Print (" we use ", the torch. Cuda. Device_count (), "GPUs!" )
The model=torch. Nn. DataParallel (model)
# move the model to the right device
Model. The to (device)


Location:
 loss_dict=model (images, the targets) 

I print the images (the list format, batchsize I set is 1), shape of images' [0] is [3, 255, 255], the error information is as follows:
 anomalies: RuntimeError 
Caught RuntimeError up in 0 0 on the device.
The Original Traceback (the most recent call last) :
File "/home/TJMT/local/lib/python3.6/site - packages/torch/nn/parallel/parallel_apply py", 60, the line in _worker
Input and output module of=(* * * kwargs)
The File "/home/TJMT/local/lib/python3.6/site - packages/torch/nn/modules/module. Py", line 532, in __call__
Input result=self. Forward (* and * * kwargs)
The File "/home/TJMT/local/lib/python3.6/site - packages/torchvision/models/detection/generalized_rcnn py", line 66, in the forward
Images, the targets=self. The transform (images, the targets)
The File "/home/TJMT/local/lib/python3.6/site - packages/torch/nn/modules/module. Py", line 532, in __call__
Input result=self. Forward (* and * * kwargs)
The File "/home/TJMT/local/lib/python3.6/site - packages/torchvision/models/detection/transform. Py", 46, the line in the forward
Image=self. The normalize (image)
The File "/home/TJMT/local/lib/python3.6/site - packages/torchvision/models/detection/transform. Py", line 66, normalize in
Return (image - mean [:, None, None])/STD [: None, None]
RuntimeError: The size of tensor (2 a) must match The size of tensor b (3) at non - singleton dimension is 0
The File "/home/TJMT/tjmt_new/notebook/cx/MaskRcnn - the torch/engine. The test - py", line 52, in train_one_epoch
Loss_dict=model (images, the targets)
The File "/home/TJMT/cx/MaskRcnn tjmt_new/notebook/test - - the torch/torch - object - detection - fudan. Py", line 189, "train" in
Train_one_epoch (model, the optimizer, data_loader, device, epoch, print_freq=10)
The File "/home/TJMT/cx/MaskRcnn tjmt_new/notebook/test - - the torch/torch - object - detection - fudan. Py", line 277, the main in
The train ()
The File "/home/TJMT/cx/MaskRcnn tjmt_new/notebook/test - - the torch/torch - object - detection - fudan. Py", line 284, in & lt; module>
The main ()


I print
  • Related