Home > Software design >  Matrix 2D on Convolutional Netowrk
Matrix 2D on Convolutional Netowrk

Time:05-22

that may be a silly question, but I wanted to use a convolutional neural network in my deep reinforcement learning project and I got a problem I don't understand. In my project I want to insert into network matrix 6x7 which should be equivalent to black and white picture of 6x7 size (42 pixels) right?

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential()
        self.model.add_module("conv_1", torch.nn.Conv2d(in_channels=1, out_channels=16, kernel_size=4, stride = 1))
        self.model.add_module("relu_1", torch.nn.ReLU())
        self.model.add_module("max_pool", torch.nn.MaxPool2d(2))
        self.model.add_module("conv_2", torch.nn.Conv2d(in_channels=16, out_channels=16, kernel_size=4, stride = 1))
        self.model.add_module("relu_2", torch.nn.ReLU())
        self.model.add_module("flatten", torch.nn.Flatten())

        self.model.add_module("linear", torch.nn.Linear(in_features=16*16*16, out_features=7))

    def forward(self, x):
        x = self.model(x)
        return x

In conv1 in_channels=1 because I have got only 1 matrix (if it was image recognition that means 1 color). Other in_channels and out_channels are kind of random until linear. I have no idea where I should insert the size of a matrix, but the final output should be a size of 7 which i typed in linear.

The error i get is:

RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [6, 7]

CodePudding user response:

There are a few problems with your code. First, the reason you're getting that error message is because the CNN is expecting a tensor with shape (N, Cin, Hin, Win), where:

  • N is the batch size
  • Cin is the number of input channels
  • Hin is the input image pixel height
  • Win is the input image pixel width

You're only providing the width and height dimensions. You need to explicitly add a channels and batch dimension, even if the value of those dimensions is only 1:

model = CNN()

example_input = torch.randn(size=(6, 7)) # this is your input image

print(example_input.shape) # should be (6, 7)

output = model(example_input) # you original error

example_input = example_input.unsqueeze(0).unsqueeze(0) # adds batch and channels dimension

print(example_input.shape) # should now be (1, 1, 6, 7)

output = model(example_input) # no more error!

You'll note however, you get a different error now:

RuntimeError: Calculated padded input size per channel: (1 x 2). Kernel size: (4 x 4). Kernel size can't be greater than actual input size

This is because after the first conv layer, your data is of shape 1x2, but your kernel size for the second layer is 4, which makes the operation impossible. An input image of size 6x7 is quite small, either reduce the kernel size to something that works, or use a larger images.

Here's a working example:

import torch
from torch import nn


class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential()
        self.model.add_module(
            "conv_1",
            torch.nn.Conv2d(in_channels=1, out_channels=16, kernel_size=2, stride=1),
        )
        self.model.add_module("relu_1", torch.nn.ReLU())
        self.model.add_module("max_pool", torch.nn.MaxPool2d(2))
        self.model.add_module(
            "conv_2",
            torch.nn.Conv2d(in_channels=16, out_channels=16, kernel_size=2, stride=1),
        )
        self.model.add_module("relu_2", torch.nn.ReLU())
        self.model.add_module("flatten", torch.nn.Flatten())

        self.model.add_module("linear", torch.nn.Linear(in_features=32, out_features=7))

    def forward(self, x):
        x = self.model(x)
        return x


model = CNN()
x = torch.randn(size=(6, 7))
x = x.unsqueeze(0).unsqueeze(0)
output = model(x)
print(output.shape) # has shape (1, 7)

Note, I changed the kernel_size to 2, and the final linear layer has an input size of 32. Also, the output has shape (1, 7), the 1 is the batch_size, which in our case was only 1. If you want just the 7 output features, just use x = torch.squeeze(x).

  • Related