I am trying to convert a model which uses Flatten
/Linear
as the final layer to use global pooling with AdapativeAvgPool1d
/Linear
. The output dimensions of the Linear layer after the global pooling are messing up the training epochs. I get the following error:
ValueError: operands could not be broadcast together with shapes (64,4) (64,)
Model with Flatten-->Linear (works)
conv1d --> relu --> maxpool1d --> Flatten --> Linear
:
model = nn.Sequential(
nn.Conv1d(in_channels=1, out_channels=32, kernel_size=128, stride=16, padding=1),
nn.ReLU(),
nn.MaxPool1d(kernel_size=2, stride=2),
nn.Flatten(),
nn.LazyLinear(n_classes)
)
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
Sequential -- --
├─Conv1d: 1-1 [64, 32, 505] 4,128
├─ReLU: 1-2 [64, 32, 505] --
├─MaxPool1d: 1-3 [64, 32, 252] --
├─Flatten: 1-4 [64, 8064] --
├─Linear: 1-5 [64, 4] 32,260
==========================================================================================
Model with AdaptiveAvgPool1d-->Linear (output dimension wrong)
I want the output of this implementation to match that of the previous one, where the output shape coming out of the Linear
layer is [64,4]
model = nn.Sequential(
nn.Conv1d(in_channels=1, out_channels=32, kernel_size=128, stride=16, padding=1),
nn.ReLU(),
nn.MaxPool1d(kernel_size=2, stride=2),
nn.AdaptiveAvgPool1d(1),
nn.LazyLinear(n_classes)
)
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
Sequential -- --
├─Conv1d: 1-1 [64, 32, 505] 4,128
├─ReLU: 1-2 [64, 32, 505] --
├─MaxPool1d: 1-3 [64, 32, 252] --
├─AdaptiveAvgPool1d: 1-4 [64, 32, 1] --
├─Linear: 1-5 [64, 32, 4] 8
==========================================================================================
CodePudding user response:
You can't replace nn.Flatten with nn.AdaptiveAvgPool1d because they don't do the same thing. You still need to add nn.Flatten() after nn.AdaptiveAvgPool1d to have the same output shape.