I'm new to Computer Vision model structure, and I'm using Tensorflow for Node JS @tensorflow/tfjs-node
to make some models detect some objects. With Mobilenet and Resnet SSD, the models are using the Channels Last
format, so when I create a Tensor with tf.node.decodeImage
the format is by default Channels Last
, like shape: [1, 1200, 1200, 3]
for 3 channels, and the predictions data work great, able to recognize objects.
But model from Pytorch, converted to ONNX, then to Protobuf PB format, the saved_model.pb
has the Channels First
format, like shape: [1, 3, 1200, 1200]
.
Now I need to create Tensor from image but with Channels First
format. I found many exemple of creating conv1d, conv2d
specifying the format dataFormat='channelsFirst'
. But I don't know how to apply it to an image data. Here is the API https://js.tensorflow.org/api/latest/#layers.conv2d .
Here is the Tensor codes:
const tf = require('@tensorflow/tfjs-node');
let imgTensor = tf.node.decodeImage(new Uint8Array(subBuffer), 3);
imgTensor = imgTensor.cast('float32').div(255);
imgTensor = imgTensor.expandDims(0); // to add the most left axis of size 1
console.log('tensor', imgTensor);
This gives me a shape with channels last that is not compatible with the Model shape with channels first:
tensor Tensor {
kept: false,
isDisposedInternal: false,
shape: [ 1, 1200, 1200, 3 ],
dtype: 'float32',
size: 4320000,
strides: [ 4320000, 3600, 3 ],
dataId: {},
id: 7,
rankType: '4',
scopeId: 4
}
I know of tf.shape
, but it reshapes without converting to channels first, and the result seems useless in predictions results. Don't know what I'm missing.
CodePudding user response:
you can use something like this:
const nchw = tf.transpose(nhwc, [0, 3, 1, 2]);