I am training a model to recognize different Lego parts. When I train my model on google teachable machine and try the sample objects, the model predicts it accurately 100% of the time. However when I upload the same model to my react native app and run it through expo-go on my phone, it gets the predictions wrong almost all the time.
I think it has to do with the tensor image but I am not sure.
My model can be found here: https://teachablemachine.withgoogle.com/models/NSTiRzrtZ/
Accurate part prediction on google teachable machine] when taking a picture of the green piece on my phone, it predicts red piece. the prediction order is grey, tan, red, green
My code:
import React, {useRef, useState, useEffect} from 'react';
import {View,StyleSheet,Dimensions,Pressable,Modal,Text,ActivityIndicator,} from 'react-native';
import * as MediaLibrary from 'expo-media-library';
import {getModel,convertBase64ToTensor,startPrediction} from '../../helpers/tensor-helper';
import {cropPicture} from '../../helpers/image-helper';
import {Camera} from 'expo-camera';
// import { Platform } from 'react-native';
import * as tf from "@tensorflow/tfjs";
import { cameraWithTensors } from '@tensorflow/tfjs-react-native';
import {bundleResourceIO, decodeJpeg} from '@tensorflow/tfjs-react-native';
const initialiseTensorflow = async () => {
await tf.ready();
tf.getBackend();
}
const TensorCamera = cameraWithTensors(Camera);
const modelJson = require('../../model/model.json');
const modelWeights = require('../../model/weights.bin');
const modelMetaData = require('../../model/metadata.json');
const RESULT_MAPPING = ['grey', 'tan', 'red','green'];
const CameraScreen = () => {
const [hasCameraPermission, setHasCameraPermission] = useState();
const [hasMediaLibraryPermission, setHasMediaLibraryPermission] = useState();
const [isProcessing, setIsProcessing] = useState(false);
const [presentedShape, setPresentedShape] = useState('');
useEffect(() => {
(async () => {
const cameraPermission = await Camera.requestCameraPermissionsAsync();
const mediaLibraryPermission = await MediaLibrary.requestPermissionsAsync();
setHasCameraPermission(cameraPermission.status === "granted");
setHasMediaLibraryPermission(mediaLibraryPermission.status === "granted");
//load model
await initialiseTensorflow();
})();
}, []);
if (hasCameraPermission === undefined) {
return <Text>Requesting permissions...</Text>
} else if (!hasCameraPermission) {
return <Text>Permission for camera not granted. Please change this in settings.</Text>
}
let frame = 0;
const computeRecognitionEveryNFrames = 60;
const handleCameraStream = async (images: IterableIterator<tf.Tensor3D>) => {
const model = await tf.loadLayersModel(bundleResourceIO(modelJson,
modelWeights,
modelMetaData));
const loop = async () => {
if(frame % computeRecognitionEveryNFrames === 0){
const nextImageTensor = images.next().value;
if(nextImageTensor){
const tensor = nextImageTensor.reshape([
1,
224,
224,
3,
]);
const prediction = await startPrediction(model, tensor);
console.log(prediction)
tf.dispose([nextImageTensor]);
}
}
frame = 1;
frame = frame % computeRecognitionEveryNFrames;
requestAnimationFrame(loop);
}
loop();
}
return (
<View style={styles.container}>
<Modal visible={isProcessing} transparent={true} animationType="slide">
<View style={styles.modal}>
<View style={styles.modalContent}>
<Text>Your current shape is {presentedShape}</Text>
{presentedShape === '' && <ActivityIndicator size="large" />}
<Pressable
style={styles.dismissButton}
onPress={() => {
setPresentedShape('');
setIsProcessing(false);
}}>
<Text>Dismiss</Text>
</Pressable>
</View>
</View>
</Modal>
<TensorCamera
style={styles.camera}
type={Camera.Constants.Type.back}
onReady={handleCameraStream}
resizeHeight={224}
resizeWidth={224}
resizeDepth={3}
autorender={true}
cameraTextureHeight={1920}
cameraTextureWidth={1080}
/>
</View>
);
};
CodePudding user response:
you're doing
const tensor = nextImageTensor.reshape([1,224,224,3]);
which takes image and just reshapes tensor to new shape, regardless of actual pixels.
what you probably want to use is tf.image.resizeBilinear
to resize image to desired shape.
EDIT:
to normalize from 0..255 to -1..1, you'd do something like
const normalized = input.cast('float32').div(127.5).sub(1);