Home > Software engineering >  How can I improve recall value in Deep learning?
How can I improve recall value in Deep learning?

Time:11-26

I'm applying vgg16 as feature extraction method, However my instructor require a high recall. (90%-95%). I will explain about my dataset my data are labeled videos of traffic sign in a foggy weather (they are labeled as visible, not visible, poor viability) I extracted frames from the video as image and randomly stored the data in training & test/val folder I'm trying to apply deep learning to classify the images. As you can see my model is doing good but not very good. I can't get more videos from my instructor.

  • How can I possibly improve my model?

  • can I add more layers to my model?

  • can I feed feature extraction base model to a conv2d model that I create?

  • can I apply feed feature extraction from vgg16 to transfer learning ?

  • How can I feed feature extraction vgg16 to svm?


    BATCH = 50
    IMG_WIDTH = 224
    IMG_HEIGHT = 224
    
    from keras.applications import VGG16
    conv_base = VGG16(weights='imagenet',
    include_top=False,
    input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)) # This is the Size of the image
    
    conv_base.trainable= False
    
    
    datagen = ImageDataGenerator(rescale=1.0/255.0                  
                                      #  ,brightness_range=(1,1.5),
                                      #  zoom_range=0.1,
                                      #  rotation_range=45,
                                      #  horizontal_flip=True,
                                      #  vertical_flip=True,
    )
    
    train = datagen.flow_from_directory(train_path
                                                ,class_mode='categorical'
                                                ,batch_size = BATCH
                                                ,target_size=(IMG_HEIGHT, IMG_WIDTH))
    
    #test data val = datagen.flow_from_directory(val_path
                                              ,class_mode='categorical'
                                              ,batch_size = BATCH
                                              ,target_size=(IMG_HEIGHT, IMG_WIDTH))
    
    
    
    
    model = tf.keras.models.Sequential()
    #We now add the vggModel directly to our new model
    model.add(conv_base)
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(129, activation='relu'))
    model.add(tf.keras.layers.Dropout((0.5)))
    model.add(tf.keras.layers.Dense(5, activation='softmax'))
    
    
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001)
                   ,loss='categorical_crossentropy'
                   , metrics=["accuracy",
                               tf.keras.metrics.Precision(),
                               tf.keras.metrics.Recall()]
                    )
    
    early_stopping = EarlyStopping(monitor='val_loss'
                                  ,patience=2
                                  )
    history_1 = model.fit(train
                           ,validation_data=val
                           ,epochs=10
                           ,steps_per_epoch=len(train)
                           ,validation_steps=len(val)
                           ,verbose=1
                           ,callbacks =[early_stopping]
                            )
    
    
    Training loss       : 0.5572120547294617
    Training accuracy   : 0.8088889122009277
    Training precision: 0.9959514141082764
    Training recall: 0.437333345413208
    Test loss       : 0.5427007079124451
    Test accuracy   : 0.8233333230018616
    Test precision: 1.0
    Test recall: 0.44333332777023315

CodePudding user response:

recognition rates depend on many variables not only the noises of the image when our recognition is about 80 percent but we also use some information for our recognition process as well. You talked about sign traffic and foggy image when you are local and foreigners who had well recognize the image sign, you had a driving instructions guidebook.

  • How can I possibly improve my model?

    The model does not specify, I use a small simple model working with camera input, I use grids and multiple outputs as a sequence to help determine the label for input as object 1 contains in { P( Object | Grid 1 ), P( Object | Grid 2 ), P( Object | Grid 3 ) ... }

  • can I add more layers to my model?

    That depends on your application practice, input camera does not much information compare to the variety you may try to cascade model or resolution for extracting values, VGG16 is a series of convolution layers and a concat path is your application on it. I use only few dense layers with convolution layers that is because my input image is small.

  • can I feed feature extraction base model to a conv2d model that I create?

    The same answer, convolution layers is the extracting functions but you can convert information by time-related functions they are not limited to use with sounds or frequency such as MFCC feature extraction as extended information for recognition rates, masking sign, or model concatenation.

  • can I apply feed feature extraction from vgg16 to transfer learning ?

    The same answer is above.

  • How can I feed feature extraction vgg16 to svm?

    The same answer VGG16 is convolution layers you can use concatenate model or concatenate input.

Sample: Working templates, multiplot anime is one of OS response feedbacks it does not need time elapses.

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Functions
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
def f1( picture ):
    return tf.constant( picture ).numpy()

def animate( i ):
    ret0, frame0 = video_capture_0.read()
    if (ret0):      
        
        frame0 = tf.image.resize(frame0, [29, 39]).numpy()
        
        temp = img_array = tf.keras.preprocessing.image.img_to_array(frame0[:,:,2:3])
        temp2 = img_array = tf.keras.preprocessing.image.img_to_array(frame0[:,:,1:2])
        temp3 = img_array = tf.keras.preprocessing.image.img_to_array(frame0[:,:,0:1])

        temp = tf.keras.layers.Concatenate(axis=2)([temp, temp2])
        temp = tf.keras.layers.Concatenate(axis=2)([temp, temp3])
        # 480, 640
        temp = tf.keras.preprocessing.image.array_to_img(
            temp,
            data_format=None,
            scale=True
        )
        temp = f1( temp )
        
        im.set_array( temp )
        result = predict_action( temp )
        print( result )
    return im,

def predict_action ( image ) :
    predictions = model.predict(tf.constant(image, shape=(1, 29, 39, 3) , dtype=tf.float32))
    result = tf.math.argmax(predictions[0])
    return result

Output: Simple camera and object detection, I wrote simple codes with a small model ( No object detection API ), there are more examples Sample

CodePudding user response:

you stated

my data are labeled videos of traffic sign in a foggy weather (they are labeled as visible, not visible, poor viability) 

from this I assume you have 3 classes. However in your model you have the code

model.add(tf.keras.layers.Dense(5, activation='softmax'))
```
which implies you have 5 classes?
In model.fit you have the code

,steps_per_epoch=len(train) ,validation_steps=len(val)

in your generators you set the batch size to 50.  in which case you should have
steps_per_epoch =int(len(train/50) and validation_steps=int(len(val/50)

You are using the early stopping callback but you should add the parameter

restore_best_weights=True also change patience=4

This way your model will be set to the weights for the epoch with the lowest validation loss. I also recommend you use the Keras callback ReduceLROnPlateau. Documentation is [here.][1] my recommended code for this callback is:

rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.4, patience=2, verbose=1)

In model.fit change callbacks to callbacks=[[early_stopping, rlronp]

Also run for more epochs
   



  [1]: https://keras.io/api/callbacks/reduce_lr_on_plateau/
  • Related