Real time object detection lag-CodePudding

im trying to capture position of license plate with webcam feed using YOLOv4 tiny then input the result to easyOCR to extract the characters. The detection works well in real time, however when i apply the OCR the webcam stream become really laggy. Is there anyway i can improve this code to make it make it less laggy?

my YOLOv4 detection

#detection
while 1:
    #_, pre_img = cap.read()
    #pre_img= cv2.resize(pre_img, (640, 480))
    _, img = cap.read()
    #img = cv2.flip(pre_img,1)
    hight, width, _ = img.shape
    blob = cv2.dnn.blobFromImage(img, 1 / 255, (416, 416), (0, 0, 0), swapRB=True, crop=False)

    net.setInput(blob)

    output_layers_name = net.getUnconnectedOutLayersNames()

    layerOutputs = net.forward(output_layers_name)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.7:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .5, .4)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.5:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)

                x = int(center_x - w / 2)
                y = int(center_y - h / 2)

                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .8, .4)
    font = cv2.FONT_HERSHEY_PLAIN
    colors = np.random.uniform(0, 255, size=(len(boxes), 3))
    if len(indexes) > 0:
        for i in indexes.flatten():
            x, y, w, h = boxes[i]
            label = str(classes[class_ids[i]])
            confidence = str(round(confidences[i], 2))
            color = colors[i]
            cv2.rectangle(img, (x, y), (x   w, y   h), color, 2)
           # detection= cv2.rectangle(img, (x, y), (x   w, y   h), color, 2)
            detected_image = img[y:y h, x:x w]
            cv2.putText(img, label   " "   confidence, (x, y   400), font, 2, color, 2)
            #print(detected_image)
            cv2.imshow('detection',detected_image)

            cv2.imwrite('lp5.jpg',detected_image)
            cropped_image = cv2.imread('lp5.jpg')
            cv2.waitKey(5000)
            print("system is waiting")
            result = OCR(cropped_image)
            print(result)

easy OCR function

def OCR(cropped_image):
    reader = easyocr.Reader(['en'], gpu=False)  # what the reader expect from  the image
    result = reader.readtext(cropped_image)
    text = ''
    for result in result:
        text  = result[1]   ' '

    spliced = (remove(text))
    return spliced

CodePudding user response：

There are several points.

cv2.waitKey(5000) in your loop causes some delay even though you are pressing a key. So remove it if you are not debugging.
You are saving a detected region into a JPEG image and loading it each time. Do not do that - just pass the cv image(Numpy array) into the OCR module.
EasyOCR is a DNN model based on ResNet, but you are not using a GPU(gpu=False). So you should use GPU. See this benchmark by Liao.
You are creating many EasyOCR Reader instances inside a loop. Create only one instance before the loop and reuse it inside a loop. I think this is the most important bottleneck.

CodePudding user response：

You are essentially saying "the while loop must be fast." And of course the OCR() call is a bit slow. Ok, good.

Don't call OCR() from within the loop.

Rather, enqueue a request, and let another thread / process / host worry about the OCR computation, while the loop quickly continues upon its merry way.

You could use a threaded Queue, or a subprocess, or blast it over to RabbitMQ or Kafka. The simplest approach would be to simply overwrite /tmp/cropped_image.png within the loop, and have another process notice such updates and (slowly) call OCR(), appending the results to a log file.

There might be a couple of updates to the image file while a single OCR call is in progress, and that's fine. The two are decoupled from one another, each progressing at their own pace. Downside of a queue would be OCR sometimes falling behind -- you actually want to shed load by skipping some (redundant) cropped images.

The two are racing, and that's fine. But take care to do things in atomic fashion -- you wouldn't want to OCR an image that starts with one frame and ends with part of a subsequent frame. Write to a temp file and, after close(), use os.rename() to atomically make those pixels available under the name that the OCR daemon will read from. Once it has a file descriptor open for read, it will have no problem reading to EOF without interference, the kernel takes care of that for us.