Why are the Bounding Boxes from Tesseract not aligned on the image text?-CodePudding

I'm using the tesseract R package to recognize text within an image file. However, when plotting the bounding box for a word, the coordinates don't seem to be right.

Why is the bounding box for the word "This" not aligned with the text "This" in the image?
Is there an easier way to plot all bounding box rectangles on the image?

library(tesseract)
library(magick)
library(tidyverse)

text <- tesseract::ocr_data("http://jeroen.github.io/images/testocr.png")
image <- image_read("http://jeroen.github.io/images/testocr.png")

text <- text %>% 
  separate(bbox, c("x1", "y1", "x2", "y2"), ",") %>% 
  mutate(
    x1 = as.numeric(x1),
    y1 = as.numeric(y1),
    x2 = as.numeric(x2),
    y2 = as.numeric(y2)
  )

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = text$y1[1], 
  xright = text$x2[1], 
  ytop = text$y2[1])

CodePudding user response：

This is simply because the x, y co-ordinates of images are counted from the top left, whereas rect counts from the bottom left. The image is 480 pixels tall, so we can do:

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = 480 - text$y1[1], 
  xright = text$x2[1], 
  ytop = 480 - text$y2[1])

Or, to show this generalizes:

plot(image)

rect(
  xleft = text$x1, 
  ybottom = magick::image_info(image)$height - text$y1, 
  xright = text$x2, 
  ytop = magick::image_info(image)$height - text$y2,
  border = sample(128, nrow(text)))