I am using Google Cloud Vision API to detect objects in images. The response is in the following form from google vision API. it returns the array of normalized vertices. But I need the 4 points only for RectF.I have googled it before positing it here but I couldn't get any proper solution.
{
"responses": [
{
"localizedObjectAnnotations": [
{
"mid": "/m/01c648",
"name": "Laptop",
"score": 0.885833,
"boundingPoly": {
"normalizedVertices": [
{
"x": 0.16581687,
"y": 0.5996421
},
{
"x": 0.5108573,
"y": 0.5996421
},
{
"x": 0.5108573,
"y": 0.9928019
},
{
"x": 0.16581687,
"y": 0.9928019
}
]
}
},
{
"mid": "/m/04brg2",
"name": "Tableware",
"score": 0.8071477,
"boundingPoly": {
"normalizedVertices": [
{
"x": 0.61909163,
"y": 0.8264213
},
{
"x": 0.7196966,
"y": 0.8264213
},
{
"x": 0.7196966,
"y": 0.9963302
},
{
"x": 0.61909163,
"y": 0.9963302
}
]
}
},
{
"mid": "/j/984ysm",
"name": "Table top",
"score": 0.66904813,
"boundingPoly": {
"normalizedVertices": [
{
"y": 0.8069201
},
{
"x": 0.86148286,
"y": 0.8069201
},
{
"x": 0.86148286,
"y": 0.99502665
},
{
"y": 0.99502665
}
]
}
},
{
"mid": "/m/0d4v4",
"name": "Window",
"score": 0.5146187,
"boundingPoly": {
"normalizedVertices": [
{
"x": 0.004114019,
"y": 0.00019616824
},
{
"x": 0.3921472,
"y": 0.00019616824
},
{
"x": 0.3921472,
"y": 0.25323766
},
{
"x": 0.004114019,
"y": 0.25323766
}
]
}
}
]
}
]}
I want to draw rectangle around the detected object but I am not sure how can I get the rectangle points from the Polygon vertices. What is the algorithm for converting the polygon into a rectangle.
CodePudding user response:
The API provides you with four points, which are the four corner of an axis-aligned rectangle. The four corners can be referred as:
- The topleft corner;
- The topright corner;
- The bottomright corner;
- The bottomleft corner.
Each corner is a point which has two coordinates; for instance, the two coordinates of the topleft corner are (x=left, y=top)
, and the two coordinates of the bottomright corner are (x=right, y=bottom)
.
Identify which point is the topleft corner, and which point is the bottomright corner, and that will give you the four values you seek:
left = topleft.x
top = topleft.y
right = bottomright.x
bottom = bottomright.y
As an additional note, these values are extremely easy to identify if you know how to take the minimum
or the maximum
, since for instance:
right = max(left, right)
left = min(left, right)
Whether top = max(top, bottom)
or top = min(top, bottom)
depends on the orientation of the coordinate system, so you'll have to figure that one out for yourself. For instance, in a mathematical plot we almost always use top = max(top, bottom)
, but when describing pixel coordinates on a screen, we more often use top = min(top, bottom)
.