I generated a data-set of (200 x 200x 3) images in which each image contains a 40 X 40 box of different color. Create a model using tensorflow which can predict coords of this 40 x 40 box. enter image description here
The code i used for generating these images:
from PIL import Image, ImageDraw
from random import randrange
colors = ["#ffd615", "#f9ff21", "#00d1ff",
"#0e153a", "#fc5c9c", "#ac3f21",
"#40514e", "#492540", "#ff8a5c",
"#000000", "#a6fff2", "#f0f696",
"#d72323", "#dee1ec", "#fcb1b1"]
def genrate_image(color):
img = Image.new(mode="RGB", size=(200, 200), color=color)
return img
def save_image(img, imgname):
img.save(imgname)
def draw_rect(image, color, x, y):
draw = ImageDraw.Draw(image)
coords = ((x, y), (x 40, y), (x 40, y 40), (x, y 40))
draw.polygon(coords, fill=color)
#return image, str(coords)
return image, coords[0][0], coords[2][0], coords[0][1], coords[2][1]
FILE_NAME = "train_annotations.txt"
for i in range(0, 100):
img = genrate_image(colors[randrange(0, len(colors))])
img, x0, x1, y0, y1 = draw_rect(img, colors[randrange(0, len(colors))], randrange(200 - 50), randrange(200 - 50))
save_image(img, "dataset/train_images/img" str(i) ".png")
with open(FILE_NAME, "a ") as f:
f.write(f"{x0} {x1} {y0} {y1}\n")
f.close()
can anyone help me by suggesting how can i build a model which can predict coords of a new image.
CodePudding user response:
Well the easiest way you can split these boxes is by doing a K-means clustering where K is 2. So you basically record all the rgb pixel values of the pixels. Then using K-means group up the pixels into 2 groups, one would be the background group, the other being the box color group. Then with the box color group, map those colors back to their original coordinates. Then get the mean of those coordinates to get the location of the 40x40 box.
https://www.tensorflow.org/api_docs/python/tf/compat/v1/estimator/experimental/KMeans Above is a source documentation on how to do K-means
CodePudding user response:
It is enough to perform a bounding box regression, for this you just need to add a fully connected layer after СNN with 4 output values:x1,y1,x2,y2. where they are top left and bottom right. Something similar can be found here https://github.com/sabhatina/bounding-box-regression.