Home > Software design >  PYTHON PIL: remove everything from image except text (based on pixel color)
PYTHON PIL: remove everything from image except text (based on pixel color)

Time:06-16

I have some images that have subtitle/text on them and i want to remove everything from picture but let clear text on that,(most important is that text need to be clear and good so any ocr program can read it).

-In original image(1.png) text should be white but that doesnt mean it is RGB:255,255,255 so it vary from pixel to pixel. so this is problem that i cant find a way to get just text.

Maybe i need to convert rgb to something different , maybe any value with percentage or idk

i tried with below code to convert image 1.png to 2.png and here are the results but they are not good enought

1.png: enter image description here

2.png enter image description here

RGB_min=[180,180,180]
RGB_max=[245,245,245]

def level(img):
    copy = img.copy()
    for x in range(img.size[0]):
        for y in range(img.size[1]):
            pxl = list(copy.getpixel((x, y)))
            # if pxl[0] < 220 and pxl[1] < 220: 
            if (pxl[0] < RGB_min[0] and pxl[1] < RGB_min[1]  ) or (pxl[0] > RGB_max[0] or pxl[1] >RGB_max[1]) : 
                pxl[0] = 255
                pxl[1] = 255
                pxl[2] = 255
            else:
                pxl[0] = 0
                pxl[1] = 0
                pxl[2] = 0
                
            copy.putpixel((x, y), tuple(pxl))
    return copy

image = Image.open('1.png')
leveled = level(image)
leveled.save('2.png')

here u can see how pixels are in text if you zoom in.

enter image description here

CodePudding user response:

You could try looking for bright values in enter image description here

  • Related