Right now I have a label with many lines of text as below. I want to find a way to detect if there are any scratch in a line (the first and the last line). Any recommendation, thanks.
CodePudding user response:
Assumption the text lines are aligned horizontally.
- Threshold the black text to get a mask.
- Count the black pixel in each row of the image. This gives you a vector with the number of rows.
- In this vector, You should see a pattern of small and high values which represent empty and filled lines.
- Scratches are small values between two high values.
Another way is to fit a line using Hough transform: https://learnopencv.com/hough-transform-with-opencv-c-python/
CodePudding user response:
This is a pretty difficult issue since it is actually lack of print in a line than a strike-through.
Can you tell me the process you are following for OCR so that I can work in a solution?
My best guess right now is perform a Sobel operation in the horizontal direction and with text aligned. and if you find a line where there is no gradient change, then that is most likely a blank line and if you observe that such a blank line is very narrow and this line is intersecting/overlapping detected text, it is scratch that you are looking for.
But for me to give a better answer, I will need to know the steps you are following.