Bounty note:
The original title isDetecting obscure shapes with Opencv Python
.
However I am interested in concepts of image-processing that would solve such a problem: How to find a pasted image inside the bigger image?Assume the following:
- The jigsaw shapes are always of same (rectangle) boundary size.
(ie: a template-based searching method could work)- The jigsaw shape is not rotated to any angle.
(ie: there will be straight(-ish) horizontal and vertical lines to find.- The jigsaw shape is always "pasted" into some other "original" image.
(ie: a paste-detection method could work)The solution can be OpenCV (as requested by the Asker), but the core concepts should be applicable when using any appropriate software (ie: can loop through image pixels to process their values, in order to achieve the described solution).
I myself use Javascript, but of course I will understand thatopenCV.calcHist()
becomes a histogram functon in JS code. I have no problem translating a good explanation into code. I will consider OpenCV code as pseudo-code towards a working idea.
I'm trying to find a way to reliably determine the location of a puzzle piece in an image. The puzzle piece varies in both shape and how easy it is to find it. What algorithm(s) in the opencv module would help me with the task at hand? Or is what I'm trying to do beyond the scope of the module?
Example images below
CodePudding user response:
In my opinion the best approach for a canonical answer was suggested in the comments by Christoph, which is training a CNN:
- Implement a generator for shapes of puzzle pieces.
- Get a large set of natural images from the net.
- Generate tons of sample images with puzzle pieces.
- Train your model to detect those puzzle pieces.
CodePudding user response:
I am still open to Answers even during the upcoming grace period.
I need a reason to award the bounty.
I'll throw in my own attempt.
It fails on the first image, only works fine on the next two images.
I am open to other pixel-processing based techniques where possible.
I do not use OpenCV so the process is explained with words (and pictures). It is up to the reader to implement the solution in their own chosen programming language/tool.
Background:
I wondered if there was something inherent in pasted images (something maybe revealed by pixel processing or even by frequency domain analysis, eg: could a Fourier signal analysis help here?).
After some research I came across Error Level Analysis (or ELA).
This page has a basic introduction for beginners.
Process: In 7 easy steps, this detects the location of a pasted puzzle piece.
(1) Take a provided cat picture and re-save 3 times as JPEG in this manner:
- Save copy #1 as JPEG of quality setting 2.
- Reload (to force a decode of) copy #1 then re-save copy #2 as JPEG of quality setting 5.
- Reload (to force a decode of) copy #2 then re-save copy #3 as JPEG of quality setting 2.
(2) Do a difference blend-mode with original cat picture as base layer versus the re-saved copy #3 image. Thimage will be black so we increase Levels.
(3) Increase Levels to make the ELA detected area(s) more visible.
note: I recommend working in BT.709 or BT.601 grayscale at this point. Not necessary, but it gives "cleaner" results when blurring later on.
(4) Alternate between applying a box blur to the image and also increasing levels, to a point where the islands disappear and a large blob remains..
(5) The blob itself is also emphasized with an increase of levels.
(6) Finally a Gaussian blur is used to smoothen the selection area
(7) Mark the blob area (draw an outline stroke) and compare to input image...
CodePudding user response:
Histogram of Largest Error
This is a rough concept of a possible algorithm.
The idea comes from an unfounded premise that seems plausible enough. The premise is that adding the puzzle piece drastically changes the histogram of the image.
Let's assume that the puzzle piece is bounded by a 100px by 100px square. We are going to use this square as a mask to mask out pixels that are used to calculate the histogram.
The algorithm is to find the placement of the square mask on the image such that the error between the histogram of the masked image and the original image is maximized.
There are many norms to experiment with to measure the error: You could start with the sum over the error components squared.