I'm trying to create a 64 x 64 image dataset for machine learning from a 2D (576 x 768) geographic array.
The array contains nan values in random locations (random i,j) and the extracted 64 x 64 arrays (images) should not contain any nan values. There's also a mask limiting the amount of appropriate arrays.
What I've done for now is generate random pairs of i and j and examined them for nan values and too much overlapping.
What I need help with is figuring out how to check if after x extractions there remains a 64 x 64 are that hasn't been sampled too many times.
For illustration purposes, this task is equivalent for locating
11
11
in
000111000
001110000
001111100
011001100
000111100
It is crucial that the 2x2 form is not compromised.
Any ideas how to do this efficiently?
CodePudding user response:
I think you want to perform a 2d cross-correlation (Here is the scipy cross-correlation documentation)
It returns an array with each value being the similitude between the patern and the image at the coordinates of this value. If it's 1, that means the image is exatly the same as the pattern.