Counting coloured swatches - faster method required-CodePudding

I'm currently trying to find a way to count coloured swatches. They are all regular, same size, arranged in a chess board pattern, colours vary. The number of swatches will vary from a few hundred to about 90,000.

In this example, of only three colours, I need it to count the number of light green squares = 8

In Photoshop I can take a sample of each swatch, where x, y are the centre of the swatch, and colArr is an array of all the RGB values of those swatches.

var pointSample = app.activeDocument.colorSamplers.add([x,y]);

// Obtain array of RGB values.
var rgb = [
  Math.round(pointSample.color.rgb.red,0),
  Math.round(pointSample.color.rgb.green,0),
  Math.round(pointSample.color.rgb.blue,0)
];

colArr.push(rgb);
delete_all_colour_samples();

function delete_all_colour_samples()
{
  app.activeDocument.colorSamplers.removeAll();
}

If the colour matches one in a previously established array, it gets counted.

It works! It's fairly instantaneous for small samples. However, with an image of over 3000 samples it takes about 25 seconds. Whilst I'm content to know that the image has been processes, I am curious if it can be done quicker.

My question is this: Could this be sped up by using say Python, but how?

CodePudding user response：

To count the number of contiguous regions with a given color:

First find which pixels have that color,
then use connected component analysis to count the number of contiguous regions.

Most image processing libraries will have the tools needed to do this easily. For example using DIPlib (disclosure: I'm an author) it would be:

import diplib as dip

img = dip.ImageRead('Qcza1.png')

color = dip.Create0D([159,244,141])
squares = dip.AllTensorElements(img == color)
squares = dip.Label(squares, connectivity=1)
count = dip.Maximum(squares)[0][0]

For OpenCV it would be:

import cv2

img = cv2.imread('Qcza1.png')  # Uses BGR ordering!

color = (141,244,159)
squares = cv2.inRange(img, color, color)
count, squares = cv2.connectedComponents(squares, connectivity=4)
count = count - 1  # count included background label

There are two things that can go wrong here:

The colors are not exact (illumination changes, compression artifacts, etc.). In this case you need to use a range of colors (dip.InRange(img, color1, color2) instead of img == color, or cv2.inRange(img, color1, color2)).
The squares are not perfectly separated like in the example. If the joins happen only at vertices, a small erosion of the binary squares image before labeling should do the trick. If the squares can be side-by-side, a more complex method is needed: we'd look at the histogram of sizes, expecting to see a first peak at the size of individual squares, and further peaks at multiples of this size. From this histogram we can compute how many squares there are, by multiplying bins near the 2x size by 2, near the 3x size by 3, etc, and summing up all our values.

CodePudding user response：

The details of what you are asking are not quite clear to me yet - like whether the squares are always the same size in every picture, whether you know in advance how many there are in a given image, whether they are always square (you describe them as "regular") and whether you actually want to count more than one colour in each image - your example is "light green" but you seem to have an array "previously established colours" so it is not clear if you want to know how many pixels are of each colour in that array.

Anyway, let's make a first stab at an answer, using OpenCV and basic Numpy indexing to get the centres.

import cv2
import numpy as np

# Load an image
im = cv2.imread('Qcza1.png')

# Assume the swatches are 64x64 pixels
sw, sh = 64, 64

# Get colours at each swatch centre, so we start 1/2 a swatch in and step by a whole swatch
centres = im[sh//2::sh, sw//2::sw]

That gives us this array of size [4,6,3] because Numpy arrays are [height, width, channels] and we have 4 swatch centres vertically, 6 swatch centres horizontally and each has 3 colours.

array([[[141, 244, 159],
    [ 85, 202, 105],
    [255, 255, 255],
    [ 85, 202, 105],
    [141, 244, 159],
    [ 85, 202, 105]],

   [[255, 255, 255],
    [141, 244, 159],
    [ 85, 202, 105],
    [141, 244, 159],
    [ 85, 202, 105],
    [141, 244, 159]],

   [[141, 244, 159],
    [ 85, 202, 105],
    [255, 255, 255],
    [ 85, 202, 105],
    [141, 244, 159],
    [255, 255, 255]],

   [[ 85, 202, 105],
    [141, 244, 159],
    [ 85, 202, 105],
    [255, 255, 255],
    [ 85, 202, 105],
    [255, 255, 255]]], dtype=uint8)

In case you are unfamiliar with Numpy indexing, it is basically:

array[start:end:stride]

so in the code above I am starting half a swatch width into the array (to get to the centre of the first swatch) and then using a stride equal to the swatch width to get to the next centre.

If you now want to tally the number of swatch centres having color [141,244,159], you can do:

tally = np.sum(np.all(centres==[141,244,159], axis=-1))

which gets the result 8.

Note that you can do this all with Python Imaging Library PIL/Pillow, if that is an easier install for you, by replacing:

import cv2
im = cv2.imread(...)

with:

from PIL import Image

# Open PIL Image and make into Numpy array
im = np.array(Image.open(...).convert('RGB'))

and bear in mind that OpenCV uses BGR ordering whereas PIL uses RGB, so the colour triplets you are looking for will be reversed.

I generated a sample image of 90,000 swatches of 10x10 pixels using 1,000 unique colours like this:

import cv2
import numpy as np

# Define 10 possible values for R, G and B
vals = np.arange(0,255,26)

# Make 90,000 pixel image using approx 10*10*10 = 1000 colours
h, w = 200, 450
np.random.seed(42)
im = np.random.choice(vals, (h,w,3)).astype(np.uint8)

# Scale up to make swatches 10x10
sw, sh = 10, 10
im = cv2.resize(im, (w*sw, h*sh), 0, 0, interpolation=cv2.INTER_NEAREST)

I then timed it and got 1.9ms and it found 83 swatches with colour [0, 78, 156] which is what you'd expect with 90,000 swatches of 1,000 colours:

%%timeit
centres = im[sh//2::sh, sw//2::sw]
np.sum(np.all(centres==[0,78,156], axis=-1))

1.95 ms ± 30.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)