Choose a random triple that isn't in list python-CodePudding

I am trying to generate a large amount of triples while insuring they aren't in a list as well as a few other factors. The goal is to create a 1024x1024 image but the difference between two pixels is never the same but it also has to ensure that the difference doesn't make the pixel have color values outside of the acceptable range. The problem with the code I have currently is that it is too slow and take over 12 hours to even get half way. Here is the segment of the code that chooses a single pixel's color values:

def choose_color():
global color
global difference
global usedDifferences
found = 0
while found == 0:
    difference = [random.randint(-256, 256), random.randint(-256, 256), random.randint(-256, 256)]
    if difference not in usedDifferences:
        if 0 <= color[0]   difference[0] <= 255 and 0 <= color[1]   difference[1] <= 255 and 0 <= color[2]   difference[2] <= 255:
            usedDifferences.append(difference)
            found = 1
            color[0]  = difference[0]
            color[1]  = difference[1]
            color[2]  = difference[2]

How can I make this faster?

CodePudding user response：

Your code can be modified to run in under 10 seconds on my machine by fixing a few issues.

As a cautionary note, I'm not sure if you meant to only look at the diff with the one previous color, or maybe all the adjacent colors. If you want to ensure that ANY TWO pixels have a unique diff, that's a completely different question.

The biggest problem is using a list. You need to store the used differences in a set. This data structure exists for that basic purpose -- efficiently checking to see if a value is unique. This requires using something hashable like a tuple for the color, which I believe is better anyway.

Additionally, you are wasting a lot of effort selecting from a random range that includes invalid values. You should compute the valid range first, then select a random number from that. Or in my solution I just picked a random color and computed the diff, so it has to be valid.

Lastly, I implemented this as an generator. If we're just checking the diff with the previous one pixel, we don't need to store these values anywhere besides directly in the image (which I did not do here). Keeping them in a list is a big waste of space and time.

from itertools import islice
from random import randint
from datetime import datetime

def compute_diff(color1: tuple, color2: tuple):
    return (
        color2[0] - color1[0],
        color2[1] - color1[1],
        color2[2] - color1[2],
    )

def choose_color(last_color=(0, 0, 0), MAX_TRIES=1000):
    # This is absolutely the most important thing:
    # used_diffs MUST be a set.
    used_diffs = set()
    tries = 0
    while tries < MAX_TRIES:
        # Don't pick a random diff and check for bounds.
        # That's a lot of wasted work.
        # Instead just pick a valid color and compute the diff.
        color = (randint(0, 255), randint(0, 255), randint(0, 255))
        diff = compute_diff(color, last_color)
        if diff not in used_diffs:
            # Using a generator is generally more efficient
            yield color
            used_diffs.add(diff)
            last_color = color
            tries = 0
        tries  = 1
        if tries > 10:
            # I added this just for debugging purposes. It never triggered.
            print('.', end='')
    raise RecursionError('choose_color took too many tries')

try:
    # demonstrate that we can pick a few valid colors
    print('demo 10 pixels')
    for color in islice(choose_color(), 10):
        print(color)

    # now time the selection of enough pixels to fill the image
    print('building full image...')
    start = datetime.now()
    for color in islice(choose_color(), 1024*1024):
        pass
    end = datetime.now()
    print(end-start)
except RecursionError as e:
    print (str(e))

CodePudding user response：

What you are trying to do is (probably) not possible. There are two big issues with your code currently in terms of correctness. First off, it is possible that the difference (5, 5, 5) is used, and that (-5, -5, -5) is also used. The absolute difference is the same, meaning that you may be using differences that are not unique.

Secondly, (and much more important) you are not checking that the difference between all pixels is never repeated. Imagine this scenario:

I start at (0,0,0) and generate a difference of (1, 1, 1)
I am now at (1, 1, 1) and generate a difference of (10, 10, 10)
I am now at (11, 11, 11) and generate a difference of (-2, -2, -2)
I am now at (9, 9, 9) and generate a difference of (3, 3, 3)
I am now at (12, 12, 12) which has a difference of (1, 1, 1) to (11, 11, 11)!

Here is some sample code that fixes both of these problems and selects random colors that never have the same difference between each other (note: this code heavily benefits from this data structure that allows for quick insertion, deletion, and random sampling.)

import random
import operator

class ListDict(object):
    def __init__(self):
        self.item_to_position = {}
        self.items = []

    def add_item(self, item):
        if item in self.item_to_position:
            return
        self.items.append(item)
        self.item_to_position[item] = len(self.items)-1

    def remove_item(self, item):
        position = self.item_to_position.pop(item)
        last_item = self.items.pop()
        if position != len(self.items):
            self.items[position] = last_item
            self.item_to_position[last_item] = position

    def choose_random_item(self):
        return random.choice(self.items)

def get_colors():
    colors = ListDict()
    for i in range (0, 256):
        for j in range(0, 256):
            for k in range(0, 256):
                colors.add_item((i,j,k))
    return colors

def remove_color(color, colors):
    if color in colors.item_to_position:
            colors.remove_item(color)

print("Generating colors!")
colors = get_colors()

initial_color = [random.randint(0, 256), random.randint(0, 256), random.randint(0, 256)]
output_colors = [initial_color]
all_differences = []

for i in range(1024 * 1024):
    try:
        new_color = colors.choose_random_item()
    catch IndexError:
        print("Ran out of colors!")
        break

    differences = []
    for color in output_colors:
        difference = tuple(map(operator.sub, color, new_color))
        difference_neg = tuple(map(operator.mul, difference, (-1, -1, -1)))
        differences.append(difference)
        differences.append(difference_neg)

    for color in output_colors:
        for difference in differences:
            remove_color(tuple(map(operator.add, color, difference)), colors)

    all_differences  = differences
    output_colors.append(new_color)

    for difference in all_differences:
        remove_color(tuple(map(operator.add, new_color, difference)), colors)

    print(i, len(colors.items))

print(output_colors)

There is an issue with this code: it doesn't finish! It can typically generate around ~1250 colors before it runs out of viable options. In addition, as you may have noticed it starts to slow down a lot after a while. On my machine, it takes around 20 minutes to run all the way. This is because we need to check every single difference with every new color we add, and check each new difference with all the colors we have already randomly selected!

To answer your question "can my code performance be improved?" if you do not care about the two issues I just pointed out in your approach here is a much faster version of your code which runs in around 4 minutes on my machine

import random
import operator

class ListDict(object):
    def __init__(self):
        self.item_to_position = {}
        self.items = []

    def add_item(self, item):
        if item in self.item_to_position:
            return
        self.items.append(item)
        self.item_to_position[item] = len(self.items)-1

    def remove_item(self, item):
        position = self.item_to_position.pop(item)
        last_item = self.items.pop()
        if position != len(self.items):
            self.items[position] = last_item
            self.item_to_position[last_item] = position

    def choose_random_item(self):
        return random.choice(self.items)


def get_differences():
    differences = ListDict()
    for i in range (0, 256):
        for j in range(0, 256):
            for k in range(0, 256):
                differences.add_item((i,j,k))
    return differences

def check_bounds(color, diff):
    for i in range(len(color)):
        if (color[i]   diff[i] < 0) or (color[i]   diff[i] > 255):
            return False
    return True

print("Generating differences!")
differences = get_differences()

initial_color = [random.randint(0, 256), random.randint(0, 256), random.randint(0, 256)]
colors = [initial_color]


print("Generating colors!")
while(len(colors) != 1024 * 1024):
    random_diff = differences.choose_random_item()
    viable_differences = []
    current_color = colors[-1]

    for one in range(2):
        for two in range(2):
            for three in range(2):
                viable_diff = (random_diff[0] if one else -random_diff[0],
                                random_diff[1] if two else -random_diff[1],
                                random_diff[2] if three else -random_diff[2])
                if check_bounds(current_color, viable_diff):
                    viable_differences.append(viable_diff)

    if len(viable_differences):
        random_viable_diff = random.choice(viable_differences)
        colors.append(tuple(map(operator.add, current_color, random_viable_diff)))

    differences.remove_item(random_diff)

print(colors)

In summary: you are not currently checking to make sure that no two differences are ever repeated. If we check for this thoroughly we actually run out of colors without even finding enough to fill a single 1024-pixel row! With the second piece of code, you will still repeat, but it will be rare. Let me know if you have any questions.