Home > Software engineering >  What's the most efficient way to check if a value will be a duplicate when using a for loop?
What's the most efficient way to check if a value will be a duplicate when using a for loop?

Time:01-14

I'm trying to get all unique permutations for the below list. Collecting all of them and then finding the uniques is impossible since it'll crash the computer (I need combinations of 18). To solve this I'm trying to check if a value is unique before adding it to the list or set. Just doing that for permutations of 6 or 7 has proven to take a long time though, so how I have it set up now just won't work for 18. Is there a more efficient way of doing this?

pos = ['QB', 'QB', 'QB',
       'RB', 'RB', 'RB', 'RB', 'RB', 'RB', 'RB', 'RB', 'RB', 'RB', 'RB',
       'WR', 'WR', 'WR', 'WR', 'WR', 'WR', 'WR', 'WR', 'WR', 'WR', 'WR',
       'TE', 'TE', 'TE', 'TE', 'TE']

combos = set()

for combo in itertools.permutations(pos, 18):
    if combo not in combos:
        combos.add(combo)

The above takes about 40 seconds to run for permutations of size 6. No way it'll work for size 18.

CodePudding user response:

The unique permutations you want are far too numerous to work with.

Let's compute a lower bound. You have 3 QBs, 11 RBs, 11 WRs, and 5 TEs, and each of your permutations will contain at least 1 QB, 2 RBs, 3 WRs, and 1 TE.

That's...

>>> from math import comb
>>> comb(3, 1) * comb(11, 2) * comb(11, 3) * comb(5, 1) * comb(23, 11)
184051617750

... 184 billion possibilities, each of which has many unique permutations. For example, 1 QB, 11 RB, 3 WR, and 1TE allows for ...

>>> 18 * 17 * comb(16, 3)
171360

... over 171,000 unique orderings (pick one of the 18 slots for the QB, one of the 17 remaining slots for the TE, and 3 of the remaining 16 for the TEs, with the RBs filling the rest)

So the final set of possible draft orders is much, much, much too large to even iterate over in a reasonable amount of time, let alone store in memory. Efficiency is not the problem: the sheer number of possibilities is.

CodePudding user response:

I found a lot better algorithm , but still not sure it is the best. We can use tree structure , we can think all different items as different tree nodes. We will start with an empty root and add all different tree nodes if we have it, until it reaches desired length. In this case we have 4 different tree nodes ("QB", "RB", "WR", "TE"), so our tree will grow roughly 4**n, we will store path information and if we reach desired length we will add the path.Here is my solution:

from time import time

# We will store all paths as a list in this global variable list
all_paths = []


class TreeNode:
    max_length_of_permutation = 6

    def __init__(self,remaining_tree_dict,length,path):
        self.remaining_tree_dict = remaining_tree_dict
        self.length = length # Lentgh of root to this tree
        self.path = path # Path of root to this tree
        self.create_childs() 

    def create_childs(self):
        # We check if we reached the max length
        if self.length < self.max_length_of_permutation:

            #We will create all possible child trees
            for tree in self.remaining_tree_dict:

                # We will copy because wo don't want to change original remaining_tree_dict
                new_remaining_tree_dict = self.remaining_tree_dict.copy()

                # We update new_remaining_tree_dict by subtracting the tree we will create
                new_remaining_tree_dict[tree] -= 1

                # If we don't have this tree in new_remaining_tree_dict anymore, we will delete
                if new_remaining_tree_dict[tree] == 0:
                    del new_remaining_tree_dict[tree]

                # We are creating new tree node with updated remaining trees, length and path, No need to assign
                TreeNode(new_remaining_tree_dict,self.length   1,self.path   [tree])

        # if we reach to max_length we add path to our global path variable
        else:
            all_paths.append(self.path)







initial_remaining_trees = {
    'QB':3,
    'RB':11,
    'WR':11,
    'TE':5,
}

cur_time = time()

# We create root tree with initial values
root = TreeNode(remaining_tree_dict = initial_remaining_trees, length = 0, path = [])

print(time() - cur_time)
print(all_paths)

so if we set max_length_of_permutation = 6 it is really fast.I tried a little bit larger different lengths also mathematically calculated number of permutations here is the results:

# 11 -> 2853248     num, 3.5 sec
# 12 -> 10048638    num, 11 sec
# 13 -> 34615984    num, 40 sec
# 14 -> 116607556   num, 242 sec
# 15 -> 384175688   num, ?
# 16 -> 1238482830  num, ?
# 17 -> 3908904792  num, ?
# 18 -> 12083761976 num, ?

So we can estimate it will take hours to complete. And i also calculated that since number of permutations for 18 is 12083761976, in best algorithm let's say we calculate every permutation O(1) time , adding this permutations to store in a list will take roughly 10 minutes.So don't expect really fast algorithm

I hope this is helpful

  • Related