Compare and iterate over two 2D arrays-CodePudding

I have two 2D arrays like these:

mix = [[1,'Blue'],[2,'Black'],[3,'Black'],[4,'Red']]

possibilities = [[1,'Black'],[1,'Red'],[1,'Blue'],[1,'Yellow'],
         [2,'Green'],[2,'Black'],
         [3,'Black'],[3,'Pink'],
         [4,'White'],[4,'Blue'],[4,'Yellow'],
         [5,'Purple'],[5,'Blue']
        ]

I want to loop through the possibilities list, find the exact index in which it matches the mix list, then append the correct mix into a new list.IF it does not match the MIX list, append into a "bad list" and then move on to the next iteration. For now this is the idea I had ---- note it totally DOES NOT work! :)

i = 0
j = 0
bad_guess = []
correct_guess = []
while i < len(mix):
    while possibilities[j] != mix[i]:
        j  = 1
    if possibilities[j] == mix[i]:
        i  =1
        correct_guess.append(possibilities[j])
        j = 0
    elif possibilities[j] != mix[i]:
        bad_guess.append(mix[i])
        break

Output Basically the output I would want for this example is:

correct_guess = [[1,'Blue'],[2,'Black'],[3,'Black'],[5,'Purple']
bad_guess = [4,'Red']

CodePudding user response：

There are a number of ways of solving this. The simple way is to loop through the lists, but that's not very pythonic. You can remove the inner loop using a containment check:

bad_guess = []
correct_guess = []
for item in mix:
    if item in possibilities:
        correct_guess.append(item)
    else:
        bad_guess.append(item)

The in operator is going to do a linear search through possibilities at every iteration. For a small list like this, it's probably fine, but for something larger, you will want a faster lookup.

Faster lookup is offered in sets. Unfortunately, sets can not contain non-hashable types such as lists. The good news is that they can contain tuples:

mix = [tuple(item) for item in mix]
possibilities = {tuple(item) for item in possibilities}
bad_guess = []
correct_guess = []
for item in mix:
    if item in possibilities:
        correct_guess.append(item)
    else:
        bad_guess.append(item)

Another way to get the same result is to first sort mix by whether an item appears in possibilities or not, and then use itertools.groupby to create the output lists. This approach is fun to parse, but is not particularly legible, and therefore not recommended:

key = lambda item: item in possibilities
bad_guess, correct_guess = (list(g) for k, g in itertools.groupby(sorted(mix, key=key), key=key))

This last method is more algorithmically complex than the set lookup because sorting is an O(N log N) operation, while lookup in a set is O(1).

CodePudding user response：

This should do the job:

mix = [[1,'Blue'],[2,'Black'],[3,'Black'],[4,'Red']]

possibilities = [[1,'Black'],[1,'Red'],[1,'Blue'],[1,'Yellow'],
         [2,'Green'],[2,'Black'],
         [3,'Black'],[3,'Pink'],
         [4,'White'],[4,'Blue'],[4,'Yellow'],
         [5,'Purple'],[5,'Blue']
        ]

bad_guess = []
correct_guess = []
for i in mix:
    if i in possibilities:
        correct_guess.append(i)
    else:
        bad_guess.append(i)

CodePudding user response：

mix = {1:'Blue' , 2:'Black' , 3:'Black' , 4:'Red'}

possibilities = [[1,'Black'],[1,'Red'],[1,'Blue'],[1,'Yellow'],
     [2,'Green'],[2,'Black'],
     [3,'Black'],[3,'Pink'],
     [4,'White'],[4,'Blue'],[4,'Yellow'],
     [5,'Purple'],[5,'Blue']
    ]

for poss in possibilities:
    if (mix[poss[0]] = poss[1]) & (poss not in bad_guess):
        bad_guess.append(poss)
    elif poss not in good_guess:
        good_guess.append(poss)

you could try making the mix list a dictionary instead so you dont need to iterate over it

CodePudding user response：

As is often the case in Python, you don't actually have to muck about with indices at all to do this.

First, here is a simple solution using your existing data structure. Iterate over mix, and append each item to the appropriate list, depending on whether it's in possibilities or not. (This is the same idea presented in this answer to "How to split a list based on a condition?")

mix = [
    [1, 'Blue'],
    [2, 'Black'],
    [3, 'Black'],
    [4, 'Red'],
]

possibilities = [
    [1, 'Black'], [1, 'Red'], [1, 'Blue'], [1, 'Yellow'],
    [2, 'Green'], [2, 'Black'],
    [3, 'Black'], [3, 'Pink'],
    [4, 'White'], [4, 'Blue'], [4, 'Yellow'],
    [5, 'Purple'], [5, 'Blue'],
]

correct_guesses = []
bad_guesses = []

for item in mix:
    if item in possibilities:
        correct_guesses.append(item)
    else:
        bad_guesses.append(item)

print(correct_guesses)
print(bad_guesses)

Output:

[[1, 'Blue'], [2, 'Black'], [3, 'Black']]
[[4, 'Red']]

However, this does a lot of unnecessary looping. Each time you check item in possibilities, the code has to iterate over possibilities (which is a list) to see whether or not item is there.

As others have commented, the issue here is your data structure. Instead of a list, possibilities could be a dictionary. Checking whether a dictionary has a given key, or accessing the value associate with a given key, is O(n); essentially it's "instant" instead of having to go look for it.

possibilities = {
    1: ['Black', 'Red', 'Blue', 'Yellow'],
    2: ['Green', 'Black'],
    3: ['Black', 'Pink'],
    4: ['White', 'Blue', 'Yellow'],
    5: ['Purple', 'Blue']
}

Here each key is an integer, and each value is a list of the colors that number allows. Then your for loop would look like this, checking if the color is one allowed for that number

for item in mix:
    number, color = item
    if color in possibilities[number]:
        correct_guesses.append(item)
    else:
        bad_guesses.append(item)

Do you see the problem, though? We're still doing the same thing: using in on a list. We could turn each of those lists into a set instead, which can much more efficiently check whether or not it contains something:

possibilities = {
    1: {'Black', 'Red', 'Blue', 'Yellow'},
    2: {'Green', 'Black'},
    3: {'Black', 'Pink'},
    4: {'White', 'Blue', 'Yellow'},
    5: {'Purple', 'Blue'}
}

Checking whether a set contains an item is much faster than for a list. The for loop would remain the same.

With all that in mind, here's a complete solution. I've also changed the two-item lists to tuples, which serves no functional difference in this case, but is more idiomatic.

mix = [
    (1, 'Blue'),
    (2, 'Black'),
    (3, 'Black'),
    (4, 'Red'),
]

possibilities = {
    1: {'Black', 'Red', 'Blue', 'Yellow'},
    2: {'Green', 'Black'},
    3: {'Black', 'Pink'},
    4: {'White', 'Blue', 'Yellow'},
    5: {'Purple', 'Blue'}
}

correct_guesses = []
bad_guesses = []

for item in mix:
    number, color = item
    if color in possibilities[number]:
        correct_guesses.append(item)
    else:
        bad_guesses.append(item)

print(correct_guesses)
print(bad_guesses)

Output:

[(1, 'Blue'), (2, 'Black'), (3, 'Black')]
[(4, 'Red')]