Home > Software engineering >  Issues removing words from a list in Python
Issues removing words from a list in Python

Time:02-14

I'm building a Wordle solver. Basically removing words from a list, if they don't have specific characters, or don't have them at specific locations. I'm not concerned about the statistics for optimal choices yet.

When I run the below code (I think all relevant sections are included), my output is clear that it found a letter matching position to the 'word of the day'. But then the next iteration, it will choose a word that doesn't have that letter, when it should only select from remaining words.

Are words not actually being removed? Or is there something shadowing a scope I can't find? I've rewritten whole sections, with the exact same problem happening.

#Some imports and reading the word list here. 

def word_compare(word_of_the_day, choice_word):
    results = []
    index = 0
    letters[:] = choice_word
    for letter in letters:
        if letter is word_of_the_day[index]:
            results.append((letter, 2, index))
        elif letter in word_of_the_day:
            results.append((letter, 1, index))
        else:
            results.append((letter, 0, index))
        index  = 1
    print("\nIteration %s\nWord of the Day: %s,\nChoice Word: %s,\nResults: %s" % (
        iteration, word_of_the_day, choice_word, results))
    return results


def remove_wrong_words():
    for item in results:
        if item[1] == 0:
            for word in words:
                if item[0] in word:
                    words.remove(word)
    for item in results:
        if item[1] == 2:
            for word in words:
                if word[item[2]] != item[0]:
                    words.remove(word)
    print("Words Remaining: %s" % len(words))
    return words


words, letters = prep([])
# choice_word = best_word_choice()
choice_word = "crane"
iteration = 1
word_of_the_day = random.choice(words)

while True:
    if choice_word == word_of_the_day:
        break
    else:
        words.remove(choice_word)
        results = word_compare(word_of_the_day, choice_word)
        words = remove_wrong_words()
        if len(words) < 10:
            print(words)
        choice_word = random.choice(words)
        iteration  = 1

Output I'm getting:

Iteration 1
Word of the Day: stake,
Choice Word: crane,
Results: [('c', 0, 0), ('r', 0, 1), ('a', 2, 2), ('n', 0, 3), ('e', 2, 4)]
Words Remaining: 386

Iteration 2
Word of the Day: stake,
Choice Word: lease,
Results: [('l', 0, 0), ('e', 1, 1), ('a', 2, 2), ('s', 1, 3), ('e', 2, 4)]
Words Remaining: 112

Iteration 3
Word of the Day: stake,
Choice Word: paste,
Results: [('p', 0, 0), ('a', 1, 1), ('s', 1, 2), ('t', 1, 3), ('e', 2, 4)]
Words Remaining: 81

Iteration 4
Word of the Day: stake,
Choice Word: spite,

... This continues for a while until solved. In this output, 'a' is found to be in the correct place (value of 2 in the tuple) on the second iteration. This should remove all words from the list that don't have 'a' as the third character. Instead 'paste' and 'spite' are chosen for later iterations from that same list, instead of having been removed.

CodePudding user response:

I think one of your issues is the following line: if letter is word_of_the_day[index]:. This should be == not is as the latter checks for whether the two objects being compared have the same memory address (i.e. id()), not whether they have the same value. Thus, results will never return a tuple with a value of 2 in position 1, so this means the second for loop in remove_wrong_words won't do anything either. There may be more going on but I'd like a concrete example to run before digging in further.

CodePudding user response:

Your issue has to do with removing an item from a list while you iterate over it. This often results in skipping later values, as the list iteration is being handled by index, under the covers.

Specifically, the problem is here (and probably in the other loop too):

for word in words:
    if item[0] in word:
        words.remove(word)

If the if condition is true for the first word in the words list, the second word will not be checked. That's because when the for loop asks the list iterator for the next value, it's going to yield the second value of the list as it now stands, which is going to be the third value from the original list (since the first one is gone).

There are a few ways you could avoid this problem.

One approach is to iterate on a copy of the list you're going to modify. This means that the iterator won't ever skip over anything, since the copied list is not having anything removed from it as you go (only the original list is changing). A common way to make the copy is with a slice:

for word in words[:]:       # iterate on a copy of the list
    if item[0] in word:
        words.remove(word)  # modify the original list here

Another option is to build a new list full of the valid values from the original list, rather than removing the invalid ones. A list comprehension is often good enough for this:

words = [word for word in words if item[0] not in word]

This may be slightly complicated in your example because you're using global variables. You would either need to change that design (and e.g. accept a list as an argument and return the new version), or add global words statement to let the function's code rebind the global variable (rather than modifying it in place).

  • Related