Home > Software engineering >  How to get word suggestions from a few known letters in Python?
How to get word suggestions from a few known letters in Python?

Time:04-10

Let's say I have these letters and where there's an underscore there's an unknown letter: B_t

How can I get all the optional words from this? like Bat, Bot, Bet, Bit etc.

CodePudding user response:

For n-_ in a text you have 26**n, where 26 is the size of the alphabet. Here a possibility where assumed lower-case characters:

import itertools as it
import string

text = 'B_t_'
missing_chars = text.count('_')

guesses = []
text_pattern = text.replace('_', '{}')
for chars in it.product(string.ascii_lowercase, repeat=missing_chars):
    guesses.append(text_pattern.format(*chars))

print(guesses)
# ['Bata', 'Batb', 'Batc', 'Batd', 'Bate', 'Batf', ...]

CodePudding user response:

Using regular expressions

Code

 find_matches(word_to_find):
    ' finds words that matches word_to_find pattern in word_list '
    find_patterh = word_to_find.replace("_", "\w")   # convert to regex pattern replacing _ (which can be any letter)
                                                     # with \w to represent unknown 
                                                     # word character
    pattern = re.compile(find_patterh)               # compiled regex expression
    
    # Words in list that matched pattern
    return [m.group() for word in word_list if (m:=pattern.match(word))]

Tests

# Test list of words (more complete list at https://pypi.org/project/english-words/)
word_list = ['store', 'run', 'Bat', 'Ball', 'Bat', 'Bot', 'Bet', 'Bit', 'oven', 'go', 'warm', 'move']

print(f"Matches for B_t are: {find_matches('B_t')}")
# Output: Matches for B_t are: ['Bat', 'Bat', 'Bot', 'Bet', 'Bit']

print(f"Matches for ru_ are: {find_matches('ru_')}")
# Output: Matches for ru_ are: ['run']
  • Related