Home > OS >  How to get all dictionary words from a list of letters?
How to get all dictionary words from a list of letters?

Time:11-25

I have an input string, like "fairy", and I need to get the English words that can be formed from it. Here's an example:

5: Fairy
4: Fray, Airy, Fair, Fiar
3: Fay, fry, arf, ary, far, etc.

I have an std::unordered_set<std::string> of dictionary words so I can easily iterate over it. I've created permutations before as shown below:

std::unordered_set<std::string> permutations;

// Finds every permutation (non-duplicate arrangement of letters)
std::sort(letters.begin(), letters.end());
do {
    // Check if the word is a valid dictionary word first
    permutations.insert(letters);
} while (std::next_permutation(letters.begin(), letters.end()));

That's perfect for length 5. I can check each letters to see if it matches and I end up with "fairy", which is the only 5 letter word that can be found from those letters.

How could I find words of a smaller length? I'm guessing it has to do with permutations as well, but I wasn't sure how to implement it.

CodePudding user response:

You can keep an auxiliary data structure and add a special symbol to mark an end-of-line:

#include <algorithm>
#include <string>
#include <set>
#include <list>
#include <iostream>
 
int main()
{
    std::list<int> l = {-1, 0 ,1, 2, 3, 4};
    std::string s = "fairy";
    std::set<std::string> words;
    do {
        std::string temp = "";
        for (auto e : l)
            if (e != -1) temp  = s[e];
            else break;
        words.insert(temp);
    } while(std::next_permutation(l.begin(), l.end()));
}

Here the special symbol is -1

CodePudding user response:

Okay, you have to ask yourself a question. Can you reuse letters? For instance, if you're given the word friend, is fee legal? Friend has 1 e and fee has 2. That's an important but minor detail.

Algorithm 1: Brute Force

You can iterate over your entire list of possible words and write a method "does this word consist only of letters in this other word"? If so, add it to your final list.

That algorithm changes very slightly based on your answer to the first question, but it's not hard to write.

Algorithm 2: Recursive Approach

Create a method addWords().

/**
 * letters is the list of letters you're allowed to use
 * word may not be empty
 */
void addWords(string letters, string word) {
    size_t length = word.length();

    for (int index = 0; index < length;   index) {
         string newWord = word   letters[index];
         string remainingLetters = letters.substr(0, index)   letters(index   1);
         // if newword is in your dictionary, add it to the output
         ...
         addWords(remainingLetters, newWord);
    }
}

Let's look how this works with addWords("fairy", "") --

First loop: add f to the empty string and check if f is a word. Then recurse with addWords("airy", f"). We'll look at recursion shortly.

Second loop: add a to the empty string and check if a is a word. It is, so we'll add it to the output and recurse with addWords("firy", "a").

Repeat, checking each one-letter word (5 times total).

Now, let's look at one level of recursion -- addWords("airy", "f"). Now, we're going to try in order fa, fi, etc. Then we'll recurse again with something like addWords("iry", "fa") (etc).

From recursing the second loop, we would try words beginning with a instead of f. So we would end up testing af, ai, etc.

This works if the answer to your first question is "no, we don't reuse letters". This method does NOT work if the answer is yes.

CodePudding user response:

Algorithmically, you can use something like a counter to generate all subsets of your word, and then find all the permutations:

For example:

00000 ---> null
00001 ---> y
00010 ---> r
00011 ---> ry
00100 ---> i
00101 ---> iy
...
11110 ---> Fair
11111 ---> Fairy

Note: Now, do your permutation function for each word to generate other orders of the chars. See here for the permutation.

For implementing the counter, use something like a boolean array, and change the lowest bit, and update others if it needs. In each level, choose those "chars" that their indices are true in your boolean array.

  • Related