I would like to store a dictionary in a vector of lists. Each lists contains all words that have the same starting letter in the alphabet. (e. g. ananas, apple) My problem is that I cannot read any words starting with "z" in my const char* array into the list. Could someone explain to me why and how to fix this/ Is there a way to realize it with const char*? Thank you!
#include <iostream>
#include <list>
#include <vector>
#include <iterator>
#include <algorithm>
#include <string>
#include <fstream>
std::pair<bool, std::vector<std::list<std::string>> > loadwithList()
{
const char* prefix = "abcdefghijklmnopqrstuvwxyz";
std::vector<std::list<std::string>> dictionary2;
std::ifstream infile("/Users/User/Desktop/Speller/Dictionaries/large", std::ios::in);
if (infile.is_open())
{
std::list<std::string> data;
std::string line;
while (std::getline(infile, line))
{
if (line.starts_with(*prefix) && *prefix != '\0')
{
data.push_front(line);
}
else
{
dictionary2.push_back(data);
data.clear();
prefix ;
}
}
infile.close();
return std::make_pair(true, dictionary2);
}
std::cout << "Cant find file\n";
return std::make_pair(false, dictionary2);
}
int main()
{
auto [loaded, dictionary2] = loadwithList();
if (!loaded) return 1;
}
CodePudding user response:
You loose the first word of each letter after 'a'. This is because when you reach a word of the next letter, the if(line.starts_with(*prefix) && *prefix != '\0')
fails and only then you go to the next letter but also go to the next word.
You loose the whole letter 'z' because after the last line in your file - the if(line.starts_with(*prefix) && *prefix != '\0')
has succeeded at this point - the while (std::getline(infile, line))
terminates and you miss the dictionary2.push_back(data);
.
CodePudding user response:
Answer is already given and problems are explained.
Basically you would need a double nested loop. Outer loop would read word by word, inner loop would check a mtach for each of the characters in "prefix". This will be a lot of looping . . .
And somehow not efficient. It would be better to take a std::map
for storing the data in the first place. And if you really need a std::vector
of std::lists
, then we can copy the data. We will take care to store only lowercase alpha characters as the key of the std::map
.
For test purposes I loaded a list with words from here. There are roundabout 450'000 words in this list.
I used this for my demo program.
Please see below one potential solution proposal:
#include <iostream>
#include <fstream>
#include <map>
#include <list>
#include <vector>
#include <utility>
#include <string>
#include <cctype>
std::pair<bool, std::vector<std::list<std::string>> > loadwithList() {
std::vector<std::list<std::string>> data{};
bool resultOK{};
// Open File and check, if it could be opened
if (std::ifstream ifs{ "r:\\words.txt" }; ifs) {
// Here we will store the dictionary
std::map<char, std::list<std::string>> dictionary{};
// Fill dictionary. Read complete file and sort according to firstc character
for (std::string line{}; std::getline(ifs, line); )
// Store only alpha letters and words
if (not line.empty() and std::isalpha(line.front()))
// Use lower case start character for map. All all words starting with that character
dictionary[std::tolower(line.front())].push_back(line);
// Reserve space for resulting vector
data.reserve(dictionary.size());
// Move result to vector
for (auto& [letter, words] : dictionary)
data.push_back(std::move(words));
// All good
resultOK = true;
}
else
std::cerr << "\n\n*** Error: Could not open source file\n\n";
// And give back the result
return { resultOK , data };
}
int main() {
auto [result, data] = loadwithList();
if ( result)
for (const std::list<std::string>&wordList : data)
std::cout << (char)std::tolower(wordList.front().front()) << " has " << wordList.size() << "\twords\n";
}