Home > Software engineering >  C Storing char input of words into a matrix
C Storing char input of words into a matrix

Time:04-11

I am having difficulty in splitting an inputted paragraph into separate words, and storing each word into a row of an array:

For example, if we have a text input stating: 'my name is john' (assume input is always lowercase), i would like to store each word into a separate row:

    0 1 2 3 4 5 .... 30
0   m y
1   n a m e
2   i s
3   j o h n 
.
.
.
(w)

However, my output is significantly different and i suspect i have lost myself in my use of loops.

#include <iostream>
#include <fstream>

using namespace std;

int main(){
    
    char c[100];
    int i, j, z, t /*text size array*/, w /*number of words*/;
    
    
    cin.getline(c,100);
    
    for(i=0; c[i]!=' '; i  ); //calculating spaces
    cout << "No. of spaces = " << i << endl;
    w = i   1; //number of words

    for(t=0; c[t]!='\0'; t  );
    cout << t << endl; //total characters, the size of the array we need going forward 
    
    int k[t]; //array where the indexes of the spaces in c will be tracked
    
    for(int i=0; i<=t; i  ){
        if(c[i]!=32){ //32 is the ASCII code for a space bar
            k[i] = i;  
            cout<<k[i]<<endl;
        }
    }
    
    char words[w][30]; //30 max characters in a word
    
    for(int i=0; i < w; i  ){ //we have 5 words here, so 5 columns
        for(int j=0; j<t;j  ){ //going through all the characters to find the spaces again
            if(j != k[j-1]   1){ //here is the index numbers don't match, then theres a space
                
                //im not quite sure as to how to keep track of the indexes here as we need to calculate the difference between the next space and last space also
                for(int z=0; z<=j; z  ){ //filling the matrix
                   words[i][z] = c[z]; //column = i, the word number. row z tracks the index number where the character in the initial input at index z, is stored.
                    cout<<words[i][z]; 
                }
            }
        }
    }

}

I would appreciate any sort of help - i recognise this is very messy code, however i want to be able to apply the concept of matrices to this code, to use for the overall purpose (this is still the beginning of the project).

Please note, i can only use iostream and fstream libraries here. The couts are irrelevant, just for my use to keep track of my code in my efforts to debug.

Thanks!

CodePudding user response:

As Maker R noted in a comment below the post, it would be really helpful if you could be a little more specific on the task itself. If it is an online task, sharing the text of it would be pretty helpful.

As a start, I don't see you using fstream anywhere in the code you shared, so I'm going to presume you're more interested in the loop itself.

If the goal of the task is to just store each word on a new row of nxn char array, then this would do the job:

    const short inputMaxSize = 100, maxRows = 31; // for '\0'
    char input[inputMaxSize], result[maxRows][maxRows];

    cout << "Enter string: \n";
    cin.getline(input, inputMaxSize);

    int row = 0, col = 0, currentIndex = 0;

    while (input[currentIndex] != '\0') { // looping till Terminating Null, which is automatically appended to the end by .getline
        if (input[currentIndex] == ' ') { // enter here when reaching spacebar, aka end of word
            result[row][col   1] = '\0'; // This would come handy if you want to use the data stored after
            row  ; // incerement to go on next row of matrix
            currentIndex  ; // so next iteration of the while loop starts from next element
            col = 0; // resetting column index of result
            cout << '\n';
            continue;
        }
        // else case        
        result[row][col] = input[currentIndex]; // saving current char from input to the char array on current row
        cout << result[row][col] << ' '; // printing with spaces
        currentIndex  ;
        col  ;
    }

As it is storing the characters, it goes on ahead to print them out, with spaces on the console.

If you want the case where you save the 'spaces' to the char array itself, you would have to be a bit more specific on the edge cases. Do you want the last character to have a space after it or not? If so, you could just do

result[row][col] = input[currentIndex];
result[row][col   1] = ' ' // adds space after each element

But, of course - You also have to check bounds of the array!

If you want spaces beneath them as well, you would have to make every even row be a blank row filled with spaces.. and so on...

If you want numbers in the array too (representing indexes, just like in the example you showed), you would have to add the char equivalent of it to the array, meaning you do an extra iteration for the column indexes, and row indexes you put on each row iteration. (boldly said)

p.s. I don't see you using dynamic memory (the new operator) so I'm just keeping it static by limiting the inputs.

CodePudding user response:

What is a matrix?

In computer science, a matrix is nothing more than an array of arrays. It can be a std::vector<std::vector> or, like in my first implementation, a std::vector<std::string> (a std::string is nothing more than a null-terminated array of characters).

A matrix may be stored in row- or column-major order. The order that is best suited for your case is clearly row-major, meaning matrix[0][4] accesses the element in the first row and fifth column (i.e. the fifth character of the first word).

https://en.wikipedia.org/wiki/Row-_and_column-major_order

Separating strings into words

We can then use the following algorithm:

  • Keep track of two indices/iterators (I'll be using iterators), one for the beginning of the word (named begin), the other for the end (one character after the last character, named end);
  • Then, go through each character of the input string, with the following steps:
  1. If the end iterator points at a separating character, store the [begin, end) word. Then, make the begin iterator point to one character after the end iterator (skipping the separating character). My implementation also ignores empty words (when begin == end - 1) and blank words (when the word only contains separating characters).
  2. Increment the end iterator to include a new character at the end of the [begin, end) word.
  • Store the [begin, end) word for the last time. This is necessary because the string rarely ends with a separating character, which was the only condition to store a word.

Implementation using modern C

#include <iostream>
#include <vector>
#include <string>

bool separatesWords(char c)
{   
    // Edit this if you want more separating characters
    // e.g. tabs, punctuation, etc.
    // (std::isblank and std::isspace might help)
    return c == ' ';
}

std::string getInput() // Get user input
{
    std::string line;
    std::getline(std::cin, line);
    return line;
}

std::vector<std::string> splitIntoWords(const std::string& str)
{
    std::vector<std::string> words;
    // Create iterators
    auto wordBegin(str.cbegin());
    auto wordEnd(str.cbegin());

    while (wordEnd <= str.cend())
    {
        if (separatesWords(*wordEnd)) // If character can separate words
        {
            if (wordBegin < wordEnd - 1) // At least one character to make a word
                words.emplace_back(wordBegin, wordEnd); // Store the word

            wordBegin = wordEnd   1; // Update begin iterator
        }

          wordEnd; // Update end iterator
    }

    if (wordBegin < wordEnd - 1) // At least one character to make a word
        words.emplace_back(wordBegin, wordEnd); // Store the word

    return words;
}

Implementation without STL features

I used std::string and std::vector, which you asked not to do. To transform the code accordingly, the following changes are required:

  • Use a C-style array instead of a std::vector
  • Use C-style strings instead of std::string
  • Use indices instead of iterators

Please do note that not using more modern approaches is a poor design decision as it often results in messier, less concise and less maintainable code.

#include <iostream>

constexpr size_t MAX_WORD_COUNT = 10;
constexpr size_t MAX_WORD_SIZE = 30;

bool separatesWords(char c)
{   
    // Edit this if you want more separating characters
    // e.g. tabs, punctuation, etc.
    // (std::isblank and std::isspace might help)
    return c == ' ';
}

void getInput(char* input)
{
    std::cin.getline(input, MAX_WORD_COUNT * MAX_WORD_SIZE);
}

void splitIntoWords(const char* input, char words[MAX_WORD_COUNT][MAX_WORD_SIZE])
{
    // Create iterators
    size_t wordBegin(0);
    size_t wordEnd(0);
    // Word counter
    size_t wordCount(0);

    while (wordEnd <= MAX_WORD_SIZE)
    {
        if (separatesWords(input[wordEnd])) // If character can separate words
        {
            if (wordBegin < wordEnd - 1) // At least one character to make a word
            {
                words[wordCount][wordEnd - wordBegin] = '\0'; // Null-terminate the word
                  wordCount; // Update the word count
            }
            wordBegin = wordEnd   1; // Update begin index
        }
        else
            words[wordCount][wordEnd - wordBegin] = input[wordEnd]; // Store one character

          wordEnd; // Update end index
    }

    // Initialize remaining rows to empty strings
    for (  wordCount; wordCount < MAX_WORD_COUNT;   wordCount)
        words[wordCount][0] = '\0';
}
  • Related