Home > Blockchain >  How to split a string into separate words and create the array of these words in C language?
How to split a string into separate words and create the array of these words in C language?

Time:11-09

So, the task is the following:

Find the number of words in the text in which the first and last characters are the same.

In order to do this, I think I first should split the text and create the array of separate words.

For example, the string is:

"hello goodbye river dog level"

I want to split it and get the following array:

{"hello", "goodbye", "river", "dog", "level"}

I have the code that splits the string:

#include<stdio.h>
#include <string.h>

int main() {
   char string[100] = "hello goodbye river dog level";
   // Extract the first token
   char * token = strtok(string, " ");
   // loop through the string to extract all other tokens
   while( token != NULL ) {
      printf( " %s\n", token ); //printing each token
      token = strtok(NULL, " ");
   }
   return 0;
}

However, it just prints these words, and I need to append each word to some array. The array shouldn't be of fixed size, because potentially I could add as many elements as the text requires. How to do this?

CodePudding user response:

I don't see any reason to split into words. Just iterate the string while keeping a flag that tells whether you are inside or outside a word (i.e. a state variable). Then have variables for first and last character that you maintain as you iterate. Compare them when you go out of a word or reach end-of-string.

A simple approach could look like:

#include <stdio.h>

int count(const char* s)
{
    int res = 0;
    int in_word = 0;
    char first;
    char last;
    
    while(*s)
    {
        if (in_word)
        {
            if (*s == ' ')
            {
                // Found end of a word
                if (first == last)   res;
                in_word = 0;
            }
            else
            {
                // Word continues so update last
                last = *s;
            }
        }
        else
        {
            if (*s != ' ')
            {
                // Found start of new word. Update first and last
                first = *s;
                last = *s;
                in_word = 1;
            }
        }
          s;
    }
    if (in_word && first == last)   res;
    return res;
}

int main(void) 
{
    char string[100] = "hello goodbye river dog level";
    printf("found %d words\n", count(string));
    return 0;
}

Output:

found 2 words

Note: Current code assumes that word delimiter is always a space. Further the code doesn't treat stuff like , . etc. But all that can be added pretty easy.

CodePudding user response:

Here is a simple (but naive) implementation based on the existing strtok code. It doesn't just count but also points out which words that were found, by storing a pointer to them in a separate array of pointers.

This works since strtok changes the string in-place, replacing spaces with null terminators.

#include <stdio.h>
#include <string.h>

int main(void)
{
  char string[100] = "hello goodbye river dog level";
  char* words[10]; // this is just assuming there's not more than 10 words
  size_t count=0;

  for(char* token=strtok(string," "); token!=NULL; token=strtok(NULL, " ")) 
  {
    if( token[0] == token[strlen(token)-1] ) // strlen(token)-1 gives index of last character
    {
      words[count] = token;
      count  ;
    }
  }

  printf("Found: %zu words. They are:\n", count);
  for(size_t i=0; i<count; i  )
  {
    puts(words[i]);
  }
  
  return 0;
}

Output:

Found: 2 words. They are:
river
level

CodePudding user response:

with strtok based on Alexander's code.

#include <stdio.h>
#include <string.h>

int main(void)
{
    char string[] = "hello, goodbye; river, dog; level.";
    char *token = strtok(string, " ,;.");
    int counter =0;
    while( token != NULL )
    {
        if(token[0]==token[strlen(token)-1]) counter  ;
        token = strtok(NULL, " ,;.");
    }
    printf("found : %d", counter);

    return 0;
}
  • Related