Define and return two dimensional string array from function in C-CodePudding

I want to use a two-dimensional string array (new_sentence) that I produced in the function named parse() in the main function (or any other function). The code I wrote is below.

strtok() and strcpy() functions work fine. I can access all elements of the new_sentence array within the parse() function. However, I could not pass this array into the main function.

I can understand that there is a problem with the return value of the parse() function and the definition inside the main function. I did some research on similar issues but couldn't figure it out. Can you help me.

#include <string.h>
#include <stdio.h>

const char *parse(char *s);

int main () {
    
   char str[1000] = "This-is-test-sentence";
   
   char *n_sentence = parse(str);
   for (int j = 0; j<4; j  )
   {
       printf("%s\n", n_sentence[j]);
   }
   return 0;
}

const char *parse(char *s)
{
   char *delim = "-";
   char *token;
   char new_sentence[4][15];

   token = strtok(s, delim);

   int numberOfWord = 0;
   while( token != NULL ) {

      strcpy(new_sentence[numberOfWord], token);
      token = strtok(NULL, delim);
      numberOfWord  ;
   }
   
   return new_sentence;
}

CodePudding user response：

This is one way to go about your problem, based on your code and my own snippets:

#include <string.h>
#include <stdio.h>

#define MAX_WORDS 1000

int parse(char *s, char *n_sentence[], int max_size);

int main () {

   char str[] = "This-is-test-sentence";
   char* n_sentence[MAX_WORDS];

   int tot_words = parse(str, n_sentence, MAX_WORDS);

   printf("Total number of words: %d\n", tot_words);
   for (int j = 0; j<tot_words; j  )
   {
       printf("%s\n", n_sentence[j]);
   }
   return 0;
}

int parse(char *s, char* new_sentence[], int max_size)
{
   char *delim = "-";
   int numberOfWord = 0;

   new_sentence[numberOfWord  ]= strtok(s, delim);

   while(((numberOfWord < max_size) && (new_sentence[numberOfWord] = strtok(NULL, delim)) != NULL)) {
        numberOfWord;
   }

   return numberOfWord;
}

The code declares your array of parsed strings in the main and passes it to parse as an argument (to be filled with words). This avoids the undefined behavior caused by returning a local array created in parse (that ceases to exist after parse terminates).

n_sentence is now a proper array of strings - the code only specifies how many strings, at most, it can store. But the length of an individual string can be anything. parse returns the number of strings that were retrieved from parsing - it is a convenient approach that makes it easier to work with that array as you can see from the printing part.

Lastly, parse itself got rearranged - I used my old approach for parsing lines instead of your strcpy one. I recommend you study this code in detail with a book/documentation. Also, never ignore compiler warnings and ideally, set them to the highest possible level.

EDIT: following a suggestion by @JonathanLeffler parse now checks the maximum allowable size of the input string array and stops parsing - adding strings if that size is reached. This way the code avoids potential writing past that array boundaries. In this case, maximum size is 1000 words. I made a separate argument to parse instead of using the macro MAX_WORDS to make it, as a function, more autonomous. The while() works correctly because of short circuiting.

CodePudding user response：

Your parse() function is the classic case of a string-split function. If you allocate the result on the heap then you can handle an arbitrary number of results. I personally would not use strtok here: it adds a bit of complexity that isn't needed, it modifies the input, and it isn't re-entrant (you can't use it in multiple threads, check out strtok_r).

char **split(const char *s, char delim, size_t *count)
{
        *count = 0;
        const char *p = s;
        char **result = NULL;
        for (; *p != '\0';   p) {
                if (*p != delim)
                        continue;

                char **new_result = realloc(result, (*count   1) * sizeof(char *));
                if (new_result == NULL)
                        goto error;

                result = new_result;
                result[*count] = calloc(p - s   1, sizeof(char));
                if (result[*count] == NULL)
                        goto error;

                strncpy(result[*count], s, p - s);
                  *count;
                s = p   1;
        }

        char **new_result = realloc(result, (*count   1) * sizeof(char *));
        if (new_result == NULL)
                goto error;

        result = new_result;
        result[*count] = calloc(p - s   1, sizeof(char));
        if (result[*count] == NULL)
                goto error;

        strncpy(result[*count], s, p - s);
          *count;
        return result;

error:
        for (size_t i = 0; i < *count;   i)
                free(result[i]);

        free(result);
        *count = 0;
        return NULL;
}

Here is a test driver for the split function:

int main(int argc, char **argv)
{
        for (int i = 1; i < argc;   i) {
                size_t count = 0;
                char **result = split(argv[i], '-', &count);
                printf("'%s'\n", argv[i]);
                for (size_t i = 0; i < count;   i)
                        printf("\t'%s'\n", result[i]);

                for (size_t i = 0; i < count;   i)
                        free(result[i]);

                free(result);
        }

        return EXIT_SUCCESS;
}

Some tests covering a few edge cases:

$ ./split -string-one string-two- string--three string-four
'-string-one'
        ''
        'string'
        'one'
'string-two-'
        'string'
        'two'
        ''
'string--three'
        'string'
        ''
        'three'
'string-four'
        'string'
        'four'

If this is a homework assignment I highly recommend studying the code and trying to recreate it from scratch.