Home > Software design >  C Code to split a string and return an array?
C Code to split a string and return an array?

Time:11-08

It's been a while since I've used C, and I am trying to write a simple utility that accepts an input string and returns an array of strings. So for example if the input is

draw 1 2 3

the output array is:

out[0] = draw
out[1] = 1
out[2] = 2
out[3] = 3

I appreciate and and all help and please be kind, I've used mainly C# lately which has a very nice string.split function!

Here's what I tried, found some example code and tried to modify it.

#include<stdio.h>
#include <string.h>


char outPut[50][50] splitString(char str[200])
{
  int iCount = 1;
  // Returns first token
  char* token = strtok(str, " ");
  outArray[0] = token;
  
  // Keep printing tokens while one of the
  // delimiters present in str[].
  while (token != NULL) 
  {
    printf("%s\n", token);
    token = strtok(NULL, " ");
    outArray[iCount] = token;
    iCount  ;
  }
    
}
int main ()
{
  char str[200];
  char outArr[50][50];
  puts ("Enter text:");
  gets(str);
  outArr = splitString(str);
  for (int i=0; i<outArr.length; i  )
  {
     printf(Array element %d is %s", i, outArr[i]);       
  }
  return 0;
}

CodePudding user response:

How to return an array from a function

In C, you cannot directly return an array. The function main must tell the function splitString the address of the array it should write to. Therefore, you should change the function parameters of splitString to the following:

int splitString( char input[200], char output[50][50] )

Since arrays can't actually be passed as function parameters, both array parameters will decay to a pointer to the first element of the array. In the case of the 2D array, the parameter will decay to a pointer to the first element of the outer array.

I have changed the return type to int, because the function splitStrings must somehow tell the function main how many tokens it found and wrote to the array. Therefore, it makes sense for the function splitStrings to return this number.

Now you can call the function splitString from the function main like this

num_tokens = splitString( str, outArr );

which is equivalent to

num_tokens = splitString( &str[0], &outArr[0] );

because the arrays decay to a pointer to their first element.

How to copy a string

In C, you cannot copy a string like this:

outArray[iCount] = token;

What you are actually telling the compiler to do is to copy the pointer token (not the string that it is pointing to) and to assign it to the array. This will not work, because you cannot assign a value to an array itself. You can only assign values to the individual elements of an array.

If you want to copy the string that the pointer is pointing to instead of the pointer itself, then you should use the function strcpy instead, like this:

strcpy( outArray[iCount], token );

Due to array to pointer decay, this function call is equivalent to:

strcpy( &outArray[iCount][0], token );

However, since the destination array only has space for 49 characters plus the terminating null character, you should first verify that the token is not too long, otherwise you may cause a buffer overflow, which could cause your program to crash or misbehave in some other way.

Whether to use the function gets

The function gets is so dangerous that it has been removed from the ISO C standard, because there is no way to prevent a buffer overflow, unless the input originates from a trusted source. Therefore, you should stop using this function and use fgets instead. See this question for further information:

Why is the gets function so dangerous that it should not be used?

Note that if you use fgets instead of gets, there will usually be a newline character at the end of the input. You will usually want to remove it. See this question on how to do so:

Removing trailing newline character from fgets() input

Fixed code

After applying all of the fixes mentioned above, and after applying some other minor improvements, your code should look like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_TOKENS 50
#define MAX_TOKEN_LENGTH 50

int splitString( char input[], char output[MAX_TOKENS][MAX_TOKEN_LENGTH] )
{
    //this variable will keep track of the number of found tokens
    int num_tokens = 0;

    //find first token
    char* token = strtok( input, " " );

    while ( token != NULL )
    {
        //verify that array is large enough to store this many tokens
        if ( num_tokens == MAX_TOKENS )
        {
            fprintf( stderr, "Too many tokens to fit in array!\n" );
            exit( EXIT_FAILURE );
        }

        //verify that token length is not too long to fit in buffer
        if ( strlen( token ) >= MAX_TOKEN_LENGTH )
        {
            fprintf( stderr, "Error: Token length too long to fit in buffer!\n" );
            exit( EXIT_FAILURE );
        }

        //print the token for debugging purposes
        printf( "Adding token to array: %s\n", token );

        //copy the token to the output buffer
        strcpy( output[num_tokens], token );

        //increment the number of found tokens
        num_tokens  ;

        //attempt to find next token for next loop iteration
        token = strtok( NULL, " " );
    }

    //return the number of tokens found to the calling function
    return num_tokens;
}

int main( void )
{
    char line[200];
    char outArr[MAX_TOKENS][MAX_TOKEN_LENGTH];
    char *p;
    int num_tokens;

    //prompt user for input
    printf( "Enter text: ");

    //attempt to read one line of input
    if ( fgets( line, sizeof line, stdin ) == NULL )
    {
        fprintf( stderr, "Input error!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find newline character in input
    p = strchr( line, '\n' );

    //verify that input line was not too long to fit in buffer
    if ( p == NULL && !feof( stdin ) )
    {
        fprintf( stderr, "Input was too long for input buffer!\n" );
        exit( EXIT_FAILURE );
    }

    //remove newline character from input by overwriting it with
    //a null terminating character
    if ( p != NULL )
    {
        *p = '\0';
    }

    //perform the function call
    num_tokens = splitString( line, outArr );

    //output the result of the function call
    for ( int i = 0; i < num_tokens; i   )
    {
        printf( "Array element %d is %s.\n", i, outArr[i]); 
    }

    return EXIT_SUCCESS;
}

This program has the following behavior:

Enter text: draw 1 2 3
Adding token to array: draw
Adding token to array: 1
Adding token to array: 2
Adding token to array: 3
Array element 0 is draw.
Array element 1 is 1.
Array element 2 is 2.
Array element 3 is 3.
  • Related