Home > Back-end >  How can I split a string of numbers in order to have only couples of numbers in C?
How can I split a string of numbers in order to have only couples of numbers in C?

Time:11-12

I want to do two things :

1)

char *myStr = "2 1 3 2 5 2 5 4 6 1 6 2 7 1 7 3 7 4 8 1 ";

The goal is to have my string to be of this form

2 1
3 2 
5 2
...
...

And so on until the end of the string.

2) With a string like this, I want to put these values in a 2D array in the form myArr[0][0] = 2, myArr[0][1] = 1, myArr[1][0] = 3, myArr[1][1] = 2, and so on and so forth.

I firstly tried with strtok but I think it's not appropriate as the delimiters are not enough for this problem And then by iterating character by character to split but up to this point I don't know how to do that:

const char * separator = " "
char * strToken = strtok ( out, separator );
while ( strToken != NULL ) {
  printf ( "%s\n", strToken);
  strToken = strtok ( NULL, separator);
}

what I get :

2
1
3
2
5
2
5
4
6
1
6
2
7
1
7
3
7
4
8
1

CodePudding user response:

In C, it's tempting to try to make things tiny, but sometimes it's clearer to do things a longer way.

In this case, you're trying to make a custom parser. There are a few ways of doing this with various complexity, but I'll describe a simple top-down approach to go from the input string to the 2D arran of integer values.

You'll need to keep track of how far in the string you've scanned, and where you've added to the 2D array. I'll assume these are constant length, you can make them dynamic if needed.

char * myStrPtr = myStr;

int myArr[NUM_PAIRS][2];
int myArrIdx = 0;

The basic operation you want is scan a number, then skip spaces after it. Here's a function for that. It takes the character pointer and a pointer to the integer, then returns a pointer to the next number (skipping error checking).

char * getInt(char * myStrPtr, int * i) {
    char *myEndPtr = NULL;
    *i = strtol(myStrPtr, &myEndPtr, 0);
    // Error if myEndPtr == myStrPtr. Skip spaces now.
    while (isspace(myEndPtr)) {
        myEndPtr  ;
    }
    return myEndPtr;
}

You want to scan 2 numbers at a time, so here's a function that does that. It takes the character pointer, and a 1D int array, and returns the new character pointer.

char * get2Ints(char * myStrPtr, int[2]) {
    char *myEndPtr = NULL;

    // Skipping check for end of string.
    myEndPtr = getInt(myStrPtr, &int[0]);
    // Error if myEndPtr == myStrPtr, skipping that check.

    // Comments above apply here.
    myEndPtr = getInt(myStrPtr, &int[1]);

    return myEndPtr;
}

Finally, you want to scan all the pairs in the string. I'll assume the number of pairs is known, otherwise you would count them first and allocate an array for them, or use a linked list to store them.

for (myArrIdx = 0; myArrIdx < NUM_PAIRS; myArrIdx  ) {
    char * myEndPtr = get2Ints(myStrPtr, myArr[myArrIdx]);
    // Error if myArrPtr == myEndPtr.
    myStrPtr = myEndPtr;
}

That should do it.

CodePudding user response:

Parse things with strtol. The following is extremely fragile and depends on precise input, but it gives the general idea:

#include <ctype.h>
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>

int
main(void)
{
    char myStr[] = "2 1 3 2 5 2 5 4 6 1 6 2 7 1 7 3 7 4 8 1 ";
    int arr[10][2];
    char *t = myStr;
    int (*a)[2] = arr;
    while( t < myStr   sizeof myStr && *t ){
        char *end;
        a[0][0] = strtol(t, &end, 10);
        assert(isspace(*end));
        a[0][1] = strtol(end   1, &end, 10);
        assert(isspace(*end));
        *end = '\n';
        t = end   1;
        a  = 1;
    }
    printf("%s", myStr);
    for( int i = 0; i < 10; i  = 1 ){
        printf("%d, %d\n", arr[i][0], arr[i][1]);
    }
}

CodePudding user response:

I firstly tried with strtok but I think it's not appropriate as the delimiters are not enough for this problem

That is not the problem with using strtok. The problem with strtok is that it must overwrite the delimiters with null terminating characters, so that the individual fields become strings terminated by a null character. However, since myStr is a pointer to a string literal, you are not allowed to modify it.

Therefore, if you want to use strtok, you must either copy the string literal to a memory buffer which is writable, or declare myStr not as a pointer to a string literal, but rather as a char array, like this:

char myStr[] = "2 1 3 2 5 2 5 4 6 1 6 2 7 1 7 3 7 4 8 1 ";

Here is a solution which uses strtok:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_ROWS 100
#define COLS_PER_ROW 2

int main( void )
{
    char myStr[] = "2 1 3 2 5 2 5 4 6 1 6 2 7 1 7 3 7 4 8 1 ";
    const char delimiters[] = " ";
    char myArr[MAX_ROWS][COLS_PER_ROW];
    int num_rows = 0;
    char *p;

    //find first token
    p = strtok( myStr, delimiters );

    //process one row per loop iteration
    for (;;) //infinite loop, equivalent to while(1)
    {
        //process one column per loop iteration
        for ( int i = 0; i < COLS_PER_ROW; i   )
        {
            //determine whether there are more tokens
            if ( p == NULL )
            {
                //print warning message if ran out of tokens in the
                //middle of a row, as this should only happen at the
                //start of a row
                if ( i != 0 )
                {
                    fprintf(
                        stderr,
                        "Warning: Ran out of tokens in the middle of "
                        "a row!\n"
                    );
                }

                //we cannot use "break" here, because that would
                //only break out of the innermost loop, but we must
                //break out of two levels of nested looops
                goto break_out_of_nested_loop;
            }

            //verify that we are not going to write to the array
            //out of bounds
            if ( num_rows == MAX_ROWS )
            {
                fprintf(
                    stderr,
                    "Too many rows to fit in the array! Stopping..."
                );

                goto break_out_of_nested_loop;
            }

            //print warning message and stop parsing if found token
            //is larger than one character
            if ( strlen( p ) > 1 )
            {
                fprintf(
                    stderr, 
                    "Warning: Found token is larger than "
                    "one character! Stopping...\n"
                );

                goto break_out_of_nested_loop;
            }

            //add found character to the array
            myArr[num_rows][i] = p[0];

            //find next token for next loop iteration
            p = strtok( NULL, delimiters );
        }

        //increase the number of valid rows in the array
        num_rows  ;
    }

break_out_of_nested_loop:

    //print the content of the array
    for ( int i = 0; i < num_rows; i   )
    {
        for ( int j = 0; j < COLS_PER_ROW; j   )
        {
             printf( "%c ", myArr[i][j] );
        }

        printf( "\n" );
    }
}

This program has the following output:

2 1 
3 2 
5 2 
5 4 
6 1 
6 2 
7 1 
7 3 
7 4 
8 1 

Note that goto should generally be avoided, if possible. However, for breaking out of nested loops, there usually are no better alternatives, so it is generally considered acceptable in this case.

Since you are only dealing with individual characters and not strings, you don't really need strtok. If you don't need strtok, then you can also use a pointer to a string literal, as you defined it in your question:

char *myStr = "2 1 3 2 5 2 5 4 6 1 6 2 7 1 7 3 7 4 8 1 ";

Here is the a corresponding solution:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

#define MAX_ROWS 100
#define COLS_PER_ROW 2

int main( void )
{
    char *myStr = "2 1 3 2 5 2 5 4 6 1 6 2 7 1 7 3 7 4 8 1 ";
    char myArr[MAX_ROWS][COLS_PER_ROW];
    int num_rows = 0;
    char *p = myStr;

    //process one row per loop iteration
    for (;;) //infinite loop, equivalent to while(1)
    {
        //process one column per loop iteration
        for ( int i = 0; i < COLS_PER_ROW; i   )
        {
            //skip all whitespace characters
            while ( isspace( (unsigned char)*p ) )
                p  ;

            //determine whether next character is a digit
            if ( !isdigit( (unsigned char)*p ) )
            {
                //check if end of string is reached
                if ( *p != '\0' )
                {
                    //unexpected character was found, so print a
                    //warning message and stop parsing
                    fprintf(
                        stderr,
                        "Warning: Unexpected character found! "
                        "Stopping...\n"
                    );
                }

                //print warning message if ran out of digits in the
                //middle of a row, as this should only happen at the
                //start of a row
                else if ( i != 0 )
                {
                    fprintf(
                        stderr,
                        "Warning: Ran out of digits in the middle of "
                        "a row!\n"
                    );
                }

                //we cannot use "break" here, because that would
                //only break out of the innermost loop, but we must
                //break out of two levels of nested looops
                goto break_out_of_nested_loop;
            }

            //verify that we are not going to write to the array
            //out of bounds
            if ( num_rows == MAX_ROWS )
            {
                fprintf(
                    stderr,
                    "Too many rows to fit in the array! Stopping..."
                );

                goto break_out_of_nested_loop;
            }

            //add found character to the array
            myArr[num_rows][i] = p[0];

            //go to next character
            p  ;

            //print warning message and stop if next character exists
            //and is not a whitespace character (i.e. space,
            //newline, etc.)
            if ( !isspace( (unsigned char)*p ) )
            {
                if ( *p != '\0' )
                {
                    fprintf(
                        stderr, 
                        "Warning: Unspected character found! "
                        "Stopping...\n"
                    );
                    goto break_out_of_nested_loop;
                }
            }
            else
            {
                //NOTE: This block will be skipped if we
                //are already at the end of the string, due to
                //the nested "if" statements above

                //go to next character
                p  ;
            }
        }

        //increase the number of valid rows in the array
        num_rows  ;
    }

break_out_of_nested_loop:

    //print the content of the array
    for ( int i = 0; i < num_rows; i   )
    {
        for ( int j = 0; j < COLS_PER_ROW; j   )
        {
             printf( "%c ", myArr[i][j] );
        }

        printf( "\n" );
    }
}

This program has the same output as the first program.

  • Related