Home > Back-end >  Solve a problem: to reverse the words in a sentence. AddressSanitizer: stack-buffer-overflow
Solve a problem: to reverse the words in a sentence. AddressSanitizer: stack-buffer-overflow

Time:10-06

im trying to solve a problem in C where we need to reverse the words in a sentence. For example "hello my name is" to "is name my hello". the problem im facing is a stack overflow error. not exactly sure if my code works perfectly as well. Will be grateful to know what I did wrong. Thank you!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int my_strlen(char sentence[]) {
    int count = 0;
    for (int i = 0 ; sentence[i] != '\0' ; i  ) {
        count  ;
    }
    return count;
}

void reverse_words(char sentence[]) {

    int len = my_strlen(sentence);
    int word_start = len;
    int word_end = len;
    int index = 0;
    int i;
    char reverse[len   1];

    while (word_start > 0) {
        if (sentence[word_start] == ' ') {
            i = word_start   1;
            while (i <= word_end) {
                reverse[index] = sentence[i];
                i  ;
                index  ;
            }
            reverse[index  ] = ' ';
            word_end = word_start - 1;
        }
        word_start--;
    }

    for (int j = 0 ; j <= word_end ; j  ) {
        reverse[index] = sentence[j];
        index  ;
    }

    reverse[index] = '\0';

    for (int j = 0 ; j < index ; j  ) {
        sentence[index] = reverse[index];
    }

}

int main(void) {
    char sentence[] = "abcd efgh ijkl mnop qrst uvxy z";
    reverse_words(sentence);
    for (int i = 0 ; i < sizeof(sentence) / sizeof(sentence[0]) - 1 ; i  ) {
        printf("%c", sentence[i]);
    }
    printf("\n");
}

CodePudding user response:

Just consider the case when the passed string contains only one word as for example "A".

In this case the variable len will be equal to 1.

The array reverse will have only two elements

char reverse[len   1];

Now in this while loop

while (word_start > 0) {
    if (sentence[word_start] == ' ') {
        i = word_start   1;
        while (i <= word_end) {
            reverse[index] = sentence[i];
            i  ;
            index  ;
        }
        reverse[index  ] = ' ';
        word_end = word_start - 1;
    }
    word_start--;
}

the body of the if statement will not get the control because the string does not contain a space. So in fact the while loop will look like

while (word_start > 0) {
    //...
    word_start--;
}

So word_start will became equal to 0 after the loop.

And due to the following for loop

for (int j = 0 ; j <= word_end ; j  ) {
    reverse[index] = sentence[j];
    index  ;
}

you will have reverse[0] = 'A', reverse[1] = '\0' and then after this statement

reverse[index] = '\0';

you will have reverse[2] = '\0'. That is there is an attempt to write beyond the array.

Also this for loop in general uses an invalid value of the variable word_end because the variable can be decreased in the preceding while loop.

Another problem is the condition in the while loop

while (i <= word_end) {

In the first iteration of the loop you can insert in the middle of the result string the terminting zero character '\0'.

Pay attention to that the string can contain adjacent spaces. In this case your approach will be incorrect.

Usually such a task is implemented the following way.

At first you need to reverse the whole string and then to reverse each word in the string starting from its beginning. There is no need to use an auxiliary array. Using an auxiliary variable length array makes the function unsafe.

Here is a demonstration program where the functions do not use standard C string functions.

#include <stdio.h>

size_t my_strlen( const char *s ) 
{
    const char *p = s;

    while ( *p )   p;

    return p - s;
}

void reverse_n( char *s, size_t n )
{
    for ( size_t i = 0; i < n / 2; i   )
    {
        char c = s[i];
        s[i] = s[n-i-1];
        s[n-i-1] = c;
    }
}

char * reverse_words( char *s ) 
{
    size_t n = my_strlen( s );

    reverse_n( s, n );
    
    for ( char *p = s; *p; )
    {
        while ( *p == ' ' || *p == '\t' )   p;

        if ( *p )
        {
            char *q = p;

            while ( *  p && *p != ' ' && *p != '\t' );
            reverse_n( q, p - q );
        }
    }

    return s;
}

int main( void ) 
{
    char sentence[] = "hello my name is";

    puts( sentence );
    puts( reverse_words( sentence ) );
}

The program output is

hello my name is
is name my hello

As you can see the function reverse_words does not use an auxiliary variable length array.

CodePudding user response:

Below is the code of one of the original functions with three highly visible comments including two changes:

void reverse_words(char sentence[]) {

    int len = my_strlen(sentence);
    int word_start = len;
    int word_end = len;
    int index = 0;
    int i;
    char reverse[len   1];

    while (word_start > 0) {
        if (sentence[word_start] == ' ') {
            i = word_start   1;
            while (i <= word_end) {
                reverse[index] = sentence[i];
                i  ;
                index  ;
            }
            reverse[index  ] = ' '; //  *** Comment #1 below ***
            word_end = word_start - 1;
        }
        word_start--;
    }

#if 0 // Comment #2 below
    for (int j = 0 ; j <= word_end ; j  ) {
#else
    for (int j = 0 ; j < word_end ; j  ) {
#endif
        reverse[index] = sentence[j];
        index  ;
    }

    reverse[index] = '\0';

    for (j = 0 ; j < index ; j  ) {
#if 0 // Comment #3 below
        sentence[index] = reverse[index];
#else
        sentence[j] = reverse[j]; 
#endif
    }
}

#1 - A previously nonexistent space is added to the end of the first "word" copied
#2 - Trimming the copy of the space after the last "word" copied fixes the overflow
#3 - This would never work. Fixed.


Fixing #3 (as is required) leads to another observation.
Juggling the indexes, in midstream the code uses word_end as the index of the last character of the word. But, initially, word_end is that index 1. With #3 fixed, a very simple correction would be:

int len = my_strlen(sentence);
int word_start = len;
int word_end = len - 1; // index of last character of last "word"

leaving the rest of the code as it was originally.

Off by one errors are a common cause of bugs.

  • Related