Home > Net >  How does strcat affect the strtok?
How does strcat affect the strtok?

Time:12-29

Assume we need to copy user's input into another string by concatenating the tokens of input, e.g., "hello world" -> "helloworld".

#include <stdio.h>
#include <string.h>

int main(void) {
  char buffer[50];

  printf("\nEnter a string: ");

  while (fgets(buffer, sizeof(buffer), stdin) != 0) {
    size_t size = strlen(buffer);

    if (size > 0 && buffer[size - 1] == '\n') {
      char input[1]; // set it too small 
      buffer[size - 1] = '\0';

      char *tok = strtok(buffer, " "); // works fine
      do {
        strcat(input, tok); // append to "input" that has not enough space
        printf("\nfound token: %s", tok);
        tok = strtok(NULL, " "); // produces garbage
      } while (tok);

     break;
  }
}

Running the code above:

Enter a string: hello world

found token: hello
found token: w
found token: r
*** stack smashing detected ***: <unknown> terminated

I struggle to understand how is strtok related to strcat failing to append tok. They are not sharing variables except for tok which is (according to the docs) copied by strcat, so whatever strcat is doing shouldn't affect the strtok behavior and the program should crash on the second strcat call at least, right? But we see that strcat is getting called 3 times before stack smashing gets detected. Can you please explain why?

CodePudding user response:

For starters this array

char input[1];

is not initialized and does not contain a string.

So this call of strcat

strcat(input, tok);

invokes undefined behavior also because the array input is not large enough to store the copied string. It can overwrite memory beyond the array.

CodePudding user response:

There are multiple problems in the code:

  • char input[1]; is too small to do anything. You cannot concatenate the tokens from the line into this minuscule array. You must define it with a sufficient length, namely the same length as buffer for simplicity.
  • input must be initialized as an empty string for strcat(input, tok); to have defined behavior. As coded, the first call to strcat corrupts other variables causing the observed behavior, but be aware anything else could happen as a result of this undefined behavior.
  • char *tok = strtok(buffer, " "); works fine but may return a null pointer if buffer contains only whitespace if anything. The do loop will then invoke undefined behavior on strcat(input, tok). Use a for or while loop instead.
  • there is a missing } in the code, it is unclear whether you mean to break from the while loop after the first iteration or only upon getting the end of the line.

Here is a modified version:

#include <stdio.h>
#include <string.h>

int main(void) {
    char buffer[50];
    char input[sizeof buffer] = "";

    printf("Enter a string: ");

    if (fgets(buffer, sizeof(buffer), stdin)) {
        char *tok = strtok(buffer, " \n");
        while (tok) {
            strcat(input, tok);
            printf("found token: %s\n", tok);
            tok = strtok(NULL, " \n");
        }
        printf("token string: %s\n", input);
    }
    return 0;
}
  • Related