Home > OS >  Malloc, dynamically allocating wrong size memory for a string
Malloc, dynamically allocating wrong size memory for a string

Time:12-16

int main() {
    char A[200], B[200];
    printf("Enter 2 words or sentences.\n");
    gets(A);
    gets(B);
    char* C = (char*)malloc((strlen(A)   strlen(B)) * sizeof(char));
    for (int i = 0; i < strlen(A); i  )
        C[i] = A[i];
    for (int i = 0; i < strlen(B); i  )
        C[i   strlen(A)] = B[i];
    printf("%s", C);
}

The initial value of C is ÍÍÍÍÍÍÍÍÍÍýýýý, which is 4 symbols longer than requested and the 4 symbols also show up as output when printing C. I have no idea why there are 4 symbols, which is why I'm here seeking an explanation.

CodePudding user response:

C[strlen(A) strlen(B)] = '\0'; after the malloc gets rid of the junk, thanks everyone for the ideas.

CodePudding user response:

malloc just returns a pointer to some memory it allocated for you. It doesn't initialize that memory, zero it out, or anything like that. So what you're seeing when you print it out, is whatever junk was in there before.

Frankly, you're lucky you didn't open up a wormhole or something. C strings are nul-teminated, so when you pass that pointer around, you're technically not passing a string yet. When you pass it to a function that expects a string, all kinds of wackiness can ensue.

You should initialize the memory when you get it. The simplest initialization would be something like *C = '\0'; or C[0] = '\0';, which turns the memory into a zero-length string. But you probably already have something to put there, or why would you be allocating memory in the first place? :P

Now that there's code, we can tweak it a bit to fix the issue...

int main() {
    char A[200], B[200];
    printf("Enter 2 words or sentences.\n");

    // BTW: you should never, ever be using `gets`.
    // use `fgets` and pass the size of your buffer to avoid overruns.
    fgets(A, sizeof A, stdin);
    fgets(B, sizeof B, stdin);

    // you don't want to call `strlen` over and over. save these lengths
    size_t Alen = strlen(A);
    size_t Blen = strlen(B);

    // lop off the newlines :P
    A[--Alen] = B[--Blen] = '\0';

    // You need enough space for both strings, plus a nul at the end.
    // side note: you don't need to cast the pointer to a `char *`.
    // also, sizeof(char) is 1 by definition, so no need to multiply by it.
    char* C = malloc(Alen   Blen   1);

    // compare to the length variable instead
    for (int i = 0; i < Alen; i  )
        C[i] = A[i];
    
    for (int i = 0; i < Blen; i  )
        C[i   Alen] = B[i];

    // important: nul-terminate the string
    C[Alen   Blen] = '\0';

    printf("%s", C);
}

CodePudding user response:

Problems:

Not null character terminated

Code attempts printf("%s", C); which is undefined behavior as C[] is not a string as needed by "%s".

// Append a \0
C[strlen(A)   strlen(B)] = '\0';

Insufficient memory allocated

Make room for the null character.

// (strlen(A)   strlen(B)) * sizeof(char)
strlen(A)   strlen(B)   1

gets() is no longer part of the standard C library since C11

Use fgets() and lop off a potential trailing '\n' for similar behavior.

int vs. size_t

int is insufficient for very long strings. size_t works for all strings.

Avoid potentially recalculating the string length


int main(void) {
    char A[200], B[200];
    printf("Enter 2 words or sentences.\n");
    fgets(A, sizeof A, stdin);
    A[strcspn(A, "\n")] = '\0'; // Lop off potential \n
    fgets(B, sizeof B, stdin);
    B[strcspn(B, "\n")] = '\0';

    size_t A_len = strlen(A);
    size_t B_len = strlen(B);
    char* C = malloc(A_len   B_len   1);

    if (C) {
      for (size_t i = 0; A[i]; i  ) {
        C[i] = A[i];
      }
      for (size_t i = 0; B[i]; i  ) {
        C[A_len   i] = B[i];
      }
      C[A_len   B_len   i] = '\0';

      printf("%s\n", C);
      free(C);  // Good housekeeping to free allocations.
    }
}
  • Related