Home > OS >  I need to understand the behaviour of these C pointers
I need to understand the behaviour of these C pointers

Time:03-22

Playing around pointers I produced a behavior that I can't explain. When I print a string in an accessory function I get a segfault in my main when printing an uninitialized char* (I see this is reasonable), the tricky part is if I do everything else the same way and don´t use the printf line in the accessory function, the program works perfectly, to the point that it surprisingly returns the address of the stack string implicitly.

Here is the program as I wrote and ran it (on Fedora 35 in an installation as vanilla as it gets). Even if you can't totally explain the issue any help in understanding this will be most welcome.

#include<stdio.h>
#include <stdlib.h>
#include<string.h>

/* This function generates a stack string and fills it */
char* generate_stack_string()
{
    char stack_string[70];
    strcpy(stack_string,"test");
    
    /*the next line is the one that when added crashes the program*/
    
    //printf("adress ( %p ) and text ( %s ) of stack string inside the function\n\n",stack_string, stack_string);
    
    
    //function compiles and runs normally despite no return value, but seems to return something anyhow
}
int main()
{   char *pointer_in_main;  
    //this pointer is never malloced, this is intentional, to provoke the behaviour
    
    printf("pointer_in_main adress before function call  %p\n\n",pointer_in_main);
    pointer_in_main = generate_stack_string();
    printf("pointer_in_main adress after function call  %p\n\n",pointer_in_main);
    
    //This is where program crashses when the heap string is printed
    printf("pointer_in_main content after function call %s\n\n",pointer_in_main);
    

}
 

CodePudding user response:

With the understanding that this is all undefined behavior, let's take a look at your function:

char* generate_stack_string()
{
    char stack_string[70];
    strcpy(stack_string,"test");
    
    /*the next line is the one that when added crashes the program*/
    
    //printf("adress ( %p ) and text ( %s ) of stack string inside the function\n\n",stack_string, stack_string);
    
    
    //function compiles and runs normally despite no return value, but seems to return something anyhow
}

This function is not returning a value, despite the fact that it is declared to do so. The returning of a value from a function is typically performed by placing a value in a register. Since there's no return statement, whatever value happened to be in the register in question will be the value returned. And since the last statement in the function is also a function call, the value in the register is what was returned from that function. In the case of strcpy, that happens to be the value of the first parameter, i.e. stack_string converted to a pointer. So by luck you're returning the pointer you intended to return.

Running this code I get the following output:

pointer_in_main adress before function call  (nil)

pointer_in_main adress after function call  0x7ffe3c7cd3e0

pointer_in_main content after function call test

You were also "lucky" that the memory contents previously used by stack_string weren't overwritten when the last call to printf in main happens.

Now if we uncomment the printf call in generate_stack_string, I get the following output:

pointer_in_main adress before function call  (nil)

adress ( 0x7ffec0587ee0 ) and text ( test ) of stack string inside the function

pointer_in_main adress after function call  0x51

Segmentation fault (core dumped)

Here we can see that a value was returned that is outside the valid address space of the process, so attempting to dereference it causes a segfault. But what is this value?

Looking at the updated function, the last line executed is a call to printf. This function returns the number of characters printed, and 0x51 (decimal 81) happens to be the number of characters printed. So this value was left in the register used to return values from a function, so this is what was returned.

But again, this is all undefined behavior. With the modified code, I get the same segfault regardless of optimization level. If I run the original code with -O1 or higher, I get this output:

pointer_in_main adress before function call  (nil)

pointer_in_main adress after function call  (nil)

pointer_in_main content after function call (null)

So when your program has undefined behavior, all bets are off.

CodePudding user response:

There are many problems in your program most of which leads to undefined behavior. For example, the pointer pointer_in_main is uninitialized and you're using this pointer in the call to printf when you wrote:

printf("pointer_in_main adress before function call  %p\n\n",pointer_in_main);//this is undefined behavior because pointer_in_main is uninitialized
//whatever happens after this is not reliable

In the above call to printf, you're using the uninitialized pointer pointer_in_main which leads to undefined behavior.

Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.

So the output that you're seeing(maybe seeing) is a result of undefined behavior. And as i said don't rely on the output of a program that has UB. The program may just crash.

So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.


1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

CodePudding user response:

For starters this statement

printf("pointer_in_main adress before function call  %p\n\n",pointer_in_main);

has undefined behavior because the pointer was not initialized and has an indeterminate value.

char *pointer_in_main;  

This function returns nothing

char* generate_heap_string()
{
    char heap_string[70];
    strcpy(heap_string,"test");
    
    /*the next line is the one that when added crashes the program*/
    
    //printf("adress ( %p ) and text ( %s ) of heap string inside the function\n\n",heap_string,heap_string);
    
    
    //function compiles and runs normally despite no return value, but seems to return something anyhow
}

So again these calls of printf

pointer_in_main = generate_heap_string();
printf("pointer_in_main adress after function call  %p\n\n",pointer_in_main);

//This is where program crashses when the heap string is printed
printf("pointer_in_main content after function call %s\n\n",pointer_in_main);

invoke undefined behavior.

Even if you will change the function the following way

char* generate_heap_string()
{
    char heap_string[70];
    strcpy(heap_string,"test");
    
    /*the next line is the one that when added crashes the program*/
    
    //printf("adress ( %p ) and text ( %s ) of heap string inside the function\n\n",heap_string,heap_string);
    
    return heap_string;    
}

then nevertheless this call of printf

pointer_in_main = generate_heap_string();
//...
//This is where program crashses when the heap string is printed
printf("pointer_in_main content after function call %s\n\n",pointer_in_main);

will invoke undefined behavior one more time because the local array declared in the function will not alive after exiting the function and the pointer pointer_in_main has an invalid value.

That is in this call

//This is where program crashses when the heap string is printed
printf("pointer_in_main content after function call %s\n\n",pointer_in_main);

there is an attempt to dereference the invalid pointer.

The difference between the calls of printf is that in one case you are trying to print an invalid value of the pointer but in the other case

//This is where program crashses when the heap string is printed
printf("pointer_in_main content after function call %s\n\n",pointer_in_main);

you are using the invalid value to access memory.

  • Related