Home > OS >  Returning allocated buffer, vs buffer passed to a function
Returning allocated buffer, vs buffer passed to a function

Time:09-21

When passing values to my functions, I often consider either returning an allocated buffer from my function, rather than letting the function take a buffer as an argument. I was trying to figure out if there was any significant benefit to passing a buffer to my function (eg:

void f(char **buff) {
   /* operations */
   strcpy(*buff, value);
}

Versus

char *f() {
    char *buff = malloc(BUF_SIZE);
    /* operations */
    return buff;
}

These are obviously not super advanced examples, but I think the point stands. But yeah, are there any benefits to letting the user pass an allocated buffer, or is it better to return an allocated buffer?

CodePudding user response:

Are there any benefits to using one over the other, or is it just useless?

This is a specific case of the more general question of whether a function should return data to its caller via its return value or via an out parameter. Both approaches work fine, and the pros and cons are mostly stylistic, not technical.

The main technical consideration is that each function has only one return value, but can have any number of out parameters. That can be worked around, but doing so might not be acceptable. For example, if you want to reserve your functions' return values for use as status codes such as many standard library functions produce, then that limits your options for sending back other data.

Some of the stylistic considerations are

  • using the return value is more aligned with the idiom of a mathematical function;
  • many people have trouble understanding pointers; and in particular,
  • non-local modifications effected through pointers sometimes confuse people. On the other hand,
  • the return value of a function can be used directly in an expression.

With respect to modifications to the question since this answer was initially posted, if the question is about whether to dynamically allocate and populate a new object vs populating an object presented by the caller, then there are these additional considerations:

  • allocating the object inside the function frees the caller from allocating it themselves, which is a convenience. On the other hand,
  • allocating the object inside the function prevents the caller from allocating it themselves (maybe automatically or statically), and does not provide for re-initializing an existing object. Also,
  • returning a pointer to an allocated object can obscure the fact that the caller has an obligation to free it.

Of course, you can have it both ways:

void init_thing(thing *t, char *name) {
    t->name = name;
}

thing *create_thing(char *name) {
    thing *t = new malloc(sizeof(*t));

    if (t) {
        init_thing(t);
    }
    return t;
}

CodePudding user response:

Both options work.
But in general, returning information through the parameters (the second option) is preferable because we usually reserve the return of the function to report an error. And we can return several information trough multiple parameters. Hence, it is easier for the caller to check if the function was OK or not by checking first the returned value. Most of the services from the C library or the Linux system calls work like this.

Concerning your examples, both options work because you are referencing a constant string which is globally allocated at program's loading time. So, in both solutions, you return the address of this string.
But if you do something like the following:

char *func(void) {
   char buff[] = "example";
   return buff;
}

You actually copy the content of the constant string "example" into the stack area of the function pointed by buff. In the caller the returned address is no longer valid as it refers to a stack location which can be reused by any other function called by the caller.
Let's compile a program using this function:

#include <stdio.h>

char *func(void) {
   char buff[] = "example";
   return buff;
}

int main(void) {

  char *p = func();

  printf("%s\n", p);

  return 0; 

}

If the compilation options of the compiler are smart enough, we get a first red flag with a warning like this:

$ gcc -g bad.c -o bad
bad.c: In function 'func':
bad.c:5:11: warning: function returns address of local variable [-Wreturn-local-addr]
    5 |    return buff;
      |           ^~~~

The compiler points out the fact that func() is returning the address of a local space in its stack which is no longer valid when the function returns. This is the compiler option -Wreturn-local-addr which triggers this warning. Let's deactivate this option to remove the warning:

$ gcc -g bad.c -o bad -Wno-return-local-addr

So, now we have a program compiled with 0 warning but this is misleading as the execution fails or may trigger some unpredictible behaviors:

$ ./bad
Segmentation fault (core dumped)

CodePudding user response:

You can't return the address of local memory.

Your first example works because the memory in "example" will not be deallocated. But if you allocated local (aka automatic) memory it automtically be deallocated when the function returns; the returned pointer will be invalid.

char *func() {
   char buff[10];

   // Copy into local memory
   strcpy(buff, "example");

   // buff will be deallocated after returning.
   // warning: function returns address of local variable
   return buff;
}

You either return dynamic memory, using malloc, which the caller must then free.

char *func() {
  char *buf = malloc(10);
  strcpy(buff, "example");
  return buff;
}

int main() {
  char *buf = func();
  puts(buf);
  free(buf);
}

Or you let the caller allocate the memory and pass it in.

void *func(char **buff) {
   // Copy a string into local memory
   strcpy(buff, "example");

   // buff will be deallocated after returning.
   // warning: function returns address of local variable
   return buff;
}

int main() {
  char buf[10];
  func(&buf);
  puts(buf);
}

The upside is the caller has full control of the memory. They can reused existing memory, and they can use local memory.

The downside is the caller must allocate the correct amount of memory. This might lead to allocating too much memory, and also too little.

An additional downside is the function has no control over the memory which has been passed in. It cannot grow nor shrink nor free the memory.

You can only return one thing from a function.

For example, if you want to convert a string to an integer you could return the integer like atoi does. int atoi( const char *str ).

int num = atoi("42");

But then what happens when the conversion fails? atoi returns 0, but how do you tell the difference between atoi("0") and atoi("purple")?

You can instead pass in an int * for the converted value. int my_atoi( const char *str, int *ret ).

int num;
int err = my_atoi("42", &num);
if(err) {
  exit(1);
}
else {
  printf("%d\n");
}
  •  Tags:  
  • c
  • Related