Home > Blockchain >  Avoiding Side Effects in C
Avoiding Side Effects in C

Time:09-23

This question particularly applies to arrays.

In many languages, I would do something like this:

    #This is in python for simplicity:

    def increment(mylist):
        for i in mylist:
            i  = 1;
            
        return mylist
    

    mylist = {0,1,2,3}
    mylist = increment(mylist)

I tried several ways to return arrays in C, none of which worked as above. It seems that C simply was not intended to work this way. Instead, I had to do this:

#include <stdio.h>

increment(int *myarray, int size) {
    for(int i = 0; i < size; i  ){
        myarray[i]  = 1;
        
    }
}

int main(){
    int myarray[4] = {0,1,2,3};
    increment(myarray, 4);

}

Needless to say, the C function changes the state of the array and is therefore a side effect function. There are good reasons to avoid this (which are not the topic of this question).

Is there a way to avoid these types of side effects in C?

CodePudding user response:

Beware, in python objects are passed by reference.

def increment(mylist):
    // mylist is a local reference to the original array
    for i in mylist:
        i  = 1;   // i is a local value: nothing is changed in mylist!
        
    return mylist  // returns a reference to the original (and unchanged...) array

What should be done to alter the original list:

def increment(mylist):
    for i in range(len(mylist)):
        mylist[i]  = 1
    // returning mylist is optional since the caller's list has been modified

This is the exact equivalent in C of

int *increment(int array[], int size) {
    for (int i=0; i<size;   i) {
        array[i]  = 1;
    }
    return array;
}

But you can build and return a brand new list in Python that way:

def increment(mylist):
    return [i   1 for i in mylist]

Which cannot be done as easily in C. The idiomatic way is either to let the caller provide the array and size (as above) or to return a dynamically allocated array:

int *increment(int array[], int size) {
    int *new_array = malloc(size * sizeof(int));
    for (int i=0; i<size;   i) {
        new_array[i] = array[i]   1;
    }
    return new_array;
}

and let the caller free the returned array when done by transfering ownership.

CodePudding user response:

First, in python {0,1,2,3} is not a list but a set.

The python code that would be more directly equivalent to what you did in C would be :

def increment(mylist):
    for i in range(len(mylist)):
        mylist[i]  = 1;
        
    return mylist


mylist = [0,1,2,3]
mylist = increment(mylist)

and in that case there is a side effect on the list in python too. That is because the most common way to pass an array is by reference (or pointer in the case of C)

A C code that would be closer to what you did in your python code is :

void increment(int *myarray, int size) {
    for(int i = 0; i < size; i  ){
        int v = myarray[i]; // copy of the array value here
        v  = 1;
    }
}

int main(){
    int myarray[4] = {0,1,2,3};
    increment(myarray, 4);
}

In that case there is no side effect on the array as well because i just made a copy of the array value before using it.

If you want to avoid side effects, the general rule is that you have to make copies, either of your array or of your individual array values.

EDIT: what you probably wanted to do in your python function was

def increment(mylist):
    mylist = list(mylist) # copy array
    for i in range(len(mylist)):
        mylist[i]  = 1
    return mylist

CodePudding user response:

In the C code, you pass a pointer to the first element of the array, while the array stays in memory. What you can do is create a new array, then return a pointer to it. However, be careful. If you create an auto array (created on stack), it will only exist inside of the function, so the returned pointer will point to garbage memory.

int* increment(int *myarray, int size) {
    int tempArray[size]; //only exists inside of the function.
    for(int i = 0; i < size; i  ){
        tempArray[i] = myarray[i]   1;
    }
    return tempArray; //don't do this, tempArray will not exist outside of this function.
}

You can instead use malloc function, which uses heap memory instead, and also exists outside of the function. (You will need to include stdlib.h)

#include <stdio.h>
#include <stdlib.h>

int* increment(int *myarray, int size) {
    int* tempArray = malloc(size*sizeof(int)); //exists globally
    for(int i = 0; i < size; i  ){
        tempArray[i] = myarray[i]   1;
    }
    return tempArray;
}

int main(){
    int myarray[4] = {0,1,2,3};
    int* newarray = increment(myarray, 4);
    //use the newarray - myarray stays the same.
    free(newarray); //don't forget to free when you no longer need it
}

CodePudding user response:

Changing the contents of an array is always a "side-effect" in C, as the formal definition goes. If you are rather looking for a way to make an array etc immutable, as in read-only and always creating a new object upon manipulation, there are ways to do that too.

You have to be aware that this typically involves a "hard copy" of the data contents, so it comes with execution overhead. C gives you the option not to be that inefficient if you don't want to. But if you want it, then the more flexible option is dynamic allocation. Something like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int* increment(const int* myarray, int size) 
{
  int* new_obj = malloc(sizeof(int[size]));
  for(int i=0; i<size; i  )
  {
    new_obj[i] = myarray[i]   1;
  }
  return new_obj;
}

int main (void)
{
  int* myarray = malloc(sizeof(int[4]));
  memcpy(myarray, (int[]){0,1,2,3}, sizeof(int[4]));
  for(int i=0; i<4; i  )
  {
    printf("%d ", myarray[i]);
  }
  puts("");
  
  int* another_array = increment(myarray, 4);
  free(myarray);
  
  for(int i=0; i<4; i  )
  {
    printf("%d ", another_array[i]);
  }

  free(another_array);
}

Note that this is significantly slower than modifying the original array in place. The heap allocation and the data copy are both relatively slow.

You could create "bad API" functions in C though, such as

int* increment(int *myarray, int size) {
    for(int i = 0; i < size; i  ){
        myarray[i]  = 1;
        
    }
    return myarray;
}

This returns a pointer to the same array that was passed along. It's bad API because it's confusing, though some C standard functions were designed just like this (strcpy etc). And in order to use this function you need a pointer to the first element of the array, rather than the array itself.

  • Related