Home > other >  Understanding uninitialised value(s) C
Understanding uninitialised value(s) C

Time:10-14

I was writing a tokenizer in C and although it works exactly as expected Valgrind throws uninitialized value(s) error. I'm perfectly aware of more than a solution for the code that will fix the problem. Tokenizer is compiled and put into a static library before usage.

The tokenizer in question

char ** tokenizer (char *source, const char *delim, int *length){
    char * string;
    int dimension = 2;
    char ** final_tokenization = malloc(sizeof(char *) * dimension);

    string = strtok(source,delim);
    int i = 0;

    while(string != NULL){
        if(i < dimension){
            final_tokenization[i] = string;
            i  ;
        }else{
            dimension *= 2;
            final_tokenization = realloc(final_tokenization,sizeof(char *) * dimension);
            if(final_tokenization == NULL){
                return NULL;
            }
            final_tokenization[i] = string;
            i  ;
        }
        string = strtok(NULL,delim);
    }
    //length being the final length of final_tokenization
    *length = i;
    return final_tokenization;
}

A simple main to test it

int main(int argc,char **argv){
if(argc == 3){
    int * length;

    char ** string = tokenizer(argv[2],argv[1],length);

    if((*length) == -1 && string == NULL){
        exit(-1);
    }
    
    for(int i = 0; i < (*length); i  ){
        printf("%s \n",string[i]);
    }

    free(string);
}else{
    exit(-1);
}
return 0;

So that a call of ./main " " "token1 token2 token3" would return one "tokenN" per line.

Now as I understand it when I call tokenizer(argv[2],argv[1],length) I'm passing the value of the address of length to the function. What is 'inside' said address is unknown (as I've not yet initialized the variable) so reading from it would return some garbage. More over modifying the main with

int main(int argc,char **argv){
    if(argc == 3){
        int length;
        
        char ** string = tokenizer(argv[2],argv[1],&length);
[...]
}

Does not rise any errors. Why this happens? Shouldn't the result be the same?

CodePudding user response:

When you do this:

int * length;

char ** string = tokenizer(argv[2],argv[1],length);

The value of length is indeterminate, and you are reading that indeterminate value and passing it to a function, and that read is what valgrind is telling you about.

This code will most likely segfault because tokenizer will attempt to read that indeterminate value as a pointer and then attempt to dereference it.

In contrast, when you do this:

    int length;
    
    char ** string = tokenizer(argv[2],argv[1],&length);

The value you're passing to the function is the address of length which is well defined. Then when tokenizer dereferences that pointer value and writes a value to the dereferenced pointer, it writes to length in main.

  • Related