Home > Blockchain >  Why is it hashed differently when the same value is entered? In C
Why is it hashed differently when the same value is entered? In C

Time:03-30

I am using SHA1 to encrypt my ID.

However, even if I enter the same ID, it is hashed differently.

This is my code:

#include <stdio.h>
#include <string.h>
#include <openssl/sha.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>

char *sha1_hash(char *input_url, char *hashed_url) {
    unsigned char hashed_160bits[20];
    char hashed_hex[41];
    int i;
    
    SHA1(input_url, 160, hashed_160bits);

    for(i=0; i < sizeof(hashed_160bits); i  ) {
        sprintf(hashed_hex   i*2, "x", hashed_160bits[i]);
    }        

    strcpy(hashed_url, hashed_hex);

    return hashed_url;
}

int main()
{   
    char *input_url;
    char *hashed_url;
    
    while(1) {
        input_url = malloc(sizeof(char)* 1024);
        hashed_url = malloc(sizeof(char) * 1024);
        
        printf("input url> ");
        scanf("%s", input_url);
        
                if (strcmp(input_url, "bye") == 0) {
                        free(hashed_url);
                        free(input_url);
                        break;
                }

        sha1_hash(input_url, hashed_url);
    
        printf("hashed_url: %s\n", hashed_url);
        free(hashed_url);
        free(input_url);
        }

    return 0;
}

If I enter the same value for the first attempt and the second attempt, it will be hashed differently, but the third attempt will be hashed the same as the second attempt.

I think the dynamic allocation is a problem, but I can not think of a way to fix it.

CodePudding user response:

The problem seems to be in the uninitialized memory you are allocating.

malloc reserves memory for you, but the contents are 'whatever has been in there before'. And since you are not only hashing the string contents, but the entire buffer, you get different results each time.

Try using calloc, running memset over the buffer or limit your hashing to strlen(input) and see if that helps.

CodePudding user response:

You're not calling SHA1 correctly:

SHA1(input_ID, 160, hashed_ID_160bits);

The second parameter is the length of the data to hash. You're instead passing in the number of bits in the hash. As a result, you're reading past the end of the string contained in input_ID into uninitialized memory and possibly past the end of the allocated memory segment. This triggers undefined behavior.

You instead want:

SHA1(input_ID, strlen(input_ID), hashed_ID_160bits);

CodePudding user response:

SHA1(input_ID, 160, hashed_ID_160bits);

That line is wrong. You are always getting hash for 160 bytes. I assume you want the hash for the input text only, so use that length:

SHA1(input_ID, strlen(input_ID), hashed_ID_160bits);

SHA1 always produces hash of 160 bits, so you do not need to pass 160 as a parameter. If you want different size of SHA hash, you need to use a different function, documented here, and then of course modify rest of the code to match that hash size.


Why you get different hashes at different times is because of accessing uninitialized part of malloc buffer. This is Undefined Behavior, so "anything" can happen, and it's not generally useful to try and figure out what exactly happens, because it's not necessarily very deterministic. If you want to dig deeper than that, you could for example use a debugger to examine the memory addresses and contents on different loop iterations to see what exactly changed. Though, since this is Undefined Behavior, it's notoriously common for bad code to behave differently when you try to run it under debugger, or add debug prints.

  • Related