So I did a ton of googling around but can't really find an answer to my problem. I tried to make a function to hash bigger files, and I made two different functions, yet both seem to produce the same output for any file I hash, to be precise:
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
which I know is the SHA256 hash of an empty bite string.
The two solutions I tried to use were:
from hashlib import sha256 as hashfun
bs = 1048576 # I'll just use 1024kb chunks as an example
selectfile = "/PATH/TO/FILE"
with open(selectfile, 'rb') as f:
for chunk in iter(lambda: f.read(bs), b''):
hashfun().update(chunk)
print(hashfun().hexdigest())
and
from hashlib import sha256 as hashfun
bs = 1048576
selectfile = "/PATH/TO/FILE"
with open(selectfile, 'rb') as f:
while True:
data = f.read(bs)
if not data:
break
hashfun().update(data)
print(hashfun().hexdigest())
Neither of these seems to function. I know that the file is not the issue since a simple function without the block reading worked before:
from hashlib import sha256 as hashfun
selectfile = "/PATH/TO/FILE"
with open(selectfile, 'rb') as f:
rbytes = f.read()
readable_hash = hashfun(rbytes).hexdigest()
print(readable_hash)
Is there something super obvious that I am doing wrong here (probably lol)? Thanks in advance!
CodePudding user response:
The code is calling the hash constructor repeatedly instead of creating a single hash instance and updating it.
This should work:
from hashlib import sha256 as hashfun
bs = 1048576 # I'll just use 1024kb chunks as an example
selectfile = "/PATH/TO/FILE"
hash_ = hashfun()
with open(selectfile, 'rb') as f:
for chunk in iter(lambda: f.read(bs), b''):
hash_.update(chunk)
print(hash_.hexdigest())