Home > Software design >  Why is my sha256 checksum incompatible with aws glacier checksum response?
Why is my sha256 checksum incompatible with aws glacier checksum response?

Time:10-10

I have an archive file in ubuntu server. I uploaded this file in AWS glacier using aws cli. at the finishing, AWS gave me a checksum like this:

{"checksum": "6c126443c882b8b0be912c91617a5765050d7c99dc43b9d30e47c42635ab02d5"}

but when i checked the checksum in own server like this:

sunny@server:~/sha256sum backup.zip

return this checksum:

5ba29292a350c4a8f194c78dd0ef537ec21ca075f1fe649ae6296c7100b25ba8

why between checksums has a difference?

CodePudding user response:

While the checksum returned by Glacier uses SHA-256, it is not a simple SHA-256 sum over the entire object. Rather, it calculates hashes for each megabyte of data, and calculates a hash for each pair of hashes, and repeats the process till one hash remains. For more information, see the documentation.

Here's is a simple implementation in Python

#!/usr/bin/env python3
import hashlib
import sys
import binascii

# Given a file object (opened in binary mode), calculate the checksum used by glacier
def calc_hash_tree(fileobj):
    chunk_size = 1048576

    # Calculate a list of hashes for each chunk in the fileobj
    chunks = []
    while True:
        chunk = f.read(chunk_size)
        if len(chunk) == 0:
            break
        chunks.append(hashlib.sha256(chunk).digest())
    
    # Now calculate each level of the tree till one digest remains
    while len(chunks) > 1:
        next_chunks = []
        while len(chunks) > 1:
            next_chunks.append(hashlib.sha256(chunks.pop(0)   chunks.pop(0)).digest())
        if len(chunks) > 0:
            next_chunks.append(chunks.pop(0))
        chunks = next_chunks

    # The final remaining hash is the root of the tree:
    return binascii.hexlify(chunks[0]).decode("utf-8")

if __name__ == "__main__":
    with open(sys.argv[1], "rb") as f:
        print(calc_hash_tree(f))

You can call it on a single file like this:

$ ./glacier_checksum.py backup.zip
  • Related