Home > other >  Validate S3 signature Python
Validate S3 signature Python

Time:06-14

I want to validate the signature of an S3 (hosted at digitalocean) presigned URL via Python. As far as I know, the signature consists of the full URL, with the secret key.

I've already tried things like AWS S3 presigned urls with boto3 - Signature mismatch, but that results in a different signature.

I want to check the signature given in the URL (of an image for example) by recreating it with the hashing algorithm.

How would I go about doing this?

CodePudding user response:

I had the same problem and was hoping the boto package would provide an easy way to do this, but unfortunately it doesn't.

I also tried to use boto to create the same signature base on the url, but the problem is the timestamp (X-Amz-Date in the url) To get the exact same signature, the timestamp provided in the url needs to be used for generating. I went down the rabbit hole trying to 'override' the datetime but it seems like it's impossible.

So what's left is generating the signature from scratch, like you said you tried. The code in the question you linked does work but it's not straightforward.

Inspired by that link and the boto3 source, this is what I've created and it seems to work:

from urllib.parse import urlparse, parse_qs, urlencode, quote
import hashlib
import hmac
from django.conf import settings

def validate_s3_url(url, method='GET'):
    """
    This check whether the signature in the given S3 url is valid,
    considering the other parts of the url.
    This requires that we have access to the (secret) access key 
    that was used to sign the request (the access key ID is 
    available in the url).
    """
    parts = urlparse(url)
    querydict = parse_qs(parts.query)
    # get relevant query parameters
    url_signature = querydict['X-Amz-Signature'][0]
    credentials = querydict['X-Amz-Credential'][0]
    algorithm = querydict['X-Amz-Algorithm'][0]
    timestamp = querydict['X-Amz-Date'][0]
    signed_headers = querydict['X-Amz-SignedHeaders'][0]

    # if we have multiple access keys we could use access_key_id to get the right one
    access_key_id, credential_scope = credentials.split("/", maxsplit=1)
    host = parts.netloc

    # important: in Python 3 this dict is sorted which is essential
    canonical_querydict = {
        'X-Amz-Algorithm': [algorithm],
        'X-Amz-Credential': [credentials],
        'X-Amz-Date': [timestamp],
        'X-Amz-Expires': querydict['X-Amz-Expires'],
        'X-Amz-SignedHeaders': [signed_headers],
    }
    # this is optional (to force download with specific name)
    # if used, it's passed in as 'ResponseContentDisposition' Param when signing.
    if 'response-content-disposition' in querydict:
        canonical_querydict['response-content-disposition'] = querydict['response-content-disposition']
    canonical_querystring = urlencode(canonical_querydict, doseq=True, quote_via=quote)

    # build the request, hash it and build the string to sign
    canonical_request = f"{method}\n{parts.path}\n{canonical_querystring}\nhost:{host}\n\n{signed_headers}\nUNSIGNED-PAYLOAD"
    hashed_request = hashlib.sha256(canonical_request.encode('utf-8')).hexdigest()
    string_to_sign = f"{algorithm}\n{timestamp}\n{credential_scope}\n{hashed_request}"

    # generate signing key from credential scope.
    signing_key = f"AWS4{settings.AWS_SECRET_ACCESS_KEY}".encode('utf-8')
    for message in credential_scope.split("/"):
        signing_key = hmac.new(signing_key, message.encode('utf-8'), hashlib.sha256).digest()

    # sign the string with the key and check if it's the same as the one provided in the url
    signature = hmac.new(signing_key, string_to_sign.encode('utf-8'), hashlib.sha256).hexdigest()

    return url_signature == signature

This uses django settings to get the secret key but really it could come from anywhere.

  • Related