Home > Back-end >  Checking if AWS S3 presigned link exists using wget --spider
Checking if AWS S3 presigned link exists using wget --spider

Time:10-07

I've read several threads on SO about checking whether a URL exists or not in bash, e.g. #37345831, and the recommended solution was to use wget with --spider. However, the --spider option appears to fail when used with AWS S3 presigned URLs.

Calling:

wget -S --spider "${URL}" 2>&1

Results in:

HTTP request sent, awaiting response...
  HTTP/1.1 403 Forbidden
  x-amz-request-id: [REF]
  x-amz-id-2: [REF]
  Content-Type: application/xml
  Date: [DATE]
Server: AmazonS3
Remote file does not exist -- broken link!!!

Whereas the following returns as expected, HTTP/1.1 200 OK, for the same input URL:

wget -S "${URL}" -O /dev/stdout | head

The version of wget I'm running is:

GNU Wget 1.20.3 built on linux-gnu.

Any clue as to what's going on?

CodePudding user response:

Any clue as to what's going on?

There exist few HTTP request methods also known as HTTP verbs, for this case 2 of them are relevant

  • GET
  • HEAD

when not instructed otherwise wget does make first of them, when --spider option is used second one is used, to which server should respond with just headers (no body).

AWS S3 presigned link

According to Signing and authenticating REST requests - Amazon Simple Storage Service one of step of preparing is as follows

StringToSign = HTTP-Verb   "\n"  
    Content-MD5   "\n"  
    Content-Type   "\n"  
    Date   "\n"  
    CanonicalizedAmzHeaders  
    CanonicalizedResource;

therefore we might conclude that AWS S3 presigned link will be working with exactly 1 of HTTP verbs. One you have is for GET. Consult whoever crafted that link to furnish you with AWS S3 presigned link made for HEAD if you wish to use --spider successfully.

  • Related