I've read several threads on SO about checking whether a URL exists or not in bash, e.g. #37345831, and the recommended solution was to use wget
with --spider
. However, the --spider
option appears to fail when used with AWS S3 presigned URLs.
Calling:
wget -S --spider "${URL}" 2>&1
Results in:
HTTP request sent, awaiting response...
HTTP/1.1 403 Forbidden
x-amz-request-id: [REF]
x-amz-id-2: [REF]
Content-Type: application/xml
Date: [DATE]
Server: AmazonS3
Remote file does not exist -- broken link!!!
Whereas the following returns as expected, HTTP/1.1 200 OK
, for the same input URL:
wget -S "${URL}" -O /dev/stdout | head
The version of wget
I'm running is:
GNU Wget 1.20.3 built on linux-gnu.
Any clue as to what's going on?
CodePudding user response:
Any clue as to what's going on?
There exist few HTTP request methods also known as HTTP verbs, for this case 2 of them are relevant
- GET
- HEAD
when not instructed otherwise wget
does make first of them, when --spider
option is used second one is used, to which server should respond with just headers (no body).
AWS S3 presigned link
According to Signing and authenticating REST requests - Amazon Simple Storage Service one of step of preparing is as follows
StringToSign = HTTP-Verb "\n"
Content-MD5 "\n"
Content-Type "\n"
Date "\n"
CanonicalizedAmzHeaders
CanonicalizedResource;
therefore we might conclude that AWS S3 presigned link will be working with exactly 1 of HTTP verbs. One you have is for GET
. Consult whoever crafted that link to furnish you with AWS S3 presigned link made for HEAD
if you wish to use --spider
successfully.