I feed cURL with multiple URLs at a time, and have a difficulty parsing the output log to get the original addresses back. Namely, if an URL resolves, the output is as follows:
$ curl --head --verbose https://www.google.com/
* Trying 64.233.165.106...
* TCP_NODELAY set
* Connected to www.google.com (64.233.165.106) port 443 (#0)
<...>
> HEAD / HTTP/2
> Host: www.google.com
<...>
which can eventually be parsed back to https://www.google.com/.
However, with an invalid URL it does not do:
$ curl --head --verbose --connect-timeout 3 https://imap.gmail.com/
* Trying 74.125.131.109...
* TCP_NODELAY set
* After 1491ms connect time, move on!
* connect to 74.125.131.109 port 443 failed: Operation timed out
<...>
* Failed to connect to imap.gmail.com port 443: Operation timed out
The error message contains the URL in this case, but in other cases it does not. I can't rely on it.
So, I need either have URL-to-IP resolving disabled in the output, like
* Trying https://imap.gmail.com/...
or somehow append each URL from the list to the corresponding output, like:
$ curl --head --verbose --connect-timeout 3 https://imap.gmail.com/ https://www.google.com/
https://imap.gmail.com/
* Trying 64.233.162.108...
* TCP_NODELAY set
* After 1495ms connect time, move on!
* connect to 64.233.162.108 port 443 failed: Operation timed out
<...>
https://www.google.com/
* Trying 74.125.131.17...
* TCP_NODELAY set
* Connected to www.gmail.com (74.125.131.17) port 443 (#0)
<...>
Wget or HTTPie are not an option. How one can achieve that with cURL?
CodePudding user response:
Perhaps this is the solution:
while read LINE ; do
print "REQUESTED URL: $LINE" >> output.txt;
curl $LINE >> output.txt 2>&1;
done < url-list.txt