There is a certain amount (~1000) of small jpeg files (about 20Kb) that are located at the URLs like:
https://example.com/file=1.0
https://example.com/file=1.1
...
https://example.com/file=1.973
https://example.com/file=1.974
How to download them using bash script? I do not know how to write scripts, but I think that there is some simple way using wget for example.
They have the same filename.jpeg
so need to download them with consecutive names like filename-1.jpg
, filename-2.jpg
...
CodePudding user response:
Curl has this built-in feature to be able to download multiple URLs with generated sequences and apply those same sequences to the saved file name:
curl \
--user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.63 Safari/537.36' \
--parallel \
--url "https://example.com/file=1.[0-974]" \
--output "filename-#1.jpg"
See: https://curl.se/docs/manpage.html#-o
-o
,--output <file>
Write output to
<file>
instead of stdout. If you are using{}
or[]
to fetch multiple documents, you should quote the URL and you can use '#
' followed by a number in the<file>
specifier. That variable will be replaced with the current string for the URL being fetched. Like in:curl "http://{one,two}.example.com" -o "file_#1.txt"
or use several variables like:
curl "http://{site,host}.host[1-5].com" -o "#1_#2"
You may use this option as many times as the number of URLs you have. For example, if you specify two URLs on the same command line, you can use it like this:
curl -o aa example.com -o bb example.net
and the order of the -o options and the URLs does not matter, just that the first -o is for the first URL and so on, so the above command line can also be written as
curl example.com example.net -o aa -o bb
See also the
--create-dirs
option to create the local directories dynamically. Specifying the output as '-
' (a single dash) will force the output to be done to stdout.To suppress response bodies, you can redirect output to
/dev/null
:curl example.com -o /dev/null
Or for Windows use
nul
:curl example.com -o nul
Examples:
curl -o file https://example.com curl "http://{one,two}.example.com" -o "file_#1.txt" curl "http://{site,host}.host[1-5].com" -o "#1_#2" curl -o file https://example.com -o file2 https://example.net
See also -O, --remote-name, --remote-name-all and -J, --remote-header-name.
CodePudding user response:
You can use for
with seq
like this:
for i in `seq 0 974`; do wget https://example.com/file=1.$i; done;