tac FILE | sed -n -e 's/^.*URL: //p' | SEND TO WGET HERE
This one liner above gives a list of URLs from a file, one per line. I am trying to stream/pipe these into wget
directly. Each URL is a thumbnail picture that I need to do a massive download on. Trying to write this one liner to facilitate this process.
CodePudding user response:
This one liner above gives a list of URLs from a file, one per line. I am trying to (...) pipe these into wget directly.
In order to do so you might harness -i file
option, if you give -
as file wget
will be reading standard input, from wget
man page
-i file
--input-file=file
Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input(...)If this function is used, no URLs need be present on the command line(...)
So in your case
command | wget -i -
where command
is command which output is one URL per line
CodePudding user response:
Use xargs
to set the argument of a command from standard input:
tac FILE | sed -n -e 's/^.*URL: //p' | xargs wget
Here each word of the standard input of xargs
is set as a positional argument to wget
Demo:
$ cat FILE
URL: https://google.com https://netflix.com
asdfdas URL: https://stackoverflow.com
$ tac FILE | sed -n -e 's/^.*URL: //p' | xargs wget
--2021-12-30 12:53:17-- https://stackoverflow.com/
Resolving stackoverflow.com (stackoverflow.com)... 151.101.65.69, 151.101.193.69, 151.101.129.69, ...
Connecting to stackoverflow.com (stackoverflow.com)|151.101.65.69|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.7’
index.html.7 [ <=> ] 175,76K 427KB/s in 0,4s
2021-12-30 12:53:18 (427 KB/s) - ‘index.html.7’ saved [179983]
--2021-12-30 12:53:18-- https://google.com/
Resolving google.com (google.com)... 142.250.184.142, 2a00:1450:4017:80c::200e
Connecting to google.com (google.com)|142.250.184.142|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.google.com/ [following]
--2021-12-30 12:53:18-- https://www.google.com/
Resolving www.google.com (www.google.com)... 142.250.187.100, 2a00:1450:4017:807::2004
Connecting to www.google.com (www.google.com)|142.250.187.100|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://consent.google.com/ml?continue=https://www.google.com/&gl=GR&m=0&pc=shp&hl=el&src=1 [following]
--2021-12-30 12:53:19-- https://consent.google.com/ml?continue=https://www.google.com/&gl=GR&m=0&pc=shp&hl=el&src=1
Resolving consent.google.com (consent.google.com)... 216.58.206.206, 2a00:1450:4017:80c::200e
Connecting to consent.google.com (consent.google.com)|216.58.206.206|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.8’
index.html.8 [ <=> ] 12,16K --.-KB/s in 0,01s
2021-12-30 12:53:19 (1,25 MB/s) - ‘index.html.8’ saved [12450]
--2021-12-30 12:53:19-- https://netflix.com/
Resolving netflix.com (netflix.com)... 54.155.246.232, 18.200.8.190, 54.73.148.110, ...
Connecting to netflix.com (netflix.com)|54.155.246.232|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.netflix.com/ [following]
--2021-12-30 12:53:19-- https://www.netflix.com/
Resolving www.netflix.com (www.netflix.com)... 54.155.178.5, 3.251.50.149, 54.74.73.31, ...
Connecting to www.netflix.com (www.netflix.com)|54.155.178.5|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://www.netflix.com/gr-en/ [following]
--2021-12-30 12:53:20-- https://www.netflix.com/gr-en/
Reusing existing connection to www.netflix.com:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.9’
index.html.9 [ <=> ] 424,83K 1003KB/s in 0,4s
2021-12-30 12:53:21 (1003 KB/s) - ‘index.html.9’ saved [435027]
FINISHED --2021-12-30 12:53:21--
Total wall clock time: 4,1s
Downloaded: 3 files, 613K in 0,8s (725 KB/s)