Home > Blockchain >  How to stream into wget?
How to stream into wget?

Time:12-31

tac FILE | sed -n -e 's/^.*URL: //p' | SEND TO WGET HERE

This one liner above gives a list of URLs from a file, one per line. I am trying to stream/pipe these into wget directly. Each URL is a thumbnail picture that I need to do a massive download on. Trying to write this one liner to facilitate this process.

CodePudding user response:

This one liner above gives a list of URLs from a file, one per line. I am trying to (...) pipe these into wget directly.

In order to do so you might harness -i file option, if you give - as file wget will be reading standard input, from wget man page

-i file
--input-file=file

Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input(...)If this function is used, no URLs need be present on the command line(...)

So in your case

command | wget -i -

where command is command which output is one URL per line

CodePudding user response:

Use xargs to set the argument of a command from standard input:

tac FILE  | sed -n -e 's/^.*URL: //p' | xargs wget

Here each word of the standard input of xargs is set as a positional argument to wget

Demo:

$ cat FILE
URL: https://google.com   https://netflix.com
asdfdas URL: https://stackoverflow.com

$ tac FILE  | sed -n -e 's/^.*URL: //p' | xargs wget
--2021-12-30 12:53:17--  https://stackoverflow.com/
Resolving stackoverflow.com (stackoverflow.com)... 151.101.65.69, 151.101.193.69, 151.101.129.69, ...
Connecting to stackoverflow.com (stackoverflow.com)|151.101.65.69|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.7’

index.html.7            [   <=>              ] 175,76K   427KB/s    in 0,4s    

2021-12-30 12:53:18 (427 KB/s) - ‘index.html.7’ saved [179983]

--2021-12-30 12:53:18--  https://google.com/
Resolving google.com (google.com)... 142.250.184.142, 2a00:1450:4017:80c::200e
Connecting to google.com (google.com)|142.250.184.142|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.google.com/ [following]
--2021-12-30 12:53:18--  https://www.google.com/
Resolving www.google.com (www.google.com)... 142.250.187.100, 2a00:1450:4017:807::2004
Connecting to www.google.com (www.google.com)|142.250.187.100|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://consent.google.com/ml?continue=https://www.google.com/&gl=GR&m=0&pc=shp&hl=el&src=1 [following]
--2021-12-30 12:53:19--  https://consent.google.com/ml?continue=https://www.google.com/&gl=GR&m=0&pc=shp&hl=el&src=1
Resolving consent.google.com (consent.google.com)... 216.58.206.206, 2a00:1450:4017:80c::200e
Connecting to consent.google.com (consent.google.com)|216.58.206.206|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.8’

index.html.8            [ <=>                ]  12,16K  --.-KB/s    in 0,01s   

2021-12-30 12:53:19 (1,25 MB/s) - ‘index.html.8’ saved [12450]

--2021-12-30 12:53:19--  https://netflix.com/
Resolving netflix.com (netflix.com)... 54.155.246.232, 18.200.8.190, 54.73.148.110, ...
Connecting to netflix.com (netflix.com)|54.155.246.232|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.netflix.com/ [following]
--2021-12-30 12:53:19--  https://www.netflix.com/
Resolving www.netflix.com (www.netflix.com)... 54.155.178.5, 3.251.50.149, 54.74.73.31, ...
Connecting to www.netflix.com (www.netflix.com)|54.155.178.5|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://www.netflix.com/gr-en/ [following]
--2021-12-30 12:53:20--  https://www.netflix.com/gr-en/
Reusing existing connection to www.netflix.com:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.9’

index.html.9            [   <=>              ] 424,83K  1003KB/s    in 0,4s    

2021-12-30 12:53:21 (1003 KB/s) - ‘index.html.9’ saved [435027]

FINISHED --2021-12-30 12:53:21--
Total wall clock time: 4,1s
Downloaded: 3 files, 613K in 0,8s (725 KB/s)
  • Related