Home > OS >  copy_stream to download file from remote URL
copy_stream to download file from remote URL

Time:10-13

We wanted to download files from remote-url into memory and then upload it to some public cloud. I am planning to use copy_stream lib in ruby. However I am not sure if it can be achieved by this, because I need to also maintain the memory and CPU stats in such a way that it will not hamper the performance.

Any suggestion or example how to achieve this via copy_stream lib in ruby or do we have any other lib to achieve this considering the performance.

https://ruby-doc.org/core-2.5.5/IO.html

CodePudding user response:

You can setup src/dst to be simple IO abstractions that respond to read/write:

src = IO.popen(["ssh", srchost, "cat /path/to/source_file | gzip"], "r")
dst = IO.popen(["ssh", dsthost, "gunzip > /path/to/dest_file"], "w")

IO.copy_stream(src, dst)

src.close
dst.close

CodePudding user response:

  • Set up src to be the downloadable file.
  • Set up dst to be the cloud resource, with write permission.
  • Make sure the two are compliant with sendfile().

Sendfile is a kernel based copy stream procedure. In terms of ram use and performance, there is nothing faster. You application will not be involved with the transfer.

For sendfile(), the output socket must have zero-copy support and the input file must have mmap() support. In general, this means you have already downloaded the file to a local file, you do not change the downloaded file during the copy, and you have an open socket to the output.

  • Related