I am trying to print a visible feedback for the user in the terminal while my aplication donwloads a file from the web and write it into the hard drive, but I could not find how to do this reading the documentation or googling it.
This is my code:
res = requests.get(url_to_file)
with open("./downloads/%s" % (file_name), 'wb') as f:
f.write(res.content)
I was expecting to figure out how to make something like this:
Downloading file ........
# it keeps going ultil the download is finished and the file writen
Done!
I am realy strugling even to start, because none of the methods returns a "promise" (like in JS).
Any help would be very apreciated! Thanks!
CodePudding user response:
requests.get
by default downloads the entirety of the requested resource before it gets back to you. However, it has an optional argument stream
, which allows you to invoke .iter_content
or .iter_lines
on the Response
object. This allows you to take action every N bytes (or as each chunk of data arrives), or at every line, respectively. Something like this:
chunks = []
chunk_size = 16384 # 16Kb chunks
# alternately
# chunk_size = None # whenever a chunk arrives
res = requests.get(url_to_file, stream=True)
for chunk in res.iter_content(chunk_size):
chunks.append(chunk)
print(".", end="")
data = b''.join(chunks)
This still blocks though, so nothing else will be happening. If you want more of the JavaScript style, per Grismar's comment, you should run under Python's async loop. In that case, I suggest using aiohttp
rather than requests
, as it is created with async style in mind.
CodePudding user response:
Here's a version that will download the file into a bytearray
in a separate thread.
As mentioned in other answers and comments, there are other alternativs that are developed with async operations in mind, so don't read too much into the decision to go with threading
, it's just to demonstrate the concept (and because of convenience, since it comes with python).
In the below code, if the size of the file is known, each .
will correspond to 1%. As a bonus, the downloaded and total number of bytes will be printed at the start of the line like (1234 B / 1234567 B)
. If size is not known, the fallback solution is to have each .
represent a chunk.
import requests
import threading
def download_file(url: str):
headers = {"<some_key>": "<some_value>"}
data = bytearray()
with requests.get(url, headers=headers, stream=True) as request:
if file_size := request.headers.get("Content-Length"):
file_size = int(file_size)
else:
file_size = None
received = 0
for chunk in request.iter_content(chunk_size=2**15):
received = len(chunk)
data = chunk
try:
num_dots = int(received * 100 / file_size)
print(
f"({received} B/{file_size} B) "
"." * num_dots, end="\r"
)
except TypeError:
print(".", end="")
print("\nDone!")
url = "<some_url>"
thread = threading.Thread(target=download_file, args=(url,))
thread.start()
# Do something in the meantime
thread.join()
Do keep in mind that I've left out the lock to protect against simultaneous access to stdout
to reduce the noise. I've also left out writing the bytarray
to file at the end (or writing the chunks to file as they are received if the file is large), but keep in mind that you may want to use a lock for that as well if you read and/or write to the same file in any other part of your script.