Hi Stackoverflow Community,
I am trying to import ONLY Australia's google mobility csv file into Python from: https://www.gstatic.com/covid19/mobility/Region_Mobility_Report_CSVs.zip (which is available at: https://www.google.com/covid19/mobility/).. However, google now provides a folder of all csv files.
I am wondering if someone can point me into the right direction of how to import a file within a zipped folder online.
I would like to achieve this 'without downloading' the zip folder into my PC and then importing AUS csv. Wondering if there is a way to do all this using a code so everytime I run a code, python grabs the latest AUS csv file from the URL.
Thanks
CodePudding user response:
Looks like you're trying to do 3 things in sequence:
- download the zip file
- read the zip file
- read Australia's csv (assuming you know its filename)
We can do all this just with python's built-in modules!
First, download the file with urllib.request
:
import urllib.request
with urllib.request.urlopen("https://www.gstatic.com/covid19/mobility/Region_Mobility_Report_CSVs.zip") as f:
zip_data: bytes = f.read()
Second, read the archive. zipfile
can help you with that.
import zipfile
from io import BytesIO
z = zipfile.ZipFile(BytesIO(zip_data))
In this step, we wrapped zip_data
(bytes) in BytesIO
(a file-like object) because zipfile.ZipFile
takes "a path to a file (a string), a file-like object or a path-like object".
Last, parse the csv file with csv
module.
from io import StringIO
with z.open("2020_AU_Region_Mobility_Report.csv") as au_csv:
australia_data = csv.reader(StringIO(au_csv.read().decode("utf8")))
for row in australia_data:
print(row)
The line where we parse the csv is a little convoluted because csv.reader
takes an iterator of strings, but au_csv.read()
returns one blob of bytes. So we have to decode those bytes, then turn it into something the reader accepts.
Combine all of the above:
import csv
import urllib.request
import zipfile
from io import BytesIO, StringIO
# Download
with urllib.request.urlopen("https://www.gstatic.com/covid19/mobility/Region_Mobility_Report_CSVs.zip") as f:
zip_data: bytes = f.read()
# Open zip file and parse csv
with zipfile.ZipFile(BytesIO(zip_data)) as z:
with z.open("2020_AU_Region_Mobility_Report.csv") as au_csv:
australia_data = csv.reader(StringIO(au_csv.read().decode("utf8")))
for row in australia_data:
print(row)