Home > Mobile >  Ingesting/importing a country specific Google Mobility CSV file into Python
Ingesting/importing a country specific Google Mobility CSV file into Python

Time:09-17

Hi Stackoverflow Community,

I am trying to import ONLY Australia's google mobility csv file into Python from: https://www.gstatic.com/covid19/mobility/Region_Mobility_Report_CSVs.zip (which is available at: https://www.google.com/covid19/mobility/).. However, google now provides a folder of all csv files.

I am wondering if someone can point me into the right direction of how to import a file within a zipped folder online.

I would like to achieve this 'without downloading' the zip folder into my PC and then importing AUS csv. Wondering if there is a way to do all this using a code so everytime I run a code, python grabs the latest AUS csv file from the URL.

Thanks

CodePudding user response:

Looks like you're trying to do 3 things in sequence:

  1. download the zip file
  2. read the zip file
  3. read Australia's csv (assuming you know its filename)

We can do all this just with python's built-in modules!

First, download the file with urllib.request:

import urllib.request

with urllib.request.urlopen("https://www.gstatic.com/covid19/mobility/Region_Mobility_Report_CSVs.zip") as f:
    zip_data: bytes = f.read()

Second, read the archive. zipfile can help you with that.

import zipfile
from io import BytesIO

z = zipfile.ZipFile(BytesIO(zip_data))

In this step, we wrapped zip_data (bytes) in BytesIO (a file-like object) because zipfile.ZipFile takes "a path to a file (a string), a file-like object or a path-like object".

Last, parse the csv file with csv module.

from io import StringIO

with z.open("2020_AU_Region_Mobility_Report.csv") as au_csv:
    australia_data = csv.reader(StringIO(au_csv.read().decode("utf8")))

for row in australia_data:
    print(row)

The line where we parse the csv is a little convoluted because csv.reader takes an iterator of strings, but au_csv.read() returns one blob of bytes. So we have to decode those bytes, then turn it into something the reader accepts.

Combine all of the above:

import csv
import urllib.request
import zipfile
from io import BytesIO, StringIO

# Download
with urllib.request.urlopen("https://www.gstatic.com/covid19/mobility/Region_Mobility_Report_CSVs.zip") as f:
    zip_data: bytes = f.read()

# Open zip file and parse csv
with zipfile.ZipFile(BytesIO(zip_data)) as z:
    with z.open("2020_AU_Region_Mobility_Report.csv") as au_csv:
        australia_data = csv.reader(StringIO(au_csv.read().decode("utf8")))

for row in australia_data:
    print(row)
  • Related