Home > Software design >  How to cleanup BigQuery export to CSV using pandas
How to cleanup BigQuery export to CSV using pandas

Time:05-08

I exported my BigQuery data to CSV but can't figure out how to clean up the data as the headers are all appended on the backend in the same row.

Here's my code:

from google.cloud import bigquery
import pandas as pd

project = 'project1'
client = bigquery.Client(project=project)

defineQuery = """
    SELECT *  
    FROM table
    LIMIT 5;    
"""

df = pd.DataFrame(client.query(defineQuery)) 
df.to_csv(r'C:/file-name.csv') 

The BigQuery CSV file export came out with just 2 columns like the following:-

|     | 0     |
|-----|-------|
|    1|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
|    2|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
|    3|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
|    4|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
|    5|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})

How do I clean it up so it's like the following?

|     | Field 1 | Field 2 | Field 3 |
|-----|---------|---------|---------|
|    1| 123     | data    | data2   |
|    2| 123     | data    | data2   |
|    3| 123     | data    | data2   |
|    4| 123     | data    | data2   |
|    5| 123     | data    | data2   |

Thanks in advance!

CodePudding user response:

Try this:

df = client.query(defineQuery).to_dataframe() 

instead of

df = pd.DataFrame(client.query(defineQuery)) 
  • Related