I am willing to export all my previous and all future firestore documents to Big Query as tabular format.
Example: If I have a User
collection in firestore like:
{
"name": "Jon",
"score": 10,
}
Then the BigQuery table should look like this:
-------------------------------------
| id | name | score |
---------------- ---------- ---------
| <document_id> | Jon | 10 |
---------------- ---------- ---------
I have tried using
- Export Collections to BigQuery. This does not export the data in the above format; it rather keeps track of the each insert/update/delete events.
- Manually creating a table in the dataset and importing the backup from firestore. This does get me the BigQuery table view I want, but that is not updated automatically (like the extension does).
What do I do to achieve the tabular format of the document mentioned above as well as it updates automatically when the firestore documents are updated? Do I have to write my own cloud function to update all the data to BigQuery?
CodePudding user response:
The schema-views scripts that come with the "Export Collections to BigQuery" allow you to create views that let you query your data in a more tabular form like you suggested. You can set them up once and then query against them even as new data comes in.
CodePudding user response:
- Export Collections to BigQuery. This does not export the data in the above format; it rather keeps track of the each insert/update/delete events.
As explained by Michael, with the "Export Collections to BigQuery" extension you could run the schema-views
script, which is provided with the extension, to create a view according to your desired data model.
Another approach would be to write a Cloud Function that directly writes to BigQuery. The following simple example shows how a Scheduled Cloud Function reads the docs from the users
collection and add them to a BigQuery table.
It's up to you to adapt it. In particular, if you want to update the BigQuery table when the Firestore documents are created/deleted/updated, you should use an onWrite()
Cloud Function.
const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();
const { BigQuery } = require('@google-cloud/bigquery');
const bigquery = new BigQuery();
exports.exportToBigQuery = functions.pubsub.schedule('every 24 hours').onRun(async (context) => {
try {
// get the user docs !! UP TO YOU TO ADAPT THE QUERY !!
const usersQuerySnapshot = await admin.firestore().collection("users").get();
const rows = [];
usersQuerySnapshot.forEach((doc) => {
rows.push({ id: doc.id, name: doc.get("name"), score: doc.get("score") })
});
const dataset = bigquery.dataset("dataset-name"); // Use Environment configuration https://firebase.google.com/docs/functions/config-env
const table = dataset.table("table-name");
await table.insert(rows);
return null;
} catch (error) {
console.log(error);
return null;
}
});